Computing: Free Pascal Programming

Creating command line programs with Lazarus.


Command line programs (i.e. programs, running in Windows Command Prompt ("DOS box") or in a Linux terminal) may appear to mainstream computer users as dinosaurs, that were necessary at the beginning of the PC age, but that aren't useful in any way today. Advanced users or specialists in domains as for example bioinformatics, however, know, that there are lots of situations, where a graphical interface is not needed, where a given task can be run without any user interaction, similar to batch programs on a mainframe, taking its input data from the command line or a file and producing an output on the console or to a file. On Unix systems, lots of administration is done by using shell scripts or text-based Perl programs. As an example of a simple command line program on Windows, have a look at my tutorial about Web development environment setup on MS Windows: 5. Perl, where I describe a tool, written in Free Pascal, that copies the Perl or PHP file actually edited within Komodo Edit, from my development library to the webserver and then starts the web browser, pointed to the URL of the script; in other words, a command line program, that allows to test CGI scripts from within Komodo Edit.
There is no fundamental difference between creating a graphical application and a command line program: In Lazarus, choose New > Simple Program or New > Program (instead of New > Application). The difference between the two kinds of programs is, that New > Program automatically includes some compiler directives and units, in particular that it sets {$mode objfpc}{$H+} and declares the Classes unit, what allows to do object oriented programming.
There are however 2 points to consider, when creating a command line program:
  • Windows Command Prompt (as a difference with Linux terminals) does not use UTF8 encoding. When writing from a Free Pascal program to the console, the output on the screen will be such, that non-ASCII characters ("non-English letters") will be displayed as some "garbage" (the characters actually displayed being ANSI encoded, and ANSI depends on the actual Windows System code page).
  • Console programs often need command line parameters and sometimes the actual or special environment variables. Getting their value may easily be done, using the ParamCount and ParamStr resp. the GetEnv functions, but how to proceed to test a command line program, that needs parameters, directly from within Lazarus?
UTF8 display in Windows Command Prompt.
As I said above, writing from a Free Pascal program to Windows Command Prompt, without considering the non-ASCII characters issue, results in a senseless display of these characters. As an example, a screenshot of my "3-month multi-lingual calendar", where the accentuated characters and umlauts are displayed as one or two symbols, that depend on the Windows System code page.
Incorrect UTF8 display in Windows Command Prompt
In the original version of the program, I used "e" instead of all "e with accent", and "ae", "oe" and "ue" instead of the umlauts. Just a work-around, not a solution. Later, when I searched the Free Pascal Wiki, I discovered, that the UTF8 issue in my "3-month multi-lingual calendar" program could be fixed, without having to do any code page conversions or other function calls, but simply by including the LazUTF8 unit in the program. In fact, with this unit included, all is handled by Lazarus itself: With every call to the Write procedure, Lazarus converts the text to be displayed from UTF8 to the actual Windows System code page (and my "é" and "ä" were automatically displayed correctly). But then, running the program on my new laptop, with a newer version of Windows 10, all non-ASCII characters appeared as the same two "undefined" characters. I spent lots of time searching forums and other Internet sites in order to find an explanation why in some cases some characters are displayed correctly and in other cases some characters are not. And I think that I have found a solution for most situations.
There are two things, you'll have always to do when dealing with non-ASCII characters in command line programs:
  • Use strings for all data being written to the console. So, do not use the UTF8String, nor the UnicodeString type. And if you define one-character constants with non-ASCII value, better to be careful and type them as string, too.
  • Always declare the usage of the LazUTF8 unit, even if your code doesn't include any call to its functions.
Concerning the LazUTF8 unit, including it into a console program by simply adding it to the uses statement, is not enough. Trying to built my calendar program, the compilation aborted with a Unit not found error message!
Compilation abortion due to LazUTF8 unit not found
As a Linux operating system or a Perl distribution, the Lazarus development environment is based on packages. These are "collections" of units, installed and upgraded as a whole, taking in account other packages, they depend on, and the minimum version of the dependencies. In order to build an application or a program, that uses a given unit, the installation of one or more packages is required. Old units, like Crt, Dos or Graphics, as well as newer units as SysUtils, DateUtils or Math do not require the installation of a particular unit and Lazarus knows all it needs to know to build a program that uses them. For other units, Lazarus must be explicitly told that a given package is required. The best known example is the LCL package, needed to build any GUI application. The error message, I got when trying to build my program, tells us that the LazUtils package is a requirement to use the LazUTF8 unit. This package, automatically added to the project, when needed in a GUI application, is not automatically available for command line programs. To add a package requirement to a project, choose Project > Project Inspector in the Lazarus menu bar, then in the Project Inspector window, select New requirement.
Lazarus Project Inspector: Adding a new requirement
The New requirement window opens, and you get an impressive list of packages, by default shipped with Lazarus, or, maybe, packages, that you installed yourself using the Package Manager or that you found on the Internet. As I said, some of them are globally used with GUI applications (such as LCL), others are specific for specific applications. In our case: In the list, choose LazUtils and push the OK button.
Adding the LazUtils package to a command line program
With the LazUtils package requirement added, I rebuilt the command line program: Success!
Successful build of a UTF8 command line program
And rerunning the "3-month multi-lingual calendar" program: Mäerz, Abrëll, MÉ, DË ... all characters were now displayed correctly.
Correct UTF8 display in Windows Command Prompt
Until one day, running the program on my new Dell, I got this:
Incorrect UTF8 display in Windows Command Prompt on latest Windows 10 versions
It took a time before I figured out how output to Windows Command Prompt really works, what are the factors that have to be considered when writing non-ASCII characters, why in some cases the character is displayed correctly, in others it is displayed as two boxes with an interrogation mark and sometimes as some characters depending on the actual code page. Here, how I see things and what should make it possible to display your non-English text correctly in nearly all cases. What character is actually displayed depends on the following:
  1. The way, how Lazarus interprets and converts characters when building the program.
  2. The System code page.
  3. The version of Windows.
  4. The usage or not of the Crt unit.
Concerning point 1, we already saw it: When using non-ASCII characters, the usage of the LazUTF8 unit is mandatory (and all variables and constants to be displayed should be declared as strings).
The display in Command Prompt is not UTF-8 encoded, but depends on the actual System code page, that depends itself on the System locale, i.e. the language used for non-Unicode applications, such as Command Prompt. The System locale is set in the Administrative tab of Control Panel > Region. If you use a Latin based character set, you should set the System locale to English (United States), even if your Windows language is German or French. Doing so, sets the System code page to OEM 437, the character set of the original IBM PC, where character codes 128 to 168 include all major accentuated letters, used in European languages. With the LazUTF8 unit being used, all these letters will display correctly in Command Prompt (and it was with this settings that my calendar program displayed proper Luxembourgish days of the week on my old HP laptop). This works independently if you use the Crt unit or not. On the other side, all characters not defined in OEM 437 (such as ω, Д, or א) will result in "garbage" display. Note, that you can display the actual System code page in Command Prompt by running the command chcp (without parameters).
Microsoft is actually working on a way to solve the display issues in Command Prompt. There is a new terminal application available in Microsoft Store; I tried it and got the same problems with my calendar as on my Dell. Another approach, actually at β stage, is to give Command Prompt worldwide language support. This feature, activated by default on the latest versions of Windows 10, may be selected or unselected in the Administrative tab of Control Panel > Region. It results in setting the System code page to the pseudo UTF8 code page 65001. This works fine with Free Pascal Write statements. Not only all Latin based characters, but also letters of the Greek, Cyrillic or Arabic alphabets are displayed correctly (I did not try with Asian languages, but I suppose that it works, too?).
There is an issue, however: Code page 65001 is not compatible with some legacy code and such legacy code is used in the Free Pascal Crt unit. This means, that if you use the Crt unit in your programs, you can't use code page 65001 (the result of doing so is what I got when running my calendar program on my new Dell, where worldwide language support was activated by default). No issue if you only need Latin based characters: Use OEM 437 by unchecking the UTF8 option in the System locale settings. On the other side, if your command line programs do not use Crt, enabling worldwide language support ensures that you can write to Command Prompt without having to worry about code page issues.
Windows System locale settings: Activating or not worldwide language support
Testing console programs with command line parameters.
Testing console programs with command line parameters is not really a problem. First, you can build your program and then run it from Command Prompt with the actual parameters. Second, you can fill the command line variables by adding assignments during the test phase. The first of these procedures has the disadvantage that you have to switch to Command Prompt each time you want to run the program. The second one allows you to stay within Lazarus, but the code, concerning the reading of the command line parameters, will not be tested. Having a possibility to directly telling Lazarus to run a program with given command line parameters would be convenient. And, of course, this possibility exists.
In the Lazarus menu bar, choose Run > Run Parameters.
Specifying command line parameters in Lazarus [1]
The Run Parameters window opens and you can specify your actual command line parameters in the Command line parameters field, just in the way, you would do it at the command line. Note, that in this window, you can set several other options such as starting your program from another program, setting an output display (Linux only) or defining a working directory (useful in particular if your program uses files, stored in a given directory).
Specifying command line parameters in Lazarus [2]
Testing console programs with given environment variables.
Environment variables are special operating system variables, that contain information about the OS itself, about the actual user and their private directories, about the paths to be searched for executables and the one corresponding to Program Files and Program Files (x86), in a similar way as on a webserver, where you have environment variables, containing information about the server and website access. Sometimes, it is necessary to read these variables, what may be done using the GetEnv function, included in the DOS unit. There are also cases, where you want to run your program with environment variables different from those set on your system. As examples, testing a program, that takes different actions depending on the platform (32bit or 64bit) or a program that uses data from a user library (with the need of knowing the computer user name).
No need to change your actual environment variables. Just tell Lazarus to run your program with the environment variables set to the values you want. To do this, choose Run > Run Parameters as before and set your test environment variables in the Environment tab, by adding them to the User overrides section.
Specifying testing environment variables in Lazarus


If you find this text helpful, please, support me and this website by signing my guestbook.