-
Notifications
You must be signed in to change notification settings - Fork 2
CGI (Common Gateway Interface)
CGI stands for Common Gateway Interface, it is a standard for external gateway programs to interface with information servers (like HTTP servers).
CGI applications are usually written in scripting languages such as Perl, however nowadays they're also written in other languages, generally they get the file extension .cgi
, but they can also end in others like .py
for python etc.
It is used whe the webserver needs to dynamically interact with a user, usually this is then done in the way of a user filling in a form and submitting this to the server. the CGI retrieves the data, processes it and returns the result back to the webserver and then to the user.
- applications run on the server
- reusable pieces of code
- well defined standard supported by most modern browsers
- interface is consistent, ca be written in many languages like C, C++, Python, Java, PERL.
- person writing the CGI can write it independently of the OS which the server uses
- simple basic way of passing information about the user's request from the webserver to the application program and getting a response back.
- by default, CGI scripts run in the security context of the server.
This webserver's cgi directive can be set in 2 different ways:
- 2 arguments (extension, executable) ie.
cgi_pass .pl /usr/bin/perl;
will go into the CGI bin and find the first CGI script with said extension and execute it with the executable given. (can be used if script is not an executable) if a directory is given (CGI-bin), OR will compare if the extension matches the desired CGI script and if so, run it. - 1 argument, (executable) ie.
cgi_pass ./get_query.pl;
executes the CGI executable given if request is the cgi directory OR if that specific cgi is requested.
The CGI script is usually found at a certain URL which should run the script. Generally speaking it follows the following steps:
- a diectory is created within the webserver which contains the scripts. This is folder is usually calld
cgi-bin
. - The user sends a request to the server in the form of
http://mywebsite.com/cgi-bin/mycgiscript.pl
- The server recognises the file being requested is a CGI script and instead of sending back the file it runs the script and passes the output of the script to the web client.
When an HTTP server receives a request for a CGI script, the server gives the script the details of the request. There are 4 major ways in which a HTTP server an CGI script communicate:
-
Enviroment variables, HTTP server uses environment variables to pass information about the request to the CGI script. Depending on the type of request the variables may or may not contain information required by the script to function properly.
-
The command line, mostly used for
isindex
queries. However,isindex
queries are dissuaded as it can cause security risks due to direct communicating with the command line. -
standard input, for HTTP
POST
andPUT
queries. The HTTP server communicates the information to the CGI script via standard input. The amount of information writtent o the standard input is stored in theCONTENT_LENGTH
environment variable. -
Standard output A script returns its output on the standard output. The output can be a document generated by the script, or instructions to the server for retrieved the desired output.
HTTP Header | Description |
---|---|
Content-type: string | Format of the file is being returned as a string. |
Content-type: string | variable which sends back the length of the data in bytes. Used for the broswer to determine how much time is needed to download the result, used in the POST method |
Location: URL string | This can be used to redirect a request to any file. The URL string specified ddepicts the URL to be returned instead of the URL which is requested |
Expires: Date string | Date string is used by the browser to determine when the page expires and needs refreshing. The format is: dd mon yyyy hh:mm:ss, NOT USED |
Set-Cookie: string | cookie passed as a string to be set, NOT USED |
CGI programs also use environment variables, all programs have access to the following variables
HTTP Header | Description |
---|---|
content-type | Used when a file is uploaded from the user. Depicts the data type of the content attached |
http_user_agent | gives info about the browser who initiated the request |
query_string | used in GET requests, it's the URL-encoded information sent from the browser |
content_length | used with POST request. Gives info about the length of the query information |
script_name | name of CGI script |
path_info | full path where CGI script is kept |
document_root | root of the document provided |
Remote_host | HOST of the request |
script_filename | file name of the script |
server_name | name of the server (host) |
server_port | port where the server is located (port) |
server_protocol | always HTTP/1.1 in our case |
server_software | foodserv in this case |
CGI class takes the request class and finds the requested URI and its appropriate target configuration.
It then performs the setup()
function and creates the absolute path starting at the current directory (which it will use for path finding), and set up the argv and envp lists.
These functions also validate if:
- the file exists
- the file is an executable
- the file is allowed to execute based on what the config file says If these checks fail the class throws an error.
If this goes well it proceeds to the execute()
function, which executes the CGI and captures the output.
It creates two pipe fds (one for reading (capture output), one for writing (for POST)). If the request is not a POST it immediately closes the writing fd.
Afterwards it forks, in the child process, the arguments and environment variables arrays are built and the appropriate fds are dupped and/or closed and the CGI script is executed with execve
. If the execution fails it returns a 1.
In the parent process if the method is a POST, then it writes to the stdout of the write fd (which is linked to the stdin of the child pipe). And retrieves the output of the child stdout and stores it for later. It also returns the exit code. 0 on success (leading to http code 200) and otherwise it throw 502 bad gateway.
Common Gateway Interface (CGI) – How it Works, Features & Applications
Webserv Wiki
Project
Configuration File
HTTP Headers
Documentation
Network Programming
HTTP Information
RFC highlights
Resources