HTTP
Introduction
This is the protocol used to retrieve Web pages from a server (normally on port 80) and also to send information obtained from a form back to a server. It is perhaps the most complex of protocols that we will meet.
Like other protocols, this one can be explored using Telnet to act as a primitive web browser, sending and receiving information according to the protocol.
The HTTP protocol is overall as follows:
Thus a separate connection is used for each request.
The response code 200 OK is the most common response, signaling that the request was successful. There are many response codes. They are grouped as shown below:
Response code |
Meaning |
200 - 299 |
success |
300 - 399 |
web browser needs to go to another page |
400 - 499 |
client error |
500 - 599 |
server error |
Some common response codes are:
Common Response Codes
Response Code |
Meaning |
200 OK |
request successful |
301 Moved Permanently |
The page has moved to a new URL. |
304 Not Modified |
The client made a request for a page, but used an option to specify that it only requires the page if it has been changed. |
400 Bad request |
The request has faulty syntax |
401 Unauthorized |
Authorization is needed to access this page, Either the authorization is wring or has not been supplied. |
404 Not Found |
The server cannot find the page. This is a common error. |
503 Service Unavailable |
The server is temporarily unable to handle the request, perhaps due to maintenance or overloading |
The request
The client sends a request, for example:
GET /index.html HTTP/1.0
Accept: text/html
Accept: image/gif
User-Agent: Lynx/2.4
This is a sequence of lines, in ASCII, terminated by an empty line. As we have seen, the second item on the first line is the path name. This is followed by the version of the HTTP protocol that the client understands. This line is all that is required. However, other information can be provided by the client. Each piece of information is on a separate line and takes the form:
keyword: value
For example:
Accept: text/html
says that the client can accept html documents. Another example is:
Accept: image/gif
which again allows the server to tailor information to what the client is able to process. The client can also say which web browser and version it is, for example:
User-Agent: Lynx/2.4
There are other request types in addition to GET:
HEAD retrieves only the file header, so that the browser can see whether it has been updated since it last retrieved a copy
POST is used in conjunction with forms and CGI (see later).
The Response
The response consists of a number of header lines, followed by an empty line, followed by the contents of the file - usually html. For example:
HTTP/1.1 200 OK
Date: Mon, 12 Jul 1999 12:42:22 GMT
Server: Apache/1.3.6 (Unix)
Last-Modified: Wed, 07 Jul 1999 17:14:42 GMT
ETag: "fcdd-17e-37838b02"
Accept-Ranges: bytes
Content-Length: 382
Connection: close
Content-Type: text/html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
etc.
The first line gives the HTTP version number and a response code (see above).
The third line is the name of the server program and version number.
The last line of the header specifies the MIME type of the content.