-
Notifications
You must be signed in to change notification settings - Fork 0
HTTP Messages
These notes cover HTTP messages and their structure. This will be useful for understanding how to parse and compose these messages. While we already re-use the existing HTTP parser from the Node.js folks, it can't hurt to understand it better!
Messages form the basic unit of communication between client and server. There are two types of messages: requests and responses. Requests are issued from the client to the server, while responses are issued from server to the requesting client. Messages are composed of a starting line, zero or more headers, an empty line, and optionally a message-body.
In the interest of robustness, servers SHOULD ignore any empty line(s) received where a Request-Line is expected. In other words, if the server is reading the protocol stream at the beginning of a message and receives a CRLF first, it should ignore the CRLF.
... an HTTP/1.1 client MUST NOT preface or follow a request with an extra CRLF.
Both request and response messages begin with a start line which is terminated by a CRLF.
The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.
Method <SP> Request-URI <SP> HTTP-Version <CRLF>
The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code and its associated textual phrase, with each element separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.
HTTP-Version <SP> Status-Code <SP> Reason-Phrase <CRLF>
A message may optionally include a block of headers following its start line. The last header line is followed by an empty line (CRLF). If no headers are provided the empty line must still be included.
There are four categories of headers:
- General (RFC 2616 Section 4.5)
- Request (RFC 2616 Section 5.3)
- Response (RFC 2616 Section 6.2)
- Entity (RFC 2616 Section 7.1)
Each header field consists of a name followed by a colon (":") and the field value.
Field-Name ":" Field-Value
Field names are case-insensitive.
The field value MAY be preceded by any amount of LWS, though a single SP is preferred.
name:value <-- Valid
name: value <-- Valid
name: value <-- Valid and preferred
Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT.
name: A very long
<SP> header value.
name: Another very long
<HT> header value this time using a tab.
The field-content does not include any leading or trailing LWS...
Such leading or trailing LWS MAY be removed without changing the semantics of the field value.
abc: <SP> 123 <SP> --> value is "123" not " 123 "
Any LWS that occurs between field-content MAY be replaced with a single SP before interpreting the field value or forwarding the message downstream.
The order in which header fields with differing field names are received is not significant. However, it is "good practice" to send general-header fields first, followed by request-header or response- header fields, and ending with the entity-header fields.
Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list.
name: value1,value2,value3
The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded.
We must be sure when composing the message we list the values in field using the same order appended by the developer.
http_message_append_header("name", "value1");
http_message_append_header("name", "value2");
Results in a header that looks like...
name: value1,value2
Following the CRLF after the headers is the message-body. Not all messages have bodies. They may be optional or not allowed at all depending on the type (request or response) or what the method (ex: GET, POST) allows.
The presence of a message-body in a request is signaled by the inclusion of a Content-Length or Transfer-Encoding header field in the request's message-headers.
if the request method does not include defined semantics for an entity-body, then the message-body SHOULD be ignored when handling the request.
All responses to the HEAD request method MUST NOT include a message-body, even though the presence of entity- header fields might lead one to believe they do.
All 1xx (informational), 204 (no content), and 304 (not modified) responses MUST NOT include a message-body. All other responses do include a message-body, although it MAY be of zero length.