Saturday, October 2, 2010

HTTP 1.1, rfc 2616 and reading comprehension

I've read with interest some documentation from Microsoft about how the HTTP 1.1 specification mandates some behavior. To Quote:
WinInet limits connections to a single HTTP 1.0 server to four simultaneous connections. Connections to a single HTTP 1.1 server are limited to two simultaneous connections. The HTTP 1.1 specification (RFC2616) mandates the two-connection limit.

This seems to be saying that browsers are only allowed (via some mythical mandate) to use two connections per server and any connections past two must block. After reading through the http 1.1 specification (again) I'm troubled that many folks have seriously misinterpreted this requirement. This is especially troubling because the manner in which RFCs are written is VERY explicit and it is (for me) really easy to understand the difference between a requirement and a recommendation. What is even more troubling is that people quote the microsoft reinterpretation of the specification as if it is a direct quote of the specification.

So for my example, the top of RFC 2616 states:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [34].

If we chase down RFC 2119

1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that the
definition is an absolute requirement of the specification.

2. MUST NOT This phrase, or the phrase "SHALL NOT", mean that the
definition is an absolute prohibition of the specification.

3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

4. SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean that
there may exist valid reasons in particular circumstances when the
particular behavior is acceptable or even useful, but the full
implications should be understood and the case carefully weighed
before implementing any behavior described with this label.

Then in the http spec we see:
Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy. A proxy SHOULD use up to 2*N connections to another server or proxy, where N is the number of simultaneously active users. These guidelines are intended to improve HTTP response times and avoid congestion.

Whats my problem?
  • If someone has taken the time to formally define things in a certain context, it is professionally irresponsible to change the meaning of their statements.
  • If you are distributing technical documentation, make sure you have your facts right and use unambiguous language. Remember, not everyone speaks English as their native language, nor do they necessarily have the inclination to go chase down quoted sources.
  • If you are trying to cite documentation, chase down the originator, don't rely on second, third, fourth, or nth parties to give you your information unless you REALLY trust them

Lets dissect a portion of the original quote:

Connections to a single HTTP 1.1 server are limited to two simultaneous connections.

Which of the following statements does this concretely assert?
  • An HTTP server will not accept more than two simultaneous connections.
  • An HTTP server might accept only two connections or might accept more
  • Clients can not make more than two simultaneous connections to the same server
  • Clients can actually make more than two simultaneous connections, but we've limited them to two

For the lay person (other than perhaps lawyers), these distinctions probably seem like minute and petty semantic wrangling. For professional software developers they are, however, terribly important.

Why? Because computers don't exactly do what you want them to do, they do exactly what you tell them to do.

Reread that a couple of times please...

Any subjective interpretation you are expecting the computer to do on your behalf does NOT happen and anybody who's used a computer has probably run into problems where the computer is not doing what you want and you are unable to understand why. There are millions of lines of code you are interacting with and their behavior is often specified with ambiguous language like the original paragraph. More importantly, they are restated and modified via the "telephone game" effect such that original and same requirements are completely lost

No comments: