Professor, CSE Dept, KSIT
Professor, CAOS, IISC
HTTP Protocol Mechanisms for High Performance Applications
HTTP is the most widely used protocol today, and is supported by almost every device that connects to a network. As web pages continue to grow in size and complexity, web browsers and the HTTP protocol itself have evolved to ensure that end users can meaningfully engage with this rich content. This article describes how HTTP has evolved from a simple request-response paradigm to include mechanisms for high performance applications.
Each request consists of a one-line ASCII character string (GET followed by the path of the resource being requested) terminated by CR/LF, and the response is also an ASCII character stream. There are no headers containing meta-information about requests and responses. A new TCP connection is opened with each request, and closed after the response is received.
This protocol soon evolved to support additional features, including: serving non-HTML documents (such as images), enabling clients and servers to negotiate the type of content to be exchanged, its size, its encoding, etc. These features are formally captured in the HTTP/1.0 Informational RFC 1945, which describes the common usage of the HTTP protocol between clients and servers. It clarifies that HTTP is defined as “an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems”. The first formal standard for HTTP/1.1 was defined in RFC 2068, which was later superseded by RFC 2616. This is the standard most commonly used today, even though the protocol has evolved to the new HTTP/2standard defined in RFC 7540.
To address these questions, this article will show how the HTTP protocol has evolved to include multiple mechanisms to improve the application performance delivery. For web application developers, factors such as network performance and hardware configuration are not controllable. Hence, developers must consider the following aspects while optimizing web network performance:
i. Making use of HTTP persistent connections (using the Keep-Alive header). This avoids one round trip delay, thereby improving performance.
ii. Making efficient use of caching as supported by the HTTP protocol.
iii. Making use of chunk-transfer-encoding to display content as it is received, rather than waiting to receive the entire content before displaying it.
We will focus on these three techniques, although there are several other ways to improve performance including compression, minimizing the number of HTTP redirects (more than 80% of websites use redirects), reducing the number of DNS resolutions, and making use of Content Delivery Networks (CDNs) to locate large-sized content (typically, multimedia) closer to users, thereby reducing end-to-end delay.
Following the experiential approach taken in first article of this series (June 2017 issue of ACC.digital), we describe the setup for experiments that will illustrate these concepts. The basic setup is shown in Figure 1, consisting of a web server machine (running Apache web server ) named myweb.com, and a client machine running the Firefoxweb browser1 . The two machines (which can be basic desktops or laptops) are connected via a simple network (e.g., via a Wi-Fi access point, or directly with an Ethernet cable). For concreteness, we will assume that the web server’s IP address is 10.1.1.1 and the client’s IP address is 10.1.1.101. (Please note that in your experimental setup, these IP addresses are likely to be different.)
Figure 1: A simple connection between the client (left) running the Firefox browser and the web server (right)
For our experiments, we need to observe details of the TCP connections established and terminated between these two machines. To do so, we will make use of the most widely used network diagnostic tool:Wireshark . This utility allows us to capture and analyze each byte of every packet that traverses the network. For a detailed understanding of Wireshark, we refer readers to the user guide .
The need for this specific browser is explained later.
All modern browsers use persistent connections by default.