Experiment 3: Queuing Delay
Within a network element, queuing delay can occur at three places: (1) at the input interface, where a packet is waiting to be received, (2) when packets wait in the buffer where the application needs to process it, and (3) at the output interface where it is waiting to be transmitted (the output interface can be busy transmitting other packets ahead in the queue). For queuing delay to occur, there must be at least two transmitters. (When two packets are being transmitted concurrently, one of them needs to wait.) We thus modify the setup for Experiment 1, with two or more clients sending data concurrently to the server. Two hosts Ha and Hc are connected to Hb via a single Ethernet switch on a 10 Mbps link ( Figure.2).
When two data packets are transmitted simultaneously from the two client machines, they will be queued at the input network buffer of the network switch, and at the input buffer of the server application until the server reads this data. It is practically very challenging to witness queuing at the switch since it will require very tightly coupled synchronization between the two client machines. (Even slight differences between two transmissions may not cause any queuing.) Thus, for simplicity, we will assume that all queuing delay occurs at the application input buffer.
The client machine sends the first packet, and then sends subsequent packets immediately after receiving a response for the previous packet. Since a client sends only one packet at a time, and sends the next one only after the response for the previous packet is received, there is no queuing delay at the client side.
The simple server application is designed as single threaded – it reads data from the client, and it responds after sleeping for time tsleep (to emulate processing time delay). Once the response is sent, it immediately reads data from the next client request, processes it and responds back, etc. With a value of tsleep = 10 ms, the average response time is measured as 13.23ms. The end to end delay without any queuing is about 3.2 ms. With two clients, we observed response times of 20.21 ms and 20.22 ms measured by Ha and Hc respectively (with tsleep = 10 ms). Hence the queuing delay each packet encountered was about 7 ms (the difference between 20.21 ms and 13.23 ms). The delay of 7 ms can be explained in a simplistic way as follows. The delay of 3.23 ms for propagation, transmission and processing is subsumed in the queuing delay of 10ms at the application. Thus, each the packet needs to wait only for about 7 ms in the queue (not 10 ms). The response time of 20.2 ms can be accounted as 3.2 ms of propagation and transmission delay, 10 ms of processing delay and 7 ms of queuing delay.
The reader is encouraged to conduct the experiment with three concurrent clients. Here, when a packet is being processed (for 10ms on account of tsleep = 10 ms), the request from the second client will wait for 7 ms, but the request from the third client will wait for 17 ms. Hereafter, all subsequent requests from all the three clients will witness a queue delay of 17 ms. Thus, the average response time for the clients is expected to be around 30 ms (corresponding to 17 ms for queuing delay, 10 ms for processing delay and 3 ms for propagation and transmission delay). Figure 3 and Figure 4 depict the processing and waiting (queuing) time for two and three concurrent clients respectively. The reader is encouraged to conduct experiments with four of more concurrent clients and analyze the queue delay and overall response time.
The reader is encouraged to modify the parameter tsleep, and notice that queuing delays are observed only when tsleep is substantially more than the transmission delay. If the value of tsleep is less than 3ms, there would be no queuing delays for second client.
The reader is further encouraged to conduct the experiment while keeping the processing delay smaller than the combination of transmission and propagation delays, and then observe and analyze the end to end delay. This would provide useful insights into interplay of these delays.
The importance of such experiments is valuable in the context of real-life applications which typically receive large numbers of concurrent requests. If application load causes significant delays, it is necessary to quantify the queuing delay and determine the causes of the processing delay, and then formulate strategies required to reduce the delay. We explore this issue in our next experiment.