Skip to main content

Networking · 40 min

HTTP and TLS

How the web works on top of TCP: request bytes, status classes, caching, persistent connections, HTTP/2, HTTP/3, TLS 1.3, and certificate chains.

Why This Matters

A cold HTTPS fetch over TCP and TLS 1.3 usually spends one round trip on the TCP three-way handshake, one round trip on the TLS handshake, then at least one network flight for the HTTP request and first response bytes. At 80 ms RTT, protocol setup alone can consume about 160 ms before the server sees the request body.

For ML systems, this path sits under model download, feature service calls, vector database queries, inference APIs, and telemetry. A mis-set Cache-Control header can multiply traffic. A TLS termination choice can decide whether the backend sees the real client address. A single lost TCP segment can stall all HTTP/2 streams on that connection.

Core Definitions

Definition

HTTP message

An HTTP request contains a method, request target, protocol version, header fields, a blank line, and an optional body. An HTTP response contains a protocol version, status code, reason phrase in HTTP/1.x, header fields, a blank line, and an optional body.

Definition

TLS authenticated key exchange

TLS 1.3 combines an ephemeral Diffie-Hellman key exchange with certificate authentication. The key exchange produces fresh traffic keys; the certificate chain binds the server public key to a DNS name trusted by the client.

Definition

Persistent connection

A persistent HTTP connection carries more than one request-response exchange over the same transport connection. HTTP/1.1 defaults to persistence unless either peer sends Connection: close.

Definition

Head-of-line blocking

Head-of-line blocking occurs when one missing earlier item prevents later independent work from being delivered. TCP has byte-stream ordering, so a lost segment can block later bytes even when they belong to another HTTP/2 stream.

HTTP Request and Response Bytes

HTTP/1.1 is text at the message layer. The first line names the operation and target. Headers are ASCII field names and values separated by :, and the header block ends with \r\n\r\n.

GET /models/resnet50.onnx HTTP/1.1\r\n
Host: cdn.example.net\r\n
User-Agent: cp-client/1.0\r\n
Accept: application/octet-stream\r\n
If-None-Match: "sha256-9b8c"\r\n
\r\n

The first 16 bytes of the request are:

47 45 54 20 2f 6d 6f 64 65 6c 73 2f 72 65 73 6e
 G  E  T     /  m  o  d  e  l  s  /  r  e  s  n

The method is GET, the path is /models/resnet50.onnx, and the version is HTTP/1.1. GET requests normally have no body. POST and PUT often do, and the receiver needs either Content-Length or Transfer-Encoding: chunked to find the message boundary on a persistent connection.

A small JSON inference request over HTTP/1.1 looks like this:

POST /v1/embed HTTP/1.1\r\n
Host: api.example.com\r\n
Content-Type: application/json\r\n
Content-Length: 27\r\n
\r\n
{"text":"systems matter"}

Content-Length is 27 because the body bytes are:

7b 22 74 65 78 74 22 3a 22 73 79 73 74 65 6d 73
20 6d 61 74 74 65 72 22 7d

The response status code class gives the first routing decision. 1xx is informational, for example 100 Continue. 2xx means success, with 200 OK, 201 Created, and 204 No Content common. 3xx redirects or selects another representation, such as 301 Moved Permanently, 302 Found, and 304 Not Modified. 4xx means the request is invalid for this client context, such as 400 Bad Request, 401 Unauthorized, 403 Forbidden, and 404 Not Found. 5xx means the server side failed, with 500 Internal Server Error, 502 Bad Gateway, and 503 Service Unavailable common at proxies and load balancers.

Cacheable Methods and Conditional Requests

GET and HEAD are the usual cacheable methods. HEAD asks for the same headers as GET without the response body, which is useful for checking size, validators, and freshness. POST responses can be cached only when explicit headers make them reusable, but many shared caches avoid doing so.

Two validators matter in daily debugging. Last-Modified is a timestamp. ETag is an opaque token chosen by the server. Strong ETags require byte-for-byte equality of the selected representation. Weak ETags, written like W/"abc", permit semantic equivalence without byte equality.

A cached object has:

HTTP/1.1 200 OK\r\n
Date: Tue, 12 May 2026 10:00:00 GMT\r\n
Cache-Control: max-age=60\r\n
ETag: "m-7f2a"\r\n
Content-Length: 1024\r\n
\r\n
...1024 bytes...

At 10:00:30 GMT, the cache can reuse it without revalidation. At 10:01:10 GMT, it is stale. The client can send:

GET /weights/block-0001.bin HTTP/1.1\r\n
Host: cdn.example.net\r\n
If-None-Match: "m-7f2a"\r\n
\r\n

If unchanged, the server replies:

HTTP/1.1 304 Not Modified\r\n
Date: Tue, 12 May 2026 10:01:10 GMT\r\n
ETag: "m-7f2a"\r\n
\r\n

No body is sent. For a 1 GiB shard, the validation exchange replaces 1,073,741,824 payload bytes with a few hundred header bytes. If-Modified-Since performs the same role with timestamps, but ETags avoid clock granularity and clock skew problems.

Persistent Connections, Pipelining, and Sockets

HTTP/1.0 commonly opened one TCP connection per object. HTTP/1.1 keeps the connection open by default, so a client can send a second request after reading the first response. This avoids another TCP handshake and lets TCP congestion state carry over.

The socket code is ordinary stream I/O. HTTP message parsing sits above read and write.

int fd = socket(AF_INET, SOCK_STREAM, 0);
connect(fd, (struct sockaddr *)&addr, sizeof(addr));

const char req[] =
    "GET /health HTTP/1.1\r\n"
    "Host: inference.example.com\r\n"
    "Connection: close\r\n"
    "\r\n";

write(fd, req, sizeof(req) - 1);

char buf[4096];
ssize_t n;
while ((n = read(fd, buf, sizeof(buf))) > 0) {
    /* parse status line, headers, then body framing */
}
close(fd);

Persistent connections require correct body framing. If response A has no Content-Length, no chunked encoding, and the connection remains open, the client cannot know where A ends and response B starts. That is why HTTP/1.1 servers either send a length, use chunked encoding, or close the connection to delimit the body.

HTTP/1.1 pipelining allowed a client to send request 2 before response 1 arrived. It failed in practice because responses had to be returned in order. If request 1 triggered a slow database query and request 2 was a static 200-byte file, response 2 still waited behind response 1. Intermediaries also had inconsistent behavior, so browsers mostly avoided pipelining.

HTTP/2 and HTTP/3 Framing

HTTP/2 keeps HTTP semantics but changes the wire format to binary frames. Each frame has a 9-byte header:

00 00 0c 01 04 00 00 00 03 ...
|length| T| F|R| stream id |

Here 00 00 0c is length 12, type 01 is HEADERS, flags 04 means END_HEADERS, and stream id is 3. DATA frames, HEADERS frames, WINDOW_UPDATE frames, and others share the connection. Streams are independent at the HTTP layer, so stream 3 can carry /a while stream 5 carries /b.

Multiplexing fixes HTTP/1.1 application-layer head-of-line blocking, but not TCP byte-stream blocking. Suppose TCP bytes 0 to 1199 hold part of stream 3, and bytes 1200 to 2399 hold part of stream 5. If the segment carrying bytes 0 to 1199 is lost, the receiver TCP stack cannot deliver bytes 1200 to 2399 to HTTP/2 yet. Stream 5 waits even though its bytes arrived.

HTTP/3 moves HTTP semantics onto QUIC over UDP. QUIC implements encryption, congestion control, loss recovery, and streams in user space. A lost QUIC packet that contains data for stream 3 does not prevent delivery of already received stream 5 data, because QUIC ordering is per stream rather than one global byte stream. Packet loss still costs retransmission time, but independent streams are not blocked by a missing byte range from another stream.

TLS 1.3 Handshake and Certificate Chain

TLS 1.3 starts with ClientHello. The client sends supported protocol versions, cipher groups, random bytes, and an ECDHE key share. For example, with X25519 the key share is 32 bytes. The server replies with ServerHello, selects parameters, and sends its ECDHE share.

Both sides compute the same Diffie-Hellman shared secret. TLS then derives handshake traffic keys from that secret and the transcript hash. The server sends encrypted handshake messages, including Certificate, CertificateVerify, and Finished. The client verifies the certificate chain and the signature over the transcript, then sends its own Finished. Application data follows under application traffic keys.

The certificate chain is usually leaf to intermediate to root:

Leaf certificate
  subject: api.example.com
  public key: server signing key
  issuer: Example Intermediate CA

Intermediate certificate
  subject: Example Intermediate CA
  public key: intermediate signing key
  issuer: Example Root CA

Root certificate
  subject: Example Root CA
  public key: root signing key
  trust source: operating system or browser trust store

The server normally sends the leaf and intermediate, not the root. The client already has trusted roots. Verification checks signatures, validity times, DNS name matching via Subject Alternative Name, key usage, and revocation policy when configured.

Forward secrecy comes from ephemeral Diffie-Hellman. If an attacker records traffic today and steals the server certificate private key next month, old TLS 1.3 sessions remain protected because the traffic keys came from an ephemeral ECDHE secret that was not stored as the certificate private key. This is the major reason TLS 1.3 removed static RSA key exchange from the main handshake.

The Model

The useful latency model counts network flights, not API calls. For a cold HTTP/1.1 request over TCP and TLS 1.3:

Tfirst-byteTTCP handshake+TTLS 1.3 handshake+THTTP request to responseT_{\text{first-byte}} \approx T_{\text{TCP handshake}} + T_{\text{TLS 1.3 handshake}} + T_{\text{HTTP request to response}}

With RTT 80 ms, the setup part is about 2×80=1602 \times 80 = 160 ms before the server processes the HTTP request. The full first-byte time is near 240 ms plus server compute if the request is sent only after the TLS handshake completes.

The connection reuse invariant is simple: once TCP and TLS are established, later HTTP requests on that connection avoid both handshakes. If an inference client sends 100 requests over one kept-alive TLS connection and the RTT is 20 ms, it avoids roughly 99×40=396099 \times 40 = 3960 ms of handshake round-trip delay compared with opening a fresh TCP plus TLS connection for each request. Concurrency, server time, and congestion control change the exact wall-clock value, but the avoided network flights are real.

Load balancers interact with this model in two common modes. In TLS termination mode, the load balancer completes TLS with the client, reads HTTP, then forwards to a backend over another connection. Routing can use Host, path, cookies, and HTTP/2 stream metadata. In TCP pass-through mode, the load balancer forwards encrypted bytes and cannot inspect HTTP fields, though it can route from IP, port, and sometimes SNI from the TLS ClientHello. When TLS terminates at the load balancer, backends often receive X-Forwarded-For or Forwarded headers to recover the client address, but applications must trust those headers only from known proxies.

Common Confusions

Watch Out

HTTP/2 multiplexing does not remove TCP head-of-line blocking

HTTP/2 removes the rule that response 2 must wait for response 1 at the HTTP layer. It still rides on one ordered TCP byte stream. If an earlier TCP segment is missing, later bytes are withheld from the HTTP/2 parser even when they belong to another stream. HTTP/3 changes the transport substrate to QUIC streams over UDP, so loss is scoped per stream.

Watch Out

A certificate does not encrypt the session by itself

The certificate authenticates a public key and a DNS name. The symmetric traffic keys come from the TLS key schedule, seeded by ephemeral Diffie-Hellman and transcript hashes. Replacing ECDHE with only a certificate would authenticate the server but would not give forward secrecy.

Watch Out

A 304 response is not a redirect

304 Not Modified is in the 3xx class, but it is a cache validation response. The client reuses its stored body. It should not move to a different URL unless a redirect status such as 301, 302, 303, 307, or 308 is present with a Location header.

Exercises

ExerciseCore

Problem

An HTTP cache stored a response at 12:00:00 with Cache-Control: max-age=120, ETag: "w42", and a 50 MiB body. At 12:03:00 the client needs the same URL. Write the conditional request and compute the payload bytes saved if the server returns 304 Not Modified with a 160-byte header block and no body.

ExerciseCore

Problem

Decode this HTTP/2 frame header: 00 00 05 00 01 00 00 00 07. Give the payload length, type, flags, and stream id.

ExerciseAdvanced

Problem

A client sends 20 small API calls. RTT is 50 ms. Ignore server compute and response transmission time, and assume no packet loss. Compare handshake round-trip cost for opening a fresh TCP plus TLS 1.3 connection for every call versus sending all calls over one persistent connection.

References

Canonical:

  • W. Richard Stevens, Kevin R. Fall, TCP/IP Illustrated, Volume 1: The Protocols (2nd ed., 2012), ch. 13-15 and ch. 18 — TCP connection behavior, data flow, and transport security context
  • Andrew S. Tanenbaum and David J. Wetherall, Computer Networks (5th ed., 2011), §6.4 and §8.6 — HTTP, the Web, SSL/TLS, and public-key certificates
  • W. Richard Stevens, Bill Fenner, Andrew M. Rudoff, UNIX Network Programming, Volume 1: The Sockets Networking API (3rd ed., 2004), ch. 4, ch. 6, ch. 30 — TCP sockets, I/O multiplexing, and client-server design
  • R. Fielding, M. Nottingham, J. Reschke, RFC 9110: HTTP Semantics (2022), §6, §9, §13, §15 ; messages, methods, conditional requests, and status codes
  • E. Rescorla, RFC 8446: The Transport Layer Security Protocol Version 1.3 (2018), §2, §4, §4.4, §7 ; TLS 1.3 handshake, authentication, and key schedule
  • M. Thomson, C. Benfield, RFC 9113: HTTP/2 (2022), §4, §5 ; binary frames, streams, and multiplexing

Accessible:

  • Ilya Grigorik, High Performance Browser Networking, ch. 9, ch. 12, ch. 18 ; HTTP performance, TLS, and browser networking
  • MDN Web Docs, An overview of HTTP ; request and response structure with practical examples
  • Cloudflare Learning Center, What happens in a TLS handshake? ; visual walkthrough of TLS handshake phases

Next Topics

  • /computationpath/tcp-congestion-control
  • /computationpath/dns-and-service-discovery
  • /computationpath/load-balancers-and-reverse-proxies
  • /computationpath/quic-and-http3
  • /topics/public-key-cryptography