From 60b7fc345efd4a7a51d65cf555aa5cac454811d2 Mon Sep 17 00:00:00 2001 From: Christine Dodrill Date: Tue, 19 May 2020 13:55:23 -0400 Subject: [PATCH] How http requests work (#149) * blog: how HTTP requests work * updates * philosophy tag * IP -> UDP --- ...how-http-requests-work-2020-05-19.markdown | 164 ++++++++++++++++++ 1 file changed, 164 insertions(+) create mode 100644 blog/how-http-requests-work-2020-05-19.markdown diff --git a/blog/how-http-requests-work-2020-05-19.markdown b/blog/how-http-requests-work-2020-05-19.markdown new file mode 100644 index 0000000..9c17f16 --- /dev/null +++ b/blog/how-http-requests-work-2020-05-19.markdown @@ -0,0 +1,164 @@ +--- +title: How HTTP Requests Work +date: 2020-05-19 +tags: + - http + - ohgod + - philosophy +--- + +# How HTTP Requests Work + +Reading this webpage is possible because of millions of hours of effort with +tens of thousands of actors across thousands of companies. At some level it's a +minor miracle that this all works at all. Here's a preview into the madness that +goes into hitting enter on christine.website and this website being loaded. + +## Beginnings + +The user types in `https://christine.website` into the address bar and hits +enter on the keyboard. This sends a signal over USB to the computer and the +kernel polls the USB controller for a new message. It's recognized as from the +keyboard. The input is then sent to the browser through an input driver talking +to a windowing server talking to the browser program. + +The browser selects the memory region normally reserved for the address bar. The +browser then parses this string as an [RFC 3986][rfc3986] URI and scrapes out +the protocol (https), hostname (christine.website) and path (/). The browser +then uses this information to create an abstract HTTP request object with the +Host header set to christine.website, HTTP method (GET), and path set to the +path. This request object then passes through various layers of credential +storage and middleware to add the appropriate cookies and other headers in order +to tell my website what language it should localize the response to, what +compression methods the browser understands, and what browser is being used to +make the request. + +[rfc3986]: https://tools.ietf.org/html/rfc3986 + +## Connections + +The browser then checks if it has a connection to christine.website open +already. If it does not, then it creates a new one. It creates a new connection +by figuring out what the IP address of christine.website is using [DNS][dns]. A +DNS request is made over [UDP][udp] on port 53 to the DNS server configured in +the operating system (such as 8.8.8.8, 1.1.1.1 or 75.75.75.75). The UDP +connection is created using operating system-dependent system calls and a DNS +request is sent. + +[udp]: https://en.wikipedia.org/wiki/User_Datagram_Protocol +[dns]: https://en.wikipedia.org/wiki/Domain_Name_System + +The packet that was created then is destined for the DNS server and added to the +operating system's output queue. The operating system then looks in its routing +table to see where the packet should go. If the packet matches a route, it is +queued for output to the relevant network card. The network card layer then +checks the ARP table to see what [mac address][macaddress] the +[ethernet][ethernet] frame should be sent to. If the ARP table doesn't have a +match, then an arp probe is broadcasted to every node on the local network. Then +the driver waits for an arp response to be sent to it with the correct IP -> MAC +address mapping. The driver then uses this information to send out the ethernet +frame to the node that matches the IP address in the routing table. From there +the packet is validated on the router it was sent to. It then unwraps the packet +to the IP layer to figure out the destination network interface to use. If this +router also does NAT termination, it creates an entry in the NAT table for +future use for a site-configured amount of time (for UDP at least). It then +passes the packet on to the correct node and this process is repeated until it +gets to the remote DNS server. + +[macaddress]: https://en.wikipedia.org/wiki/MAC_address +[ethernet]: https://en.wikipedia.org/wiki/Ethernet + +The DNS server then unwraps the ethernet frame into an IP packet and then as a +UDP packet and a DNS request. It checks its database for a match and if one is +not found, it attempts to discover the correct name server to contact by using a +NS record query to its upstreams or the authoritative name server for the +WEBSITE namespace. This then creates another process of ethernet frames and UDP +packets until it reaches the upstream DNS server which hopefully should reply +with the correct address. Once the DNS server gets the information that is +needed, it sends this back the results to the client as a wire-format DNS +response. + +UDP is unreliable by design, so this packet may or may not survive the entire +round trip. It may take one or more retries for the DNS information to get to +the remote server and back, but it usually works the first time. The response to +this request is cached based on the time-to-live specified in the DNS response. +The response also contains the IP address of christine.website. + +## Security + +The protocol used in the URL determines which TCP port the browser connects to. +If it is http, it uses port 80. If it is https, it uses port 443. The user +specified HTTPS, so port 443 on whatever IP address DNS returned is dialed using +the operating system's network stack system calls. The [TCP][tcp] three-way +handshake is started with that target IP address and port. The client sends a +SYN packet, the server replies with a SYN ACK packet and the client replies with +an ACK packet. This indicates that the entire TCP session is active and data can +be transferred and read through it. + +[tcp]: https://en.wikipedia.org/wiki/Transmission_Control_Protocol + +However, this data is UNENCRYPTED by default. [Transport Layer Security][tls] is +used to encrypt this data so prying eyes can't look into it. TLS has its own +handshake too. The session is established by sending a TLS ClientHello packet +with the domain name (christine.website), the list of ciphers the client +supports, any application layer protocols the client supports (like HTTP/2) and +the list of TLS versions that the client supports. This information is sent over +the wire to the remote server using that entire long and complicated process +that I spelled out for how DNS works, except a TCP session requires the other +side to acknowledge when data is successfully received. The server on the other +end replies with a ClientHelloResponse that contains a HTTPS certificate and the +list of protocols and ciphers the server supports. Then they do an [encryption +session setup rain dance][tlsraindance] that I don't completely understand and +the resulting channel is encrypted with cipher (or encrypted) text written and +read from the wire and a session layer translates that cipher text to clear text +for the other parts of the browser stack. + +[tls]: https://en.wikipedia.org/wiki/Transport_Layer_Security +[tlsraindance]: https://www.cloudflare.com/learning/ssl/what-happens-in-a-tls-handshake/ + +The browser then uses the information in the ClientHelloResponse to decide how +to proceed from here. + +## HTTP + +If the browser notices the server supports HTTP/2 it sets up a HTTP/2 session +(with a handshake that involves a few roundtrips like what I described for DNS) +and creates a new stream for this request. The browser then formats the request +as HTTP/2 wire format bytes (binary format) and writes it to the HTTP/2 stream, +which writes it to the HTTP/2 framing layer, which writes it to the encryption +layer, which writes it to the network socket and sends it over the internet. + +If the browser notices the server DOES NOT support HTTP/2, it formats the +request as HTTP/1.1 wire formatted bytes and writes it to the encryption layer, +which writes it to the network socket and sends it over the internet using that +complicated process I spelled out for DNS. + +This then hits the remote load balancer which parses the client HTTP request and +uses site-local configuration to select the best application server to handle +the response. It then forwards the client's HTTP request to the correct server +by creating a TCP session to that backend, writing the HTTP request and waiting +for a response over that TCP session. Depending on site-local configuration +there may be layers of encryption involved. + +## Application Server + +Now, the request finally gets to the application server. This TCP session is +accepted by the application server and the headers are read into memory. The +path is read by the application server and the correct handler is chosen. The +HTML for the front page of christine.website is rendered and written to the TCP +session and travels to the load balancer, gets encrypted with TLS, the encrypted +HTML gets sent back over the internet to your browser and then your browser +decrypts it and starts to parse and display the website. The browser will run +into places where it needs more resources (such as stylesheets or images), so it will +make additional HTTP requests to the load balancer to grab those too. + +--- + +The end result is that the user sees the website in all its glory. Given all +these moving parts it's astounding that this works as reliably as it does. Each +of the TCP, ARP and DNS requests also happen at each level of the stack. There +are layers upon layers upon layers of interacting protocols and implementations. + +This is why it is hard to reliably put a website on the internet. If there is a +god, they are surely the one holding all these potentially unreliable systems +together to make everything appear like it is working.