What Happens Technically when Someone Opens a Web Browser and Goes to a Website?

Question
Surfing the World Wide Web happens as a part of everyday life. In a job interview, you are asked to describe what happens behind the scenes when you open a web browser and enter a URL. The hiring manager wants to hear your knowledge of assembling packets of PHP, JavaScript, HTML, networking, or other technologies involved and how the sequence proceeds. What are the details of what happens behind-the-scenes?

Answer
After someone double clicks a web browser on an operating system, the executable binary file on the operating system launches (consuming some CPU and memory). The user then enters an alphanumeric URL and presses the "Enter" key. The HTTP constructor tells the browser to use HTTP; a web browser supports other protocols other than HTTP. Browsers usually add the "http://" constructor if someone just types in a URL like "google.com".

The next process is the Domain Name System (DNS) lookup request (according to blog.catchpoint.com). For the purposes of this explanation, we will assume this request is new and not benefited by any caching*. A TCP packet is sent out of the workstation on port 80. The TCP packet leaves the subnet through the default gateway via the router. The packet then goes to the internet service provider. The internet service provider's DNS server would route the request to the top-level domain (e.g., .com) (this is partially attributable to blog.catchpoint.com).

The TCP packet would then involve the authoritative DNS server of the TLD (e.g., a nearby .com server would redirect the request to a DNS server). This DNS query for the domain's IP address (the google in google.com) will often be via port 53 (either UDP or TCP).** An IP address (four octet values separated by three periods if IPv4 is being used) is obtained from this authoritative DNS server via an A record (whereas an AAAA record would support IPv6 according to https://support.dnsimple.com/articles/aaaa-record/).***

The original packet is now resolved and determined to go to this IP address. The routing path along the way will happen via routing tables of intermediate routers; the destination address in the packet header would facilitate the packet's movement (according to this UK authoritative website). Layer 3 of the OSI model facilitates the routing of the packet toward its destination.

When the destination is reached, TCP packets from the web server (e.g., Nginx, Apache Web Server, or IIS) are sent to the client workstation. The three-way handshake is completed when a connection is formed with the web server and the workstation. This process includes a SYN packet from the client workstation (the first way of the handshake), a SYN-ACK packet from the web server (the second way of the handshake), and an ACK packet from the client workstation (the third way of the handshake). TCP packets have flags that are designated as SYN, SYN-ACK, ACK, FIN, RST, PSH, and URG (according to this packetlife.net article).

The above process often takes less than one second. HTTP stands for HyperText Transfer Protocol. HTTP leverages TCP (Transmission Control Protocol) behind-the-scenes.

Now the HTTP request process begins (according to blog.catchpoint.com). A GET [HTTP method] request is made over the connection (established via the threeway handshake); this is a request to retrieve any data from the URL (according to W3.org). The web server sends the files via packets over the connection. An HTTP response status code is also returned, usually behind the scenes. There are five classes of such codes; to read more about them, see this Mozilla web page.

The response code may play a role in how the browser handles the presentation (according to this GlassDoor posting) as opposed to being merely informational for human debugging efforts. The 200 series will normally happen in a way that is invisible to the user. A response status code of 300 will involve further action, and this process will also usually be undetectable to the user. A response status code such as 4XX will return a generic message as the ultimate destination was not reachable by the client. A response code of 5XX will display an error message from the server.

Web browsers may or may not respond to the HTTP status code returned. Some will allow a cookie to be stored on a 3XX response code, while others require 2XX (according to this StackOverflow posting).

Assuming the response code is 2XX, the user's web browser starts to assemble the TCP packets downloaded from the bodies of the TCP packet. The body of these packets sent can be segmented bytes of PHP, JavaScript, HTML, .txt, JSON, XML, CSS and various other files (such as .gif or .jpg). Some web pages have static content whereas others are more dynamic.**** This assembly and rendering process by the web browser happens at layer 7 of the OSI model.*****

Normally cookies are enabled on the web browser; some sites (and most major websites according to HowStuffWorks.com) use cookies which are files that are placed on your workstation. These retain credentials and help the website owner to gather accurate statistics on website visitors. To learn about how cookies work, see HowStuffWorks.com.

For situations where data is sent to the website (e.g., a username and or password are sent from the client workstation) and not just downloaded from the website, the HTTP POST method may be invoked. This would happen after the three-way handshake which is a process that happens on the underlying TCP technology.

If the web browser is closed, a FIN packet is sent to terminate the connection (as taken from blog.catchpoint.com).

By default HTTP uses port 80, but the URL can use a different port number (either in the URL or a forwarding load balancer could send traffic to a different port). HTTPS is closely related to HTTP. HTTPS, if properly implemented, involves encrypted traffic from the workstation to the web server. HTTPS uses port 443 instead of port 80.

There are persistent and non-persistent HTTP connections. Persistent connections already have the three-way handshake completed; therefore reloading the web page can be done more quickly than a non-persistent request. (This paragraph is paraphrased from geeksforgeeks.org).

To learn more about the different HTTP versions, see this Mozilla website. You may want to read the article on this site How Do You Search the Logs of a Website that Is Not Functioning Properly?.

* There are four potential caches that could speed up the DNS record retrieval process (identifying an IP address via a website address). These four caches are the browser's cache, the OS cache, the router's cache, and the ISP cache (according to this posting).

** The TLD DNS server does not have every FQDN; it will be able to point to a relevant authoritative nameserver according to Cloudflare.com.

*** The DNS resolution process can be more complicated involving DNS caching servers, a database and recursive lookups.

**** The Fiddler tool can help show what happens when a URL request is made. It helps show the different portions of the web page and how they are rendered (e.g., the different sources of different pictures and images).

***** As many websites are so large that they require multiple packets, the processing of new lines is addressed via either a CRLF (in Windows) or an LF symbol. This processing is normal but it could involve a security risk; to read about the threat of a CRLF injection, see this posting.


FFR
If you want to know more about how a Kubernetes cluster supports a website and how the routing works to a given Pod supporting a website, see this posting.

As of February 2021, there are over 1,500 TLDs recognized by the IANA (according to https://www.iana.org/domains/root/db). TLDs can have a generic category (e.g., .com) or country category (such as co.uk, according to page 1212 of The Linux Programming Interface).

The tracert (on Windows) and traceroute (on Linux/Unix) commands can illustrate the intermediate hops via command prompt.

For more information about the threeway handshake that TCP uses, see these postings:

For more information about the different DNS servers, read the following. Different DNS servers play a role at different points in the resolution process (according to TechTarget). The DNS stub resolver server is one type; for the example of how DNS works, the DNS stub resolver server is your computer at home (according to TechTarget). If you use the hosts file on your computer at home, you can see how DNS resolution works with this "DNS server." The next, or second, DNS server is the DNS recursive resolver server (according to TechTarget); this can be something provided by your ISP or be your router or cable modem in your house (according to TechTarget). Finally the authoritative DNS server is used; this third one may instruct the recursive one to query a different authoritative server for the relevant top-level domain (e.g., .com) (according to TechTarget).

Leave a comment

Your email address will not be published. Required fields are marked *