How does the Internet work?
In our interconnected world, the internet has become an integral part of our daily lives, enabling instant communication, access to information, and seamless online experiences.
Have you ever wondered how this vast digital network functions and connects us to the practically boundless repository of knowledge, media, services and more?
In this article, we will delve into the inner workings of the internet, demystifying its underlying architecture and the technology that powers this global phenomenon. Right from the basics of data transmission to the complexities of routing and protocols, join me on a journey to uncover the secrets of how the internet works, empowering you with a deeper understanding of the digital web that shapes our modern existence.
To start with, we will learn about what clients and servers are and how they work together in accordance with the client-server model.
Client-Server Model
The client-server model is the backbone of modern networking, forming the foundation for seamless communication between devices and services over the Internet. At its core, this model involves two key components:
Client
In the client-server model, a client refers to a computing device or software application that initiates requests to a server for resources, services, or information. Clients can be various devices such as personal computers, laptops, smartphones, tablets, or IoT devices, as well as different software applications running on these devices.
Server
In the client-server model, a server is a powerful computing device or software application that responds to requests from clients and provides them with the requested resources, services, or information. The server is designed to handle a large number of incoming client requests and is typically hosted in data centres with high-performance hardware and robust network connections.
Request-Response Model
The second important model we need to understand is the request-response model. In the ever-evolving landscape of web applications and networked systems, the request-response model stands as a fundamental communication pattern that drives the exchange of data and services between clients and servers. This elegant model orchestrates the seamless flow of information, enabling clients to make specific requests for resources or functionality, and servers to respond promptly with the requested data or actions.
Generally, the request-response model for the WWW (World Wide Web) transfers information over HTTP or HTTPS. Naturally, the question that would now arise is what is HTTP and HTTPS.
Story - Birth of the World Wide Web (WWW)
Sir Tim Berners-Lee, born on June 8, 1955, is a British computer scientist widely recognized as the inventor of the World Wide Web. In 1989, while working at CERN (the European Organization for Nuclear Research), Berners-Lee proposed the concept of a global information system that would use hypertext to link documents together. He created the first web browser and editor (called WorldWideWeb and later renamed Nexus) in 1990. Along with this, he developed the HyperText Transfer Protocol (HTTP) and the HyperText Markup Language (HTML), fundamental technologies that enable the exchange of information and the creation of web pages. On August 6, 1991, he posted the first-ever web page on the internet, introducing people to this revolutionary system. In 1994, he founded the World Wide Web Consortium (W3C) to standardize and evolve web technologies. Tim Berners-Lee's groundbreaking contribution laid the foundation for the modern internet, revolutionizing the way information is shared, accessed, and connected across the globe. His vision of an open and accessible internet continues to shape our digital world today.
Hypertext Markup Language - The language of the Internet!
Hypertext is a concept that refers to text that contains links or references to other pieces of information, often in the form of clickable words or phrases. The term "hypertext" was coined by Ted Nelson in the 1960s, and it became a fundamental concept behind the World Wide Web.
In traditional printed media, such as books or articles, the text is linear and sequential. However, hypertext allows for non-linear and associative connections between pieces of information. When you encounter hypertext in digital form, such as on a website or in an electronic document, you can click on hyperlinks (embedded within the text) to access additional information or related content.
For example, a hypertext document about history might contain links to specific events or historical figures. By clicking on these links, you can jump to more detailed information about the selected topics. This interconnected structure allows users to navigate through information in a more flexible and dynamic manner, following their interests or needs.
The World Wide Web, which is built on the principles of hypertext, uses hypertext markup language (HTML) to create web pages that contain hyperlinks. Hypertext is a fundamental concept that revolutionized information access and navigation, enabling the seamless linking of information across the internet and creating a web of interconnected knowledge.
How is Hypertext delivered? - HTTP and HTTPS
HTTP (Hypertext Transfer Protocol) and HTTPS (Hypertext Transfer Protocol Secure) are application-layer protocols that are widely used for data communication over the Internet. They are part of the TCP/IP protocol suite, which is the set of networking protocols used to facilitate communication between devices on the internet.
Here's how HTTP and HTTPS work with the IP, TCP, and UDP protocols:
-
Internet Protocol (IP): IP is responsible for routing and addressing packets across the internet. Think of it like a post box address but for computers, just like you receive mails on your postbox, a computer receives packets or datagrams on its IP address. It defines the structure of IP addresses (IPv4 or IPv6) that uniquely identify devices connected to the network. IP is a connectionless protocol, meaning it does not establish a direct connection between devices before sending data. Instead, it routes packets independently to their destinations. IPv4 and IPv6 are two versions of the Internet Protocol used to assign unique addresses to devices on a network. The main difference lies in their address formats and the number of available addresses.
- IPv4: IPv4 uses 32-bit addresses represented in decimal format, such as 192.168.0.1. However, due to the limited number of available IPv4 addresses (approximately 4.3 billion), address exhaustion became a concern as more devices connected to the internet.
- IPv6: IPv6 on the other hand, uses 128-bit addresses represented in hexadecimal format, such as 2001:0db8:85a3:0000:0000:8a2e:0370:7334. With approximately 340 undecillion (340 trillion trillion trillion) unique IPv6 addresses, this version provides an enormous address space, ensuring that every connected device can have a unique, globally routable address.
-
Transmission Control Protocol (TCP): TCP is a transport-layer protocol that operates above IP and provides reliable, ordered, and error-checked data delivery between applications running on devices. It establishes a connection-oriented communication channel between two devices before transmitting data. TCP ensures that data packets are delivered in sequence and retransmits lost packets if necessary, ensuring data integrity and reliability.
-
User Datagram Protocol (UDP): Like TCP, UDP is also a transport-layer protocol that operates above IP. However, UDP is a connectionless protocol, meaning it does not establish a dedicated connection before transmitting data. Instead, it sends data packets (datagrams) without checking if they are received or in order. UDP is useful for applications that prioritize speed and efficiency over reliability, such as real-time streaming or online gaming.
-
HTTP/HTTPS: HTTP and HTTPS are application-layer protocols that run on top of either TCP or UDP. They define how web browsers and web servers communicate with each other to transfer data, particularly web pages, images, videos, and other resources. The key difference between HTTP and HTTPS is the security aspect:
- HTTP: HTTP operates over TCP and sends data in plain text, which means the data is not encrypted during transmission. This lack of encryption makes HTTP vulnerable to eavesdropping and data interception, especially in unsecured networks.
- HTTPS: HTTPS, on the other hand, operates over either TCP or UDP and uses encryption protocols such as TLS (Transport Layer Security) or SSL (Secure Sockets Layer) to secure the data transmission between the client (web browser) and the server. The use of encryption ensures that sensitive information remains confidential and protected from unauthorized access.
In summary, HTTP and HTTPS are application-layer protocols used for data communication over the Internet. They fit on top of the transport-layer protocols TCP and UDP, which, in turn, operate on top of the Internet Protocol (IP) to enable reliable and secure data transfer between devices on the Internet.
Under the Hood of HTTP
HTTP (Hypertext Transfer Protocol) is an application-layer protocol used for data communication on the World Wide Web. It defines how messages are formatted and transmitted between web browsers (clients) and web servers. HTTP operates on a request-response model, where clients send requests to servers, and servers respond with the requested data. Let's take a closer look at some HTTP requests and responses to understand how it works.
HTTP and REST (Representational State Transfer) are closely related as REST is an architectural style that uses HTTP as its communication protocol. RESTful APIs (Application Programming Interfaces) are built on top of HTTP, leveraging its methods, headers, and status codes to create a standardized and scalable way for clients to interact with resources on the server.
HTTP Request
An HTTP request is a message sent by a client (typically a web browser) to a server, following the rules of the Hypertext Transfer Protocol (HTTP). It is a fundamental part of how data is exchanged between clients and servers on the World Wide Web. The components of a typical HTTP request are shown below.
- Request Line: The first line of an HTTP request contains the method, the path of the requested resource (Uniform Resource Identifier or URI), and the HTTP version.
GET /example.html HTTP/1.1
- Headers: After the request line, there can be optional headers providing additional information about the request, such as the user agent, accepted content types, etc.
Host: example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36
- Body: Some requests, like POST or PUT, may contain a body with data to be sent to the server (e.g., form data or JSON data).
HTTP Response
An HTTP response is a message sent by a server in response to an HTTP request made by a client (typically a web browser). It follows the rules of the Hypertext Transfer Protocol (HTTP) and contains the requested data or status information requested by the client. The components of a typical HTTP Response are shown below.
- Status Line: The first line of an HTTP response contains the HTTP version, the status code, and the reason phrase.s
HTTP/1.1 200 OK
- Headers: After the status line, there can be optional headers providing additional information about the response, such as the content type, content length, and more.
Content-Type: text/html
Content-Length: 1234
- Body: The response body contains the data requested by the client (e.g., HTML content, JSON data, image data, etc.).
Status Codes
HTTP responses include a three-digit status code that indicates the result of the request. Status codes can be categorized into different types based on their numerical ranges:
- 1xx Informational: The server received the request and is continuing to process it. Example: 100 Continue, 101 Switching Protocols
- 2xx Success: The request was successfully received, understood, and processed by the server. Example: 200 OK, 201 Created, 204 No Content
- 3xx Redirection: The client must take additional action to complete the request, typically involving redirection to a different URL. Example: 301 Moved Permanently, 302 Found (Temporary Redirect), 304 Not Modified
- 4xx Client Errors: The client seems to have made an error in the request, such as providing invalid data or requesting a non-existing resource. Example: 400 Bad Request, 401 Unauthorized, 404 Not Found, 403 Forbidden
- 5xx Server Errors: The server failed to fulfil a valid request due to an error on the server's side. Example: 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable
Building Blocks of Web
- HTML (Hypertext Markup Language): HTML provides the structure and content of a webpage. It defines the various elements on the page, such as headings, paragraphs, images, links, forms, and more.
- CSS (Cascading Style Sheets): CSS is used to control the presentation and layout of webpages. It styles the HTML elements, defining their colours, fonts, sizes, margins, and positioning.
- JavaScript: JavaScript adds interactivity and dynamic behaviour to webpages. It enables actions such as form validation, animation, handling user interactions, and updating content without requiring a page reload.
- Images and Graphics: Images and graphics are essential for visual content on a website. They include photographs, icons, logos, and other visual elements that enhance the user experience.
- Typography: Typography refers to the choice of fonts and their styling on the website. Selecting appropriate fonts can significantly impact the readability and aesthetics of the site.
- Navigation: Website navigation provides a way for users to move around the site and access different pages. It can include menus, breadcrumbs, sitemaps, and other elements that help users find the information they need.
- Responsive Design: Responsive design ensures that the website adapts and looks good on different devices and screen sizes, including desktops, tablets, and mobile phones.
- Forms: Forms allow users to submit data, make selections, and interact with the website. They are used for tasks such as contact forms, login forms, and search boxes.
- Media (Audio and Video): Media elements like audio and video enrich the website's content and engagement. They can be used for background music, video presentations, or embedded tutorials.
Webpage vs. Website vs. Webapp
The terms "Webpage," "Website," and "Webapp" are related but refer to different concepts on the World Wide Web. Here's a breakdown of the differences between them:
-
Webpage:
- A webpage is a single document or resource that can be displayed in a web browser.
- It is typically written in Hypertext Markup Language (HTML) and may contain text, images, videos, links, and other media elements.
- Webpages are static in nature, meaning they present fixed content to the user without any interactive or dynamic functionality.
- Each webpage is accessed using a specific Uniform Resource Locator (URL).
- For example, the "About Us" page or the "Contact" page on a website is a single webpage.
-
Website:
- A website is a collection of related Webpages that are linked together and accessible from a common domain or subdomain.
- It consists of multiple interconnected Webpages that provide information, services, or functionality to visitors.
- Websites can be either static, with fixed content, or dynamic, with content that is updated or personalized based on user interactions or data.
- A website may contain a homepage, multiple content pages, image galleries, contact forms, and other features that form a cohesive unit.
Websites are hosted on web servers and can be accessed by users through their web browsers. - For example, www.example.com is a website that can consist of multiple - Webpages accessible through different URLs.
-
Web App (Web Application):
- A web app, short for web application, is a dynamic and interactive software program that runs in a web browser.
- Unlike static webpages, web apps offer more complex functionality and often involve user interactions, data processing, and real-time updates.
- Web apps are typically written in various programming languages such as - JavaScript, Python, Ruby, etc. at the backend and may use frameworks and libraries like React, Angular, Vue, etc. to enhance their frontend capabilities.
- They can provide a user experience similar to that of native applications, offering features like form submission, data retrieval, real-time chat, and more.
- Web apps can be hosted on web servers and accessed through a web browser, just like websites.
- Examples of web apps include email clients, social media platforms, online banking systems, and productivity tools like Google Docs.
Other commonly used Protocols on the Internet
Several common protocols are used on the internet to enable communication between devices and facilitate the exchange of data. Some of the most widely used protocols include:
-
FTP (File Transfer Protocol):
FTP is used for transferring files between a client and a server over a network.
It allows users to upload, download, and manage files on remote servers. -
SMTP (Simple Mail Transfer Protocol):
SMTP is the standard protocol used for sending and receiving email messages between email servers.
It enables the transmission of outgoing emails from a sender's mail server to the recipient's mail server. -
POP3 (Post Office Protocol version 3):
POP3 is a protocol used to retrieve email messages from a mail server to a client's email application (like Outlook or Thunderbird).
It allows users to download and store emails locally on their devices. -
IMAP (Internet Message Access Protocol):
IMAP is another protocol for retrieving email messages from a mail server to a client's email application.
Unlike POP3, IMAP allows users to access emails without downloading them, keeping the messages stored on the server. -
DNS (Domain Name System):
DNS translates human-readable domain names (like www.example.com) into IP addresses that computers can understand.
It enables users to access websites using domain names, simplifying the process of locating resources on the internet. -
DHCP (Dynamic Host Configuration Protocol):
DHCP is used to automatically assign IP addresses and other network configuration information to devices on a network.
It simplifies the process of setting up and managing IP addresses for connected devices. -
SSH (Secure Shell):
SSH is a cryptographic network protocol used for secure remote access and secure file transfer between devices.
It provides encrypted communication for secure login and remote command execution. -
SNMP (Simple Network Management Protocol):
SNMP is used for network management and monitoring devices like routers, switches, and servers.
It allows administrators to gather information and manage devices remotely.
These protocols, along with many others, form the foundation of modern Internet communication, making it possible for devices and services to interact seamlessly and efficiently across the global network.
In conclusion, understanding how the internet works empowers us with insights into the technology that shapes our modern world. The digital web continues to evolve, connecting people, information, and services, and the underlying architecture will always remain a fascinating realm to explore and marvel at. Learning more about the Internet can be a highly empowering and rewarding activity, especially for those aspiring to be developers.