– Reading time: 7 minutes-
“This site uses cookies and similar technologies to help personalize our content and provide a better experience”, how many times have you come across this message when opening a website? Typically, it tends to be given little importance; most people proceed by clicking “Accept” to get rid of that annoying message as quickly as possible. However, what are these cookies in practice, and why are we asked to accept their use every time?
Although the name and the cover image might suggest otherwise, I’m obviously not referring to the cookies we usually have for breakfast. The cookies that we will study are purely connected to the computer science context.
Let’s remember what happens when you visit a website
The first thing you need to know about cookies is that they are essential for the functioning of the web. To help you understand why, let me first remind you of what happens when you visit a website. Let’s take the case of social networks like Instagram, Facebook, Twitter, or similar platforms. Every time you connect to one of these, your computer sends a request over the internet to another computer (the one that manages the social network) and receives a response containing the requested page, post, or any other information. In this operation, your device is referred to as the client, while the computer that manages the social network is called the server. Of course, this mechanism applies in general – it happens every time you visit any other website. However, the example of a social network was not chosen at random, and you will understand why later on.
The HTTP protocol
So, client and server communicate through a series of messages. You should know that these messages are based on a communication protocol called HTTP (HyperText Transfer Protocol). This protocol defines a sort of language that allows clients and servers to communicate. There are two types of HTTP messages: request messages (called HTTP requests) and response messages (called HTTP responses). HTTP requests correspond to the messages that the client sends to the server to request a specific resource. On the other hand, HTTP responses correspond to the messages that the server sends to the client containing the requested resources.
HTTP has become well-established as the primary system for transmitting information on the web, having been used for over three decades. However, there are some characteristics of HTTP that can be limiting for the modern web. One of these is that it is stateless: neither the server nor the client maintain, at the protocol level, information about previously exchanged messages. This characteristic can be problematic in cases where a user needs to be continuously authenticated to use a service, just like in the case of social networks. What I mean is that, in the case of social networks, as well as any other web service that requires authentication, after the user is recognized, the server should be able to remember him across multiple requests.
Let me explain further with an example. Suppose you are accessing your social media profile. You go to the login page, enter your username and password, and click submit. At this point, your computer will send a request to the social media server, which, after checking your username and password, will understand that it’s indeed you wanting to access your profile. The server will then return the data related to your homepage with posts from your friends and people you follow. So far, so good: everything will have concluded with a single exchange of messages, and no one will need to remember anything.
Now, let’s say you want to like a post from one of your friends. You click on the thumbs-up icon, and your computer automatically generates a new request to send to the social media server, which should then recognize you and save your choice. Now, if we were using pure HTTP and no other mechanism, unfortunately, the server would have no way to remember that you were already authenticated. As mentioned earlier, HTTP is stateless. In other words, the server would see your new request as coming from an unknown entity and would ask you to authenticate again by entering your username and password.
Of course, in reality, this never happens. Once you have authenticated yourself, your social media platform always remembers you and won’t ask for your username and password again unless you click on the “Log out” button. So, how does the server remember you if HTTP is stateless? The answer should be apparent by now: it uses cookies.
What are cookies?
A cookie is simply a data structure that moves between the client and the server. To put it simply, you can think of a cookie as a tag that the server provides you with every time you authenticate with the correct username and password. After authentication, whenever you want to perform an operation, your computer will need to attach that tag to your request. This way, when the server receives a request associated with your tag, it can read it to recognize you and avoid subjecting you to the authentication process again. So you’ll understand why cookies are of fundamental importance for the functioning of the web. But then why are they often described in a negative way? Furthermore, why are cookies used on websites that don’t require any authentication to be used?
Cookies and privacy
Allowing the recognition of a user across multiple requests can be exploited to maintain a sort of “history” of his or her activities. By doing so, cookies can be used to create highly detailed profiles of users and their interests. These profiles can be utilized for personalized advertising, offering more relevant services, as well as for commercial and profiling purposes.
Have you ever performed an internet search and subsequently received relevant advertisements? Cookies are partly connected to this situation. They are used to store information about the user’s activities on the website, such as visited pages, purchased items, browsing preferences, and more. Advertising companies use this information to display ads that align with the user’s browsing habits and interests. For example, if a user visited an e-commerce website to purchase a specific product, they might see advertisements for that product on other websites later on.
However, it is important to note that the use of cookies for behavioral advertising is subject to regulations and limitations based on privacy laws such as the GDPR in Europe and the CCPA in the United States. Users have the right to choose whether to allow or deny the use of cookies for behavioral advertising and to exercise their privacy rights over their personal data. This is why the first thing you are asked when accessing a website is consent for cookies.
I hope it’s clearer to you now what cookies are. The next time you visit a new website, you’ll certainly be able to make an informed decision about whether to accept them or not. Before concluding, I want to make a clarification. The sentence highlighted in yellow above represents a simplification that I deemed necessary to help you understand the issue at hand. In reality, if you were to disable cookies, you would likely not even be able to see your homepage. In fact, often just loading the homepage requires multiple interactions between the client and server. It was important to clarify this, but it doesn’t change the underlying concept that remains valid. That’s all for now. Thank you for reading this far. Best regards!