Demystifying the Dark Web: An Introduction to Tor and Onion Routing

When you think of the dark web, you might think of hackers, the Silk Road, and other illegal services that operate covertly and mysteriously. However, the essence of the dark web – anonymous communication – is useful for more than just illicit behavior. The technologies that surround the dark web are used by journalists, people who need access to censored news sites, the U.S. military, and even people who just want to avoid annoying targeted ads. It may also surprise you that the infrastructure used to browse and communicate anonymously on the dark web is not so different than that of the internet we know now.

This article is meant to help clarify the dark web, shedding light on how its technology simply builds onto the systems we’re already familiar with. You’ll get an introduction to Tor: the popular open-source software that allows anonymous communication over the web, understanding how Tor helps form the dark web and what it means to use Tor for yourself.

What is Tor?

Tor is a free, open-source software that helps anonymize your traffic while you browse the web, maintained by a nonprofit organization called The Tor Project.

To understand how it works, we can first think of the internet as a big collection of devices (such as computers, routers, and servers) that can communicate with each other – with routers being the devices that facilitate these conversations.

When you try to visit a website on a normal web browser (like Google Chrome or Firefox), your web traffic goes through several routers on the internet before it reaches its destination. If someone happens to intercept your traffic at any point along its route, they can see key information like what IP address it came from, which one it’s going to, and potentially the data you’re sending if the communication is unencrypted. In short, you don’t have anonymity – this lets hackers, your ISP (Internet Service Provider), or companies like Facebook snoop on what sites you visit.

A diagram of a normal route from your computer to a website.
A diagram of a normal route from your computer to a website.

Onion routing: a way to anonymize traffic

Tor offers a solution to this problem by anonymizing your traffic as you browse the web with the help of a distributed, anonymous network. This network, known as the Tor network, is a series of volunteer-run routers, or relays, that can talk to each other (as well as normal devices on the internet). These particular routers are special because they share an agreement to forward anonymous messages sent using Tor, as defined by a routing protocol called onion routing.

Instead of finding a direct path from your computer to your destination site, onion routing chooses a random series of relays, or nodes, on the Tor Network to pass your traffic through. This random path is called a Tor circuit, and is 3 nodes long by convention, with an entry node (also known described as a guard), a middle node, and an exit node.

Image description: Diagram of a path from your computer to a website, going through the Tor network.
Image description: Diagram of a path from your computer to a website, going through the Tor network.

Before your computer (the Tor client) sends your web traffic through the circuit, Tor will encrypt it with multiple layers of encryption, much like that of an onion. These layers correspond to, and can only be decrypted by, each respective node in the circuit. For a primer on public-key encryption, refer here.

Image description: The initial encrypted message sent out by the Tor client, wrapped in encryptions made with the entry node’s key, the middle node’s key, and then finally the exit node’s key.
The initial encrypted message sent out by the Tor client, wrapped in encryptions made with the entry node’s key, the middle node’s key, and then finally the exit node’s key.

When your traffic first leaves, Tor will modify it so that only the source (your computer) and the immediate destination (the entry node) can be known to anyone who might intercept it. The rest of the information – like the two other nodes in the circuit, the final destination, and any data you send – become incrementally decrypted and used to forward on your traffic until it reaches its final destination.

Image description: Image of how the data packet gets decrypted at each step of the Tor circuit.
Image of how the data packet gets decrypted at each step of the Tor circuit.

How does Onion Routing help anonymize?

Onion routing has a very helpful property: each node in the circuit only knows who sent it the message, and where to send it next. Instead of being able to see the source and destination of your Internet traffic along all of its route (like in normal routing), Tor allows you to anonymize your browsing across its transport because the source of your traffic (i.e. your IP address) is never associated with your final destination.

The beginning and end of the circuit pose the most vulnerabilities. At the start of the route, your ISP, for example, might be able to see that traffic from your device is headed to a Tor Entry node, and can know that you’re using Tor. But once the encrypted message goes into the Tor network, your ISP won’t be able to know what you’re trying to access with Tor, or the data that you’re sending.

The exit node – where your traffic leaves the Tor network and goes to its final destination – is another point of vulnerability. If you’re not using an encrypted communication protocol (e.g. using HTTP instead of HTTPS), this can allow a snooper to intercept and read personally-identifying data in the route’s final leg.

However, even if you are communicating using encrypted protocol to send something personally identifying, like logging into Facebook, for example, your anonymity is at risk if Facebook happens to correlate your identity to other traffic seen at the same exit node. This is why, as a rule-of-thumb, you shouldn’t send anything involving personal information (like your username, email, Social Security Number…) while using Tor.

What is the Tor Browser, and how does it help anonymize browsing?

Tor’s onion routing protocol helps anonymize the transport part of sending data on the network, but there are other ways surfing the web can leak personal information that have nothing to do with your IP address. The Tor Browser is a special web browser built not only to offer the benefits of Tor routing when you go online, but also protect against other major privacy vulnerabilities associated with normal web browsing.

You might be surprised that the Tor Browser looks particularly, well, like a normal browser. Tor is basically a modified version of Mozilla Firefox, which is open-source and lends itself to modification. It looks, and behaves, and is, pretty much like any other browser you use to surf the web. You can use it to go to any websites you normally go to, like social media or news sites.

How does the Tor Browser help you browse with Tor?

When you visit a website with the Tor Browser, the browser software automatically chooses a random 3-node Tor circuit from a publicly published list of Tor nodes. By clicking the “i” icon to the left of the address bar,  you can observe the entry node, the relay node, and the exit node in your Tor circuit, along with their IP addresses and countries of origin. Below demonstrates what the browser looks like when going to a normal website: https://check.torproject.org.

Screenshots of the Tor Browser on check.torproject.org, with and without the Tor circuit popup.

In the eyes of your destination website, “your” IP address is, in fact, the IP address of the last node in the Tor circuit. (E.g. in the case above, the site thinks the source’s IP is 104.244.72.251 in Luxembourg).

What else does it do to provide anonymity?

Sending your data along the Tor Network isn’t enough to ensure anonymity – that’s why the Tor Browser implements additional measures to safeguard your browsing. Below are a few key functionalities of the browser, and additional ones can be found in the Tor Browser’s official design document.

Refreshing and resetting

Because staying on the same Tor circuit can risk your anonymity, the Tor Browser has ways to give you new circuits while browsing. The browser automatically chooses a new random circuit about every 10 minutes. In addition, you can refresh the circuit for any open tab, or click a “New Identity” button that closes any open tabs, clears cookies and browsing history, uses new circuits for all connections and restarts the browser.

Screenshots of the Tor browser, with the identity refresh popup.
Screenshot of the Tor Browser, with the identity refresh popup.

Avoiding Fingerprinting

The term fingerprinting describes how your identity can be profiled through any information your browser tells sites about us. In the background, normal browsers send to sites information like what browser you’re using, what OS you’re running, and even your screen resolution. Though this is important for a good user experience, they also compromise your anonymity – and so the Tor Browser helps prevent fingerprinting when it can.

For example, all browsers send something called the User-Agent, which tells sites what browser and OS you’re using. The Tor browser sends the same generic, false User-Agent for anyone that browses on it, essentially making everyone’s traffic look the same on the network:

Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0

Fingerprinting can also be done by some browser plugins. For example, HTML5 Canvas, which is used by many websites to display graphics and animations, can exploit subtle properties specific to your computer, like your video card, font-packs, and graphics library to create a fingerprint on your identity. In response, the Tor browser asks your permission before running Canvas elements.

Ensuring Encrypted Communication

The Tor browser automatically includes an extension called HTTPS-Everywhere. This plugin was made as a collaboration between The Tor project and The Electronic Frontier Foundation (EFF) to rewrite any unencrypted traffic (like ones through HTTP) to use HTTPS instead, helping protect your data when it exits the Tor network.

Making Tor Easy for You

Perhaps one of the most important features of the Tor browser is its ease-of-use – you don’t need to be a hacker in order to browse using Tor. Tor’s low barrier-to-entry allows more people to use Tor, which, in turn, helps all of its community stay anonymous in a larger sea of traffic.

Tor outside Tor Browser; other Tor-ifiied Applications

Using the official Tor Browser isn’t the only way you can use Tor. Other types of applications can be “Tor-ified”, i.e., made to connect to the internet through the Tor Network.

Other Tor Applications

There are plenty of useful applications designed specifically to use Tor. One example is OnionShare, a service that uses Tor to share files securely and anonymously (you can imagine that this would be important for journalists communicating sensitive information, for example).

Can you “Tor-ify” Other Browsers?

You can technically configure your normal browser, like Firefox or Chrome, to use Tor’s routing by altering the browser’s network settings. However, doing this is considered “a really bad idea” by The Tor Project, as these browsers will still allow you to be fingerprinted.

Can you Tor-ify your own applications / all of your network activity?

Similar to Tor-ifying non-Tor browsers, Tor-ifying all other activity on your computer is doable, but extremely risky for anonymity “unless you know exactly what you are doing” (The Tor Project). To Tor-ify all network activity, you will have to configure proxy settings on your computer to work with Tor.

However, many applications on your computer ignore proxy settings and can leak sensitive data, like your real IP address, your user name, and your time-zone. One example of this problem is a large scale is a series of privacy attacks on people who used Bittorrent through Tor, where many users unknowingly compromised their IP addresses.

Wait, so what’s the dark web?

So far, we’ve discussed the Tor network and how the Tor browser (and other applications) implement onion routing for anonymous communication. But circling back – what is the dark web? And how does Tor enable it?

What does the dark web consist of?

The “dark web” describes a collection of web services that exist behind layers of encryption, cannot be found using traditional search engines, and cannot be visited with traditional web browsers.  The dark web consists mostly of services on the Tor network, but also includes some other networks like I2P, Freenet, and small peer-to-peer ones. For the purpose of this article, “dark web site” will specifically reference services on the Tor network.

The “dark” origin of the term comes from the 1970s, when researchers working on ARPANET – the precursor of the Internet as we know it – described isolated networks that were unconnected to ARPANET as “darknets”. And while many news articles may confuse the two, the dark web is also distinctly different than the deep web, which describes sites on the normal web that are hidden by search engines (like documents that may be hidden by a login).

Image description: A venn diagram showing the distinction between clear web, deep web, dark web.
A venn diagram showing the distinction between clear web, deep web, dark web.

Hidden Services: Websites on the dark web

What is a dark web site?

If someone uses Tor to put a site on the dark web, this basically means they are taking advantage of the anonymizing capabilities of Tor’s software to protect their identity as a webhost. These dark web sites created with Tor are also known as onion service or hidden services, alluding to how they depend on Tor’s onion routing protocol, and how they can remain hidden by never having their IP address, or location, revealed.

The addresses for a hidden service look something like 3g2upl4pq6kufc4m.onion, all sharing the .onion root domain (this particular example is the hidden service version of DuckDuckGo, a popular, privacy-focused search engine). This next section explains how the strange look of these addresses, in fact, is a part of why these sites are inaccessible on the regular internet, as well as how their web hosts can remain anonymous.

Why can’t we access hidden services on regular browsers?

If you paste 3g2upl4pq6kufc4m.onion into your normal browser, you’ll get a message saying “This site can’t be reached.” This is because DNS (domain name system), the way your web browser normally helps you get to the site you want by translating a user-friendly name (like “example.com”) into an IP address (like 93.184.216.34), doesn’t know how to handle sites ending in the .onion. The .onion domain is officially considered a special use domain by IANA (Internet Assigned Numbers Authority), and DNS providers are specifically told not to resolve these names.

Unlike DNS, which relies on IP addresses to know where to send your traffic, Tor has its own way of allowing you to access a website without ever knowing its IP address. In fact, it’s extremely difficult to know the IP addresses behind .onion sites in general, and this is by design, ensuring web hosts have anonymity.

How does Tor let you access hidden services?

If DNS doesn’t know how to bring you to .onion sites, how does Tor bring you to them instead? Another version of this question is: as someone trying to access a hidden service, can I go to this onion site without ever needing to know its IP address?

In contrast to DNS, which resolves domain names into IP addresses, Tor resolves .onion names through its own system. This system works from the way onion services are created, as well as how they communicate with a client trying to access them – all resulting in the special property that no one (including you!) can ever see the IP address associated with an onion service. Here’s how this works, starting with how the service is first created:

  1. When an onion site sets up, it makes itself known to several randomly chosen servers in the Tor network: these are called introduction points. The site connects to these points (using a Tor circuit to keep anonymous) and provides them its public key, which gives a way of identifying the service without giving its IP address.

    Image description: The onion service communicating its public key to its introduction points via Tor circuits.
    The onion service communicating its public key to its introduction points via Tor circuits.
  2. Over a Tor circuit, the onion site uploads a descriptor of its public key and the IPs of its introduction points in a type of directory available across the Tor Network, known as a distributed hash table (DHT). The service also generates a domain name of the form “XYZ.onion”, where XYZ is a 16-character alphanumeric string that is based off a hash of its public key, and associates this name with its descriptor in the DHT.
    Image description: The onion service uploading its descriptor (containing its public key and introduction points) to the distributed hash table, over a Tor circuit.
    The onion service uploading its descriptor (containing its public key and introduction points) to the distributed hash table, over a Tor circuit.

    Image description: A diagram of the public key turning into the domain name via the hash function.
    A diagram of the public key turning into the domain name via the hash function.
  3. When you try to visit the site at XYZ.onion, Tor will use the name to find the site’s information in the hash table, connecting to the table over a Tor circuit. The name is checked against a hash of the public key to verify that it’s the right onion service.

    Image description: Your computer making a request to the database containing XYZ.onion’s descriptor, over a Tor circuit.
    Your computer making a request to the database containing XYZ.onion’s descriptor, over a Tor circuit.
  4. Your Tor client picks a random server to be a rendezvous point, creates a Tor circuit to it, and tells it a one-time secret message. This rendezvous point will be the meeting point at which you and the hidden service will try to communicate.

    Image description: Your computer sends a secret message to the rendezvous point over a Tor circuit.
    Your computer sends a secret message to the rendezvous point over a Tor circuit.
  5. Your Tor client sends a message to one of the introduction points. This message is encrypted so that only the onion site can decrypt it, and contains the address of the rendezvous point and the one-time secret. The introduction point that receives this then forwards it to the onion site. All these connections are also made over Tor circuits.

    Image Description: Your Tor client sends a message to one of the introduction points.
    Your Tor client sends a message to one of the introduction points.
  6. The onion site decrypts the message and creates a Tor circuit to the rendezvous point, sending the same one-time secret. Because the rendezvous point sees the same one-time secret from the client and onion service, it knows to act as a messenger between them.

    Image description: Both your computer and your onion site are sending the same secret message to the rendezvous point, over Tor circuits.
    Both your computer and your onion site are sending the same secret message to the rendezvous point, over Tor circuits.
  7. The rendezvous creates a circuit of 6 nodes between the client and the onion site, 3 from the client, and 3 to the service. Finally, you can communicate to and from the onion site! This communication uses end-to-end encryption, meaning that only the true senders and receivers can read or modify the data – further ensuring that your communication is secure.

     Image description: The final path of communication setup between your computer and the onion site you’re requesting.

    The final path of communication setup between your computer and the onion site you’re requesting.

Looking at this final circuit can be quite interesting, because it shows web hosts on Tor are essentially doing the same thing as someone who’s browsing with Tor – constantly ensuring that their IPs are hidden to any visitor (or a snooping authority) by communicating behind various layers of onion routing. As a host of a hidden service, keeping your IP hidden is usually your foremost concern (especially if your service is something you wouldn’t want authorities to probe in).

Other uses for hidden services

While keeping identities private is the typical case for onion sites, sometimes they aren’t concerned with their own privacy at all. For example, the New York Times and Facebook both have onion-ized versions of their normal websites to give access to people who live in places where these platforms are normally censored.

You might ask: “Why even create .onion sites? Can’t people just visit the normal versions of these sites while on Tor?” The main answer is that onion-izing a site offers even more protection to its users. When visiting a normal site over Tor, there is the chance that the connection can be snooped on close to the exit node, revealing the website and possibly your identity as a visitor (more on this will be in the next section).

However, when you visit an onion service over Tor, it’s much harder for someone to do that snooping. Even if someone happens to compromise the exit node talking to the onion service, they can only know the IP associated with the service, not the actual onion service linked to it, and not the contents of any data sent.

A secondary answer is that .onion sites can solve problems you might have when visiting their normal equivalents over Tor. For example, connecting to Facebook’s regular site over Tor is known to be problematic – the unusual IP addresses Tor chooses as your exit nodes can make you appear like a bot. This problem is a result of Facebook’s regular security mechanisms, but is addressed in the development of Facebook’s onion service.

Tor’s Vulnerabilities

While the Tor Project does its best to anonymize your browsing as much as possible, it’s not a perfect solution. One major vulnerability it’s unable to account for is something called traffic-analysis attacks. Traffic-analysis attacks happen when an observer can view the start and end of your traffic through the Tor Network, by being able to see you and either the destination site or the Tor exit node. This observer could be able to see the flow in and out of each side, and use timing correlation to identify what traffic you’re creating.

Tor’s response to this is selecting entry guards in a way that increases the chance that you won’t be profiled on an entry node compromised by an attacker. Somewhat unintuitively, randomly choosing a small set of relays to use as your entry node offers a greater chance of avoiding authorities than selecting a completely random entry relay each time. For details on this logic, check out the description of entry guards in Tor’s FAQ.

Conclusion 

The plentiful, multifaceted, and sometimes contradictory uses for Tor – from avoiding ad-targeting by Facebook to avoiding censorship to access of Facebook, and from being a tool used by the US government to being a tool to circumvent the US government – make it a fascinating site for exploration.

The myth of it being a tool only for nefarious, abject activity counteracts its potential for being known as a tool for access, democracy, and justice. One way to help counteract this myth is to learn more and engage with Tor yourself.

Learning More

The Tor Project website provides explanatory introductions, manuals, blog posts, and FAQs about everything Tor.

Using the Tor Browser

The Tor Project includes an easy way to download the Tor Browser. In addition to helping you avoid tracking, surveillance, censorship and keeping anonymous on the web, adding your own traffic through Tor’s network helps make Tor more effective for all its users.

Setting up your own Tor Relay

The Tor network is made possible by volunteers ( maybe future you!). You can set up your own Tor relay according to the Tor Project guide, choosing between a guard, middle, exit, or bridge (a special type of unlisted relay that helps users connect to Tor if their ISP censors access to it). Contributing a relay helps make the Tor network faster, more robust against attacks, more stable in case of outages, and protects the anonymity of its users.

Set up your onion service
Interested in creating an anonymized web service? Want to switch up your usual web development routine? Host an onion service that can only be accessed through Tor through the Tor Project’s guides.

Acknowledgements & Sources

I would like to thank the Tech Learning Collective for hosting a very informative workshop about Tor. This article would probably not be as coherent without them. Also thank you to Tom Igoe for providing feedback in my draft-stages.

Sources

https://trac.torproject.org/projects/tor/wiki/doc/HiddenServiceNames#Howare.onionnamescreated
https://www.eff.org/https-everywhere
https://www.torproject.org/
https://tools.ietf.org/html/rfc7686
https://www.talari.com/glossary_faq/what-are-router-protocols/
https://2019.www.torproject.org/docs/faq.html.en
https://2019.www.torproject.org/projects/torbrowser/design/
https://2019.www.torproject.org/about/torusers.html.en
https://www.lifewire.com/what-is-a-router-2618162
https://foreignpolicy.com/2013/12/09/the-darknet-a-short-history/
https://2019.www.torproject.org/docs/onion-services.html.en
https://blog.torproject.org/bittorrent-over-tor-isnt-good-idea
https://www.wired.com/2014/10/facebook-tor-dark-site/