What We Should Know about DNS

1. What is DNS?

The DNS, which stands for Domain Name System, is a hierarchical decentralized naming system for devices(computers, smartphones, etc), services, or other resources connected to the Internet or a private network. Mainly, the DNS associates domain names to the numerical IP addresses. The DNS has been in use since 1985. http://symbolics.com/ is the first and the oldest registered domain name on the Internet.

By the way, domain name is is the name of a website. IP address is a number used to indicate the location of a computer or other device on a network. For example, ITP is using a server which has the IP address 128.122.157.181. In addition, ITP has a few domain names like itp.nyu.edu and tisch.nyu.edu/itp(This is including a directory). The DNS associates the numerical IP address(128.122.157.181) with memorable domain name.

Breakdown of a URL into a protocol, subdomain, domain, and directory

                Figure 1 The Domain name of ITP

Breakdown of an IP address attributing the first 3 numbers to the network and the last to the host machine

                              Figure 2 IP Address structure

 

All devices which are connected to the Internet have IP addresses. Users can access devices if they know the device’s IP address, but sometimes IP addresses are changed by a system. In this case users need to re-memorize numbers, which are not easy to memorize. Let’s assume two cases in which the  IP address is changed. The first case is to change a server like when you are migrating  a site from your current hosting service to another service. In this case, the IP address will be changed because a user needs to use a completely different server. The second case is to use dynamic IP addresses in a private network because an administrator sometimes needs to change router settings or restarts servers. If people need to memorize only a few IP addresses, it will be fine, but in everyday  life most people use several websites, so memorizing IP addresses is not practical. That is the reason why we have DNS.

Speaking IP addresses is confusing but speaking domain names is simple.

                                        Figure 3 IP Addresses and Domain Names

2. How DNS Works

The authoritative name servers that serve the DNS root zone, commonly known as the “root servers”, are a network of hundreds of servers in many countries around the world. The Domain Name System(DNS) is composed of the root servers which are managing TLD(Top Level Domain ex: .com, .edu, .net, .org etc) and name servers. The DNS is a  hierarchical decentralized system. There are 13 root servers(A~M). a.root-servers.net manages root zone and other servers are mirrors of A. They are configured in the DNS root zone as 13 named authorities. Why root-servers are so important is that if all of them are down, we can’t use any domains and email addresses. In 2002, someone tried to do a DDoS, aka”Distributed Denial of Service” attack to 13 root servers, however, fortunately, they were not successful in their attack; the servers did not go down. Recently, B.root-servers.net which is one of 13 root servers has been renumbered to 199.9.14.201 from 192.228.79.201.(October 24th 2017) According to the institution which manages B-root-servers.net, it has been replaced for B-Root DNS service.(http://www.root-servers.org/news/b-root-begins-anycast-in-may.txt) The information of changing IP address is very important because DNS servers are using cache, which is a component that stores data so future requests for that data can be served faster, and they store information on a file which is called root hints file. However, as the predetermined time passes after using the information, the system doesn’t work properly. For instance, taking much time to restart cache server and Name Resolution(Details will be described below.) doesn’t work.

When DNS client gets information, it traces a path like the following image. First, it accesses  a root server to understand that which TDL server associated with the  .edu domain. Next, the client needs to follow the path which is associated with nyu.edu, in order to access a server which is related to itp.nyu.edu. Then it can get the subdomain information, if the client wants to know information on “http://nyu.edu.”   

Diagram showing the interactions between a client and several servers through a cache server

                      Figure 4 The Root Server and Name Servers.

 

3.How Web Browser and DNS work together

In this section, let’s see how to a web browser gets IP address when you type a domain in the browser. First, a user inputs a domain on a web browser’s address bar. The browser calls a function which is called Resolver.

The resolver sends a request to a DNS server saying that  “I would like to access google.com, so tell me the domain’s IP address.”

Then the DNS server responds with the IP address of the domain for which the Resolver asked. This process is called “Name Resolution”. For this process, they are using UDP because packets each of them sends are not so big and UDP’s process is lighter than TCP. User Datagram Protocol (UDP) is one of main internet protocols. UDP uses a simple connectionless communication model with a minimum of protocol mechanism. Moreover, this process doesn’t take so much time.

In the above, I have described the process as if it were  one-to-one communication to simplify it. However, there are several DNS servers and “Name Resolution” could go through multiple DNS servers. Before explaining the steps, I would like to mention to the general hierarchy of DNS servers. A nonprofit public benefit corporation, The Internet Corporation for Assigned Names and Numbers(ICANN) manages  IPs and domain names and they commission Internet Assigned Numbers Authority(IANA). Their activities can be broadly grouped into three categories:management of the DNS Root, coordination of the global pool of IP, and protocol assignments. IANA delegates its management to five regional Internet Registries (RIRs). There are 5 institutions; African Network Information Center (AFRINIC) is for Africa. American Registry for Internet Numbers (ARIN) is for the United States, Canada, several parts of the Caribbean region, and Antarctica. Asia-Pacific Network Information Centre (APNIC) is for Asia, Australia, New Zealand, and neighboring countries. Latin America and Caribbean Network Information Centre (LACNIC) is for Latin America and parts of the Caribbean region. Réseaux IP Européens Network Coordination Centre (RIPE NCC) is for Europe, Russia, the Middle East, and Central Asia.

Map showing the ranges of each regional domain name registry

                  Figure 5 Map of Regional Internet Registries         
                 https://en.wikipedia.org/wiki/Regional_Internet_registry

 

Then National Internet Registry and ISPs manage IPs and domain names for their regions or countries.

Here is the details of ISP’s job. The first step is the same as the first paragraph. Resolver send a request to a server which is called “Local Server” which is generally prepared by ISPs. The local server is working as a cache server which means it has information about IP addresses and domains. The cache server is controlled by an ISP and the cache expiration depends on “Start of Authority(SOA)”. Generally, the expiration of cache is a few days. If the local server has information which the browser’s resolver is looking for, it sends back the information. If the local server doesn’t have info, the local server’s resolver will start looking for information. Then, the local server responds to the request from the browser’s resolver. The difference between a browser’s resolver and a local server’s resolver is that the local server’s resolver communicates with many dns servers until it finds the IP address.

Diagram showing the interactions between a client and target machine as well as several other servers through a local server

                          Figure 6 Web browser’s request path

4. IANA and ICANN

IANA stands for Internet Assigned Numbers Authority. The IANA was a project group headed by Jon Postel at the Information Sciences Institute (ISI) of the University of Southern California (USC) started in 1988. They were a volunteer group that managed domains, IPs, and ports. The group had  financial support from the US government. In 1998, the IANA functions were transferred to The Internet Corporation for Assigned Names and Numbers(ICANN). The ICANN is a nonprofit public benefit corporation. ICANN manages addresses, port assignments, and stability of UIDs, and IANA manages DNS Root, IP addressing, and other protocol resources. IANA reports to ICANN because IANA is a lower branch of ICANN.

 

5.Social Media and DNS

Lastly, I would like to mention to a case I found. I was asked about web hosting by 2 people who required to change settings which were related to DNS. The first case was to need to change A record and nameserver info because she got hacked and she wanted to change a hosting service. The other case was to need create a website because he wanted to promote his project. They have one common issue which Facebook doesn’t allow them to share their links on Facebook feed and Messenger.  The following table shows each site’s condition.

Site 1(got hacked) Site 2 extratorrent.cc
Security issue ○(ranked caution level) × (untested)
DNS problem × ○(the domain didn’t work properly for a week) ×
Facebook share × × △(requires user’s confirmation)

 

Table 1 The sites’ conditions

 

From this table, a couple of things are assumed. The first thing is that Facebook may check data from security companies and another is they have own criteria on shared information because basically, torrent sites aren’t pretty safe, but actually there might be demand for torrent information from some users. The other thing is that some degree they care about DNS information from site2 because site2 is built by WordPress and most information the site has is from itself and one movie is from Vimeo. These are very interesting because I thought Facebook just checks a link is broken or not with The Open Graph Protocol(OGP) protocol. The Open Graph protocol enables any web page to become a rich object in a social graph. For instance, this is used on Facebook to allow any web page to have the same functionality as any other object on Facebook. The OGP is composed of basic metadata like below.

 

Basic metadata of the Open Graph Protocol of title, type, image, and url

                    Figure 7 The OGP Basic Metadata

 

In addition, Facebook publishes a tool which is called “Sharing Debugger.” I checked 4 sites: Site1, Site2, Google.com, and Facebook.com.

Screenshot of sharing debugger tool, showing the Open Graph Protocol information

                        Figure 8 Sharing Debugger https://www.google.com

 

So far in my understanding, Facebook gathers IP address, OGP meta information, redirect path, Canonical URL, and probably security information which security companies published. I guess they don’t open but, they collect data from Facebook users.