Exploring the Geography of the Internet

There are a number of network diagnostic tools that you can use to get to know the geography of the internet. Many of them are command line tools developed for POSIX operating systems, though they may be available under Windows and other non-POSIX systems too. What follows is an introduction to some of those tools, as a way of getting to know your way around the net.  These notes describe IP-based networks.

Karl Ward’s introduction to Unix is good for further reading if you’ve never used a POSIX command line interface.

Your Connection To the Network

Before you can find other devices (often called hosts) on a network, you need to know if you’re connected to that network. Every network-enabled computer has one or more network interfaces. the ifconfig command allows you a way to list those interfaces. On any POSIX command line interface, type the following:

ifconfig

This command will list all your computer’s network interfaces. Here’s a typical response (addresses have been changed):

ens3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 100.100.100.100  netmask 255.255.192.0  broadcast 100.100.127.255
        inet6 fe80::50ff:b600:feff:dfff  prefixlen 64  scopeid 0x20
        ether aa:bb:cc:dd:ee:ff  txqueuelen 1000  (Ethernet)
        RX packets 10558  bytes 47804427 (47.8 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6974  bytes 1096138 (1.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 146  bytes 12380 (12.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 146  bytes 12380 (12.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

This response tells you there are two network interfaces on this host. The first is an Ethernet connection, with an address that’s been assigned by the router to which it’s connected. There are two IP addresses, the one marked inet and the one marked inet6. The former is the IPv4 address and the latter the IPv6 address. Though Internet protocol version 6 (IPv6) is increasingly common, most networks still support the older IPv4 protocol as well.

Notice that the the broadcast address has the same initial numbers (100.100) as the IPv4 address, but a different set of final numbers. The last number (called an octet) in the address is a reserved number. On any IP network, packets sent to the x.x.x.255 address are sent to all hosts on that network.

The second address is the loopback address, which is the host’s interface to itself. The loopback address is always 127.0.0.1, this is a reserved address in the IP address space. It also has the reserved name localhost. One program on a host can communicate to another program over a network interface, and this is generally done over the loopback address. The packets sent from one program to another over this address never leave the host, and the address is available whether the host is connected to a network or not.

ifconfig is not just a tool for listing network interfaces, it’s also used to configure network interfaces, and there are many options for its use. Though most consumer machines have a graphic interface for configuring network interfaces, this tool can do all you need to configure a new interface. For more about it, you can view the manual page by typing man ifconfig.

Your Local Network

Once you know your own interface, you might want to know about your local network. Your computer keeps a list of all the addresses it’s aware of, starting with the router to which it’s connected. The arp command uses the Address Resolution Protocol to give you a list of your host’s address table. Type

arp -a

to see a list of your arp table.

If you’ve just turned the computer on and connected to nothing but your router, you’ll get a reply like this:

_gateway (100.100.127.1) at aa:bb:dd:cc:ff:ee [ether] on ens2

That’s telling you that your computer knows there’s another computer on the network with the address 100.100.127.1, and that computer’s MAC address is aa:bb:dd:cc:ff:ee. As your computer makes connections to other hosts on the local network, it will add elements to the arp table.

Like ifconfig, arp is a configuration tool, and can be used to add and remove items from the arp table. For more information, see the manual page by typing man arp.

Contacting a Remote Host

When a host is on a network, it listens for connections from other hosts on network ports. Every network interface has up to 65,535 ports on which it can listen or send messages out. The actual number of ports that a hosts listens on is usually much less; in fact, it’s good practice to set up a firewall program on your computer that blocks traffic coming into or going out of all ports except those that are used by specific applications. Consumer operating systems like MacOS and Windows have firewalls installed and configured by default.

Many common network applications use well-known ports. For example, ssh, used for encrypted connection to a remote host’s command line interface, is run on port 22 by convention. Web server applications, which use the HTTP and HTTPS protocols, typically listen on port 80 for HTTP and port 443 for HTTPS, and so forth.

ICMP and ping

There are a number of ways to contact another host, depending on what you want to know. One of the most basic questions you might ask a remote host is “Are you on the network?” The Internet Control Messaging Protocol (ICMP) is designed to do just that. IP-based hosts are expected to respond to ICMP messages, and you can send these messages using the ping command like so:

ping -c 3 93.184.216.34

The -c 3 flag indicates that you’re sending three ICMP messages. The host will respond, and the ping program will give you results like this if if does:

PING example.com (93.184.216.34): 56 data bytes
Request timeout for icmp_seq 0
64 bytes from 93.184.216.34: icmp_seq=1 ttl=51 time=75.231 ms
64 bytes from 93.184.216.34: icmp_seq=2 ttl=51 time=88.344 ms

This tells you the response time from the host, and the time to live (ttl) of the returned message. Every packet on the internet goes through multiple hosts — your router, your router’s router, your target’s router, and so forth — and each of those is a hop. The TTL indicates the number of hops. The PING command typically defaults to a TTL of 64 hops. You can set the number of hops with the -m flag like so:

ping -c 3 -m 8 93.184.216.34

This will set the TTL for your messages to 8 hops. Try it and see what the result is.

One way of testing a local network to see what other hosts are on it is to send a ping to the broadcast address for the network. For example, if your IP address is 10.0.1.1, you might send the following ping:

ping -c 3 10.0.1.255

Depending on the router’s security policy and the other hosts’ policies, you should get a reply from each host on the local network. This command, followed by arp -a is a useful way to get a list of hosts on your local net.

Many hosts block ICMP packets by default, as ICMP commands are often used as a network exploit tool. So it’s good practice to be careful with your use of ping, and to expect that sometimes you won’t get a reply from a host you know is online.

For more on ping, man ping to see the manual.

netcat

The netcat command is a useful utility for contacting hosts using a number of protocols. Like the cat command, which allows you to view the contents of a file, netcat is intended as a viewer of network connections. One of the simplest uses of netcat is as a port scanner, to see if a server is listening on a given port. For example, if you think a web server is listening on port 80, you can check like so:

nc -z example.com 80

The -z flag tells netcat not to listen for a reply, but just to return when it determines if there’s a listener on that address at that port. The reply in this case is as follows:

Connection to example.com port 80 [tcp/http] succeeded!

You can also set a range of ports, and netcat will scan them all to see if they’re open, like so:

nc -z example.com 80-8080

netcat can also be used to send a packet of info using TCP or UDP, or to listen for incoming connections. You can run a simple chat server by running netcat in one terminal window like so:

nc -l 8888

The -l flag means netcat should stay open and listen for incoming connections. Then connecting to it from another terminal like so:

nc localhost 8888

Then type messages from one window to another. If you know your IP address, you can also connect to your computer from another host. To quit, type control-C.

If you want to send or receive UDP packets instead of the default TCP, you can use the -u flag. As usual, for more on this command, check out the manual using man nc.

DHCP – Numbers to Names

When you tried the arp command, you may have noticed that each response listed the name, not just the number, or the host. The Dynamic Host Control Protocol, or DHCP, is the protocol by which hosts are assigned names. A domain name server (DNS) on a network keeps a list of known names, and when a host receives those names from a DNS server, it stores them locally. You can lookup the name-to-number information about a host using the nslookup command like so:

nslookup www.example.com

You’ll typically get a reply like so:

Server:		8.8.8.8
Address:	8.8.8.8#53

Non-authoritative answer:
Name:	www.example.com
Address: 93.184.216.34

This response tells you the DNS server that did the lookup for you (in this case Google’s DNS, 8.8.8.8), and the response it got, namely that the name www.example.com is assigned to the IP address 93.184.216.34. The “Non-Authoritative Answer” message is telling you that the server got the name from its local cache, and not from one of IANA’s root name servers.

If you want to get more information about a name registration, the whois command can deliver it. The whois command will return the domain name registry record from IANA, including the registrar, name servers, the contact info of the company who owns the name, the valid dates of the registration, and more. Though it’s good practice to include basic contact info on a whois record, many domain name registry companies allow you to make this private, and list only themselves rather than their clients.

How Do I Get There? – traceroute

Messages go through several hosts to get from sender to destination. The traceroute command lets you trace the likely route of a message. Traceroute takes advantage of the time-to-live property of packets. You may have noticed that when you sent ping messages with a low TTL, you got the address of the last host to reply, like so:

ping -c 3 -m 2 example.com

36 bytes from coregwb-te7-8-vl901-wlangwd-wwh.net.nyu.edu (10.254.4.38): Time to live exceeded

By extending the time to live by one each time you send a message until you reach the intended destination, you can learn the likely hosts through which your message passes. You can automate this with traceroute. Here’s a trace from a machine internal to the NYU network to www.example.com:

traceroute www.example.com

traceroute to www.example.com (93.184.216.34), 64 hops max, 52 byte packets
 1  10.18.0.3 (10.18.0.3)  71.135 ms  70.981 ms  70.061 ms
 2  * coregwb-te7-8-vl901-wlangwd-wwh.net.nyu.edu (10.254.4.38)  70.260 ms  70.048 ms
 3  128.122.1.36 (128.122.1.36)  83.277 ms  79.135 ms  80.395 ms
 4  ngfw-palo-vl1500.net.nyu.edu (192.168.184.228)  71.461 ms  71.837 ms  70.870 ms
 5  * nyugwa-outside-ngfw-vl3080.net.nyu.edu (128.122.254.114)  71.312 ms  80.912 ms
 6  nyunata-vl1000.net.nyu.edu (192.168.184.221)  71.090 ms  134.406 ms  71.827 ms
 7  nyugwa-vl1001.net.nyu.edu (192.76.177.202)  366.141 ms  71.049 ms  71.123 ms
 8  dmzgwa-ptp-nyugwa-vl3081.net.nyu.edu (128.122.254.109)  71.478 ms  71.848 ms  71.870 ms
 9  128.122.254.68 (128.122.254.68)  71.110 ms  71.080 ms  70.884 ms
10  ix-xe-7-3-2-0.tcore2.nw8-new-york.as6453.net (64.86.62.13)  71.921 ms  71.786 ms  71.478 ms
11  if-ae-3-2.tcore4.njy-newark.as6453.net (64.86.62.26)  72.859 ms  73.581 ms  72.676 ms
12  if-ae-11-14.tcore2.nto-new-york.as6453.net (63.243.186.5)  73.043 ms
    if-ae-11-15.tcore2.nto-new-york.as6453.net (63.243.216.5)  72.858 ms  72.694 ms
13  if-ae-12-2.tcore1.n75-new-york.as6453.net (66.110.96.5)  73.271 ms  75.386 ms  73.217 ms
14  66.110.96.61 (66.110.96.61)  78.345 ms  73.301 ms  73.462 ms
15  152.195.68.141 (152.195.68.141)  74.140 ms  73.798 ms
    152.195.68.131 (152.195.68.131)  74.154 ms
16  93.184.216.34 (93.184.216.34)  72.058 ms  72.554 ms  72.662 ms
17  * * 93.184.216.34 (93.184.216.34)  73.866 ms

You can see it started from a host with a private IP address (10.18.0.3) and then went to NYU’s various hosts. NYU’s public addresses all start with 128.122.x.x. From there it went to NYU’s service providers, then to the provider for example.com, and finally to example.com itself (93.184.216.34). Each hop is sent multiple times, so you see the minimum time, the mean, and the maximum time of the hops.

Traceroute is a powerful tool that network administrators use all the time to ensure that their networks are connected to the internet at large, and that messages are getting through.You can use it to determine where delays are, what your most common gateways to specific domains are, and more. There are many add-on tools that have traceroute at the core, and provide more information, like YouGetSignal’s visual traceroute, which attempts to look up the location of each host as well.

Traceroute can send messages using a variety of protocols. The -P flag lets you choose your protocol. As more hosts have turned off ICMP response by default, UDP is the more common protocol for traceroute, but you can use anything the command supports. As usual, man traceroute for more details.

There are many other network diagnostic tools you’ll learn as you explore the net more, but these ones will give you a good start on most network exploration.