The Internet - TCP/IP

The Internet Runs On Standards

Interoperate - The internet involves equipment like computers, routers, phones, browsers etc. formatting bytes, and sending them to some other piece of equipment. This only works if the various parties agree on a standard format for the bytes. There were early attempts at "corporate" standards, Microsoft or IBM would control the format. This was a total failure.

The successful solution was "open standards" - standards not under the control of any one vender. Anyone is free to read an implement the standard, no fee, no patents, no permission required. Open standards turned out to be the key enabler to the internet as we know it today.

You can buy a GE coffee maker, but you are not then required to buy the GE plug and cord and power supply. The plug is a standard, anything plugs into it. TCP/IP! standard ac-power plug

Internet - TCP/IP Standards

Previous lecture .. LAN, e.g. ethernet, Wi-Fi, one house
Internet - world-wide network built on open standards
Internet is like a phone system for computers
TCP/IP Standards, 1974, government sponsored research
Open standards, vendor neutral - super successful pattern
Capitalism "internet" failed pre open standard
With standard foundation in place, capitalism does great building on it

The previous LAN examples connect computers all on the same LAN. Now we will scale the problem up to send packets between any two computers on earth.

The worldwide Internet is built on the TCP/IP family of standards (Transmission Control Protocol / Internet Protocol) which solves the larger problem of sending packets between computers across the whole internet. These are free and open, vendor-neutral standards which is probably the reason they have been so incredibly successful.

IP Address

Every computer on the internet has an IP address
Here looking at IP v4 addresses, v6 on the horizon
e.g. 171.64.64.166
IP addr is exactly 4 bytes (4 1-byte numbers)
Left part encodes "neighborhood" on internet
Just like phone, 650-725-0000
e.g. 171.64.xxx.xxx generally Stanford campus
e.g. 171.64.64.xxx my floor of Gates building
Cannot just make up an IP addr, depends on location

Every computer on the internet has an "IP address" that identifies it (like a phone number). The IP address is 4 bytes, written between dots, like "171.64.2.3". The left part of the address encodes in part where that IP address is in the whole internet -- for example any 171.64.(anything) is part of Stanford (like the area code of a phone number). More specifically, in my part of the Gates building, all the IP addresses begin 171.64.64.XX varying only in that last byte.

Sandra Bullock Blooper

TCP/IP blooper in this video of "The Net"...

video

Domain Names

Domain names
e.g. "www.google.com" "web.stanford.edu"
A human-readable name for an IP addr
The famous x.org x.edu x.com x.gov names
codingbat.com (name for 173.255.219.70)
pippy.stanford.edu (name for 171.64.64.28)
Domain system can look up an IP addr from a domain name
So when you use a domain name on the internet...
1. The domain name is looked up to get IP addr
2. All packet sending uses IP addrs
Registering a domain name costs $30 a year or so
Whoever grabs it first can keep it, unless someone else owns the trademark
Parasites: domain squatters grab the domain, try to re-sell

Router

A computer typically connects to a router for service
-We'll say the router is "upstream"
The router provides the internet service, handling the computer's packets
Router has multiple network connections
-It needs at lest 2 connections to forward anything
-Forwards packets from one connection to the other
My office computer is at 171.64.64.16
That computer connects "upstream" to router 171.64.64.1
That router handles traffic for a few local computers
Left side of computer and router IP addresses are typically the same
-They are in the same IP "neighborhood"

The most common way for a computer to be "on the internet" is to establish a connection with a "router" which is already on the internet. The computer establishes a connection via, say, ethernet to communicate packets with the router. The router is "upstream" of the computer, connecting the computer to the whole internet. For example, the computer in my Stanford office has IP address 171.64.64.166, and it has a one-hop ethernet connection to its router upstream at 171.64.64.1, and this router handles packets for my computer. Often the router's IP address will end in .1, such as my router's 171.64.64.1. Typically the IP address of the computer and its router will look at the same on the left side, since they are in the same "neighborhood" of the internet.

IP Packet - From: and To: IP Addresses

TCP/IP defines a standard IP Packet
Defines addresses, data format, checksum scheme
The TCP/IP packet has both from: and to: fields
The from: and to: fields are both IP addresses

That's A Lot Of Hopping!

How does a packet get around the internet? Answer: Hop Hop Hop Hop Hop Hop Hop Hop Hop. Strange but true.

packet proceeds by multiple hops

Suppose 171.64.64.166 sends a packet over to 173.255.219.70
IP packet marked with ultimate From:/To: IP addrs
Router strategy: send the packet 1 hop closer to its destination
Hop 1: 171.64.64.166 sends packet up to its router
Hop 2: 171.64.64.1 sends packet up to its bigger router
Hop hop hop, over to destination, 10-20 hops typically
Analogy: source capillary up to major artery, over, and down to destination capillary
Picture of Internet/Routers: opte.org Internet Maps

Suppose my computer at 171.64.64.166 wants to send a packet to a computer at 173.255.219.70 somewhere out on the internet (actually that's the codingbat.com server I administer). The Internet is essentially made of a big web of routers talking to each other.

1. My computer prepares an IP packet which includes in particular From:/To: information as IP addresses, like this: (IP Packet From:171.64.64.166 To:173.255.219.70 data data data data).

2. My computer sends that IP packet to my upstream router, one hop, over ethernet. This is the "first hop" of the packet on its journey.

3. The 171.64.64.1 router looks at the To:/From: of the packet and forwards it to the next router, one hop closer to its ultimate destination. Essentially, the router has its own upstream router which is bigger and knows more about the layout of the internet. The packet is forwarded, one hop at a time, until it reaches its ultimate destination. Each router does not need to know the whole route to the destination; each router just needs to know which way to send the packet to get it one-hop closer to its destination. The routers look at the left part of the IP address to get the packet to the right neighborhood -- 173.255.x.x -- with the right part of the address -- x.x.219.70 -- coming into play only when the packet is near its ultimate destination.

Router Analysis

Each router knows enough to figure the next hop, not the whole route
There is no "center" of the internet that knows everything
The initiating computer does not typically know anything, delegating to its router
"Core" routers, towards the middle, bigger, fancier, more connections
Routers measure connection functionality/breakage all the time
-Choose alternative routes in real time from breakage, congestion
Routers are a distributed, collaborative system
-Each does its part, cooperatively solving the whole thing
-VS. centralized/top-down system

The routing of a packet from your computer is like a capillary/artery system .. your computer is down at the capillary level, your packet gets forwarded up to larger and larger arteries, makes its way over to the right area, and then down to smaller and smaller capillaries again, finally arriving at its destination. The ultimate destination puts all the packets back together in the right order to recover the original image file or whatever. The routers at the ends have a trivial upstream/downstream configuration, so the next hop for a packet is pretty simple. More central "core" routers tend to have several possible outgoing connections, so they have a more complicated choice about which link to use for the next hop.

The routers, collectively, measure what networks are reachable over what links, and dynamically adjust what links to use for each packet. One simple metric would be to route packets the way that takes the fewest number of hops. In reality, the metrics used are more complex than this. The routing system resilient to router hardware failures, overloading of certain links due to normal traffic, and links going down. The path taken by an IP packet can change from minute to minute. The routers are another example of a distributed, collaborative system. The old joke is that the backhoe is the IP packet's natural predator in the wild, as construction will sometimes slice through an important data cable, suddenly breaking a link in use. The routers "route around" such damage automatically.

Note that my computer does not need to know the layout of the internet. My computer just needs to have a connection to its upstream router, and the router, and its upstream router etc., will handle the routing from there.

Very broadly speaking, most data you get or send on the Internet goes in packets which take more than 10 but less than 20 hops from origin to destination.

Paying For Internet Service

Internet service is like a basic utility
Typically you pay a provider for your "upstream" service
Say, $30 per month, for a 10 mbps (megabits per second) connection
They in turn pay some of that money to their upstream
Sadly, the internet service business in the US is not very competitive = high cost
"Net Neutrality" - a good idea, avoid market manipulation by (few) internet providers
If there were 10 providers competing to provide service, you would not need Net Neutrality legislation

Special "Local" IP Addresses

Note that 10.x.x.x and 192.168.x.x addresses are special "local" IP addresses
These addresses are not valid out on the internet at large
They are used within an organization but not outside it
These are translated to a real IP addr as a packet makes its way
Frequently given out by Wi-Fi routers .. why I mention them

What Does it Mean to Be On the Internet?

On the internet - e.g. connect to a Wi-Fi router
1. Computer connects to an upstream router to handle traffic. Most Wi-Fi access points combine Wi-Fi radios and a router.
2. The router typically gives the computer an IP address to use
The computer cannot pick an arbitrary IP address, since the left part of the address depends on the location on the internet ... details known by the router
Also, you don't want to pick an IP address in use by someone else, so the router gives you a known good one
3. DHCP "Dynamic Host Configuration Protocol" - automatically configure network settings to work locally. Computers very often use this feature to get needed network configuration from the router automatically.

So what does it mean for a computer to be on the internet? Typically it means the computer has established a connection with a router. The commonly used DHCP standard (Dynamic Host Configuration Protocol), facilitates connecting to a router; establishing a temporary connection, and the router gives your computer an IP address to use temporarily. Typically DHCP is used when you connect to a Wi-Fi access point.

Experiment: bring up the networking control panel of your computer. It should show what IP address you are currently using and the IP address of your router. You will probably see some text mentioning that DHCP is being used.

Demo: DNS Lookup

Here I use the "host" program to look up the IP addr of a domain name. You don't have to do this; I'm just demoing.

$ host codingbat.com    # I type in a command here
codingbat.com has address 173.255.219.70
codingbat.com mail is handled by 10 mx01.1and1.com.
codingbat.com mail is handled by 10 mx00.1and1.com.
$ host www.google.com
www.google.com has address 216.58.217.196
www.google.com has IPv6 address 2607:f8b0:4007:808::2004

Demo: Ping

"Ping" is an old and very simple internet utility. Your computer sends a "ping" packet to any computer on the internet, and the computer responds with a "ping" reply (not all computers respond to ping). In this way, you can check if the other computer is functioning and if the network path between you and it works. As a verb, "ping" is also used in regular English this way .. not sure if that's from the internet or the other way around.

Experiment: Most computers have a ping utility, or you can try "ping" on the command line. Try pinging www.google.com or pippy.stanford.edu (171.64.64.28, on nick's desk). Try pinging poland.pl ... much farther away from Stanford.

Milliseconds fraction of a second used for the packet to go and come back. 1 ms = 1/1000 of a second. Different from bandwidth, this "round trip delay".

Here I run the "ping" program for a few addresses, see what it reports

$ ping www.google.com  # I type in a command here
PING www.l.google.com (74.125.224.144): 56 data bytes
64 bytes from 74.125.224.144: icmp_seq=0 ttl=53 time=8.219 ms
64 bytes from 74.125.224.144: icmp_seq=1 ttl=53 time=5.657 ms
64 bytes from 74.125.224.144: icmp_seq=2 ttl=53 time=5.825 ms
^C                            # Type ctrl-C to exit
--- www.l.google.com ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 5.657/6.567/8.219/1.170 ms
$ ping pippy.stanford.edu
PING pippy.stanford.edu (171.64.64.28): 56 data bytes
64 bytes from 171.64.64.28: icmp_seq=0 ttl=64 time=0.686 ms
64 bytes from 171.64.64.28: icmp_seq=1 ttl=64 time=0.640 ms
64 bytes from 171.64.64.28: icmp_seq=2 ttl=64 time=0.445 ms
64 bytes from 171.64.64.28: icmp_seq=3 ttl=64 time=0.498 ms
^C
--- pippy.stanford.edu ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.445/0.567/0.686/0.099 ms

Traceroute

See series of hops
First few hops in your IP neighborhood
Farther away hops .. more milliseconds
Note New York, London .. big jump in milliseconds
Hops does not go up linearly with distance
East bay - 30 miles from Stanford - 11 hops
Russia - 10000 miles from Stanford - 19 hops

Traceroute is a program that will attempt to identify all the routers in between you and some other computer out on the internet - demonstrating the hop-hop-hop quality of the internet. Most computers have some sort of "traceroute" utility available if you want to try it yourself (not required). Some routers are visible to traceroute and some not, so it does not provide completely reliable output. However, it is a neat reflection of the hop-hop-hop quality of the internet. Here's an example traceroutes from my office, and then a randomly chosen computer with a Serbia (.rs) domain name.

$ traceroute -q 1 codingbat.com   # typing a command to the computer
traceroute to codingbat.com (173.255.219.70), 64 hops max, 52 byte packets
 1  yoza-vlan70 (171.64.70.2)  2.039 ms
 2  bbra-rtr-a (171.64.255.129)  0.932 ms
 3  boundarya-rtr (172.20.4.2)  3.174 ms
 4  dca-rtr (68.65.168.51)  27.085 ms
 5  dc-svl-agg1--stanford-10ge.cenic.net (137.164.50.157)  2.485 ms
 6  dc-oak-core1--svl-agg1-10ge.cenic.net (137.164.47.123)  3.262 ms
 7  dc-paix-px1--oak-core1-ge.cenic.net (137.164.47.174)  4.046 ms
 8  hurricane--paix-px1-ge.cenic.net (198.32.251.70)  14.252 ms
 9  10gigabitethernet1-2.core1.fmt1.he.net (184.105.213.65)  9.117 ms
10  linode-llc.10gigabitethernet2-3.core1.fmt1.he.net (64.62.250.6)  4.975 ms
11  li229-70.members.linode.com (173.255.219.70)  4.761 ms
$ traceroute -q 1 yujor.fon.bg.ac.rs
traceroute to hostweb.fon.bg.ac.rs (147.91.128.13), 64 hops max, 52 byte packets
 1  csmx-west-rtr.sunet (171.64.64.2)  32.802 ms
 2  171.64.255.204 (171.64.255.204)  0.478 ms
 3  dc-svl-agg1--stanford-10ge.cenic.net (137.164.50.157)  0.972 ms
 4  dc-svl-core1--svl-agg1-10ge.cenic.net (137.164.47.121)  2.784 ms
 5  hpr-svl-hpr2--svl-core1.cenic.net (137.164.26.249)  1.107 ms
 6  lax-hpr2--svl-hpr2-10g-2.cenic.net (137.164.25.49)  13.880 ms
 7  hpr-i2-newnet--lax-hpr.cenic.net (137.164.26.134)  9.213 ms          # See the ms go way up here
 8  et-1-0-0.111.rtr.hous.net.internet2.edu (198.71.45.20)  41.892 ms    # houston
 9  et-10-0-0.105.rtr.atla.net.internet2.edu (198.71.45.12)  65.663 ms   # atlanta
10  et-9-0-0.104.rtr.wash.net.internet2.edu (198.71.45.7)  78.620 ms     # DC
11  abilene-wash.mx1.fra.de.geant.net (62.40.125.17)  179.285 ms         # jumped the Atlantic
12  ae0.mx1.pra.cz.geant.net (62.40.98.52)  179.336 ms
13  ae2.mx2.bra.sk.geant.net (62.40.98.55)  183.670 ms
14  ae0.mx1.bud.hu.geant.net (62.40.98.110)  199.815 ms
15  amres-gw.mx1.bud.hu.geant.net (62.40.125.178)  207.006 ms
16  amres-l-j-agg.rcub.bg.ac.rs (147.91.6.85)  193.146 ms
17  cisco3550-fon.rcub.bg.ac.rs (147.91.7.92)  193.536 ms
18  rcub-fon-gw4.rcub.bg.ac.rs (147.91.5.172)  193.758 ms
19  hostweb.fon.bg.ac.rs (147.91.128.13)  208.213 ms

The numbers down the left side are the number of "hops" to that machine. The "ms" figures are the number of milliseconds (1 ms = 1 thousandth of a second) it took for the send/reply. Notice that as the hops get further away, it does roughly take more milliseconds. The first few hops are Stanford addresses, then the route goes over some provider, until it arrives at Linode, which is the company that provides the hardware where codingbat.com currently lives. Small mystery: it seems like the first hop should be 171.64.64.1 which is the first router from my office; apparently that router is invisible to traceroute.

Internet and the Speed of Light

The speed that a packet can go is never faster than speed of light
It will go some fraction slower than light because,..
-The wires jump from city to city, not the shortest route
-Signal in a wire/fiber is slower than in a vacuum
-The routers take some time to pack/unpack/forward the packets
-There's other traffic using the routers too
That said, it's neat that the ping times are proportional to speed of light

Here's a traceroute of a computer in london, ae-9.r24.londen12.uk.bb.gin.ntt.net. I did a traceroute of theregister.co.uk to get this ae-xxx IP in London.

$ traceroute -q 1 ae-9.r24.londen12.uk.bb.gin.ntt.net
traceroute to ae-9.r24.londen12.uk.bb.gin.ntt.net (129.250.2.19), 64 hops max, 52 byte packets
 1  csmx-west-rtr-vl3864.sunet (171.64.64.2)  1.231 ms
 2  dc-svl-rtr-vl8.sunet (171.64.255.204)  0.562 ms
 3  dc-svl-agg4--stanford-100ge.cenic.net (137.164.23.144)  1.212 ms
 4  10-1-1-91.ear1.sanjose1.level3.net (4.15.122.45)  1.416 ms
 5  ae-1-2.ear1.sanjose3.level3.net (4.69.209.149)  2.256 ms
 6  ntt-level3-4x10g.sanjose.level3.net (4.68.62.206)  3.908 ms
 7  ae-1.r02.snjsca04.us.bb.gin.ntt.net (129.250.3.59)  2.298 ms
 8  ae-11.r23.snjsca04.us.bb.gin.ntt.net (129.250.6.118)  1.801 ms
 9  ae-3.r21.sttlwa01.us.bb.gin.ntt.net (129.250.3.125)  21.240 ms
10  ae-0.r20.sttlwa01.us.bb.gin.ntt.net (129.250.2.53)  19.918 ms
11  ae-0.r24.nycmny01.us.bb.gin.ntt.net (129.250.4.14)  79.439 ms
12  ae-9.r24.londen12.uk.bb.gin.ntt.net (129.250.2.19)  163.360 ms

Note: last 2 lines: new york 79 ms, london 163 ms

Back of the envelope math is a great skill!
Just 1 or 2 digits and some zeros
Stanford - NewYork: 2500 miles
Packet goes out and back
Distance: 5000 miles
5000 miles / speed-of-light = fraction of a second for that dist
5000/186000 = .026 seconds, aka 26 ms
Compare this figure to airplane trip!
Packet time was 79 ms
Packets seem to travel effectively about 1/3 speed of light

Now try London too

Stanford - London: 5300 miles
Packet goes out and back
Distance: 10600 miles
10600/186000 = .056 seconds, aka 56 ms
Traceroute packet time: 163 ms
About a third speed of light!
Just a rule of thumb / upper-limit

TCP/IP Summary Picture

packet hopping across many routers