introduction to the internet - weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · the...

25
The Internet and the Web – a Behind the Scenes Introduction Scott Barker INFO 200, Autumn 2013 Some graphics courtesy of: Elisabeth Jones INFO 200, Summer 2010

Upload: others

Post on 28-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

The Internet and the Web – a Behind the Scenes Introduction Scott Barker INFO 200, Autumn 2013 Some graphics courtesy of: Elisabeth Jones INFO 200, Summer 2010

Page 2: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Today’s agenda • Brief History

• Basics of how the Internet works (protocols)

• Look at some tools, do some packet sniffing

Advanced warning – TONS of stuff today, but the goal is a general understanding of the main concepts, not a mastery of all the details Some slides are dense so you can refer to them later and they make sense

Page 3: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

What is the Internet? • The Internet is a world-wide set of communication networks that

interoperate using the TCP/IP protocol suite – Think of “communication software” or “standards” when you see

“protocol”

– more on TCP/IP and other protocols later…

• A network of networks • A loosely controlled anarchy

• Sometimes called “The Public Internet” to differentiate it from

private internets or intranets

• Questions like: What does it look like?, How big is it? How many users are there? How much data is present? Are all exceedingly difficult to answer. Why?

Page 4: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Brief History

In the 1960’s the Defense Advanced Research Projects Agency (DARPA) wanted to develop a network that would connect geographically diverse, heterogeneous computers

1968/69 -- 4 sites are connected in the original ARPAnet demonstration

To expand this network, standards were necessary, so 1974 --First TCP/IP protocols developed 1983, the use of TCP/IP had been mandated by the Secretary of Defense

Page 5: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

The Original ARPANet Connecting East and West Coasts

Page 6: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Additional Design Considerations

• Network must scale well

• Must connect a very large number of nodes and different network types

• Should continue functioning even if a particular segment failed or if there were unreliable portions of the network – Why do you think this was important?

Page 7: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

TCP/IP

Provides reliable, in-order, end-to-end transmission of data All Internet communications happen or is built on top of TCP/IP Invention of TCP/IP credited to Vint Cerf, 1973. Cerf often considered the “Father of the Internet”

TCP/IP often called a “protocol suite” because many other protocols are built on top of it

Transmission Control Protocol /Internet Protocol

Page 8: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Protocol • An agreed upon way to communicate

• Protocols establish rules for the communication to be successful

– We use protocols in everyday life as well as on the Internet – Protocols may exists for years, even when other parts of the process or

other technologies change

Example: Sending a letter through the US Post Office

What protocols are in use here? What’s missing or wrong in the two examples?

Page 9: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

What and how fast? Letters, words, sentences, pictures, graphics, sounds, videos are all things that we need to communicate over the Internet Since the Internet is a digital communications network, everything must be translated into a series of 1’s and 0’s

A single 1 or 0 is a bit A sequence of eight 1’s and 0’s (for example 10010100) is a byte A kilobyte is 1,000 bytes A megabyte is 1 million bytes A gigabyte is 1,000 megabytes Network bandwidth or speed is measured by how many bits per second can be transmitted and it is usually expressed in BITS per second (not bytes per second) Note: the b is usually written in lower case for bits, as opposed to upper case for Bytes). (56Kb/sec, 768Kb/sec, 10Mb/sec, 100Mb/sec, 1Gb/sec, 10Gb/sec) Bandwidth varies depending on the type of technology used to connected to the Internet and your Internet provider (can also be a function of what you are willing to pay for).

To Measure Bandwidth:

Try Speakeasy.net Network Speed Test

or

SpeedTest.Net For your mobile device

Page 10: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

How is information turned into binary? Letters and numbers – one standard is ASCII (American Standard Code for Information Interchange)

“A” has an “ASCII” value of 65 decimal or 1000001 in binary “B” has an ASCII value of 66 decimal or 1000010 “a” has an ASCII value of 97, or 1100001

Pictures, graphics, sounds, videos – standard binary file formats also are agreed upon (jpeg, mpeg, mp3, MANY more)

– Note: transmitting pictures is generally much more bandwidth intensive than transmitting text, and videos are generally much more bandwidth intensive than pictures. Why?

– But the continuing desire to do so has resulted in a strong push and demand for faster and faster Internet connections for all

Page 11: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Getting those 1’s and 0’s someplace… Packet Switching vs. Circuit Switching

Communication technologies are typically based on two approaches:

Circuit switch approach - dedicated connection between two devices Packet switch approach - bandwidth is shared and all data is “packetized” or turned into little pieces. There may be many possible connections between devices.

Most computer networks (including the Internet) use this packet approach Why?

Page 12: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Network Packets

Having the data to communicate (“the payload”) isn’t enough Addressing information is necessary so packets are delivered to the correct device When you combine this addressing type information with real data, you have a “packet” A network packet includes source, destination, other delivery details, and data. In the case of Ethernet (the most widely deployed type of local area network), these packets can be up to 1500 bytes in size.

Given the 1500 byte max size of a packet, a single web page may be broken into dozens or even hundreds of packets went sent across the Internet

Remember each box in the above diagram represents a series of 1’s and 0’s

Page 13: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Demo

Looking at packets using a Packet Sniffer

(wireshark)

Page 14: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

IP (Internet Protocol) Addressing Each device on the Internet must have a unique public “IP Address”

In the current version of IP (IPv4), addresses are 4 bytes long or 32 bits. (Can support 2^32 addresses or 4,294,967,296 devices) 10011000000000100101000100000001 To make them easy for humans to remember, the 32 bit sequence is divided into 4 pieces (each 8 bits long) and then converted to a decimal number. Each piece of the address is then separated by a “.” 10011000 00000010 01010001 00000001 152 2 81 1

For example: 152.2.81.1 Note: with 8 bits you can represent 256 different possibilities (2^8), so each decimal number will be between 0 and 255 0 and 255 are also reserved for special purposes so in practice, you will only see devices assigned IP address that have numbers between 1-254

IP Addresses are running low, new version of IP (IPv6) is on the horizon

Addresses are 128 bits. Can support 2^128 addresses (340 undecillion addresses).

Page 15: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Private Addresses

A portion of the Internet address space has been reserved for “private” IP addresses

192.168.x.x 172.16.0.0 – 172.31.255.255 10.0.0.0 – 10.255.255.255

Private addresses may be used within a home network or on corporate/organizational networks. This allows those machines to use TCP/IP for communication but not have those devices accessible publically. Why would you want to do that?

Page 16: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Assigning Numbers to Machines

• Statically

A network administrator decides how to allocate those numbers on their local network and each device is setup manually

• Dynamically

Device requests an IP address from a pool of shared addresses on boot The address assigned is “leased” for a particular time of use On subsequent occasions your machine may get a different IP address Done using a protocol called DHCP (dynamic host configuration protocol)

Page 17: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Demo

ipconfig and ipconfig /all Windows Network GUI

On Mac – System Preferences/Network

Page 18: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

What we saw…

• Our MAC address – Media Access Control, the unique address of our network interface card

• Our IP address

• How we got our IP address (statically or DHCP)

• But some other things we haven’t talked about yet like: – Default Gateway – Primary DNS Server

Page 19: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Default Gateway/Routers

• The Internet is HUGE and would not scale if all packets to/from all destinations appeared everywhere or if every device had to know how to get to every other device

• Instead, specialized devices called “routers” segment the Internet into pieces

• A router is a traffic copy, responsible for keeping data that is on your network on your network only, and passing it on when it needs to go elsewhere

• The “default router” or “default gateway” tells your computer where data is sent when it isn’t on your network

• Routers themselves have default routers, so to get from one place to another a packet may traverse many routers before arriving at the destination

Page 20: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Domain Name Service (DNS)

• Most users don’t see or use numerical IP addresses often. They use names instead www.uw.edu www.facebook.com but it is possible to enter the number in a web browser, like: http://157.166.226.26

• The use of names rather than numbers is made possible by DNS, the Domain Name Service – the DNS server stores name/address pairs so that you can enter

a name and it will lookup the number and give that information to the lower level protocol (TCP/IP)

Page 21: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

DNS Continued

• DNS is a hierarchical system that includes .com, .org, .edu, .mil, and country based addresses like .ca or .nz ICANN made news this summer by opening the number of possible domains to be virtually unlimited (.sport, .eco have been suggested, many more)

• DNS for your computer/device may be provided by your organization (iSchool, UW), your ISP, or a commercial company (e.g. GoDaddy or Network Solutions). The server you contact for DNS information may be set statically or given to you via DHCP

• You can query DNS manually using “nslookup”

• You can find out some details about who owns a particular DNS name by doing a “whois” lookup.

www.networksolutions.com/whois

Page 22: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Demo

nslookup

Page 23: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Internet applications Include:

The web, email, ftp, remote desktop, instant messaging, and older things like telnet and USENET news. Most mobile “apps” also are Internet applications built on top of TCP/IP. They consume or store data “in the cloud”. All utilized TCP/IP as their protocol for getting packets from one place to another. Many use Application Level Protocols as well.

Page 24: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Application Level Protocols

Each application has it’s own protocol “on top” of TCP/IP, such as SMTP for mail, or HTTP for Web. That’s why TCP/IP is sometimes called a “Protocol Suite”.

It includes not only the protocols for getting data from one place to another in a reliable fashion, but it also includes application layer protocols that define how email works, how the Web works etc.

The Internet is called an “open” network because how these applications work is widely known. Any number of vendors can build products on top of these standards and have them interoperate with others.

Standards based protocols are defined in Internet RFC’s (Requests for Comment).

Not all Internet applications are based on open protocols. Increasingly, many products and services are built on “proprietary” protocols and don’t interoperate well or at all. (Instant Messaging is an example, iTunes, Exchange/Outlook “free/busy”).

Page 25: Introduction to the Internet - Weeblykevinx-chiu.weebly.com/uploads/8/9/8/3/8983380/... · The Internet has been around for a long time (started in the 1960’s) The Internet is a

Summary

The Internet has been around for a long time (started in the 1960’s) The Internet is a network of networks. Data on the Internet is broken into small pieces called packets. Routers insure packets go only where they need to go. TCP/IP is a suite of protocols that every device on the Internet uses to communicate. There are many applications on the Internet and each application has it’s own application layer protocol on top of TCP/IP The web is but one Internet application, the web and the Internet are not the same thing