chapter 4 network layer tami meredith. 1. routing and switching in general 2. ip and the internet 3....

CSCI 3421DATA COMMUNICATIONS

AND NETWORKING

Chapter 4Network Layer

Tami Meredith

Overview

1. Routing and Switching in General2. IP and the Internet3. Routing in the Internet4. Broadcast/Multicast

TCP/IP Protocol Suite

Network Layer

Get packets from sender to the receiver At each step/link packets must be:A. Routed – an output channel must be

selected; routing is a network-wide process for route-finding

B. Forwarded – moved from input to the output channel; forwarding is a local activity at each router

Forwarding

Every router has a forwarding table1. Packet is received and buffered2. Network header is examined3. Data in header is used as a lookup value

in the forwarding table (table identifies output channel to use)

4. Packet is forwarded to that channel’s output buffer

5. Packet is transmitted

Terminology

Routers – Use network layer data to perform routing

Link-Layer Switches use link-layer data to perform routing

Both are “Packet Switches”

Service Models

1. Guaranteed Delivery with Bounded Delay – Packet will (always) arrive at destination AND within a specified time interval

2. Guaranteed Delivery – Packet will (eventually) but always arrive at the destination

3. Best Effort – Nothing is guaranteed

Packet “Flows”

May guarantee ordering May guarantee a minimum available

bandwidth May guarantee a maximum difference in

transmission time for each packet (i.e., jitter)

May provide encryption/security (hides even transport layer details)

Network Types

Datagram: Connectionless service, every packet is independent E.g., The Internet

Virtual Circuit (VC): Create a connection oriented circuit (e.g., route) from sender to receiver that all packets will travel Require: Setup, Data Transfer, Teardown Circuit ID changes for each link (no global

knowledge needed, only local, thus simpler)

Virtual Circuits

TCP vs. Network VCs

TCP: Host to Host Segments may go different routes

VC: all routers participate Packets go same route

Datagram Networks

Stateless Routing tables use longest prefix

matching Tables need updating every few MINUTES

-- Complexity of the network is in the end host systems and the network is as minimal as possible

VC’s need table updates at the micro-second level -- Complexity of the system is in the network and hosts can do almost nothing

Router Architecture

Data Plane(Hardware)51.2 ns @ 10 GB

------------------------------------------------------------------Control Plane(Software)Millisecond Time Frames

Switching Techniques

Queuing

When a queue (buffer) gets filled, data must be discarded = packet loss

Major factor in ensuring QoS Various strategies

Drop tail – discard incoming packets Select and drop one from an output queue

Buffer Size must be determined Old rule of thumb:

RTT * Link Capacity = Buffer Size250msec * 10 GBS = 2.5 Gbit

Head-Of-Line Blocking

IPv4 (RFC 791)

Datagram

Version Number: 4 Header Length: Usually 20 Type of Service: Router administrator policy Datagram Length: Max 65535, Usually 1500 Fragmentation Data … More to come Time To Live: Hop counter (decremented) Protocol: See IANA Protocol Numbers 2012 Header Checksum: Of header in 16 bit words Addresses: Source and destination Options: Various things Data: Payload! The good stuff

Why a Checksum (When TCP has it)? TCP can be carried over some other network

protocol. Non-TCP data can be carried in an IP

datagram. Must be recomputed at every hop since TTL

changes. Protocol Number

http://www.iana.org/assignments/protocol-numbers/protocol-numbers.xml

Usually 4

Fragmentation

Network MTU sizes in bytes 16 Mbps Token Ring 17914 4 Mbps Token Ring 4464 FDDI 4352 Ethernet 1500 IEEE 802.3/802.2 1492 PPPoE (WAN Miniport) 1480 X.25 (Old, Compuserve) 576

What do we do if a hop has to be on a link using a smaller MTU?

We FRAGMENT the IP datagram into parts!

Fragments

Addressing

Interface: A boundary between a host and the physical layer

IP Addresses are associated with interfaces, not with hosts

ICANN controlled None left “Dotted Decimal”

Subnets

A portion of a network formed by considering all interfaces as independent and disconnected from their hosts (but not from other interfaces)

Addressing CIDR: Classless Interdomain Routing

Based on prefix matching (i.e., subnet mask) Prefix causes route/address aggregation Longest match used (most specific address)

Historically we used classful addressing 3 classes: A, B, C Wasteful of addresses

Note: 255.255.255.255 = Broadcast Address

Dynamic Host Configuration Protocol Allows IP address to be arbitrarily

assigned Avoids manual configuration Provides flexibility Allows addresses be used by multiple

hosts and thus reused May always assign same address to a

Simple Scenario

DHCPIn Action

Network Address Translation Used for SOHO (Small/Home Office) Hidden subnets, allows many computers

to share an IP address

NAT Routers

Must be both DHCP clients and servers Somewhat controversial

“misuses” port numbers Routers are only supposed to process packets

up layer 2 Hosts should be talking to each other without

the network layer modifying messages Makes P2P (e.g., Skype) much more difficult

Should just use IPv6

Datagram

Consistent 40 byte header 128 bit addresses Flow/Priority facilities

Version = 6 Class = IPv4 Type of Service Next Header = IPv4 Protocol

V4 to V6 …

Flag Day: We ALL just stop using V4 and start using V6 at midnight (UTC) of …

Every piece of Network software will need to be replaced on that day or be able to check the clock to know which IP to use

All the software, routers, systems, will all just work correctly, all the million administrators will know and change it all correctly at exactly the right time …

Dual Stack Approach

All network nodes know both IPv4 and IPv6

Has 2 address If it must SEND to an IPv4 node and it

RECEIVES an IPv6, it rebuilds the v6 into a v4

Never turns v4 into v6 since it can’t recover the flow ID (also for efficiency)

Tunnelling

When we must send to a v4 node, we put the ENTIRE v6 packet into a v4 packet

When we can, we extract the v6 packet and continue sending it

RFC 792 Internet Control Message Protocol Carried inside IP datagrams “Secret”Layer above Network but below

Transport

ICMPMessage

Message Format

Ping: ICMP type 8 (code 0) Source Quench – Not used Traceroute

a) Send garbage UDP (bad port) with TTL of 1, 2, 3, …

b) Wait for ICMP responses type 3 (code 3)

Routing

Network Layer = Forwarding + Routing (We’ve seen forwarding)

All hosts are attached to a default router known as the source router

Routing is the finding of a route from the source router to the destination router

Graph Theory

A graph consists of nodes (routers) and edges (connections between routers)

Edges are somehow weighted according to some cost to use them (traffic, time, fiscal)

Paths are routes from one node to another

Paths can be shortest (fewest number of edges) or least cost (lowest aggregate cost to use)

Routing Approaches

Routing can use global knowledge of the entire network (link-state routing)

Routing can be decentralised and function in a distributed/iterative state (e.g., distance vector routing)

Routing can be static (topology not changed) or dynamic (reacts to topological changes)

Load-sensitive routing vary edge weights with respect to load (not used in Internet) or load-insensitive

Link-State Routing

Requires global network knowledge Uses Dijkstra’s algorithm (can also use

Prim’s algorithm) O(n2) Can oscillate

Distance Vector Routing

Asynchronous: nodes operate independently

Iterative: nodes exchange information with neighbours until all information is distributed

Self-terminating: automatically stops when all information is distributed

Distributed: each node only needs information from its neighbours

Bellman-Ford Equation

dx(y) = minv{c(x,v) + dv(y)}

If, enroute to y, we stop at node v (after one hop) then the least cost path to y the cost to get to v plus the cost to get from v to y

For all possible v, simply choose the minimum one

Bellman-Ford-Moore Algorithm

Example:

Distance-Vector

Algorithm

Hierarchical Routing

The Internet is BIG! 100s of Millions of hosts Any routing algorithm for the entire Internet

would be virtually impossible Break Internet if components called Autonomous

Systems (AS) Each AS is controlled by a single corporate

entity (e.g., Bell, Rogers, Cogeco) AS are connected by gateway routers Network connecting all the main gateways is

called the Core Internet (About 100 gbs) No precise definition of who is a core participant

Automous Systems

Route in two manners1. To hosts that are served by the system2. To gateways to other systems

Issue: How do you know which exit gateway to use unless you know everything connected to each gateway

Need an inter-AS routing protocol (as well as intra-AS protocols) Inter = between, Intra = within

ISPs often create subnets and treat them as AS (e.g., Aliant within Bell)

Policy

Routing is not, in reality, based on distance, cost, bandwidth, etc.

Routing between AS is generally governed by policy

Which companies do we have agreements with? Finance more than anything governs routing decisions (need to be able to bill someone)!

Do some routes obey/violate international agreements?

The Internet

Intra AS Routing RIP: Routing Information Protocol OSPF: Open Shortest Path First AS provider can route however they so wish!

Inter AS Routing BGP: Border Gateway Protocol

One of the oldest routing protocols Popular because it is part of BSD (1982)

supporting TCP/IP V1 (RFC 1058) and V2 (RFC 2543) Distance-Vector (i.e., local) algorithm Hop count is the cost metric! (Brutally simple) No hop count greater than 15 is permitted Updates exchanged every 30 seconds between

neighbours Updates are called RIP response messages or

RIP advertisements

RIP v1 in Practice

Only handles Classful routing Drops a route if its not advertised within

180 seconds Uses UDP (not raw IP) No need for ordering or continuation Typically used in lower tier networks Vulnerable to attack (no support for

router validation)

RIP Advertisement

Message Details

Command (1:Req, 2:Resp, 3:TraceOn, 4:TraceOff, 5:Sun),

Version: 1 or 2 Address Family Identifier: IP = 2 IP Address: Use most specific

Network Number: e.g., 128.6.0.0 Subnet Number: e.g., 128.6.4.0 Host Address: e.g., 128.6.4.1 Default: 0

Metric: Hop-Count Up to 25 Route Identification entries

Need: Destination Subnet Next router on the route there Number of hops (cost metric) to get to the

destination Have:

Destination Address You know the IP of the router that sent this

information to you (via UDP header) Metric = cost

RIP v2, RIPng

RFC 2543 (1998) Supports CIDR Supports MD5 authentification Provides route tags to differentiate

internal and external routes

RFC 2080 Supports IPv6 More like v1 than v2

Formats

Generally used in upper tier networks Uses Dijkstra’s algorithm Link cost set by administrator (policy

decision) and permits route tuning 1 means hop count Inverse of bandwidth (make high bandwidth lower

cost) Artificial values to promote/avoid specific routes

Carried in raw IP packets Link-state broadcast upon change or every

30 MINUTES!

OSPF Advances

Security: Exchanges can be authenticated using simple (useless) or MD5 encryption

Support for multiple same-cost paths (load distribution)

Support for multicast routing Support for hierarchical routing

routers can be classified as border area routers

special routers identify a backbone area routes go to backbone, through backbone, to

destination

Inter-AS Routing: BGP

BGP: Border Gateway Protocol1. Obtain subnet reachability data from

neighbours2. Propagate this data internally within the AS3. Find good routes

BGP is COMPLEX – takes years to fully understand and be able to administer

Books exist on how to configure it Routing is based on policy

BGP Basics

Uses TCP connections on port 179 to connect AS gateways between two ASs

External sessions connect two ASs Internal sessions connect the nodes of a

single AS Routing is for CIDR prefixes, not hosts ASs have ASN (AS Numbers) assigned by

ICANN (RFC 1930)

BGP vs Intra AS Routing

Policy – Is everything when it comes to BGP and its mostly irrelevant within an AS

Scale – The core internet is big and can’t be divided, but a single AS can be subdivided

Performance – Doesn’t matter in BGP and is generally secondary to policy

Multiple Destinations

Broadcast Send to all nodes in the system Addressing not needed

Multicast: Send to a subset of nodes Does not go to all Requires addressing

Generally good to minimise traffic N-way unicast: Send a copy to everyone

and ignore duplications Needs no new support Can be really inefficient

Broadcast Techniques

N-way unicast Send a copy to everyone and ignore

duplications Needs no new support Can be really inefficient

Flooding Differentiate addressed vs. broadcast packets Sent broadcast packets to all neighbours

Spanning Trees Predetermine optimal (no redundancy, least

cost) transmission routes

Flooding

Uncontrolled Flooding Node X sends the packet to all its neighbours Node Y sends it to all the neighbours except

the one it received it from Can result in cycles

Sequence Number Flooding Use sequence numbers to check for

duplicates, don’t forward them May be slow and inefficient due to need to

store sequence numbers and do lookups

Reverse Path Forwarding

A form of flooding When we receive a packet

Transmit it to all nodes Except the one we got it from Only if the packet arrived on the link that is its

own shortest path to the packet’s source That is – we can ignore packets that

come via longer routes since we’ve had one on a shorter router

Spanning Trees

Can be done either globally or locally (just like routing)

Many algorithms exist One example is the center-based

approach Pick a controller (core) At some time (e.g., entering the network) a

node unicasts a join message to the core Nodes already in the tree do not forward this

message

Broadcast in Action

OSPF uses a variation of sequence number flooding to send link-state advertisements

Applications (e.g., gnutella) may implement broadcasting

However, application-level broadcasting is really just multi-cast (only nodes using the application)

BOOTP and DHCP use broadcast

IPv4 Broadcasting

Older (historic) forms of broadcasting exist but are obsolete

Broadcast address for an IPv4 host can be obtained by doing bitwise or of the bit complement of the subnet mask and the host’s IP address

Example: To broadcast to an IPv4 subnet with the address space

172.16.0.0/12 (subnet mask 255.240.0.0), the broadcast address is 172.16.0.0 | 0.15.255.255 = 172.31.255.255.

IP broadcast address 255.255.255.255 is the broadcast address of the zero network (0.0.0.0) – all hosts on this network but not on connected neighbouring networks

IPv6 Broadcasting

IPv6 uses multicast addressing to the all-hosts multicast group

No IPv6 protocols are defined to use the all-hosts address

Multicast

Not all hosts participate Stupid approach:

Broadcast the message and have unsubscribed hosts discard it

No extra infrastructure needed (simple) Lots of wasted bandwidth (unnecessary packets)

Solution: Define multicast groups and give the group an

address Obviously, this can use up a lot of address

numbers

Internet Group Management Protocol (RFC 3376)

Membership Query: Who’s in the group?

Membership Report: I’m in the group! Reports can be sent without queries to join the

group Leave Group (optional): I’m no longer in

the group Can also “leave” by ignoring queries

Multicast Approaches

Group-Shared Tree All routers in the group use the same

multicast tree Trick is to find the right center

Source-Based Tree Every router in the group (that can be a data

source) has its own tree Based on RPF Use pruning when a part of the broadcast tree

isn’t needed

The Internet

DVMRP (RFC 1075): Distance-Vector Multicast Routing Protocol RPF with Pruning

PIM (RFC 3973): Protocol Independent Multicast Routing Dense Mode: Flood (RPF) and prune, similar to DVMRP Sparse Mode: Use RV points to set up the multicast distribution

tree MSDP (RFC 3618, 4611): Multicast Source Discovery Protocol,

permits connection of PIM sparse RV nodes from different domains

SSM (RFC 3569, 4607): Source-Specific Multicast Only a single source/sender

BGP Multicast (RFC 4271): Permits routing information from other protocols (e.g., multicast) to be carried on BGP routed networks

chapter 4 network layer tami meredith. 1. routing and switching in general 2. ip and the internet 3....

network protocol

linklayer data

data communications

network vcstcp

data transfer

fragmentation data

networkwide process

output channel forwarding

Documents

16 - internet routing

internet routing registries

internet routing &...

more on internet routing

internet-reititys (routing)

internet routing architectures, second edition -...

internet routing - unibg

internet protokoll ip routing routing protokolle wichtige...

internet routing

internet routing architectures -...

routing on the internet

apnic internet routing registry

internet addressing & routing

internet routing registries, data governance, and...

internet routing algorithm

robust internet routing

wireless internet routing

chapter 25 internet routing

lecture 4 the network layer: forwarding, routing, ip and...

routing in the internet