the edge of smartness
DESCRIPTION
The Edge of Smartness. Carey Williamson Department of Computer Science University of Calgary Email: [email protected]. 1. 1. Main Message. Application Transport Network Data Link Physical. Application Transport Network Data Link Physical. Core Network. 2. 2. - PowerPoint PPT PresentationTRANSCRIPT
1
Copyright © 2005 Department of Computer Science
11
The Edge of Smartness
Carey WilliamsonDepartment of Computer ScienceUniversity of Calgary
Email: [email protected]
2
Copyright © 2005 Department of Computer Science
22
Main Message• Now, more than ever, we need “smart edge”
devices to enhance the performance, functionality, and efficiency of the Internet
Application
Transport
Network
Data Link
Physical
Application
Transport
Network
Data Link
PhysicalCoreNetwork
3
Copyright © 2005 Department of Computer Science
33
The End-to-End Principle• Central design tenet of the Internet (simple core)• Represented in design of TCP/IP protocol stack• Wikipedia: Whenever possible, communication
protocol operations should be defined to occur at the end-points of a communications system
• Some good reading:– J. Saltzer, D. Reed, and D. Clark, “End-to-End
Arguments in System Design”, ACM ToCS, 1984– M. Blumenthal and D. Clark, “Rethinking the Design
of the Internet: The end to end arguments vs. the brave new world”, ACM ToIT, 2001
4
Copyright © 2005 Department of Computer Science
44
The End-to-End Principle: Revisited• Claim: The ongoing evolution of the Internet is
blurring our notion of what an end system is• This is true for both client side and server side
– Client: mobile phones, proxies, middleboxes, WLAN– Server: P2P, cloud, data centers, CDNs, Hadoop
• When something breaks in the Internet protocol stack, we have to find a suitable retrofit to make it work properly
• We have done this repeatedly for decades, and will likely keep doing it again and again!
5
Copyright © 2005 Department of Computer Science
55
(Selected) Existing Examples• Mobility: Mobile IP, MoM, Home/Foreign Agents• Small devices: mobile portals, content transcoding• Web traffic volume: proxy caching, CDNs• Wireless: I-TCP, Proxy TCP, Snoop TCP, cross-layer• IP address space: Network Address Translation (NAT)• Multi-homing: smart devices, cognitive networks, SIP• Big data: P2P file sharing, BT, download managers• P2P file sharing: traffic classification, traffic shapers• Security concerns: firewalls, intrusion/anomaly detection• Intermittent connectivity: delay-tolerant networks (DTN)• Deep space: inter-planetary IP
6
Copyright © 2005 Department of Computer Science
66
The Smart Edge• Similar “tweaks” will be needed at server side• Putting new functionality in a “smart edge”
device seems like a logical choice, for reasons of performance, functionality, efficiency, security
• What is meant by “smart”?– Interconnected: one or more networks; define basic
information units; awareness of location/context– Instrumented: suitably represent user activities;
location, time, identity, and activity; perf metrics– Intelligent: provisioning, management, adaptation;
appropriate decision-making in real-time
7
Copyright © 2005 Department of Computer Science
77
Example 1:Redundant Traffic Elimination
8
Copyright © 2005 Department of Computer Science
8
Basic Principles of RTE• If you can “remember” what you have
sent before, then you don’t have to send another copy
• Redundant Traffic Elimination (RTE)
• Done using a dictionary of chunks and their associated fingerprints
• Examples:– Joke telling by certain CS professors– Data deduplication in storage systems (90% savings)– “WAN Optimization” in networks (20% savings)
9
Copyright © 2005 Department of Computer Science
9
Redundant Traffic Elimination (RTE)
9
• Purpose: Use bottleneck link more efficiently• Basic idea: Use a cache of data chunks to avoid
transmitting identical chunks more than once
• RTE process:– Divide IP packet into chunks– Select a subset of chunks– Store a cache of chunks at two ends
of a network link or path– Transfer only chunks that are not cached
• Works within and across files• Combines caching and chunking
C hunk A C hunk B C hunk C
D istance O verlap
C hunk cache
Chunk B
Chunk A
Chunk CFP C
FP A
FP B
.. ... .
.. ... .. ..
. ... ..
. ..
F P A = fingerp rin t (C hunk A )
10
Copyright © 2005 Department of Computer Science
10
RTE Process Pipeline
10
Packet
NIC
Chunking(no overlap)
FIFO cachemanagement
Forwarding
Yes
Yes
Packet
NIC
Fingerprinting
Forwarding
Large enough?
No
Next chunk
Overlap OK?
No
non-FIFO cachemanagement
Current Proposed
Fingerprinting
Chunk expansion Content
promising?No
Yes
Improve traditional RTE
Exploit traffic non-uniformities: Packet size (bypass
technique) Chunk popularity
(new cache management scheme)
Content type (content-aware RTE)
Up to 50% more detected redundancy
11
Copyright © 2005 Department of Computer Science
1111
Type Value Description Example
Nulls 57.1% Consecutive null bytes 0x00000000
Text 16.7% Plain text (English) Gnutella
HTTP 7.3% HTTP directives Content-Type:
Mixed 6.2% Plain text and other chars 14pt font
Binary 5.8% Random characters 0x27c46128
HTML 3.7% HTML code fragments <HTML> <p>
Char+1 3.2% Repeated text chars AAAAAAAz
Main Sources of Redundancy
12
Copyright © 2005 Department of Computer Science
12
RTE Summary
12
• Improves traditional RTE savings by up to 50%• Techniques can be used individually or together• RTE very beneficial for wireless traffic
– 30% of users have 10-50% redundant traffic
• Proposed a novel content-aware RTE– Improve RTE savings by up to 38%
• Challenges of content-aware RTE– Needs refinement to be able to work on real traces, or
exploit an appropriate traffic classification scheme
– Needs improvement in execution time
13
Copyright © 2005 Department of Computer Science
1313
Example 2:The TCP Incast Problem
14
Copyright © 2005 Department of Computer Science
14
Motivation
14
• Emerging IT paradigms– Data centers, grid computing, HPC, multi-core– Cluster-based storage systems, SAN, NAS– Large-scale data management “in the cloud”– Data manipulation via “services-oriented computing”
• Cost and efficiency advantages from IT trends, economy of scale, specialization marketplace
• Performance advantages from parallelism– Partition/aggregation, Hadoop, multi-core, etc.– Think RAID at Internet scale! (1000x)
15
Copyright © 2005 Department of Computer Science
15
Problem Formulation
• High-speed, low-latency network (RTT ≤ 0.1 ms) • Highly-multiplexed link (e.g., 1000 flows)• Highly-synchronized flows on bottleneck link• Limited switch buffer size (e.g., 100 packets)
How to provide high goodputfor data centerapplications?
TCP retransmission timeouts
TCP throughput degradation
16
Copyright © 2005 Department of Computer Science
16
Summary
• Data centers have specific network characteristics
• TCP-incast throughput collapse problem emerges
• Solutions:
– Tweak TCP parameters for this environment
– Redesign TCP for this environment
– Rewrite applications for this environment (Facebook)
– Smart edge coordination for uploads/downloads
Summary: TCP Incast Problem
17
Copyright © 2005 Department of Computer Science
1717
Concluding Remarks• We need “smart edge” devices to enhance the
performance, functionality, security, and efficiency of the Internet (now more than ever!)
Application
Transport
Network
Data Link
Physical
Application
Transport
Network
Data Link
PhysicalCoreNetwork
18
Copyright © 2005 Department of Computer Science
1818
Future Outlook and Opportunities
• Traffic classification• QoS management• Load balancing• Security and privacy• Cloud computing• Virtualization everywhere• Multipath TCP congestion control• …