why we still don’t know how to simulate networks mostafa h. ammar college of computing georgia...
Post on 17-Jan-2016
218 Views
Preview:
TRANSCRIPT
Why We STILL Don’t Know How To Simulate Networks
Mostafa H. Ammar
College of ComputingGeorgia Institute of Technology
Atlanta, GA
Disclaimer
My Personal Perspective: Networking Researcher and not
Simulationist. Have written and used discrete event
computer simulations for over 25 years Involved in COMPASS project at GT for
the last 7 years
The Main Message
The use of simulation has been growing in the networking community
Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
There is a crisis of credibility causing people to question the validity of simulations
Why and How to Fix it?
The Main Message
The use of simulation has been growing in the networking community
Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
There is a crisis of credibility causing people to question the validity of simulations
Why and How to Fix it?
Evaluating Networks: A Spectrum
A spectrum of approaches
Mathematical Analysis Computer Simulation Computer Emulation Prototype Testbed Real network testing/deployment
IncreasedCost/Overhead
DecreasedRealism/Accuracy
A Brief History of Network Simulation
In the beginning: A combination of Mathematical Analysis Small-scale prototypes Simulation
However, simulation was primitive and accessible only to people that had computers and knew how to program them.
Early Examples of Network Simulation
Kleinrock’s thesis (1962) used simulation to validate his Independence assumption.
“I invented effective dynamic routing procedures and also established the analytic model by which you could calculate delay . . . and to simulate it I had to make some fundamental assumptions-I simulated the hell out of it to show that the assumptions worked. “ LK http://www.computer.org/internet/v1n3/kleinrock9702.htm
Early Examples of Network Simulation
Paul Baran: On Distributed Communications:II. Digital Simulation of Hot-Potato Routing in a Broadband Distributed Communications Network http://www.rand.org/publications/RM/RM3103II. The Simulated Network
Description
The size of the network simulated was limited by the amount of storage available in the IBM 7090 computer using FORTRAN. A heavy storage requirement was dictated by the need for each simulated node or station to maintain a table of recorded handover numbers--the tag appended to each message indicating the number of times that message has been relayed. For each node, a table containing handover numbers to every other node via every one of up to a maximum of eight links is needed.
Early Examples of Network Simulation
The Rise of Network Simulation
As computing became more accessible more and more people started doing simulations
Papers using simulation INFOCOM 85: 10% , 92-98: ~ 60% SIGCOMM 89 : 4/29, 98: 13/26, 04:
11/30
The Main Message
The use of simulation has been growing in the networking community
Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
There is a crisis of credibility causing people to question the validity of simulations
Why and How to Fix it?
Networking Research Landscape
Early efforts dealt with relatively simple phenomenon on small-scale networks.
Current research deals with complex phenomenon on large-scale networks
A long story …
Network Research Landscape
Systems are Less tractable mathematically Difficult to prototype And yet everyone has access to
abundant computing => Simulation more viable and
often the only evaluation tool available
The Main Message
The use of simulation has been growing in the networking community
Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
There is a crisis of credibility causing people to question the validity of simulations
Why and How to Fix it?
Crisis of Credibility
“Some claim that stochastic simulation as a performance evaluation tool of various dynamic systems, including telecommunication networks, is misused, and that the spread of this phenomenon is so wide that one can speak about a deep credibility crisis. It is even claimed that one cannot rely on the majority of the published results of performance evaluation studies of dynamic systems based on stochastic simulation.”
From: Pawlikowski, K., Jeong, H.-D. J., Lee, J.-S. R.: On Credibility of Simulation Studie of Telecommunication Networks. IEEE Comms., Jan. 2002, 132-139.
Crisis of Credibility
“ I favor a stamp : WARNING: COMPUTER SIMULATION – MAY BE ERRONEOUS and UNVERIFIABLE. Like on Cigarettes.”
Michael Crichton in “State of Fear”
Crisis of Credibility
From: Cavin, Sasson and Schiper – On the accuracy of MANET Simulators
OpnetNs-2
Glomosim
Crisis of Credibility
A Typical Paper Review “This paper should be rejected because
its evaluation section is weak. The simulation (uses questionable models) and/or (simulates too small a network) and/or (does not have a valid statistical analysis of the simulation output) and/or … (your own critique here).”
The Main Message
The use of simulation has been growing in the networking community
Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
There is a crisis of credibility causing people to question the validity of simulations
Why and How to Fix it?
Reasons for the Credibility Crisis
Confusion regarding the role of simulation
Impossibility of simulating Internet-scale networks
Difficulty in building realistic modelsLack of standards for validation and
repeatability
Reasons for the Credibility Crisis
Confusion regarding the role of simulation
Impossibility of simulating Internet-scale networks
Difficulty in building realistic modelsLack of standards for validation and
repeatability
The Roles of Simulation
To validate approximate analysisTo get/confirm first-order insights
into new techniquesTo understand complex interactions
among various entities/proceduresTo perform relative evaluation
among alternativesTo answer questions regarding
deployability in a real network
The Roles of Simulation
Different tools may be needed for different roles
The burden on accuracy, repeatability and validity is highly dependent on the role
It is not always (rarely?) stated up front
A Personal Experience
Parts and Holes in a Manufacturing Transfer Line
A Personal Experience
Parts and Holes in a Manufacturing Transfer Line
A Significant Failure
Simulation has not been able to answer wide-scale deployability questions Multicast QoS RED …
Perhaps it’s a matter of simulation scale
Reasons for the Credibility Crisis
Confusion regarding the role of simulation
Impossibility of simulating Internet-scale networks
Difficulty in building realistic modelsLack of standards for validation and
repeatability
Large-Scale Network Simulation
Large-scale network simulation offers Verify validity of simulation results on
small networks Examine issues of scale Validate theoretical models for large
networksBut it has been quite challenging to
build large-scale simulationsFujimoto, Perumalla, Park, Wu, Ammar, Riley, "Large Scale Simulation: How Big? How Fast?," Proceedings of the 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), October 2003.
Quantifying Simulator Performance
Execution time: T ≈ (NF * PF * HF) / PTS NF = number of flows
PF = packets sent per flow
HF = average hops per flow
PTS = simulator speed (simulated packets transmissions / sec) Ignores lost packets, protocol generated packets (e.g., acks)
Example 500,000 active UDP flows, 1.0 Mbps per flow, average of 8
hops to reach the destination Assume 1KByte packets (125 packets per sec per flow) Workload: simulate 500 Million packet transmissions per
second of network operation
Number ofpacket transmissions (hops)to be simulated
Scalability of Packet Level Simulators
Network Size (hosts, routers, etc.)
Sim
ulat
or S
peed
- P
TS
(tra
ffic
tha
t ca
n b
e s
imu
late
d in
re
al t
ime)
1 102 104 106 108
102
104
106
108
1010
SequentialSimulation
Time ParallelSimulation
Space-parallelSimulation
(parallel discreteevent simulation)
Our focus
Approaches to Parallel Network Simulation
Build “from scratch” approach:
Substantial effort to build & validate new models
Users must learn a new simulator
SSFNet, Qualnet, Javasim
Large-scaleparallel network
simulatorBackplane/RTI
NS NS NS NS
Federated simulation approach: Simulators integrated via a
software backplane/RTI Exploit existing software &
validated model & user base Heterogenous simulations PDNS
Hardware Platforms
Sequential: Sun / Solaris Ultra-80, UltraSPARC-II 450MHz 4GB memory
Parallel: Intel / RedHat Linux 7.3 8-way Pentium-III XEON (2MB L2 cache) SMP 550MHz clock speed 4GB memory 17 SMPs (136 CPUs) connectd via Gigabit Ethernet
Performance measurements are conservative (due to hardware performance)
Sequential Performance Comparison (Single Campus Network – ~ 500 nodes and links)
COTS(Sun/Solaris)
ns-2**(Sun/Solaris)
GTNetS (Sun/Solaris)
ns-2**(Intel/Linux)
Events 30,700,649 9,107,023 9,143,553 9,117,070
Packet Transmissions*
4,658,390 4,546,074 4,571,264 4,551,084
Events/Packet Transmission
6.59 2.00 2.00 2.00
Run Time (sec) 1,677 104 112.3 48
Packet Trans. / Sec. (PTS)
2,778 43,712 40,706 94,814
* A packet transmission involves simulating a packet transmission over a single link** Includes NixVectors optimization
Average end-to-end delay differed by less than 3%
PDNS Performance on Cluster(Perumalla/Park)
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
8 16 24 32 40 48 56 64 72 80 88 96 104 112 120
Processors
Pac
ket
Tran
smis
sio
ns
per
sec
on
d
Each processor simulates ~5000 nodes and links Up to 120 processors simulating 645,600 nodes
PTS
Lemieux Supercomputer
Pittsburgh Supercomputing Centerhttp://www.psc.edu/machines/tcs/lemieux.html
•750 HP-Alpha ES45 servers
•4Gbytes memory per server
•4 CPUs per server
•1GHz CPU
•3000 CPUs total
•64-bit computing
•Quadrics interconnect
PDNS Performance on PSC(Perumalla)
02040
6080
100120140
0 256 512 768 1024 1280 1536Processors
Mill
ion
Pkt
Tra
ns/
sec Ideal/Linear
PDNS Performance
147K PTS on one CPU Campus network topology, FTP traffic (500 packets/flow, TCP) Scale problem size & number CPUs (up to ~4 million network nodes) Performance up to 106 Million PTS
But… Can we build an Internet-scale Simulation?
A “back-of-the-envelope” calculation 100 million Internet hosts 1 router for every 100 and each router has 4 links 50% of end-hosts have 56Kbps access and 50% have
10Mbps access Router to router links are as follows: 50% @ 10Mbps,
40% @ 100Mbps, 5% @ 655Mbps and 5% @ 2.4Gbps Utilization is 50% for access links and 10% for network
links 1% of hosts have active connections Average packet size = 5000 bits
George Riley, Mostafa Ammar, "Simulating Large Networks: How Big is Big Enough?" Proceedings of First International Conference on Grand
Challenges for Modeling and Simulation, January 2002.
Back of the Envelope Calculation (cont’d)
2.9 x 10^11 events per second Assume can process 10^6 events per second (~
500,000 PTS) => 290,000 CPU seconds (4 days) for evey
second of Internet time !!!! => need 300 Terabytes of memory in ns – not
including routing table space!!! => need 14 Terabytes for event logging for each
second of simulation time!!! Requires 1000 parallel CPUs with 300 GB of main
memory and 1.4 TB of disk storage in each!!! Would not speed things up much – simply allows
simulation to run
Wait a few years and computing power will catch up
Possibly … but the network itself is also growing.
Even with Moore’s Law increase in processing power we will need 300x10^6 CPU seconds for every wallclock second (assuming typical Internet growth).
Open Question: What is the right simulation size to explore Internet-scale performance issues?
Many Challenges Remain
Tools & Parallel Simulation Issues Robust performance Making parallel simulation more transparent,
“automatic” (BenchMap and AutoPart) Access to HPC platforms Visualization Tools
Modeling issues [Floyd/Paxson] Building credible large-scale models and
scenarios Verifying and validating large-scale simulations
Topology? Traffic? Methodologies and tools to effectively utilize the
simulators
Reasons for the Credibility Crisis
Confusion regarding the role of simulation
Impossibility of simulating Internet-scale networks
Difficulty in building realistic modelsLack of standards for validation and
repeatability
Building Realistic Models
The Simulation Modeler’s Dilemma: One needs to eliminate “unimportant” details
in the simulation in order to speed up simulation (avoid kitchen-sink simulations)
But how can one tell if a detail is unimportant Simulate and see if there is any difference –
this is considered wasted effort – Perhaps we should encourage these kinds of
results!
Incorporating Packet-Level Details in P2P Simulations
access bandwidth affects throughput significantly
Models which do not capture packet-level details do not reveal the difference
He, Ammar, Riley, Raj, Fujimoto, "Mapping Peer Behavior to Packet-Level Details: A Framework for Packet-Level Simulation of Peer-to-Peer Systems," Proceedings of the MASCOTS 2003.
Building Realistic Models
A significant challenge especially for large-scale simulation
Significant attention to topology modeling but very little understanding of other important issues Workload Modeling Cross-layer interactions (particularly for
wireless networks) Modeling of operations and overheads
Cross-layer modeling
A perfect instance of the Modeler’s Dilemma
Split-stack composition may be helpful
Xu, Riley, Ammar, Fujimoto, ``Split Protocol Stack Network Simulations using the Dynamic Simulation Backplane'' Proceedings of the Ninth International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, (MASCOTS'01), August 2001
Simulation Split Vertically
Each simulator simulates a portion of the protocol stack of the entire network
Simulator 1
Simulator 2
A
B
C
D
E
F
A B C D E F
Splitting Protocol Stack
Protocol stack split between TCP and IP
ns2
Glomosim
Workload Modeling
See our work presented in this conference about generating TCP workloads to match observed network utilization.
Qi He, Constantinos Dovrolis, Mostafa Ammar, "A Methodology for the Optimal Configuration of TCP Traffic in Network Simulation under Link Load Constraints," Proceedings of the 38th Annual Simulation Symposium, San Diego, April 2005.
Reasons for the Credibility Crisis
Confusion regarding the role of simulation
Impossibility of simulating Internet-scale networks
Difficulty in building realistic modelsLack of standards for validation and
repeatability
Simulation Validation and Repeatability
The issue: Given that the simulation model is
correct, how can one trust the results from the simulation
Two types of problems Technical Social
Technical Issues
Code Trustworthiness Open Source and Reusable Code is a big
imporvement Good Experimental Design Random Number Generation Correct Statistical Inference
Social Issues
Publication of enough details to allow repeatability – possibly even code
Allowance for Scholarly Credit for repeating experiments
Final Thoughts
Be open within the community about this issue
Provide acceptable guidelines for reporting simulation results – A Checklist Enough details for repeatability
Stronger enforcement of guidelinesChange reviewing process (perhaps only
for journals)Give Scholarly credit for repeating other
experiments
Network Topologies: CampusNet(Dartmouth)
10 campus networks connected in ring
Single Campus Network 538 nodes 543 links
top related