are you ready for the tutorial?

Download Are you ready for the tutorial?

If you can't read please download the document

Upload: ayoka

Post on 22-Mar-2016

59 views

Category:

Documents


2 download

DESCRIPTION

Are you ready for the tutorial?. Grab a Worksheet and instructions Did you do the pre-work? 1. Have an account a the GENI Portal. 2. Be a member of GEC19 Project. 3. Installed the Omni tools. 4. Configured the Omni tools. 5. Able to SSH from your PC. Pre-work at: - PowerPoint PPT Presentation

TRANSCRIPT

Tutorial: OpenFlow in GENI with Instrumentation and Monitoring

Are you ready for the tutorial?Grab a Worksheet and instructions

Did you do the pre-work?1. Have an account a the GENI Portal.2. Be a member of GEC19 Project.3. Installed the Omni tools.4. Configured the Omni tools.5. Able to SSH from your PC.

Pre-work at:http://tinyurl.com/gec19iaprework

Sponsored by the National Science Foundation GEC19-March 17 2014#1Tutorial: Inter-Aggregate ExperimentsAaron Helsinger, Luisa Nevers, Niky Riga GENI Project Office{ahelsing, lnevers, nriga}@bbn.comGEC19: March 17, 2014

Sponsored by the National Science FoundationSponsored by the National Science Foundation GEC19-March 17 2014#GENI: Infrastructure for Experimentation

GENI provides compute resources that can be connected in experimenter specified topologies.Sponsored by the National Science Foundation GEC19-March 17 2014#GENI is a nationwide suite of infrastructure forat scale experiments in networking, distributed systems, security, and novel applications.

GENI opens up huge new opportunitiesLeading-edge research in next-generation internetsRapid innovation in novel, large-scale applicationsKey GENI concept: slices & deep programmabilityInternet: open innovation in application programsGENI: open innovation deep into the network

3

GENI provides compute resources that can be connected in experimenter specified Layer 2 topologies.GENI: Infrastructure for ExperimentationSponsored by the National Science Foundation GEC19-March 17 2014#GENI is a nationwide suite of infrastructure forat scale experiments in networking, distributed systems, security, and novel applications.

GENI opens up huge new opportunitiesLeading-edge research in next-generation internetsRapid innovation in novel, large-scale applicationsKey GENI concept: slices & deep programmabilityInternet: open innovation in application programsGENI: open innovation deep into the network

4

Inter aggregate connectivity

Experiments live in isolated slicesHow are these links formed? Sponsored by the National Science Foundation GEC19-March 17 2014#5Single-AM Experiments

Separate Aggregate Manager100s of VMsBare metal hostsOpenFlow switch

Many topologies embedded in only one AMSponsored by the National Science Foundation GEC19-March 17 2014#Inter AM Connectivity Options

Sponsored by the National Science Foundation GEC19-March 17 2014#GENI architecture

MetroResearchBackbones

InternetISP

Regional NetworksCampus

g

g

g

Legend

GENI-enabled hardwareLayer 3Control PlaneLayer 2Data Plane

Control Plane : Over the InternetData Plane : Over the GENI backboneSponsored by the National Science Foundation GEC19-March 17 2014#Inter AM Connectivity Options

Control PlaneData Plane StaticDynamicControl InterfacePreconfigured VLANsGRE TunnelsGENI StitchingEG Stitching

Sponsored by the National Science Foundation GEC19-March 17 2014#Using the Control PlaneAlready setup, tested, stableUniversally supportedLimited bandwidth Shared 100 Mbps with all experimentersShared bandwidth with the rest of the InternetOnly Layer 3FirewallSponsored by the National Science Foundation GEC19-March 17 2014#Using the Data PlaneMore bandwidth Shared 1Gbps for most sitesLayer 2 connectionNo FirewallsUse of HW OpenFlow switchesLimited availabilityLimited topologiesLimited sitesHarder to setup/useTodays tutorial will use the Data PlaneSponsored by the National Science Foundation GEC19-March 17 2014#GENI StitchingSetup point-to-point VLANsBetween hosts on different AMsNot a broadcast domain

Dynamic, real-time setupNeed to coordinate multiple AMs Takes time Can fail

Provides traffic isolation and bandwidth constraints

A common concept used in other networks, applied to GENI, e.g. OSCARS, GLIFGENI RACK AGENI RACK BBackboneGENI RACK BRegional 1Regional 3Regional 2Sponsored by the National Science Foundation GEC19-March 17 2014#The stitcher toolCommand line tool Distributed with Omni/gcfScript built on top of OmniHas same omni commands

It takes care of Path computationAM coordinationError handling and recovery

Sponsored by the National Science Foundation GEC19-March 17 2014#Part I: Design/SetupDesign / Tutorial LogisticsCraft your RSpecsObtain Resources

Part II: ExecuteExecute ExperimentCollect Measurements

Part III: FinishTeardown Experiment

Sponsored by the National Science Foundation GEC19-March 17 2014#So now we are going to actually reserve resources. Tutorial Hands-on

Use InstaGENI Racks across the US and stitch across one or more of the above three networksSponsored by the National Science Foundation GEC19-March 17 2014#Tutorial LogisticsUse the information on your worksheet

We will reserve resources in 4 groupsTo avoid overloading the Internet2 (ION) aggregateYour group # is on your worksheet

Seeing errors in the logs is normal, most of them are not fatal

Wait for the blinking ball before running stitcher and reserving your resourcesSponsored by the National Science Foundation GEC19-March 17 2014#----- Meeting Notes (3/13/14 15:27) -----change the bullet orderTutorial: Topology

GENI Rack 1Go from single AM experiment to multi-AMSponsored by the National Science Foundation GEC19-March 17 2014#Tutorial: Hands-on Log in to the Portal and create a slice under GEC19 project

Launch Flack and load the RSpec

Adjust the topology Use your AMs,Make the inter-AM link of type stitched

Download the edited RSpec

Wait!Sponsored by the National Science Foundation GEC19-March 17 2014#----- Meeting Notes (3/13/14 15:27) -----Change the download to load an rspec

Two images

Slide to transition to hands on

Start hands-on and wait Group 1: Run StitcherRun:$ stitcher.py createsliver You should see something like:15:04:22 INFO stitcher: Checking that slice is valid...15:04:37 INFO stitcher: Slice urn:publicid:IDN+ch.geni.net:GEC19+slice+ expires on 2014- UTC15:04:53 INFO stitcher: Stitched reservation will include resources from these aggregates:15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitch.Aggregate: Stitcher doing createsliver at ..When youve gotten that, raise your handSponsored by the National Science Foundation GEC19-March 17 2014#While we waitCreating dynamic circuits takes timeParticularly across Internet2 IONNegotiate a VLAN tag that is free at each aggregate

You will see output messages that look like errorsFor now, let them go

While stitcher works, well explain what is going on

Sponsored by the National Science Foundation GEC19-March 17 2014#GENI Stitching: Under the HoodHow does GENI Stitching Work?

Rack Configuration (network admins)Long process (~weeks, months)Done once in advanceManual

Inter-aggregate link reservations (experimenters)Automated (tools can make them)Quickish (~minutes, hours)Live, EasyRepeatable

Sponsored by the National Science Foundation GEC19-March 17 2014#How does this work? There are two kinds of things that happen to make thiswork: configuration / setup when each new rack is set up, and stuffthat the experimenter's tool and supporting services do when you go tocreate your circuit.21Key pointsRack are pre-configured for GENI Stitching by network engineers Not all racks are configured!

We use VLANs they are limited

Most circuits will go across the ION aggregateCircuits there are complexThis takes time, and can fail

More Details Later.Sponsored by the National Science Foundation#GEC19: March 17, 2014GENI Stitching Pre-ConfigurationIdentify paths from a rack to GENI backbone

Identify the network providers Typically a campus, a regional, and the backbone

Identify the endpoints and VLAN tags that can be used to connect to the rack

BackboneRegionalCampusRackAggregateGENI AggregatesSponsored by the National Science Foundation GEC19-March 17 2014#This work includes - Identify a path or paths from the new rack to other GENI aggregates. Typically this means a connection to a national backbone (Internet2 or NLR). - Identify the network providers involved (typically a campus, a regional, and the backbone) - Work with engineers at each network to identify the endpoints that can be used to connect to the rack, and a range of VLAN tags that can be designated for use by the rack. - Formally capture this commitment by each intermediate network for easy searching and as the basis of configuring systems - This work has been prototyped by Tom Lehman and Xi Yang of MAX: There is a config file for capturing the information and a script to generate a wiki page (screenshot). Basically this says that the GENI link from this particular GENI switch/port to this other GENI switch/port travels over circuits generously provided by the following network provider(s). This can be improved and expanded over time. - Configure the promised VLAN ranges and endpoints at each network. - Networks that provide dynamic circuit services (today we support OSCARS) can run a GENI Aggregate Manager, and Internet2 runs such an aggregate over ION. Such networks configure their aggregate manager and their OSCARS interfaces. - Networks that provide static circuits will provision all of the promised VLANs between fixed switch/port endpoints, relying on the GENI aggregate at the endpoints to control use of the VLANs.23SCSToolExperimenter: Creating a Circuit

Aggregate 2

Aggregate 1

1. Simple Request2. Send Path Request to Stitching Computation Service (SCS)3. Get Expanded Request4. Send Request to Aggregate 15. Get Manifest6. Repeat for Other Aggregates7. Manifest Back

Sponsored by the National Science Foundation GEC19-March 17 2014#----- Meeting Notes (3/13/14 15:51) -----simplify this a bit and talk about the stitching service before this have a slide just for stiching computation service

24

ToolExperimenter: Creating a Circuit

1. Simple Request7. Manifest BackAutomated by the tool

Sponsored by the National Science Foundation GEC19-March 17 2014#----- Meeting Notes (3/13/14 15:51) -----simplify this a bit and talk about the stitching service before this have a slide just for stiching computation service

25Experimenters Job: Build a Request1. Write one RSpec for all your resourcesAvoid unbound componentsAdd a link between 2 aggregates (or many links!)2 different tagsYou can specify particular VLAN tags if you choose to not advised!

2. Reserve your resourcesLinks and compute nodes reserved togetherRemember: links are node-to-nodeMulti-point VLANs are not supportedSponsored by the National Science Foundation GEC19-March 17 2014#Tool: Creating a CircuitSCSStitcherAggregate 2

Aggregate 1

3. Get Expanded Request4. Send Request to Aggregate 15. Get Manifest6. Repeat for Other Aggregates

2-3. Expand your request to find a path - Using the Stitching Computation Service (SCS)4-5. Generate a request RSpec for each aggregate and make the reservations - Check that dynamic circuits were successfully created7. Report back a combined summary of what you have at all the aggregates

2. Send Path Request SCS7. Manifest Back Any tool can do this!No central GENI stitching authority., Stitching is just a set of reservations at multiple aggregates.Sponsored by the National Science Foundation GEC19-March 17 2014#The tool and aggregates take over from here. There is no central GENIauthority that controls stitching. Stitching is just a set of resourcereservations like any other - these resources just happen to be VLANsthat you need to reserve and configure at multiple aggregates.

Any tool can do this. This tutorial uses a script called 'stitcher.py'.

stitcher.py1. Confirm you made a reasonable stitching request and have a slice 2. Expand your request to find a path for your circuit often using the Stitching Computation Service, which you will hear about later 3. Generate a request Rspec for each aggregate and make the reservations 4. Check if any dynamic circuits were successfully created 5. Report back a combined summary of what you have at all the aggregates

Step 1 is pretty basic and needs no further explanation.

----- Meeting Notes (3/13/14 15:54) -----Mention Flack27Success looks like 15:17:04 INFO stitch.launcher: All aggregates are complete.15:17:04 INFO stitcher: Saved combined reservation RSpec at X AMs to file X-manifest-rspec-stitching-combined.xmlStitching success: Reserved resources in slice X at X Aggregates (including X intermediate aggregate(s) not in the original request), creating 1 link(s).

If youve gotten that, raise your handSponsored by the National Science Foundation GEC19-March 17 2014#Group 2: Run StitcherRun:$ stitcher.py createsliver You should see something like:15:04:22 INFO stitcher: Checking that slice is valid...15:04:37 INFO stitcher: Slice urn:publicid:IDN+ch.geni.net:GEC19+slice+ expires on 2014- UTC15:04:53 INFO stitcher: Stitched reservation will include resources from these aggregates:15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitch.Aggregate: Stitcher doing createsliver at ..When youve gotten that, raise your handSponsored by the National Science Foundation GEC19-March 17 2014#Stitcher: What is Happening?02:49:51 INFO stitcher: Loading agg_nick_cache file '/Users/inki/.gcf/agg_nick_cache'02:49:51 INFO stitcher: Loading config file /Users/inki/.gcf/omni_config02:49:51 INFO stitcher: Using control framework portal02:49:51 INFO stitcher: Checking that slice gec19demo is valid...02:49:52 INFO stitcher: Slice urn:publicid:IDN+ch.geni.net:SDN-Video-Traffic-Orchestrator+slice+gec19demo expires on 2014-03-19 15:03:29 UTC02:49:55 INFO stitcher: Stitched reservation will include resources from these aggregates:02:49:55 INFO stitcher: 02:49:55 INFO stitcher: 02:49:55 INFO stitcher: 02:49:55 INFO stitcher: 02:49:55 INFO stitcher: 02:49:55 INFO stitcher: 02:49:55 INFO stitcher: 02:49:55 INFO stitch.Aggregate: If you have used stitcher, you know it prints a lot of log messages. What is it doing?Sponsored by the National Science Foundation GEC19-March 17 2014#Stitcher Validates your Slice15:04:22 INFO stitcher: Checking that slice is valid...15:04:37 INFO stitcher: Slice urn:publicid:IDN+ch.geni.net:GEC19+slice+ expires on 2014- UTC

Stitcher does these early steps:

Check the RspecDoes it require stitching? Malformed?

Get your slice credentialDo it once to save timeSponsored by the National Science Foundation GEC19-March 17 2014#Stitcher Expands your RequestYour request RSpec doesnt need to say how to get from AM1 to AM2

Stitcher expands your request with stitching informationFinds a PathGets extra aggregate information

You will see:

15:04:53 INFO stitcher: Stitched reservation will include resources from these aggregates:15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitcher:

Sponsored by the National Science Foundation GEC19-March 17 2014#Finding a workable path, and the right reservation order can be hard.Stitching Computation Service (SCS) for path and workflow computationTom Lehman and Xi Yang wrote this optional serviceIncludes many heuristics to optimize path, chance of successAllows excluding particular connection points, VLANsOther tools may use different heuristicsStitcher uses the SCS

http://geni.maxgigapop.net/twiki/bin/view/GENI/NetworkStitchingAPIStitching Computation Service

IONSponsored by the National Science Foundation GEC19-March 17 2014# Once your tool has an expanded request, including the full path for your circuit, it reserves your resources. VLAN tags and links in a circuit are just resources like any other. And so in GENI they are reserved like any other: your tool uses the GENI AM API to send a request RSpec to the aggregates.

The order in which you make reservations does make a difference though. Some aggregates want to be told which VLAN tag to use, while others want to pick, etc. So your tool uses the workflow it figured out (or queried the SCS for) in step 1.

Then the tool makes the normal GENI AM API call to reserve your resources createsliver.----- Meeting Notes (3/13/14 16:25) -----fix the font size33Stitching RSpec Extension*Advertisement RSpecsThis local switch/port connects toThat remote aggregate switch/portUsing this VLAN range

Request RSpecsI want a circuit from A to BGet from A to B using this series of Hops: switch/port/suggested VLAN tag

Manifest RSpecsVLAN tags assigned for each hop* http://www.geni.net/resources/rspec/ext/stitch/Describes connections, paths, and requested or allocated VLANs

. 3747 .Sponsored by the National Science Foundation GEC19-March 17 2014# Stitching Extension How does your tool talk to aggregates about the circuit you want? With the Stitching RSpec Extension. http://www.geni.net/resources/rspec/ext/stitch/

This extension is added to your request RSpec to capture the path your circuit will take for each link you want. It shows a series of hops, from one switch/port/VLAN tag to the next switch/port/VLAN tag. This is how your tool and the aggregates talk about your circuit, but you as an experimenter don't need to care.34Expanded Request RSpec 10 1000000000 l2sc ethernet 9000 750-1000 990 false 2

Added intermediate AMs on linksCapacity specifiedStitching extension showing suggested VLANs at the bottom, and the range to pick fromSponsored by the National Science Foundation GEC19-March 17 2014#Intermediate AggregatesYour slice may require resources at additional aggregatesRegionals and a backbone that connect your aggregates

SCS figures this out and expands the request

Stitcher keeps track of all used aggregatesUsed later to renew and delete at all your aggregatesSponsored by the National Science Foundation GEC19-March 17 2014#Intermediate Aggregates: ExamplesInternet 2: ION and Internet2 service for dynamic circuits OSCARS to do dynamic circuits and VLAN translationSFA based AM translates GENI calls to OSCARSInternet2 operates this aggregate

MAX RegionalUses OSCARS like IONOperates a GENI AM

This is powerful!Connect arbitrary GENI ION endpointsEnables stitching to non-GENI resourcesSponsored by the National Science Foundation GEC19-March 17 2014#Resources are reserved through the GENI AM API CreateSliver callsStitcher uses Omni to make the AM API calls

Circuits and VLANs are resources like any other

Order of reservations mattersUse the computed workflow

Stitcher Reserves the ResourcesstitcherAggregate 2

Aggregate 1

Sponsored by the National Science Foundation GEC19-March 17 2014# Once your tool has an expanded request, including the full path for your circuit, it reserves your resources. VLAN tags and links in a circuit are just resources like any other. And so in GENI they are reserved like any other: your tool uses the GENI AM API to send a request RSpec to the aggregates.

The order in which you make reservations does make a difference though. Some aggregates want to be told which VLAN tag to use, while others want to pick, etc. So your tool uses the workflow it figured out (or queried the SCS for) in step 1.

Then the tool makes the normal GENI AM API call to reserve your resources createsliver.----- Meeting Notes (3/13/14 16:25) -----fix the font size38Success looks like 15:17:04 INFO stitch.launcher: All aggregates are complete.15:17:04 INFO stitcher: Saved combined reservation RSpec at X AMs to file X-manifest-rspec-stitching-combined.xmlStitching success: Reserved resources in slice X at X Aggregates (including X intermediate aggregate(s) not in the original request), creating 1 link(s).

If youve gotten that, raise your handSponsored by the National Science Foundation GEC19-March 17 2014#Group 3: Run StitcherRun:$ stitcher.py createsliver You should see something like:15:04:22 INFO stitcher: Checking that slice is valid...15:04:37 INFO stitcher: Slice urn:publicid:IDN+ch.geni.net:GEC19+slice+ expires on 2014- UTC15:04:53 INFO stitcher: Stitched reservation will include resources from these aggregates:15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitch.Aggregate: Stitcher doing createsliver at ..When youve gotten that, raise your handSponsored by the National Science Foundation GEC19-March 17 2014#Stitcher Output: createsliver15:04:22 INFO stitcher: Checking that slice is valid...15:04:37 INFO stitcher: Slice urn:publicid:IDN+ch.geni.net:GEC19+slice+ expires on 2014- UTC15:04:53 INFO stitcher: Stitched reservation will include resources from these aggregates:15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitcher: 15:04:53 INFO stitch.Aggregate: Stitcher doing createsliver at ..15:05:11 INFO stitch.Aggregate: Allocation at complete.

Continuing to look behind the scenesSponsored by the National Science Foundation GEC19-March 17 2014#Aggregate Handles Stitch RequestsChecks resource availabilityVLANs, nodes, bandwidth?

Allocates resourcesConfigure nodesConfigure the switch to connect the VLAN to the node

Returns manifestVLAN allocated is reported in stitching extensionSet VLAN 3747

RequestManifestSponsored by the National Science Foundation GEC19-March 17 2014# When the aggregate gets this request, what happens? The aggregate sees the stitching extension in the request, specifically requesting that a particular port on a local switch be configured with a particular VLAN tag, for this slice. For the aggregate running one of your compute nodes, the request asks to have that wired all the way to a specific interface on your node. The aggregate treats this as any other resource request. - Is the requested node available? - Is the requested VLAN tag available, or is any tag available in the requested range? - Configure the node and start it - Configure the switch to use that VLAN tag for this slice, directing traffic to this new node, marking the VLAN tag as not available. - Return a manifest RSpec, reporting which node was reserved and which VLAN tag was reserved; the stitching piece is reported in the stitching extension.42Stitcher Handles Aggregate ResponseStitcher handles errorsVLAN is in use? Try another VLAN in the suggested rangeWait 30 sec - 10 min to let the aggregate free up resourcesNo VLANs available here? Try a different pathSomething else, like no node available? Tell the user

Stitcher reads VLAN out of manifestVLAN is inserted into request at next aggregatee.g. GPO IG picks tag 3747, so request to ION uses 3747

Sponsored by the National Science Foundation#GEC17: July 22, 2013 When the manifest comes back from the aggregate, your tool must find the VLAN tag that the aggregate picked or accepted at your suggestion. Then it will put that VLAN tag in the request for the next aggregate. And so the tool continues with the second and following aggregates

Things can go wrong of course, and the tool can try to handle those errors. - Do we need to pick a different VLAN tag? Try to do that - Do we need a new path? Clean up and go back to Step 1 - Or was this some other kind of error (like the aggregate has no more VMs available or doesn't provide the sliver_type you asked for)? Fail the request, cleaning up any reservations along the way.

stitcher, the Omni tool Heidi and Luisa used, checks for all these errors. Of course, this does mean that it can take a while for it to finish trying to find your path.43Handling Dynamic Circuit ErrorsDynamic circuits may fail later when configuring the switchesOSCARS, ION

Common ION errors:Sliver status for circuit was (still): failedsliverstatus: X is (still) at . Had error message: sliverstatus: X is (still) at . Delete and retry.

Stitcher will retryMany ION errors are transient

Sponsored by the National Science Foundation GEC19-March 17 2014#----- Meeting Notes (3/13/14 16:30) -----All errors should be the sameHandling Transient ErrorsYou will see error messages. Dont Panic!Stitcher handles many errors itselfFor an explanation of common error messages, see the appendix or wikihttp://trac.gpolab.bbn.com/gcf/wiki/Stitcher16:27:47 ERROR omni: {'output': "vlan tag 1712 for 'stitched4' not available", 'code': {'protogeni_error_log.16:27:47 INFO stitch.Aggregate: A requested VLAN was unavailable doing createsliver XX at : AMAPIError: Error from Aggregate: code 1. protogeni AM code: 1: vlan tag 1712 for 'stitched4' not available.16:27:47 INFO stitch.launcher: Will put back in the pool to allocate. Got VLAN was unavailable. Retry 2nd time with new suggested 1713 (not 1712)16:27:47 INFO stitch.launcher: Pausing for 30 seconds for Aggregates to free up resources...Sponsored by the National Science Foundation GEC19-March 17 2014#StitcherExperimenter View: Reported Results