using perfsonar and science dmz

65
Using perfSONAR and Science DMZ NYSERNet TECH SUMMIT 2014, Poughkeepsie, NY Thursday, June 12 th John Hicks – [email protected] Network Research Engineer - Internet2

Upload: doanh

Post on 30-Dec-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using perfSONAR and Science DMZ

Using perfSONAR and Science DMZ

NYSERNet TECH SUMMIT 2014, Poughkeepsie, NY Thursday, June 12th John Hicks – [email protected] Network Research Engineer - Internet2

Page 2: Using perfSONAR and Science DMZ

•  perfSONAR –  Motivation –  Collaborative effort –  Use cases

•  Science DMZ –  Motivation –  Architecture –  Use cases

Slides made available by Internet2 and ESNet

Overview

Page 3: Using perfSONAR and Science DMZ

Measurement

Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today. In depth network measurement and monitoring services are key components to provide researches and engineers with views into application performance and to trouble shoot network problems.  

Page 4: Using perfSONAR and Science DMZ

•  How can your users effectively report problems? •  How can users and the local administrators effectively

solve multi-domain problems? –  Eliminate ‘who you know’ –  Automate things when applicable

•  Components: –  Tools to use –  Questions to ask –  Methodology to follow –  How to ask for help

–  Show issue with graphs and analysis –  Provide documentation

“The network is slow today”

Page 5: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Soft Network Failures – Hidden Problems

Hard failures are well-understood •  Link down, system crash, software crash •  Traditional network/system monitoring tools designed to quickly find

hard failures

Soft failures result in degraded capability •  Connectivity exists •  Performance impacted •  Typically something in the path is functioning, but not well

Soft failures are hard to detect with traditional methods •  No obvious single event •  Sometimes no indication at all of any errors

Independent testing is the only way to reliably find soft failures

5 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 6: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Rebooted router with full route table

Gradual failure of optical line card

Sample Soft Failures

6 – ESnet Science Engagement ([email protected]) - 6/9/14

Gb/

s

normal performance

degrading performance

repair

one month

© 2014, Energy Sciences Network

Page 7: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Testing Infrastructure – perfSONAR

perfSONAR is: •  A widely-deployed test and measurement infrastructure −  ESnet, Internet2, US regional networks, international networks −  Laboratories, supercomputer centers, universities

•  A suite of test and measurement tools •  A collaboration that builds and maintains the toolkit

By installing perfSONAR, a site can leverage over 1100 test servers deployed around the world

perfSONAR is ideal for finding soft failures •  Alert to existence of problems •  Fault isolation •  Verification of correct operation

7 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 8: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR Deployment Footprint

8 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 9: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Lookup Service Directory Search: http://stats.es.net/ServicesDirectory/

9 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 10: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR Dashboard: http://ps-dashboard.es.net

10 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 11: Using perfSONAR and Science DMZ

•  perfSONAR –  Motivation –  Collaborative effort –  Use cases

•  Science DMZ –  Motivation –  Architecture –  Use cases

Overview

Page 12: Using perfSONAR and Science DMZ

•  perfSONAR •  ESNET – Science DMZ •  NSF

–  CC-NIE –  CC-IIE –  IRNC

•  Internet2 –  BTR – (Broaden the reach) –  Tiger team visits

Efforts to help with network performance issues  

Page 13: Using perfSONAR and Science DMZ

•  perfSONAR development and support –  Internet2 –  ESNet –  Indiana University –  DANTE

•  New model moving forward –  No more perfSONAR-MDM –  Combined effort

•  Meeting at TNC2014 •  Meeting in Ann Arbor

Stakeholders  

Page 14: Using perfSONAR and Science DMZ

•  perfSONAR –  Motivation –  Collaborative effort –  Use cases

•  Science DMZ –  Motivation –  Architecture –  Use cases

Overview

Page 15: Using perfSONAR and Science DMZ

•  User and resource are geographically separated –  Common case: Remote instrument + distributed users

•  Both have access to high speed communication network –  LAN infrastructure - 1Gbps Ethernet –  WAN infrastructure – 10Gbps Optical Backbone

 

Motivation – A Typical Scenario

Page 16: Using perfSONAR and Science DMZ

•  User wants to access a file at the resource (e.g. ~600MB) •  Plans to use COTS tools (e.g. “scp”, but could easily be

something scientific like “GridFTP” or simple like a web browser)

•  What are the expectations? –  1Gbps network (e.g. bottleneck speed on the LAN) –  600MB * 8 = 4,800 Mb file –  User expects line rate, e.g. 4,800 Mb / 1000 Mbps = 4.8

Seconds •  What are the realities?

–  Congestion and other network performance factors –  Host performance –  Protocol Performance –  Application performance

Motivation – A Typical Scenario

Page 17: Using perfSONAR and Science DMZ

•  perfSONAR should be used to diagnose an end-to-end performance problem –  User attempts to download a remote

resource –  Resource and user are geographically

separate –  Both are assumed to be connected to

high-performance networks •  Poor transfer rate observed •  How solve this problem

   

Example perfSONAR Use Case

Page 18: Using perfSONAR and Science DMZ

•  The traceroute tool can be used to help determine application path –  Only layer 3 –  No layer 2 information

•  Performance problem could exist anywhere along application path

•  Problem could be –  Congested path segment –  Faulty equipment –  Misconfiguration

•  Where to look?    

Example perfSONAR Use Case

Page 19: Using perfSONAR and Science DMZ

•  Typically, each segment of the path is controlled be a different domain

•  Network engineering staff in each domain could help fix the problem –  Which domain –  How to contact staff

•  How to convince engineering staff that there is a performance problem when no ‘hard failure’ exists

   

Example perfSONAR Use Case

Page 20: Using perfSONAR and Science DMZ

•  If each domain has measurement data available via perfSONAR –  End user could discover this automatically –  No need to contact domain staff for this

process •  Automated tool can provide valuable

information •  Regular testing can expose trend anomalies •  Open testing environments provides timely

testing for partial path decomposition •  Visualization and analysis tools provide clear

evidence of success or failure    

Example perfSONAR Use Case

Page 21: Using perfSONAR and Science DMZ

•  When the problem is isolated based on testing –  ‘soft failures’ often go unnoticed –  Observed ‘soft failures’ may not be reported

•  End user can now contact the domain in question –  Provide graphs and analysis to show the

problem –  Domain staff can replicate analysis with there

own testing infrastructure •  Problem resolution could help others

   

Example perfSONAR Use Case

Page 22: Using perfSONAR and Science DMZ

•  International cultural event •  Live musical performance coordinated

over different time zones •  Many network domains involved •  Application path only partially

instrumented with perfSONAR •  Difficult to debug foreign networks

   

Another Use Case

Page 23: Using perfSONAR and Science DMZ

•  Cultural event between Cleveland and Mumbai –  Live video stream –  Latency an issue

•  Testing application –  Initial tests worked OK –  Ongoing testing showed periodic unacceptable performance

•  What is wrong – where is the problem –  Mumbai last mile –  Indian network –  TEIN network (JGN-X leg) –  TransPAC Network –  Internet2 Network –  Local Cleveland leg

International use case

Page 24: Using perfSONAR and Science DMZ

•  At least 6 network involved •  Cleveland, Internet2, TransPAC, and JGN-X instrumented

with perfSONAR –  Able to verify network performance from Cleveland to (far)

edge of JNG-X networks in minutes –  Contacted TEIN engineer to setup iperf and latency testing

took 2 weeks (still good) •  No visibility into Indian networks

–  Difficult communication with foreign engineers (language, time zone, …, etc.)

–  Relied on other engineers to test foreign networks •  After 4 weeks of testing and debugging

–  Switch misconfiguration discovered in Mumbai –  If perfSONAR was available everywhere along network path,

problem area could have been identified much faster

 

International use case (cont.)

Page 25: Using perfSONAR and Science DMZ

•  perfSONAR –  Motivation –  Collaborative effort –  Use cases

•  Science DMZ –  Motivation –  Architecture –  Use cases

Overview

Page 26: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Networks are an essential part of data-intensive science •  Connect data sources to data analysis •  Connect collaborators to each other •  Enable machine-consumable interfaces to data and analysis

resources (e.g. portals), automation, scale

Performance is critical •  Exponential data growth •  Constant human factors •  Data movement and data analysis must keep up

Effective use of wide area (long-haul) networks by scientists has historically been difficult

Motivation

26 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 27: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Data Mobility in a Given Time Interval

27

This table available at:"http://fasterdata.es.net/fasterdata-home/requirements-and-expectations/"

"27 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 28: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Central Role of the Network

The very structure of modern science assumes science networks exist: high performance, feature rich, global scope

What is “The Network” anyway? •  “The Network” is the set of devices and applications involved in the use of a

remote resource −  This is not about supercomputer interconnects −  This is about data flow from experiment to analysis, between facilities, etc.

•  User interfaces for “The Network” – portal, data transfer tool, workflow engine

•  Therefore, servers and applications must also be considered

What is important? 1.  Correctness

2.  Consistency

3.  Performance

28 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 29: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

TCP – Ubiquitous and Fragile

Networks provide connectivity between hosts – how do hosts see the network?

•  From an application’s perspective, the interface to “the other end” is a socket

•  Communication is between applications – mostly over TCP

TCP – the fragile workhorse •  TCP is (for very good reasons) timid – packet loss is interpreted

as congestion •  Packet loss in conjunction with latency is a performance killer •  Like it or not, TCP is used for the vast majority of data transfer

applications (more than 95% of ESnet traffic is TCP)

29 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 30: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

A small amount of packet loss makes a huge difference in TCP performance

6/9/14

Metro Area

Local (LAN)

Regional

Continental

International

Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)

With loss, high performance beyond metro distances is essentially impossible

30 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 31: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Working With TCP In Practice

Far easier to support TCP than to fix TCP •  People have been trying to fix TCP for years – limited success •  Like it or not we’re stuck with TCP in the general case

Pragmatically speaking, we must accommodate TCP •  Sufficient bandwidth to avoid congestion •  Zero packet loss •  Verifiable infrastructure −  Networks are complex −  Must be able to locate problems quickly −  Small footprint is a huge win – small number of devices so that

problem isolation is tractable

31 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 32: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Putting A Solution Together Effective support for TCP-based data transfer

•  Design for correct, consistent, high-performance operation •  Design for ease of troubleshooting

Easy adoption is critical •  Large laboratories and universities have extensive IT deployments •  Drastic change is prohibitively difficult

Cybersecurity – defensible without compromising performance

Borrow ideas from traditional network security •  Traditional DMZ – separate enclave at network perimeter

(“Demilitarized Zone”) −  Specific location for external-facing services −  Clean separation from internal network

•  Do the same thing for science – Science DMZ 32 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 33: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Dedicated Systems for

Data Transfer

Network Architecture

Performance Testing &

Measurement

Data Transfer Node •  High performance •  Configured specifically

for data transfer •  Proper tools

Science DMZ •  Dedicated network

location for high-speed data resources

•  Appropriate security •  Easy to deploy - no

need to redesign the whole network

perfSONAR •  Enables fault isolation •  Verify correct operation •  Widely deployed in

ESnet and other networks, as well as sites and facilities

The Science DMZ Design Pattern

33 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 34: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Abstract or Prototype Deployment

Add-on to existing network infrastructure •  All that is required is a port on the border router •  Small footprint, pre-production commitment

Easy to experiment with components and technologies •  DTN prototyping •  perfSONAR testing

Limited scope makes security policy exceptions easy •  Only allow traffic from partners •  Add-on to production infrastructure – lower risk

34 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 35: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Science DMZ Design Pattern (Abstract)

10GE

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

High performanceData Transfer Node

with high-speed storage

Per-service security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

perfSONAR

35 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 36: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Local And Wide Area Data Flows

10GE

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

High performanceData Transfer Node

with high-speed storage

Per-service security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

High Latency WAN Path

Low Latency LAN Path

perfSONAR

36 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 37: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Support For Multiple Projects

Science DMZ architecture allows multiple projects to put DTNs in place •  Modular architecture •  Centralized location for data servers

This may or may not work well depending on institutional politics •  Issues such as physical security can make this a non-starter •  On the other hand, some shops already have service models in

place

On balance, this can provide a cost savings – it depends •  Central support for data servers vs. carrying data flows •  How far do the data flows have to go?

37 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 38: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Multiple Projects

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

Project A DTNPer-project

security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

Project B DTN

Project C DTN

38 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 39: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Supercomputer Center Deployment

High-performance networking is assumed in this environment •  Data flows between systems, between systems and storage, wide

area, etc. •  Global filesystem often ties resources together −  Portions of this may not run over Ethernet (e.g. IB) −  Implications for Data Transfer Nodes

“Science DMZ” may not look like a discrete entity here •  By the time you get through interconnecting all the resources, you

end up with most of the network in the Science DMZ •  This is as it should be – the point is appropriate deployment of tools,

configuration, policy control, etc. Office networks can look like an afterthought, but they aren’t

•  Deployed with appropriate security controls •  Office infrastructure need not be sized for science traffic

39 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 40: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Supercomputer Center

VirtualCircuit

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

40 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 41: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Supercomputer Center Data Path

VirtualCircuit

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

High Latency WAN Path

Low Latency LAN Path

High Latency VC Path

41 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 42: Using perfSONAR and Science DMZ

•  perfSONAR –  Motivation –  Collaborative effort –  Use cases

•  Science DMZ –  Motivation –  Architecture –  Use cases

Overview

Page 43: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Data Site – Architecture

VCVirtualCircuit

BorderRoutersWAN HA

Firewalls

Site/CampusLAN

perfSONAR

perfSONAR

perfSONAR

Data ServiceSwitch Plane

Provider EdgeRouters

VirtualCircuit

VC

Data Transfer Cluster

43 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 44: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Data Site – Data Path

VCVirtualCircuit

BorderRoutersWAN HA

Firewalls

Site/CampusLAN

perfSONAR

perfSONAR

perfSONAR

Data ServiceSwitch Plane

Provider EdgeRouters

VirtualCircuit

VC

Data Transfer Cluster

44 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 45: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Distributed Science DMZ Fiber-rich environment enables a distributed Science DMZ

•  No need to accommodate all equipment in one location •  Allows the deployment of institutional science service

WAN services arrive at the site in the normal way Dark fiber distributes connectivity to Science DMZ services throughout the

site •  Departments with their own networking groups can manage their own

local Science DMZ infrastructure •  Facilities or buildings can be served without building up the business

network to support those flows Security is more complex

•  Remote infrastructure must be monitored •  Several technical remedies exist (arpwatch, no DHCP, separate

address space, etc.) •  Solutions depend on relationships with security groups

45 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 46: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Distributed Science DMZ – Dark Fiber

Dark Fiber

DarkFiber

10GE

DarkFiber

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

Per-project security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

Project A DTN(remote)

Project B DTN(remote)

Project C DTN(remote)

46 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 47: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Multiple Science DMZs – Dark Fiber

Dark Fiber

DarkFiber

10GE

DarkFiber

10GE

10G

Border Router

WAN

Science DMZSwitch/Routers

Enterprise Border Router/Firewall

Site / CampusLAN

Project A DTN(building A)

Per-project securitypolicy

perfSONAR

perfSONAR

Facility B DTN(building B)

Cluster DTN(building C)

perfSONARperfSONAR

Cluster(building C)

47 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 48: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Science DMZ – Flexible Design Pattern

The Science DMZ design pattern is highly adaptable to research Deploying a research Science DMZ is straightforward

•  The basic elements are the same −  Capable infrastructure designed for the task −  Test and measurement to verify correct operation −  Security policy well-matched to the environment, application set

is strictly limited to reduce risk •  Connect the research DMZ to other resources as appropriate

The same ideas apply to supporting an SDN effort •  Test/research areas for development •  Transition to production as technology matures and need dictates •  One possible trajectory follows…

48 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 49: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Science DMZ – Separate SDN Connection

SDN

Border Router

WAN

ProductionScience DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

Production DTNScience DMZ Connections

Per-service security policy control points

High performance routed path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

perfSONAR

SDNScience DMZSwitch/Router

Research DTN

SDNPath

perfSONAR

49 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 50: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Science DMZ – Production SDN Connection

SDN

Border Router

WAN

Production SDNScience DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

Production DTNScience DMZ Connections

Per-service security policy control points

High performance routed path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

perfSONAR

ResearchScience DMZSwitch/Router

Research DTN

SDNPath

perfSONAR

50 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 51: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Science DMZ – SDN Campus Border Border Router

WAN

Production SDNScience DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

Production DTNScience DMZ Connections

Per-service security policy control points

High performance multi-service

path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

perfSONAR

ResearchScience DMZSwitch/Router

Research DTN

perfSONAR

51 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 52: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Common Threads

Two common threads exist in all these examples Accommodation of TCP

•  Wide area portion of data transfers traverses purpose-built path •  High performance devices that don’t drop packets

Ability to test and verify •  When problems arise (and they always will), they can be solved if

the infrastructure is built correctly •  Small device count makes it easier to find issues •  Multiple test and measurement hosts provide multiple views of the

data path −  perfSONAR nodes at the site and in the WAN −  perfSONAR nodes at the remote site

52 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 53: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Router and Switch Output Queues

Interface output queue allows the router or switch to avoid causing packet loss in cases of momentary congestion

In network devices, queue depth (or ‘buffer’) is often a function of cost •  Cheap, fixed-config LAN switches (especially in the 10G space) have

inadequate buffering. Imagine a 10G ‘data center’ switch as the guilty party •  Cut-through or low-latency Ethernet switches typically have inadequate

buffering (the whole point is to avoid queuing!)

Expensive, chassis-based devices are more likely to have deep enough queues •  Juniper MX and Alcatel-Lucent 7750 used in ESnet backbone •  Other vendors make such devices as well - details are important •  Thx to Jim: http://people.ucsc.edu/~warner/buffer.html

This expense is one driver for the Science DMZ architecture – only deploy the expensive features where necessary

53 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 54: Using perfSONAR and Science DMZ

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Output Queue Drops – Common Locations

10GE

1GE

10GE

1GE

10GE

1GE1GE

1GE

10GE

Site Border RouterSite Core Switch/Router

32+ cluster nodes

Wiring closet switch

Common locations of output queue drops for traffic

outbound toward the WAN

WAN

Department Core Switch

1GE1GE

1GE

WorkstationsDepartment

cluster switch

Department uplink to site core constrained by

budget or legacy equipment

Cluster data

transfer node

Common location of output queue drops for traffic inbound

from the WAN

Inbound data path

Outbound data path

Outbound data path

54 – ESnet Science Engagement ([email protected]) - 6/9/14 © 2014, Energy Sciences Network

Page 55: Using perfSONAR and Science DMZ

•  perfSONAR –  Motivation –  Collaborative effort –  Use cases

•  Science DMZ –  Motivation –  Architecture –  Use cases

Overview

Page 56: Using perfSONAR and Science DMZ

Example from Clemson – CC-NIE •  Sponsored by NSF OCI(ACI)-1245936 •  “Clemson NextNet” = Production campus network using all three

elements of Internet2 Innovation Platform (100G, DMZ, SDN) •  Upgrade infrastructure (10G or 40G) to the lab/classroom/office in

20 campus buildings •  Identify & work closely with researchers (weekly meeting) •  Examples of success (beyond infrastructure), so far on campus:

–  Diverse team working toward common goals = will continue well beyond project period

–  CU Genomics Institute: Genomic code optimization & transfer speeds between lab, HPC resources, national databases

–  Bioengineering: Transfer speeds between lab & HPC resource can support processing for remote surgery tests

–  Visualization: Rendering & processing between upgraded buildings and HPC resources

Page 57: Using perfSONAR and Science DMZ

Clemson SDN component •  Originally a prototype network running between two

data centers •  Now transitioned into a production SDN campus

network, as depicted in the diagram(s). •  Currently using a Floodlight controller •  Utilizes OESS on AL2S.

•  Planned collaboration & testing over AL2S: –  Genomics: NCBI, Indiana University, University of Florida –  Bioengineering: Mayo Clinic, Vanderbilt University –  Visualization: Indiana University, others in discussion

 

Page 58: Using perfSONAR and Science DMZ
Page 59: Using perfSONAR and Science DMZ
Page 60: Using perfSONAR and Science DMZ

Clemson University Science DMZ

Nov  2  2012  

Page 61: Using perfSONAR and Science DMZ

University of Utah Science DMZ

Page 62: Using perfSONAR and Science DMZ

University of Southern California Science DMZ

Page 63: Using perfSONAR and Science DMZ

References –  Internet2

•  http://www.internet2.edu

–  NSF – CC-NIE •  http://www.nsf.gov/pubs/2012/nsf12541/nsf12541.htm

–  NSF – CC-IIE •  http://www.nsf.gov/pubs/2014/nsf14521/nsf14521.htm

–  NSF - IRNC •  http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503382

–  Broaden the Reach (BTR) •  http://www.internet2.edu/news/detail/6254/

Page 64: Using perfSONAR and Science DMZ

References (cont.) –  ESnet fasterdata knowledge base

•  http://fasterdata.es.net/

–  Science DMZ paper •  http://www.es.net/assets/pubs_presos/sc13sciDMZ-final.pdf

–  Science DMZ email list •  https://gab.es.net/mailman/listinfo/sciencedmz

–  perfSONAR •  http://fasterdata.es.net/performance-testing/perfsonar/ •  http://psps.perfsonar.net

Page 65: Using perfSONAR and Science DMZ

NYSERNet TECH SUMMIT 2014, Poughkeepsie, NY Thursday, June 12th

     

Questions?