the science dmz – perfsonar & network monitoringrich/...oin-sciencedmz-2-perfsonar.pdf · the...

108
The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering & Outreach Operating Innovative Networks (OIN) October 3 th & 4 th , 2013 With contributions from S. Balasubramanian, E. Dart, B. Johnston, A. Lake, E. Pouyoul, L. Rotman, B. Tierney and others @ ESnet

Upload: truongdieu

Post on 31-Jan-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

The Science DMZ – perfSONAR & Network Monitoring

Jason Zurawski - ESnet Engineering & Outreach

Operating Innovative Networks (OIN)

October 3th & 4th, 2013

With contributions from S. Balasubramanian, E. Dart, B. Johnston, A. Lake, E. Pouyoul, L. Rotman, B. Tierney and others @ ESnet

Page 2: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Overview Part 1 (Today):

•  What is ESnet? •  Science DMZ Introduction & Motivation •  Science DMZ Architecture

Part 2 (Today): •  PerfSONAR •  Science DMZ Security Best Practices

Part 3 (Today & Tomorrow): •  The Data Transfer Node •  Data Transfer Tools •  Conclusions & Discussion

2 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 3: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Data Transfer Trifecta: The “Science DMZ” Model

Dedicated Systems for

Data Transfer

Network Architecture

Performance Testing &

Measurement

Data Transfer Node •  High performance •  Configured for data

transfer •  Proper tools

perfSONAR •  Enables fault isolation •  Verify correct operation •  Widely deployed in

ESnet and other networks, as well as sites and facilities

Science DMZ •  Dedicated location for DTN •  Proper security •  Easy to deploy - no need to

redesign the whole network

3 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 4: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Test and Measurement – Keeping the Network Clean

The wide area network, the Science DMZ, and all its systems can be functioning perfectly

Eventually something is going to break •  Networks and systems are built with many, many

components •  Sometimes things just break – this is why we buy

support contracts Other problems arise as well – bugs, mistakes, whatever We must be able to find and fix problems when they occur Why is this so important? Because we use TCP!

4 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 5: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Where Are The Problems?

Source Campus

Backbone

S

NREN

Congested or faulty links between domains

Congested intra- campus links

5 – ESnet Science Engagement ([email protected]) - 10/2/13

D

Destination Campus

Latency dependant problems inside domains with small RTT

Regional

Page 6: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Source Campus

R&E Backbone

Regional

D S

Destination Campus

Regional

Performance is good when RTT is < ~10 ms

Performance is poor when RTT exceeds ~10 ms

Switch with small buffers

Local Testing Will Not Find Everything

6 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 7: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Soft Network Failures

Soft failures are where basic connectivity functions, but high performance is not possible.

TCP was intentionally designed to hide all transmission errors from the user:

•  “As long as the TCPs continue to function properly and the internet system does not become completely partitioned, no transmission errors will affect the users.” (From IEN 129, RFC 716)

Some soft failures only affect high bandwidth long RTT flows.

Hard failures are easy to detect & fix •  soft failures can lie hidden for years!

One network problem can often mask others

7 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 8: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Network Monitoring

•  All networks do some form monitoring. •  Addresses needs of local staff for understanding state of the

network o  Would this information be useful to external users? o  Can these tools function on a multi-domain basis?

•  Beyond passive methods, there are active tools. o  E.g. often we want a ‘throughput’ number. Can we automate that

idea? o  Wouldn’t it be nice to get some sort of plot of performance over

the course of a day? Week? Year? Multiple endpoints?

perfSONAR = Measurement Middleware

 

 8 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 9: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR

All the previous network diagrams have little perfSONAR boxes everywhere

•  The reason for this is that consistent behavior requires correctness •  Correctness requires the ability to find and fix problems -  You can’t fix what you can’t find -  You can’t find what you can’t see -  perfSONAR lets you see

Especially important when deploying high performance services •  If there is a problem with the infrastructure, need to fix it •  If the problem is not with your stuff, need to prove it -  Many players in an end to end path -  Ability to show correct behavior aids in problem localization

9 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 10: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

What is perfSONAR?

perfSONAR is a tool to:

•  Set network performance expectations

•  Find network problems (“soft failures”)

•  Help fix these problems

All in multi-domain environments

•  These problems are all harder when multiple networks are involved

perfSONAR is provides a standard way to publish active and passive monitoring data

•  This data is interesting to network researchers as well as network operators

10 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 11: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The “perfSONAR Toolkit” is an open source implementation and packaging of the perfSONAR measurement infrastructure and protocols from ESnet and Internet2

http://psps.perfsonar.net

All components are available as RPMs, and bundled into a CentOS 6-based “netinstall” and a “Live CD”

•  perfSONAR tools are much more accurate if run on a dedicated perfSONAR host, not on the DTN

Very easy to install and configure •  Usually takes less than 30 minutes

perfSONAR Toolkit

11 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 12: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The  best  source  of  informa1on  is  here:  •  h3p://code.google.com/p/perfsonar-­‐ps/wiki/pSPerformanceToolkit331    

There  are  two  use  cases  for  configura1on:  •  Diagnos1c  - Burn  CD,  insert,  boot,  Done!  - You  can’t  configure  regular  tes1ng,  but  you  can  test  to  this/log  on  and  test  with  it  

•  Permanent    - Couple  of  steps  to  install  the  Linux  Distro  

 

 

Hands On – Configuration of a pSPT

12 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 13: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Hands On – Configuration of a pSPT

13 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 14: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Hands On – Configuration of a pSPT

14 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 15: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

We  will  be  using  VMs:  •  perfsonar-­‐ws-­‐2.internet2.edu  –  perfsonar-­‐ws-­‐10.internet2.edu  

•  Note  –  Some  of  you  have  to  share,  pair  up!  

Visit  your  VM  in  a  web  browser  first,  e.g.:  

•  h3p://perfsonar-­‐ws-­‐XX.internet2.edu  (where  XX  is  your  number)  

 

 

Hands On – Configuration of a pSPT

15 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 16: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Click  on  “Enabled  Services”  • Note  you  may  need  to  ‘ok’  a  security  warning  

Username:  “root”  

Password:  “psworkshop”  

 

Hands On – Enabling SSH

16 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 17: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Hands On – Via the Web Interface …

17 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 18: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Click  ‘SSH’  to  enable  the  SSH  service  

Click  “Save”  • A  progress  bar  will  appear  • When  done  “Configura1on  Saved  And  Services  Restarted”  will  appear  

• Note:  If  you  are  sharing,  only  one  of  you  will  need  to  make  this  change  

SSH  is  now  available  on  your  host    

 

Hands On – Setting up SSH

18 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 19: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Open  a  terminal  

SSH  to  root@perfsonar-­‐ws-­‐XX.internet2.edu    (where  XX)  is  your  number):  

 

 

Hands On – Configuration of a pSPT

19 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 20: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  Do  this  first,  otherwise  a  lot  of  other  stuff  won’t  work.  •  Authen1ca1on  is  required  •  Always  remember  to  save  when  you  are  done.  

 

 

Hands On – Administrative Info

20 – ESnet Science Engagement ([email protected]) - 10/2/13

20  –  10/2/13,  ©  2013  ESnet,  Internet2  J.  Zurawski  –  [email protected]    

Page 21: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Click  on  ‘edit’  to  edit  (of  course):  

 

Hands On – Administrative Info

21 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 22: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Press  “OK”  and  “Save”  when  done:  

 

Hands On – Administrative Info

22 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 23: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  Do  this  second.    Note  that  it  may  take  a  day  to  fully  stabilize  the  clock  

•  Pick  4  –  5  Close  servers  for  NTP  •  We  have  a  fast  way  to  do  this,  or  you  can  

manually  select  •  Can  also  add  your  own  servers  if  you  don’t  like  

ours  •  Note:  Clocks  are  stable,  no  one  should  ‘save’,  but  feel  free  to  play  around  and  select  closer  ones  if  you  want.      

 

 

Hands On – NTP

23 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 24: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Press  “select  closest”  to  run  a  selec1on  

 

Hands On – NTP

24 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 25: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Add  in  servers  manually  

 

Hands On – NTP

25 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 26: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  Services  should  be  enabled/disabled  from  this  screen  (don’t  use  chkconfig,  we  overwrite  that  with  each  save…)  

•  Shortcuts  to  enable  bandwidth  only  vs  latency  only  

•  SSH  is  disabled  by  default!  •  Note:  Don’t  ‘save’  aler  this  part  either,  but  feel  free  to  see  what  the  bu3ons  do.      

 

 

Hands On – Services

26 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 27: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  Select/de-­‐select  via  bu3ons.    Pick  a  use  case  as  well  

 

 

Hands On – Services

27 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 28: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  All  regular  tes1ng  follows  the  same  pa3ern:  -  Select  a  Type  -  Select  Parameters  -  Add  Hosts  -  Save  

•  Will  only  go  over  BWCTL  here    

 

Hands On – Regular Testing

28 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 29: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  Ini1al    

 

Hands On – Regular Testing

29 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 30: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Create  test  parameters  

 

Hands On – Regular Testing

30 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 31: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Add  Hosts  

 

Hands On – Regular Testing

31 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 32: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Enter  a  new  host  

 

Hands On – Regular Testing

32 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 33: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Lets  use  these:  •  Test  to  1  or  2  of  your  neighbors  (perfsonar-­‐ws-­‐X.internet2.edu  )  

•  Test  to  Internet2  -  Ping/OWAMP:  owamp.losa.net.internet2.edu,  owamp.chic.net.internet2.edu,  owamp.hous.net.internet2.edu,  owamp.salt.net.internet2.edu  

-  Traceroute/BWCTL:  bwctl.losa.net.internet2.edu,  bwctl.chic.net.internet2.edu,  nms-­‐bwctl.hous.net.internet2.edu,  nms-­‐bwctl.salt.net.internet2.edu  

Set  up  Latency,  BW,  Ping,  and  Traceroute  tests    

 

Hands On – Regular Testing

33 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 34: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

•  perfSONAR  interface  is  meant  to  be  simple  (e.g.  so  easy  even  an  Engineer  Scien1st  CIO  could  do  it)  

•  Enabling  this  on  campus  is  the  first  step  to  seeing  a  simula1on  of  performance  for  a  bulk  data  tool.    Ideally  you  would  place  the  perfSONAR  server  where  the  users  are  (e.g  if  they  are  traversing  a  firewall  s1ll,  why  don’t  you  learn  their  pain)?  

•  Configuring  regular  tests  is  systema1c  –  pick  regional  and  far  away  des1na1ons.  

•  Dust  of  nenlow,  and  see  where  the  data  is  going  –  configure  tests  to  those  loca1ons  too.      

Transition – What did we just do?

34 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 35: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Use  the  correct  tool  for  the  Job  •  To  determine  the  correct  tool,  maybe  we  need  to  start  with  what  we  want  to  accomplish  …  

What  do  we  care  about  measuring?  •  Packet  Loss,  Duplica1on,  out-­‐of-­‐orderness  (transport  layer)  

•  Achievable  Bandwidth  (e.g.  “Throughput”)  •  Latency  (Round  Trip  and  One  Way)  •  Ji3er  (Delay  varia1on)  •  Interface  U1liza1on/Discards/Errors  (network  layer)  •  Traveled  Route  •  MTU  Feedback  

 

The Metrics

35 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 36: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR Toolkit Services

PS-Toolkit includes these measurement tools:

•  BWCTL: network throughput

•  OWAMP: network loss, delay, and jitter

•  traceroute

Test scheduler:

•  runs bwctl, traceroute, and owamp tests on a regular interval

Measurement Archives (data publication)

•  SNMP MA – router interface Data

•  pSB MA -- results of bwctl, owamp, and traceroute tests

Lookup Service: used to find services

PS-Toolkit includes these web100-based Troubleshooting Tools

•  NDT (TCP analysis, duplex mismatch, etc.)

•  NPAD (TCP analysis, router queuing analysis, etc) 36 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 37: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Toolkit Web Interface

37 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 38: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Deployment By The Numbers •  Last updated early Sept 2013. Adoption trend increases with each

release. CC-NIE and innovation platform helped as well.

38 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 39: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

World-Wide perfSONAR-PS Deployments: 950+ as of October 2013

39 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 40: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Adoption = A Checkbox?

•  Can say that about other technologies like IPv6 too …

•  Much like a car insurance policy, most will continue to pay the premiums even though they believe they drive ‘safely’

•  Most that have adopted have done so for a specific reason (e.g. it works) •  ~35 Countries •  ~205 Domains •  ~950 Instances •  30% have made the upgrade to the latest version so far (~ 2 month out from

release)

•  Other macro trends: •  Those that deploy, deploy more than 1 •  Huge uptick in Europe and Asia. •  Network Providers, Campuses, and Vos •  Not just the “usual” suspects -  Commercial entities, African NRENs, non-DOE government – many are IPv6

only (!) 40 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 41: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

We can’t wait for users to report problems and then fix them (soft failures can go unreported for years!)

Things just break sometimes •  Failing optics •  Somebody messed around in a patch panel and kinked a fiber •  Hardware goes bad

Problems that get fixed have a way of coming back •  System defaults come back after hardware/software upgrades •  New employees may not know why the previous employee set

things up a certain way and back out fixes

Important to continually collect, archive, and alert on active throughput test results

Importance of Regular Testing

41 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 42: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR Dashboard: http://ps-dashboard.es.net

42 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 43: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR Dashboard: http://ps-dashboard.es.net

43 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 44: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Adding Attenuator to Noisy Link

44 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 45: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Host Tuning Example

•  Host Configuration – spot when the TCP settings were tweaked…

•  Example Taken from REDDnet (UMich to TACC, using BWCTL measurement) •  Host Tuning: http://fasterdata.es.net/fasterdata/host-tuning/linux/

45 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 46: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Regular perfSONAR Tests

We run regular tests to check for two things •  TCP throughput

•  One way delay and packet loss

perfSONAR has mechanisms for managing regular testing between perfSONAR hosts

•  Statistics collection and archiving

•  Graphs

•  Dashboard display

•  Integrate with NAGIOS

This infrastructure is deployed now – perfSONAR hosts at facilities can take advantage of it

At-a-glance health check for data infrastructure

46 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 47: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Throughput Detail Graph

•  Temporary drop in performance was due to re-route around a fiber cut •  Latency increase •  Clean otherwise (performance stayed high)

•  Other than that, it’s stable, and performs well (over 2Gbps per stream) •  This is a powerful tool for expectation management

47 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 48: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

What are you going to measure? •  Achievable bandwidth -  2-3 regional destinations -  4-8 important collaborators -  4-8 (more if you are willing, especially to start) times per day to each

destination -  20-30 second tests within a region, longer across oceans and

continents •  Loss/Availability/Latency -  OWAMP: ~10-20 collaborators over diverse paths

•  Interface Utilization & Errors (via SNMP) What are you going to do with the results?

•  NAGIOS Alerts •  Reports to user community •  Dashboard

Develop a Test Plan

48 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 49: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

http://psps.perfsonar.net/toolkit/hardware.html

Dedicated perfSONAR hardware is best •  Server class is a good choice •  Desktop/Laptop/Mini (Mac, Shuttle) can be problematic, but work in a

diagnostic capacity

Other applications will perturb results Separate hosts for throughput tests and latency/loss tests is preferred

•  Throughput tests can cause increased latency and loss

•  Latency tests on a throughput host are still useful however

1Gbps vs 10Gbps testers •  There are a number of problem that only show up at speeds above 1Gbps

Virtual Machines do not always work well as perfSONAR hosts (use specific)

•  Clock sync issues are a bit of a factor

•  throughput is reduced significantly for 10G hosts

•  VM technology and motherboard technology has come a long way, YMMV

•  NDT/NAGIOS/SNMP/1G BWCTL are good choices for a VM, OWAMP/10G BWCTL are not

Host Considerations

49 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 50: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR Deployment Locations

Critical to deploy such that you can test with useful semantics

perfSONAR hosts allow parts of the path to be tested separately •  Reduced visibility for devices between perfSONAR hosts •  Must rely on counters or other means where perfSONAR can’t go

Effective test methodology derived from protocol behavior •  TCP suffers much more from packet loss as latency increases •  TCP is more likely to cause loss as latency increases •  Testing should leverage this in two ways -  Design tests so that they are likely to fail if there is a problem -  Mimic the behavior of production traffic as much as possible

•  Note: don’t design your tests to succeed -  The point is not to “be green” even if there are problems -  The point is to find problems when they come up so that the

problems are fixed quickly 50 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 51: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Sample Site Deployment

51 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 52: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

ATLAS Dashboard

52 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 53: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Trouble ticket comes in: “I’m getting terrible performance from site A to site B”

If there is a perfSONAR node at each site border: •  Run tests between perfSONAR nodes -  performance is often clean

•  Run tests from end hosts to perfSONAR host at site border -  Often find packet loss (using owamp tool) -  If not, problem is often the host tuning or the disk -  If not that, suspect a switch buffer overflow problem

•  These are the hardest to prove

If there is not a perfSONAR node at each site border -  Try to get one deployed -  Run tests to other nearby perfSONAR nodes

Common perfSONAR Use Case

53 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 54: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

WAN Test Methodology – Problem Isolation

Segment-to-segment testing is unlikely to be helpful •  TCP dynamics will be different •  Problem links can test clean over short distances •  An exception to this is hops that go thru a firewall

Run long-distance tests •  Run the longest clean test you can, then look for the shortest dirty test

that includes the path of the clean test

In order for this to work, the testers need to be already deployed when you start troubleshooting

•  ESnet has at least one perfSONAR host at each hub location. -  Many (most?) R&E providers in the world have deployed at least 1

•  If your provider does not have perfSONAR deployed ask them why, and then ask when they will have it done

54 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 55: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Network Performance Troubleshooting Example

10GE

10GE

10GE

Nx10GE

10GE

10GE

perfSONARperfSONARBorder perfSONAR Science DMZ perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

PoorPerformance

WAN

University CampusNational Labortory

55 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 56: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Wide Area Testing – Full Context

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONARBorder perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

Internet2 path~15 msec

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

PoorPerformance

56 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 57: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Wide Area Testing – Long Clean Test

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONAR

48 msec

Border perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

Internet2 path~15 msec

Clean,FastClean,

Fast

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

57 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 58: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Wide Area Testing – Poorly Performing Tests Illustrate Likely Problem Areas

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONAR

48 msec

Border perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

49 msec

49 msec

Internet2 path~15 msec

Clean,Fast

Clean,FastClean,

Fast

Dirty,Slow

Dirty,Slow

Clean,Fast

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

58 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 59: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Lessons From This Example

This testing can be done quickly if perfSONAR is already deployed Huge productivity

•  Reasonable hypothesis developed quickly •  Probable administrative domain identified •  Testing time can be short – an hour or so at most

Without perfSONAR cases like this are very challenging Time to resolution measured in months

In order to be useful for data-intensive science, the network must be fixable quickly, because it will break

The Science DMZ model allows high-performance use of the network, but perfSONAR is necessary to ensure the whole kit functions well

59 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 60: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

perfSONAR-­‐PS  is  working  to  build  a  strong  user  community  to  support  the  use  and  development  of  the  solware.      

perfSONAR-­‐PS  Mailing  Lists  

•  Announcement  Lists:  -  h3ps://mail.internet2.edu/wws/subrequest/perfsonar-­‐ps-­‐announce  -  h3ps://mail.internet2.edu/wws/subrequest/performance-­‐node-­‐announce  

•  Users  List:  -  h3ps://mail.internet2.edu/wws/subrequest/performance-­‐node-­‐users  

perfSONAR Community

60 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 61: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

More on perfSONAR

http://psps.perfsonar.net/ https://code.google.com/p/perfsonar-ps/

61 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 62: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Overview Part 1 (Today):

•  What is ESnet? •  Science DMZ Introduction & Motivation •  Science DMZ Architecture

Part 2 (Today): •  PerfSONAR •  Science DMZ Security Best Practices

Part 3 (Today & Tomorrow): •  The Data Transfer Node •  Data Transfer Tools •  Conclusions & Discussion

62 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 63: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

State of the Campus Show of hands – is there a firewall on your campus?

•  Do you know who ‘owns’ it? Maintains it? Is it being maintained? •  Have you ever asked for a ‘port’ to be opened? White list a host? Does

this involve an email to ‘a guy’ you happen to know? •  Has it prevented you from being ‘productive’?

In General … •  Yes, they exist. •  Someone owns them, and probably knows how to add rules – but the

‘maintenance’ question is harder to answer. -  Like a router/switch, they need firmware updates too…

•  Will it impact you – ‘it depends’. Yes, it will have an effect on your traffic at all times, but will you notice? -  Small streams (HTTP, Mail, etc.) – you won’t notice slowdowns, but you will notice

blockages -  Larger streams (Data movement, Video, Audio) – you will notice slowdowns

63 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 64: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Say Hello to your Frienemy: The Campus Firewall

To  be  100%  clear  –  the  firewall  is  a  useful  tool:  

•  A layer or protection that is based on allowed, and disallowed, behaviors

•  One stop location to install instructions (vs. implementing in multiple locations)

•  Very necessary for things that need ‘assurance’ (e.g. student records, medical data, protecting the HVAC system, IP Phones, and printers from bad people, etc.)

To be 100% clear again, the firewall delivers functionality that can be implemented in different ways

•  Filtering ranges can be implemented via ACLs

•  Port/Host blocking can be done on a host by host basis

•  IDS tools can implement near real-time blocking of ongoing attacks that match heuristics

64 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 65: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The role of Campus Firewalls

I  am  not  here  to  make  you  throw  away  the  Firewall  

•  The  firewall  has  a  role;  it’s  1me  to  define  what  that  role  is,  and  is  not  

•  Policy  may  need  to  be  altered  (pull  out  the  quill  pens  and  parchment)  

•  Minds  may  need  to  be  changed  

 

I  am  here  to  make  you  think  cri1cally  about  campus  security  as  a  system.    That  requires:  

•  Knowledge  of  the  risks  and  mi1ga1on  strategies  

•  Knowing  what  the  components  do,  and  do  not  do  

•  Humans  to  implement  and  manage  certain  features  –  this  may  be  a  shock  to  some  (lunch  is  never  free)  

65 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 66: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

When Security and Performance Clash What does a firewall do?

•  Streams of packets enter into an ingress port – there is some buffering •  Packet headers are examined. Have I seen a packet like this before? -  Yes – If I like it, let it through, if I didn’t like it, goodbye. -  No - Who sent this packet? Are they allowed to send me packets? What port did

it come from, and what port does it want to go to? •  Packet makes it through processing and switching fabric to some egress

port. Sent on its way to the final destination. Where are the bottlenecks?

•  Ingress buffering – can we tune this? Will it support a 10G flow, let alone multiple 10G flows?

•  Processing speed – being able to verify quickly is good. Verifying slowly will make TCP sad

•  Switching fabric/egress ports. Not a huge concern, but these can drop packets too

•  Is the firewall instrumented to know how well it is doing? Could I ask it?

66 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 67: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Causes of Jitter

•  Processing Delay: Time to process a packet •  Queuing Delay: Time spent in ingress/egress queues to device •  Transmission Delay: Time needed to put the packet on the wire •  Propagation Delay: Time needed to travel on the wire

67 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 68: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

When Security and Performance Clash

Lets look at two examples, that highlight two primary network architecture use cases:

•  Totally protected campus, with a border firewall -  Central networking maintains the device, and protects all in/

outbound traffic -  Pro: end of the line customers don’t need to worry (as much) about

security -  Con: end of the line customers *must* be sent through the disruptive

device

•  Unprotected campus, protection is the job of network customers -  Central networking gives you a wire and wishes you best of luck -  Pro: nothing in the path to disrupt traffic, unless you put it there -  Con: Security becomes an exercise that is implemented by all end

customers

68 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 69: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Brown University – Firewalls for All

69 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 70: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Brown University Example

Results  to  host  behind  the  firewall:  

70 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 71: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Brown University Example

In  front  of  the  firewall:  

71 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 72: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Brown  Univ.  Example  –  TCP  Dynamics  Want  more  proof  –  lets  look  at  a  measurement  tool  through  the  firewall.  

•  Measurement  tools  emulate  a  well  behaved  applica1on      ‘Outbound’,  not  filtered:  

•  nuttcp -T 10 -i 1 -p 10200 bwctl.newy.net.internet2.edu!•  92.3750 MB / 1.00 sec = 774.3069 Mbps 0 retrans!•  111.8750 MB / 1.00 sec = 938.2879 Mbps 0 retrans!•  111.8750 MB / 1.00 sec = 938.3019 Mbps 0 retrans!•  111.7500 MB / 1.00 sec = 938.1606 Mbps 0 retrans!•  111.8750 MB / 1.00 sec = 938.3198 Mbps 0 retrans!•  111.8750 MB / 1.00 sec = 938.2653 Mbps 0 retrans!•  111.8750 MB / 1.00 sec = 938.1931 Mbps 0 retrans!•  111.9375 MB / 1.00 sec = 938.4808 Mbps 0 retrans!•  111.6875 MB / 1.00 sec = 937.6941 Mbps 0 retrans!•  111.8750 MB / 1.00 sec = 938.3610 Mbps 0 retrans!

•  1107.9867 MB / 10.13 sec = 917.2914 Mbps 13 %TX 11 %RX 0 retrans 8.38 msRTT!

72 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 73: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Thru  the  firewall  

‘Inbound’,  filtered:  •  nuttcp -r -T 10 -i 1 -p 10200 bwctl.newy.net.internet2.edu!•  4.5625 MB / 1.00 sec = 38.1995 Mbps 13 retrans!•  4.8750 MB / 1.00 sec = 40.8956 Mbps 4 retrans!•  4.8750 MB / 1.00 sec = 40.8954 Mbps 6 retrans!•  6.4375 MB / 1.00 sec = 54.0024 Mbps 9 retrans!•  5.7500 MB / 1.00 sec = 48.2310 Mbps 8 retrans!•  5.8750 MB / 1.00 sec = 49.2880 Mbps 5 retrans!•  6.3125 MB / 1.00 sec = 52.9006 Mbps 3 retrans!•  5.3125 MB / 1.00 sec = 44.5653 Mbps 7 retrans!•  4.3125 MB / 1.00 sec = 36.2108 Mbps 7 retrans!•  5.1875 MB / 1.00 sec = 43.5186 Mbps 8 retrans!

•  53.7519 MB / 10.07 sec = 44.7577 Mbps 0 %TX 1 %RX 70 retrans 8.29 msRTT!

73 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 74: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

tcptrace output: with and without a firewall

firewall

No firewall

74 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 75: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Pennsylvania State University – Firewalls for Some Unprotected campus, protection is the job of network

customers

75 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 76: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Pennsylvania State University •  Initial Report from network users: performance poor both directions

•  Outbound and inbound (normal issue is inbound through protection mechanisms)

•  From previous diagram – CoE firewalll was tested •  Machine outside/inside of firewall. Test to point 10ms away

(Internet2 Washington) jzurawski@ssstatecollege:~> nuttcp -T 30 -i 1 -p 5679 -P 5678 64.57.16.22!

5.8125 MB / 1.00 sec = 48.7565 Mbps 0 retrans!

6.1875 MB / 1.00 sec = 51.8886 Mbps 0 retrans!

…!

6.1250 MB / 1.00 sec = 51.3957 Mbps 0 retrans!

6.1250 MB / 1.00 sec = 51.3927 Mbps 0 retrans!

!

184.3515 MB / 30.17 sec = 51.2573 Mbps 0 %TX 1 %RX 0 retrans 9.85 msRTT!

76 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 77: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Pennsylvania State University •  Observation: net.ipv4.tcp_window_scaling did not seem to be working

•  64K of buffer is default. Over a 10ms path, this means we can hope to see only 50Mbps of throughput:

•  BDP (50 Mbit/sec, 10.0 ms) = 0.06 Mbyte

•  Implication: something in the path was not respecting the specification in RFC 1323, and was not allowing TCP window to grow •  TCP window of 64 KByte and RTT of 1.0 ms <= 500.00 Mbit/sec. •  TCP window of 64 KByte and RTT of 5.0 ms <= 100.00 Mbit/sec. •  TCP window of 64 KByte and RTT of 10.0 ms <= 50.00 Mbit/sec. •  TCP window of 64 KByte and RTT of 50.0 ms <= 10.00 Mbit/sec.

•  Reading documentation for firewall: •  TCP flow sequence checking was enabled •  What would happen if this was turn off (both directions?

77 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 78: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Pennsylvania State University jzurawski@ssstatecollege:~> nuttcp -T 30 -i 1 -p 5679 -P 5678

64.57.16.22!

55.6875 MB / 1.00 sec = 467.0481 Mbps 0 retrans!

74.3750 MB / 1.00 sec = 623.5704 Mbps 0 retrans!

87.4375 MB / 1.00 sec = 733.4004 Mbps 0 retrans!

…!

91.7500 MB / 1.00 sec = 770.0544 Mbps 0 retrans!

88.6875 MB / 1.00 sec = 743.5676 Mbps 28 retrans!

69.0625 MB / 1.00 sec = 578.9509 Mbps 0 retrans!

 !

2300.8495 MB / 30.17 sec = 639.7338 Mbps 4 %TX 17 %RX 730 retrans 9.88 msRTT!

78 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 79: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Pennsylvania State University Impac1ng  real  users:  

79 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 80: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Goal – disentangle security policy and enforcement for science flows from that of business systems

Rationale •  Science flows are relatively simple from a security perspective •  Narrow application set on Science DMZ hosts -  Data transfer, data streaming packages -  Performance / packet loss monitoring tools -  No printers, document readers, web browsers, building control

systems, staff desktops, etc. •  Security controls that are typically implemented to protect business

resources often cause performance problems •  Sizing security infrastructure on designed for business networks to

handle large science flows is expensive

Science DMZ Security

80 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 81: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

In Big Data Science, Performance Is a Core Requirement Too

Core information security principles •  Confidentiality, Integrity, Availability (CIA)

In data-intensive science, performance is an additional core mission requirement (CIAP)

•  CIA principles are important, but if the performance isn’t there the science mission fails

•  This isn’t about “how much” security you have, but how the security is implemented

•  We need to be able to appropriately secure systems in a way that does not compromise performance

81 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 82: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Science DMZ Placement Outside the Firewall The Science DMZ resources are placed outside the enterprise

firewall for performance reasons •  The meaning of this is specific – Science DMZ traffic does not

traverse the firewall data plane •  This has nothing to do with whether packet filtering is part of the

security enforcement toolkit

Lots of heartburn over this, especially from the perspective of a conventional firewall manager

•  Lots of organizational policy directives mandating firewalls •  Firewalls are designed to protect converged enterprise networks •  Why would you put critical assets outside the firewall???

The answer is that firewalls are typically a poor fit for high-performance science applications

82 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 83: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

The Ubiquitous Firewall

The workhorse device of network security – the firewall – has a poor track record in high-performance contexts

•  Firewalls are typically designed to support a large number of users/devices, each with low throughput requirements -  Data intensive science typically generates a much smaller

number of connections that are much higher throughput

Modern firewalls are far more than a packet filter:

• Decode certain application protocols (IDS/IPS functionality, URL filter, etc.)

• Rewrite headers (e.g. NAT)

• VPN Gateway

None of these are relevant to Science DMZ applications 83 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 84: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

What’s Inside Your Firewall?

Vendor: “But wait – we don’t do this anymore!” •  It is true that vendors are working toward line-rate 10G firewalls, and

some may even have them now •  10GE has been deployed in science environments for over 10 years •  Firewall internals have only recently started to catch up with the 10G

world •  100GE is being deployed now, 40Gbps host interfaces are available now •  Firewalls are behind again

In general, IT shops want to get 5+ years out of a firewall purchase •  This often means that the firewall is years behind the technology curve •  Whatever you deploy now, that’s the hardware feature set you get •  When a new science project tries to deploy data-intensive resources, they

get whatever feature set was purchased several years ago

84 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 85: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Firewall Capabilities and Science Traffic

Firewalls have a lot of sophistication in an enterprise setting •  Application layer protocol analysis (HTTP, POP, MSRPC, etc.) •  Built-in VPN servers •  User awareness

Data-intensive science flows don’t match this profile •  Common case – data on filesystem A needs to be on filesystem Z -  Data transfer tool verifies credentials over an encrypted channel -  Then open a socket or set of sockets, and send data until done

(1TB, 10TB, 100TB, …) •  One workflow can use 10% to 50% or more of a 10G network link

Do we have to use a firewall?

85 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 86: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Firewalls vs Router Access Control Lists

When you ask a firewall administrator to allow data transfers through the firewall, what do they ask for?

•  IP address of your host •  IP address of the remote host •  Port range •  That looks like an ACL to me – I can do that on the router!

Firewalls make expensive, low-performance ACL filters compared to the ACL capabilities are typically built into the router

86 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 87: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Security Without Firewalls Does this mean we ignore security? NO!

•  We must protect our systems •  We just need to find a way to do security that does not

prevent us from getting the science done Lots of other security solutions

•  Host-based IDS and firewalls •  Intrusion detection (Bro, Snort, others), flow analysis, … •  Tight ACLs reduce attack surface (possible in many but not

all cases) •  Key point – performance is a mission requirement, and

the security policies and mechanisms that protect the Science DMZ should be architected so that they serve the mission

87 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 88: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

If Not Firewalls, Then What?

•  Remember – the goal is to protect systems in a way that allows the science mission to succeed

•  There are multiple ways to solve this – some are technical, and some are organizational/sociological

•  Note: this is harder than just putting up a firewall and thinking you are done

88 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 89: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Other Security Tools

Intrusion Detection Systems (IDS) •  One example is Bro – http://bro-ids.org/ •  Bro is high-performance and battle-tested -  Bro protects several high-performance national assets -  Bro can be scaled with clustering:

http://www.bro-ids.org/documentation/cluster.html

•  Other IDS solutions are available also

Blackhole Routing to block attacks

Netflow, IPFIX, sflow, etc. can provide visibility

89 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 90: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Other Security Tools (2)

Aggressive access lists •  More useful with project-specific DTNs •  If the purpose of the DTN is to exchange data with a small set of

remote collaborators, the ACL is pretty easy to write •  Large-scale data distribution servers are hard to handle this way

(but then, the firewall ruleset for such a service would be pretty open too)

Limitation of the application set •  One of the reasons to limit the application set in the Science DMZ

is to make it easier to protect •  Keep unnecessary applications off the DTN (and watch for them

anyway using a host IDS – take violations seriously)

90 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 91: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Other Security Tools (3)

Using a Host IDS is recommended for hosts in a Science DMZ

There are several open source solutions that have been recommended:

•  OSSec: http://www.ossec.net/

•  Rkhunter: http://rkhunter.sourceforge.net (rootkit detection + FIM)

•  chkrootkit: http://chkrootkit.org/

•  Logcheck: http://logcheck.org (log monitoring)

•  Fail2ban: http://www.fail2ban.org/wiki/index.php/Main_Page

•  denyhosts: http://denyhosts.sourceforge.net/

91 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 92: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Using OpenFlow to help secure the Science DMZ

Using OpenFlow to control access to a network-based service seems promising

•  E.G.: Sam Russell’s work at REANNZ: -  http://pieknywidok.blogspot.com.au/2013/01/thimble-secure-high-

speed-connectivity.html •  This could significantly reduce the attack surface for any

authenticated network service

92 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 93: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Collaboration Within The Organization

All stakeholders should collaborate on Science DMZ design, policy, and enforcement

The security people have to be on board •  Remember: in some organizations security people already have

political cover – it’s called the firewall •  If a host gets compromised, the security officer can say they did their

due diligence because there was a firewall in place •  If the deployment of a Science DMZ is going to jeopardize the job of

the security officer, expect pushback

The Science DMZ is a strategic asset, and should be understood by the strategic thinkers in the organization

•  Changes in security models •  Changes in operational models •  Enhanced ability to compete for funding •  Increased institutional capability – greater science output

93 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 94: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Is it possible to get a firewall that can handle 10G flows?

Yes, but just barely, and it will cost around $500K. •  Will this $500K give you any added security over router ACLs?

10G host interfaces have been around for 10 years, and true 10G firewalls for only a couple years

How long will it take for there to be a true 40G firewall? Or 100G?

94 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 95: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Thought Experiment

•  We’re going to do a thought experiment •  Consider a network between three buildings – A, B, and C

•  This is supposedly a 10Gbps network end to end (look at the links on the buildings)

•  Building A houses the border router – not much goes on there except the external connectivity

•  Lots of work happens in building B – so much so that the processing is done with multiple processors to spread the load in an affordable way, and results are aggregated after

•  Building C is where we branch out to other buildings

•  Every link between buildings is 10Gbps – this is a 10Gbps network, right???

95 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 96: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Notional 10G Network Between Buildings

WAN

perfSONAR Building A

10GE 10GE

Building B

Building C

1G1G

1G1G

1G 1G1G

1G

1G1G

1G1G1G 1G1G 1G1G 1G1G

1G

10GE

Building Layout

To O

ther

Bui

ldin

gs

10GE

10GE

10GE

96 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 97: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Clearly Not A 10Gbps Network

If you look at the inside of Building B, it is obvious from a network engineering perspective that this is not a 10Gbps network

•  Clearly the maximum per-flow data rate is 1Gbps, not 10Gbps •  However, if you convert the buildings into network elements while

keeping their internals intact, you get routers and firewalls •  What firewall did the organization buy? What’s inside it? •  Those little 1G “switches” are firewall processors

This parallel firewall architecture has been in use for years •  Slower processors are cheaper •  Typically fine for a commodity traffic load •  Therefore, this design is cost competitive and common

97 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 98: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Notional 10G Network Between Devices

WAN

perfSONAR Border Router

10GE 10GE

Firewall

Internal Router

1G1G

1G1G

1G 1G1G

1G

1G1G

1G1G1G 1G1G 1G1G 1G1G

1G

10GE

Device Layout

To O

ther

Bui

ldin

gs

10GE

10GE

10GE

98 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 99: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Notional Network Logical Diagram

10GE

10GE

10GE

10GE

10GE10GE

Border Router

WAN

Internal Router

Border Firewall

perfSONAR

99 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 100: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Security As a System Component  based  security  is  wrong.    Needs  to  be  a  system.  

•  E.g.  the  firewall  by  itself  has  limited  use,  and  can  be  easily  broken  by  a  mo1vated  a3acker  

System:  •  Cryptography  to  protect  user  access  and  data  integrity  •  IDS  to  monitor  before  (and  aler)  events  

•  Host-­‐based  security  is  be3er  for  performance,  but  takes  longer  to  implement.    Firewalls  are  bad  on  performance  but  easy  to  plot  down  in  a  network.  

•  Let  your  router  help  you  –  if  you  know  communica1on  pa3erns  (and  know  those  that  should  be  disallowed),  why  not  use  filters?  

Campus  CI  Plan.    Make  one,  update  it  olen.    Shows  funding  bodies  you  know  what  is  going  on  and  have  plans  to  address  risks,  and  foster  growth  

Economic  argument  –  if  you  are  non-­‐compe11ve  for  grants  because  you  approached  security  from  the  wrong  side,  are  you  be3er  in  the  long  run?  

100 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 101: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Security As a System Data  Provenance  

•  Some  bureaucra1c  document  states  that  all  campus  traffic  must  be  a)  encrypted  and  b)  passed  through  a  firewall  for  packet  inspec1on.    Why?  -  a)  What  data  is  private,  and  what  isn’t?    Student  records,  sure.    Maybe  even  sensi1ve  grant-­‐related  research.    Encryp1ng  all  data  is  not  necessary  if  you  stop  to  think  about  the  data.    At  least  make  it  a  user  choice.      

-  b)  Firewalls  work  when  you  can’t  be  sure  of  a  traffic  profile  (e.g.  they  stop  everything  and  give  it  the  business).    If  you  know  the  traffic  profile,  use  that  to  your  advantage.    Data  from  X  sites  on  ports  Y,  and  Z.      

•  Policy  is:  -  Wri3en  by  those  that  olen  do  not  have  prac1cal  experience  -  Outdated  almost  immediately    

•  Review  (create)  CI  Plan  regularly.      

101 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 102: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Security As a System User  Management  

• What  is  be3er:  centrally  managed  user  system  for  all  resources  vs.  independently  managed  on  each  machine?  

•  Central  -  Pro:  Easier  administra1on  when  adding/dele1ng  -  Con:  Single  point  of  failure  

•  Individual  -  Pro/Con:  Breach  of  once  machine  doesn’t  necessarily  imply  that  accounts  on  others  are  compromised  (N.B.  I  think  we  are  all  guilty  of  recycling  passwords  though…)  

•  Answer  depends  on  your  campus,  which  is  another  reason  why  the  DMZ  is  a  blueprint,  not  a  packaged  solu1on  

102 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 103: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Security As a System Device  Profiles  

•  All  the  devices  are  equal  (untrusted)  -  Have  the  number  of  phones/tablets  eclipsed  hard  campus  resources  for  any  of  you  yet?  

-  You  should  absolutely  not  trust  these,  or  *many*  of  your  hard  campus  resources  

•  Some  are  more  equal  than  others  (trusted)  -  Does  the  Physics  group  have  a  dedicated  admin  who  ‘gets  it’?    They  know  Linux,  and  have  implemented  host-­‐based  security,  plus  split  out  heavy  hi3ers  from  normal  users?  

-  Give  them  a  fast  path  (Penn  State  Model)  -  If  policy  needs  to  be  changed,  start  handing  out  cer1ficates  to  groups  that  complete  a  training.    CYA…  

103 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 104: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Sample Security Analysis from the University of Illinois (Nick Buraglio) How is security handled on campus now?

Firewalls

IPS

ACLs

Black hole routing

IDS

Host IDS

SNMP collection

The first 2 (Firewalls and IPS) are the only ones with performance implications. Can we create a secure environment without them?

104 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 105: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Sample Security Analysis from the University of Illinois (Nick Buraglio)

•  Management and Security Cocerns: -  “Adding visibility is essential for accountability” -  “Timely mitigation of issues is required” -  “Automated mitigation is highly desirable”* -  “Once you’ve broken into a DMZ host you have an outpost in

enemy territory”

105 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 106: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Sample Security Analysis from the University of Illinois (Nick Buraglio)

University of Illinois management, network engineers, and security staff decided on the following for their Science DMZ:

•  Flow Data for accountability (netflow/sflow/jflow)

•  SNMP collection for baseline creation and capacity planning

•  Router ACLs for best practice ingress blocks

•  Passive network IDS for monitoring (Bro)

•  Host IDS on all hosts outside the firewall (OSSec)

•  IDS triggered black hole routing for mitigation •  Triggers from both network and host IDS

•  Bogon (bogus IP address) filtering

106 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 107: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science

Summary So Far

Monitoring is a key part of the story – ensures things work, don’t break, and stay fixed

Emulates the user user case, sit the monitoring near them, and talk to them regularly about experience

Security needs to evolve with technology and use case – one size fits all is wrong.

Revisit security choices often, the firewall team doesn’t need to be the bad guys as long as you are working toward the same goal.

107 – ESnet Science Engagement ([email protected]) - 10/2/13

Page 108: The Science DMZ – perfSONAR & Network Monitoringrich/...OIN-ScienceDMZ-2-perfSONAR.pdf · The Science DMZ – perfSONAR & Network Monitoring Jason Zurawski - ESnet Engineering &

The Science DMZ – perfSONAR & Network Monitoring

Questions?

Jason Zurawski - [email protected]

ESnet Science Engagement – [email protected]

http://fasterdata.es.net