nottingham/horizon 27th january 2016 cybersecurity and network measurement problematic...

73
Cybersecurity and network measurement problematic in so many ways Nottingham/Horizon Seminar 27 th January 2016 Team Jon h"p://www.cl.cam.ac.uk/~jac22 h"p://metricsitn.eu/

Upload: others

Post on 11-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Cybersecurity and network measurement

problematic in so many ways

Nottingham/Horizon

Seminar

27th January 2016

Team Jon

h"p://www.cl.cam.ac.uk/~jac22  h"p://metrics-­‐itn.eu/  

Page 2: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

outline  

•  Three  parts  1.  The  Internet  at  large  2.  Measurement  data  is  big  data  –  what’s  hard?  3.  Measuring  things  is  not  neutral  –  why?  

Page 3: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Part  1  –  The  Internet  is  Big  

•  Not  just  size  but  complexity  – Graph  data  –  has  billions  nodes  &  edges  

•  Hypergraph  –  edges  have  mulOple  meanings  •  Sparse,  and  dynamic  •  Topology,  topography,  policy  at  IP  level  •  Authorship,  ownership,  ACLs  at  Web  level  

– So  simple  quesOons  (clusters,  cliques,  hubs,  etc)  •  Are  computaOonally  very  expensive  (O(m^3/2))  

Page 4: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Part  2-­‐  Big  Data  

1.  The  “Big”  in  Big  Data  is  relaOve    2.  Big  Social  Data  3.  Big  Science  Big  Data  4.  Big  Private  Data  5.  Big  Bad  Data      Alan  Turing  InsOtute  for  Data  Science      h"p://www.turing.ac.uk/  

 

Page 5: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

“Big”  Data  is  RelaOve  

•  Social  Sciences  •  Natural  Sciences  •  ComputaOonal  Sciences  

Page 6: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Social  Sciences  

•  Big  >  12,  or  “Complete”  •  E.g.  all  of  a  family,  town,  country,  world  •  10  Billion  is  not  really  big    

–  if  you’re  just  counOng  •  Problem  is  Ground  Truth  

– E.g.  where  did  you  get  your  data  from?  

Page 7: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Social  Big  Data  problems  

•  Bias    – Sample  Bias  – Recruitment  Bias  – Survivor  Bias  

•  E.g.  Data  from  smart  phones  – Who  has  smart  phones?  

• What  type?  (MAC  addr  no  longer  tells  J  )  – WEIRD  

•  white  educated  industrialized  rich  democraOc  

Page 8: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Social  Graph  Data  

•  Even  when  you  have  “large”  data  – Beware,  McSherry  et  al,  results      h"p://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html  –  See  annex  1  slides…    

– But  also  ground  truth  etc  etc  •  And  aforesaid  sample/ground  truth  quesOons  

Page 9: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Natural  Science  Data  

•  ParOcle  Physics:  LHC/CERN  – 600M  events/sec  – 10Gbps  – Mostly  noiseJ  

•  Square  Kilometer  Array  – 10^15  bps  (petabit  per  sec)  – Trickier  =  100*  the  whole  internet  L  

•  Lesson  -­‐    they  will  build  big  enough  processing  

Page 10: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

ComputaOonal  Science  

•  Complexity….Big  Bad  Data  •  GeneOcs/EpigeneOcs/Phenomics  

–  Interdependence  within  data  –    – poster  child  e.g.  is  protein  folding  – Complexity  in  model  is  exponenOal  – What  hope?  

•  Lesson:-­‐  people  will  do  approximaOon  algo  

Page 11: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Physics/Chem/Bio  

•  Use  HPC  clusters/rack  scale  systems  – Tighter  memory  interconnect  (e.g.  Cray)  – Very  very  large,  fast  RAM    – mulOple  terabytes  today  – Vector  processor  support  

•  Lesson:  Not  much  use  for  us…or  is  it?  

Page 12: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Private  Data  •  Much  social  data  is  PII  

–  Even  meta  data  is  PII  –  Protect  “big”  data  by  AAA  –  Anonymize?  Very  hard,  especially  graphs  –  Inference  on  nodes  easy  

•  Re-­‐idenOficaOon  is  almost  trivial  –  E.g.  m,  yellowcab,  medicare  –  Via  public  diary,  postcode,  other  sources  

•  DiffPriv  –  works,  but  care  sOll  needed  •  Homomorphic  Cryptography  –  tbd!!!  •  Lesson:-­‐  Not  a  solved  problem,  access  control  vital  

Page 13: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Public  Health  

•  Aside  from  IoT,  PH  is  biggest  valid  use  of  PII  – On  negaOve  side,  privacy  crucal&legal  – On  posiOve  side,  few  genuine  researchers,  so  – AAA&Diff  Priv  work  pre"y  well  

•  QuanOfied  Self  +  Wellbeing/fitness  already…  •  Fitbit,  food  diaries  etc  •  Lesson  –  good  moOves  but  mission  creep  

Page 14: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Big  Data  processing  tools  

•  Aside  from  Hadoop,    – Apache’s  Spark  Streaming  and  Graphx  – R  – Naiad  (unsupported  for  now)  – Write  your  own  J  

•  Also  care  about  data  center  network  – Latency  bounds  improve  performance  – See  annex  2  slides  

Page 15: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Big  AnalyOcs  companies  

•  Google,  facebook  – See  OpenStack  and  Datacenter  Networking  (Yongguang  Zhang)  later….  

•  Run  on  specialized  data  centers  – Non  standard  interconnects  

•   (clos  nets)  – Non  standard  protocols  

•  IP  rouOng  doesn’t  scale  (l2  bridge+vpn++)  •  TCP  hacks…  •  Rdma  (microsou)  

Page 16: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

google  1

The Google Stack

Source: Malte Schwarzkopf. “Operating system support for warehouse-scale computing”. PhDthesis. University of Cambridge Computer Laboratory (to appear), 2015, Chapter 2.

Chubby [Bur06]locking and coordination

Borg [VPK+15] and Omega [SKA+13]cluster manager and job scheduler

coordination & cluster management

Dapper [SBB+10]pervasive tracing

CPI2 [ZTH+13]interference mitigationm

onito

ring

tool

s

GFS/Colossus [GGL03]distributed block store and file system

BigTable [CDG+06]row-consistent multi-dimensional sparse map

MegaStore [BBC+11]cross-DC ACID database

Spanner [CDE+13]cross-DC multi-version DB

Dremel [MGL+10]columnar database

data storage

MapReduce [DG08]parallel batch processing

FlumeJava [CRP+10]parallel programming

Tenzing [CLL+11]SQL-on-MapReduce

Percolator [PD10]incremental processing PowerDrill [HBB+12]

query UI & columnar store

MillWheel [ABB+13]stream processing

Pregel [MAB+10]graph processing

data processing

Figure 1: The Google infrastructure stack. I omit the F1 database [SOE+12] (the back-endof which was superseeded by Spanner), and unknown front-end serving systems. Arrowsindicate data exchange and dependencies between systems; simple layering does not implya dependency or relation.

In addition, there are also papers that do not directly cover systems in the Google stack:

• An early-days (2003) high-level overview of the Google architecture [BDH03].

• An extensive description of Google’s General Configuration Language (GCL), sadly withsome parts blackened [Bok08].

• A study focusing on tail latency effects in Google WSCs [DB13].

• Several papers characterising Google workloads from public traces [MHC+10; SCH+11;ZHB11; DKC12; LC12; RTG+12; DKC13; AA14].

• Papers analysing the impact of workload co-location [MTH+11; MT13], hyperthread-ing [ZZE+14], and job packing strategies on workloads [VKW14].

Page 17: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

facebook  

1

The Facebook Stack

Source: Malte Schwarzkopf. “Operating system support for warehouse-scale computing”. PhDthesis. University of Cambridge Computer Laboratory (to appear), 2015, Chapter 2.

ÜberTrace [CMF+14, §3]pervasive tracing

MysteryMachine [CMF+14]performance modeling

monitoring toolsparallel data processing

HDFS [SKR+10]distributed block store and file system

MySQLsharded ACID database

HBase [BGS+11]multi-dimensional sparse map

f4 [MLR+14]warm blob storage

Haystack [BKL+10]hot blob storage

memcached [NFG+13]in-memory key-value store/cache

TAO [BAC+13]graph store

Wormhole [SAA+15]pub-sub replication

data storage

(Hadoop) MapReduce [DG08]parallel batch processing

Hive [TSA+10]SQL-on-MapReduce

Peregrine [MG12]interactive querying

Scuba [AAB+13]in-memory database

Unicorn [CBB+13]graph processing

Figure 1: The Facebook infrastructure stack. I omit front-end serving systems about whichdetails are unknown. Arrows indicate data exchange and dependencies between systems;simple layering does not imply a dependency or relation.

In addition, there are several papers that do not directly cover systems in the Facebook stack,but describe workloads, techniques or data centre hardware:

• Descriptions of the physical design of Facebook’s server machines as of 2011 [FHL+11]and data centre network architecture as of 2013 [FA13].

• Another paper on the HBase back-end for Facebook messages [ABC+12] and a measure-ment paper looking at the HDFS-level usage patterns of this HBase deployment [HBD+14].

• Papers on the use of erasure codes in HDFS at Facebook [RSG+13; SAP+13; RSG+14].

• Several papers analysing the Facebook memcached workload [AXF+12] and evaluatingnew sampling strategies to improve hit rates in memcached [LLD+13].

• A study of Facebook’s wide-area photo caching infrastructure [HBR+13].

• A description of how Facebook uses shared memory to persist in-memory state acrossrestarts of Scuba server processes [GCG+14].

• The HipHop Virtual Machine (HHVM) is a JIT compiler and runtime for PHP code heav-ily used in front-end page generation [AEM+14]. Previously, Facebooke used a source-to-source compiler (also called “HipHop”, HPHPc) to transform PHP into semanticallyequivalent C++ code that can be compiled into native code [ZPY+12].

Page 18: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

More  subtle  stuff  

•  Deep  learning  •  ML  using  neural  nets,  etc  

– May  be  amenable  to  other  non  standard  h/w  •  Not  transparent  or  even  explanatory?  

– Some  say  quantum  compuOng  •  Others  put  that  in  doubt…  

•  Lesson:-­‐  AI  is  ML  that  doesn’t  work,  yet  

Page 19: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Part  3  -­‐  Three  Use  Cases  

In  order  of  increasing  badness:  •  Maps  •  FluPhone  •  Censorship  

Page 20: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  Case  #1:  Crowd  Sourced  Net  Atlas  

•  Carna  Botnet  – Used  to  measure  net  from  420,000  vantage  points  – Used  default  password  exploit  –  Illegal  in  most  countries  

•  See  “Internet  census  2012:  port  scanning/0  using  insecure  embedded  devices”  

– Pass  “Does  no  harm”  test?  •  Technically  yes  &  no  (bandwidth  costs)  •  ReputaOonally  no  

Page 21: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  Case  #1  conOnued  

•  Was  it  useful?  – A  bit  – But  alternaOves  exist  – CAIDA  &  Internet  Atlas  Projects  

•  Is  it  dangerous?  Gives  an  open  example  of  an  exploit  Possibly  –  shows  where  to  a"ack  net  hubs    

Page 22: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  Case  #2:  FluPhone  •  Goal  to  collect  encounter  data    

–  during  H1/N1  influenza  epidemic  –  Get  SIR  parameters  early  –  Find  other  features  of  epidemic    –  Vector,  age/gender  effects,  herd  immunity  –  h"p://www.cl.cam.ac.uk/research/srg/netos/projects/archive/

fluphone/  •  At  start  of  epidemic,  mortality  was  high  

–  Privacy  not  an  issue  (noOfiable  disease)?  –  But  medical  ethics  commi"ee:  – Weren’t  allowed  to  collect  on  children!  –  Bad,  as  they  are  a  key  mix  part  of  flu  spreading!  

Page 23: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  Case  #2  conOnued  

/RFDWLRQ�SUR[LPLW\�GDWD

9LD�*356��*

7KH�DYHUDJH�SHUVRQ�HQFRXQWHUHG�RYHU������XQLTXH�GHYLFHV�RYHU�D�WHQ�GD\�SHULRG�

Page 24: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  Case  #3:  Censorship  

•  “Encore:  Lightweight  Measurement  of  Web  Censorship  with  Cross-­‐Origin  Requests”  

     h"p://conferences.sigcomm.org/sigcomm/2015/pdf/reviews/226pr.pdf  

•  lesson  in  Do’s  &Don’ts  – of  ethical  measurement  – Methodologically  

•  Idea:  cause  browser  visiOng  innocent  site  A  •  To  be  redirected  to  “is  it  censored,  site  B”  

Page 25: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  case  #3  conOnued  

What  could  possibly  go  wrong,  part  1?  1.  If  you  are  in  a  dangerous  country  and  your  

browser  visits  a  censored  site,  the  excuse  2.  “I  didn’t  click  on  that”  doesn’t  help  you  from  

being  arrested  and  tortured  3.  We  know  dangerous  countries  have  logging  

firewalls  to  implement  censorship    4.  E.g.  Bluecoat  technology  illegally  shipped  to  

Syria,  Iran,  Russia  etc  etc  

Page 26: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Use  case  #3  conOnued  

What  could  possibly  go  wrong,  part  2?  •  The  ACLs  will  rapidly  be  updated  

–  To  block  the  site  A  (redirector  script  site)  – Or  the  script  pa"ern  itself  –  Rendering  the  experiment  useless.  

•  Meanwhile,  other  people  have  already  done  successful  experiments  in  any  case,  e.g.  –  Censorship  in  the  Wild:  Analyzing  Internet  Filtering  in  Syria,  doi>10.1145/2663716.2663720  

– And  did  no  harm  

Page 27: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

And  another  thing  

•  Interference  is  a  bad  thing  –  In  today’s  internet  (of  things),  s/w  is  fragile  – You  don’t  know  what  a  device  is  (for)  – E.g.  ipad  for  reading  email  – Might  also  be  car  dashboard  (Tesla)  – You  change  library  (e.g.  random  #  gen)  – Might  crash  car…or  open  it  up  to  hackers  – Who  crash  car.  loss  of  privacy  -­‐>  loss  of  life  

Page 28: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Future  is  interesOng  •  Lots  to  do,  lots  not  to  do.  •  InteresOng/diverse  and  useful  

–  But  also  care  needed  –  Making  more  haystacks  to  find  less  needles…  –  Medical  ethics  overly  strict  –  AdverOsing  ethics  underly  strict  

•  Cybersecurity?  You  work  it  out…  –  Be"er  not  have  sample  bias  or  inexplicable  ML  –  Please  map  the  examples  I  gave  onto  cybersec  quesOons  –  And  see  what  would  be  useful,    –  and  what  would  be  counter-­‐produc.ve  

•  Excellent  careers  right  now  for  CS+X  –  For  X=science,  commerce,  math/stats  

Page 29: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

QuesOons?  

•  I’m  happy  to  repond  to  followups.                            [email protected]  

Page 30: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Annex  2  slides  on  Graph  Processing  

By  J  Crowcrou  from  work  by  Malte  Schqarzkopf  &  Frank  McSherry  

Computer  Laboratory  &  Unafilliated  University  of  Cambridge  

Page 31: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

tl;dr  #1  

•  Network  speed  may  not  ma"er  with  a  Spark  based  stack,  but  it  does  ma"er  to  higher  performance  analyOcs  stacks,  and  for  graph  processing  especially.  

•   By  moving  from  a  1G  to  a  10G  network,  we  see  a  2x-­‐3x  improvement  in  performance  for  Omely  dataflow.  

Page 32: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

tl;dr  #2  

•  A  well  balanced  distributed  system  offers  performance  improvements  even  for  graph  processing  problems  that  fit  into  a  single  machine;  

•   running  things  locally  isn't  always  the  best  strategy  

Page 33: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

tl;dr  #3  

•  PageRank  performance  on  GraphX  is  primarily  system  bound.  We  see  a  4x-­‐16x  performance  increase  when  using  Omely  dataflow  on  the  same  hardware,  which  suggests  that  GraphX  (and  other  graph  processing  systems)  leave  an  alarming  amount  of  performance  on  the  table  

Page 34: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

PageRank  in  Rust  

������������� ��������������������������� ���������&� & !�����"������� ������%��%������������������ ��$����" ����!����&���!����$����!��� ��������$�����$����� "��� ! �!��!��������������!�������������� ���� & !�� �����#�����������������"�!�������������������!���!�����

0%�0�/+)"�)�5�!&/�$.""�3&0%��+3"2".��+1.� +!"�&/��2�&(��("��%00, $&0%1� +)#.�*') /%"..5,�$".�*'��(+*$/&!"�&*/0.1 0&+*/��%00, $&0%1� +)#.�*') /%"..5,�$".�*'�(+�)�/0".������)!� ��/+�5+1�)�50.5�&0�#+.�5+1./"(#�

�������"�/"0�+10�0+�1*!"./0�*!�0%"��+00("*" '/�&*���*+*�0.&2&�(� +),10�0&+*�0%�0�3"�1*!"./0�*!�,."005�3"(( ��$"��*'��+��"� ("�.����$"��*'�&/�*+0�/+)"��.&((&�*0� +),10�0&+*���10�&0�&/�)+."�&*0"."/0&*$�0%�*���!&/0.&�10"!������0�&/���$++!�"4�),("�+#� +),10�0&+*/�0%�0�"4 %�*$"�!�0����$$."$�0"�!�0����*!��"*"#&0�#.+)�)�&*0�&*&*$&*!"4"!�!�0��."/&!"*0�&*�)")+.5��0� �*��(/+��"�&),(")"*0"!�&*�!&##"."*0�3�5/���*!�0%1/�%"(,/�0"�/"�+10�%+33"((��+.���!(5����,�.0& 1(�.��,,.+� %�#&0/�3&0%���,�.0& 1(�.�/5/0")

�%&/�3�/��(/+���$++!�+,,+.01*&05�0+�0.5�+10�0&)"(5�!�0�#(+3�&*��1/0��%00,/ $&0%1� +)#.�*') /%"..50&)"(5�!�0�#(+3� ��3%& %�&/��+0%���,+.0�0+��1/0��%00, 333.1/0�(�*$+.$� ��*!��*�"40"*/&+*�+#�0%"�0&)"(5�!�0�#(+3,�.�!&$)�&*���&�!��%00, ."/"�. %)& .+/+#0 +)��&�!� ��&)"(5�!�0�#(+3�&*��1/0�%�!�+*(5��""*�.1*�+*��(�,0+,�/+�#�.��/+�0%&/�3�/���$++!� %�* "�0+�/%�'"�/+)"��1$/�+10

�!!&0&+*�((5��0%"���)�.&!$"��+),10".�����%�/���*"3��)+!"(�!�0�� "*0."���3%& %�&/���)+!".*� (1/0".�"-1&,,"!3&0%�����*"03+.'&*$�'&0���*!�3"�3�*0"!�0+�/""�%+3�#�/0�0%&/��1/0� +!"� �*�$+��/�&0�01.*/�+10��&0�*+3�)+2"/,."005��.&/'(5��+1�((�/""

������ ��$"��*'�&/���*+0�3&(!(5� +),(& �0"!�$.�,%� +),10�0&+* �0%"�&!"��&/�0%�0�"� %�2".0"4�/0�.0/�3&0%�/+)"��)+1*0+#�."�(�2�(1"!��.�*'���3%& %�&0�.","�0"!(5�/%�."/��(+*$�0%"�!&." 0"!�"!$"/�0+�&0/�*"&$%�+./��#�+*"�'"",/�!+&*$0%&/�#+.�(+*$�"*+1$%��0%"�."�(�2�(1"!�.�*'/�/0�.0�0+�/0��&(&6"

�"."�&/���/0.�&$%0#+.3�.!��/".&�(���$"��*'�&),(")"*0�0&+*�&*��1/0

���!���"�����"�!��������"�!���&�"$���#��%#�)�����!���������*��������������������������������������#"����&����������&�"$���#��������������#$���&����������&�"$���#�������������������&����������&�"$���#���

������������������������������'��������"�!�����#���*�����'����� �����+�

������������������������������������$�"�$� �������������*�

��������������������������������������&�"$�'��������&�"$���#��*�������������#"��&�"$�'������!������#$�&�"$�'��������&�"$�'����������������#$�&�"$�'���� �������!������������+�

�������������� �������������������������'�(������"�!�����#���*��#$�(�����#"��'���+�����+�+�

�#�3"�(++'��0�0%"� +!"��0%"� +),10�0&+*�)�*&,1(�0"/�,".�2".0"4�/0�0"��,.",�.&*$ ���������� ��*!� ���������� �

1/&*$� ���������� ����*!�0%"*�/3&*$/�0%.+1$%� ���� ��&* ."�/&*$� ����� ��5� ����� �#+.�"� %�$.�,%�"!$"� �����

�%"�+*(5�,�.0�+#�0%&/� +),10�0&+*�0%�0�)�'"/�,�.�(("(&6�0&+*�!&##& 1(0�&/�1,!�0&*$� ����� ��" �1/"�0%"�"!$"/

)�5�(&*'��*5�,�&./�+#�2".0& "/��3"�!+*�0�%�2"��*����������,�.0&0&+*&*$�+#�."/,+*/&�&(&05�#+.�0%"/"�1,!�0"/��+�3"�((%�2"�0+�!+�0%�0

������������������ �"�3�*0�0+�)�,�0%&/� +),10�0&+*�� .+//�)1(0&,("�3+.'"./��0%."�!/��,.+ "//"/��+.� +),10"./����*!�#+.01*�0"(50%"."��."�/"2".�(�#�&.(5�/&),("�3�5/�0+�!+�0%&/��%"�)+/0� +))+*��,,.+� %�&/�0+�,�.0&0&+*�."/,+*/&�&(&05�#+."� %� ������ �� .+//�0%"�3+.'"./��/+�0%�0�"� %�3+.'".�&/�."/,+*/&�("�#+.���.+1$%(5�"-1�(�*1)�".�+#�2".0& "/

�"�3&((��//&$*�0%"�."/,+*/&�&(&05�#+.�,.+ "//&*$�2".0"4� � �0+�3+.'".� �����������

Page 35: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Impl  #1:  Send  everything  

�$�%�')�)�$#�)����$"%*)�)�$#��"$#��,$' �'(��,���!($�#����)$�%�')�)�$#�)���'�!�+�#)�()�)����#%*)(��#��#)�'"����)����)����(�,�!!�������#%*)�)$�)����$"%*)�)�$#��(��*()���(�)�$������(��($�,��"*()�"� ��(*'��)��)����� ������ �"� �(��)(�,�.�)$�,$' �'� ������� ��� �

�#���,����+����()'��*)����!!�)�������(�)$�)���,$' �'(��)���������# ��$"%*)�)�$#��(��*()���"�))�'�$�'�%��)��!.��%%!.�#��)���%�'�+�')�-�*%��)�(��,����������,$' �'���#��$��#��%�#��#)!.���#��)��#���)�'"�#�#��#���$""*#���)�#��)���*%��)�(�)$�)���'�# (�$�������+�')���(�

�����,$' �'�%'�%�'�(���"�((����$��)����$'"� ���������� ���#����)�#��)����#)�#����*%��)���'�)��'�)��#���'��)!.

�%%!.�#�� �� �)$�)���'�# (�����(��*%��)�(��'��)��#��-���#������),��#�)���,$' �'(��������*%��)�(�)$���+�')�-� �

�'��(�#)�)$�,$' �'� ������� ��� �

�������������������������������� ���(�,�!!����$*'���'()��"%!�"�#)�)�$#��,�����,��,�!!�&*�� !.���(��'���(����#����!�'�$*(!.��#�������#)�� $'��+�'.����� ������ ��)���+�')�-��,��%'�%�'����"�((���� ������ � ��#����)�#���#)�'�()��#�%�'�$'"�#���� �� �$%�'�)�$#��$'

)����������()�#�)�$#� � ������-���#����!!�$��)��(��"�((���(��(�#��#�������)$�)���,$' �'��#����'���$��)��

'���%��#)�+�')�-�

�������*'����!$,��!!*()'�)�(��$,�)��(�,$*!��%'$������#���(�))�#��,�)���$*'�,$' �'(��� �)$�� ���#�),$�%'$��((�(�� ��#��� ��

���(�"��#(�)��)�,��#����)$�(�#��($"����)����'$((�)���#�),$' �������������������������������)�,$*!���')��#!.�'�(*!)��#�($"��(�'�$*(��$""*#���)�$#� ��#�����,$*!��!$$ �%'�)).��$$���$"%�'���)$�����$,�+�'�,��,$*!��$#!.������!��)$��$#�!*���)��)������'!.�#�/+���"%!�"�#)�)�$#��(��$""*#���)�$#��$*#�����)�(�(������,���#��$���))�'�

������������������������������ �� ��������!!�)��)������,$' �'�"�#���(�"*!)�%!��+�')���(������'�����(�"�.�,�!!���+����()�#�)�$#(��#��$""$#��($�����,$' �'���#����*"*!�)��)���"�((���(��$'��������()�#�)�$#���#��(�#���*()�$#��"�((�����$'��������()�#�)��()�#�)�$#������,$' �'���#����*"*!�)��)��(��*%��)�(��#)$�����(��)��!���$'���!�'���(%�'(��+��)$'��%'$%$')�$#�!)$�)���#*"��'�$��+�')���(��#�)����#)�'���'�%�����*)�)��'���(���"*���(�"%!�'�,�.�

�����,$' �'��'$*%(��)(�����(��.��� �������'�)��'�)��#��.�($*'�����.�($��$�#�����,$' �'���#��)�'�)��)�'$*���)(���()�#�)�$#(�����*"*!�)���)(�*%��)�(��'$"������($*'���'�# ���#��)��#��((*��$#��*%��)���$'�)�����()�#�)�$#�

����(�#�!��)�'�������#)�'%'�)�)�$#�$��)��(��$���"���)�!$$ �!� ��)��(

���������������������������������������������� ��������� � ������������������������������������������������������������������� ������������

� �

� �

Page 36: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Impl  #2:  Worker-­‐level  aggregaOon  

����)'�������+#&��&�#" -���'�����'��&#!�*�&(���'� � ��%)� �!#�) #� ��� �������*� )��#�� ���� ��� ��'

��'(�"�(�������"��#!$��(� �� ��-������(#&�#�� ��� ������''�"�� ������� ���� ��"'(����#�� ����� ���#!$�&���(#

(���'#)&����&#)$����$$&#�����(����#����#�'�&�"�#!�����''��"(#���'!� �*��(#&� �� �&�(��&�(��"��� �&���*��(#&

��� ��+�������"���('� #�� �(-�

�������)&���� #+�� )'(&�(�'�(��' ������+#&��&�!�&��'�)$��(�'��#&�(���'�!����'(�"�(�#"���# ��")!��&���"(#��'�"� ��)$��(����'���&�') (��(��'��,�!$ ���#)&�+#&��&�'�()$��"�'�)$��,���"��"��#" -��!�''���'���"'(����#����"�(���"�/*��*�&'�#"�

�#&�#*�&��(��'��$$&#����� '#�$&#�)��'�#)($)('��'��(��#�'��+������ #+'�+#&��&'�(#�#*�& �$��#!!)"���(�#"�+�(��#!!)"���(�#" ���+#&��&���"�'(�&(�(� �"��#(��&�+#&��&'���#)(��('�)$��(�'�&���(��+�-��)'�"��(���"�(+#&���(�(��'�!��(�!���'�(���&�'(�#��(����#!$)(�&�

��� ������������������� � ���������������+#&��&� �*� ����&���(�#"��!$ �!�"(�(�#"�������#*����'�#)&���'(�$�&�#&!�"���!$ �!�"(�(�#"��#&�����"�(+#&����)(��(�'(� �'�"�'�%)�(������(�#����(�� ��#�!#&�����& -��#!$�&�����"�����"�(+#&�'��+����"����&���(�����(�!#&�����&�''�*� -��"��!#*�����&���(�#"�(#�(����������������(#��)&(��&�&��)���(����!#)"(�#����(�(&�"'!�((�������'���#+�*�&���#!�'��(�(����,$�"'��#��!#&���#!$)(�(�#"��"��!#&��'-"��&#"�.�(�#"��'�"��$&#��''� �*� ����&���(�'���&#''�!) (�$ ��+#&��&'�!)'(��+��(�� ���(���&#!������+#&��&�����'�&��)��'�(��#*�& �$$�"��#���#!$)(�(�#"��"���#!!)"���(�#"�(��(�+����"��#���)(�+��'�"�� �''���(��#*�&� �

����"��(������)&��� )'(&�(�'�(��' ���(�&�+����*�����&���(���)$��(�'��(�(���+#&��&� �*� ��+��� '#����&���(�(��!��(�(���$&#��''� �*� ���"�(����,�!$ ���(��'�&��)��'�(���")!��&�#��!�''���'��&#!��(#���

Page 37: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Impl  #3:  Process-­‐level  aggregaOon  

����)'�������+#&��&�#" -���'�����'��&#!�*�&(���'� � ��%)� �!#�) #� ��� �������*� )��#�� ���� ��� ��'

��'(�"�(�������"��#!$��(� �� ��-������(#&�#�� ��� ������''�"�� ������� ���� ��"'(����#�� ����� ���#!$�&���(#

(���'#)&����&#)$����$$&#�����(����#����#�'�&�"�#!�����''��"(#���'!� �*��(#&� �� �&�(��&�(��"��� �&���*��(#&

��� ��+�������"���('� #�� �(-�

�������)&���� #+�� )'(&�(�'�(��' ������+#&��&�!�&��'�)$��(�'��#&�(���'�!����'(�"�(�#"���# ��")!��&���"(#��'�"� ��)$��(����'���&�') (��(��'��,�!$ ���#)&�+#&��&�'�()$��"�'�)$��,���"��"��#" -��!�''���'���"'(����#����"�(���"�/*��*�&'�#"�

�#&�#*�&��(��'��$$&#����� '#�$&#�)��'�#)($)('��'��(��#�'��+������ #+'�+#&��&'�(#�#*�& �$��#!!)"���(�#"�+�(��#!!)"���(�#" ���+#&��&���"�'(�&(�(� �"��#(��&�+#&��&'���#)(��('�)$��(�'�&���(��+�-��)'�"��(���"�(+#&���(�(��'�!��(�!���'�(���&�'(�#��(����#!$)(�&�

��� ������������������� � ���������������+#&��&� �*� ����&���(�#"��!$ �!�"(�(�#"�������#*����'�#)&���'(�$�&�#&!�"���!$ �!�"(�(�#"��#&�����"�(+#&����)(��(�'(� �'�"�'�%)�(������(�#����(�� ��#�!#&�����& -��#!$�&�����"�����"�(+#&�'��+����"����&���(�����(�!#&�����&�''�*� -��"��!#*�����&���(�#"�(#�(����������������(#��)&(��&�&��)���(����!#)"(�#����(�(&�"'!�((�������'���#+�*�&���#!�'��(�(����,$�"'��#��!#&���#!$)(�(�#"��"��!#&��'-"��&#"�.�(�#"��'�"��$&#��''� �*� ����&���(�'���&#''�!) (�$ ��+#&��&'�!)'(��+��(�� ���(���&#!������+#&��&�����'�&��)��'�(��#*�& �$$�"��#���#!$)(�(�#"��"���#!!)"���(�#"�(��(�+����"��#���)(�+��'�"�� �''���(��#*�&� �

����"��(������)&��� )'(&�(�'�(��' ���(�&�+����*�����&���(���)$��(�'��(�(���+#&��&� �*� ��+��� '#����&���(�(��!��(�(���$&#��''� �*� ���"�(����,�!$ ���(��'�&��)��'�(���")!��&�#��!�''���'��&#!��(#���

Page 38: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Some  Baseline  figures  

$*�(380)�437*27.&00<�75<�&2)�1&/*�7-.6�.140*1*27&7.32�*9*2�61&57*5��(877.2,�(31487&7.32�&2)�39*50&44.2,(31182.(&7.32���3:*9*5��&6�:*�:.00�6-3:�.2�4&57�7:3�3+�7-.6�4367��*9*2�:.7-�.2+.2.7*0<�+&67�&,,5*,&7.32��:*(&2237�*;4*(7�&����2*7:35/�73�:.2�387�39*5����

����������*7�6�6**�-3:�:*00�385�.140*1*27&7.326�:35/�

$*�*9&08&7*�7-*�7.1*�73�)3�7:*27<�.7*5&7.326�3+��&,* &2/�32�7-*��&1!&!�(0867*5� �86.2,�!4&5/�5&4-%����&2)�385�7.1*0<�)&7&+03:�.140*1*27&7.32�

$*�86*�7:3�,5&4-6��7-36*�86*)�'<�7-*��5&4-%�4&4*5�-7746�:::�86*2.;�35,6<67*1+.0*6(32+*5*2(*36).��36).���4&4*5�,32=&0*=�4)+� ��&������*),*�,5&4-�3+":.77*5�+3003:*56��-774�&2�/&.67�&(�/575&(*6$$$���-710� �� �������&2)�&� ����*),*�,5&4-�3+�:*'0.2/6��-774�0&:�).�82.1.�.7:*')&7&8/������ ���������������"-*������������,5&4-�.6�&063�7-*�32*86*)�+35�7-*�5*68076�6-3:2�.2�7-*��!���60.)*�)*(/�-7746�7:.77*5�(31/&<3867*5-38767&786��� ��� � ���� �

������ ������� ����� �*+35*�:*�67&57��0*7�6�7-.2/�&'387�*;.67.2,�'&6*0.2*6�7-&7�6-380)�,.9*�631*�(327*;7�

�*03:��:*�6-3:�45*9.3860<�5*4357*)�1*&685*1*276�+531�!4&5/�&2)��5&4-%��&6�:*00�&6�7-*�5827.1*�3+��5&4-%32�385�(0867*5��&2)�7-*�5827.1*�3+�7:3�6.2,0*�7-5*&)*)�.140*1*27&7.326��+531�7-*���!"�4&4*5�-7746�:::�86*2.;�35,(32+*5*2(*-3736��:35/6-34�453,5&145*6*27&7.321(6-*55<� ��

������ ��� �� �� �� ����� � � ���������

!4&5/ �5&4-%�4&4*5�-7746�:::�86*2.;�35,6<67*1+.0*6(32+*5*2(*36).��36).���4&4*5�,32=&0*=�4)+�

��;� ���6 ����6

�5&4-% �5&4-%�4&4*5�-7746�:::�86*2.;�35,6<67*1+.0*6(32+*5*2(*36).��36).���4&4*5�,32=&0*=�4)+�

��;� ���6 ���6

�5&4-% 1*&685*)�32�385�(0867*5 ��;� �6 ��6

!.2,0*7-5*&)�6.140*5�

��!"�4&4*5�-7746�:::�86*2.;�35,(32+*5*2(*-3736��:35/6-34�453,5&145*6*27&7.321(6-*55<�

� 6 ���6

!.2,0*7-5*&)�61&57*5�

��!"�4&4*5�-7746�:::�86*2.;�35,(32+*5*2(*-3736��:35/6-34�453,5&145*6*27&7.321(6-*55<�

� ��6 ���6

":*27<�4&,*5&2/�.7*5&7.326��'&6*0.2*�1*&685*1*276�

!3�+&5��237-.2,�2*:��7-*�1*&685*1*27�32�385�(0867*5�(32+.516�7-&7�7-*�281'*56�+531�7-*��5&4-%�4&4*5�(&2'*�5*453)8(*)� ���35*39*5��7-*�0&4734�4*5+3516�45*77<�:*00�*9*2�7-38,-�.7�6�86.2,�320<�32*���#�(35*��5&7-*57-&2�������2�7-*� ���������� �,5&4-��7-*�6.2,0*�7-5*&)*)�.140*1*27&7.32��������'*&76�7-*�).675.'87*)

Page 39: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

System  

$*�(380)�437*27.&00<�75<�&2)�1&/*�7-.6�.140*1*27&7.32�*9*2�61&57*5��(877.2,�(31487&7.32�&2)�39*50&44.2,(31182.(&7.32���3:*9*5��&6�:*�:.00�6-3:�.2�4&57�7:3�3+�7-.6�4367��*9*2�:.7-�.2+.2.7*0<�+&67�&,,5*,&7.32��:*(&2237�*;4*(7�&����2*7:35/�73�:.2�387�39*5����

����������*7�6�6**�-3:�:*00�385�.140*1*27&7.326�:35/�

$*�*9&08&7*�7-*�7.1*�73�)3�7:*27<�.7*5&7.326�3+��&,* &2/�32�7-*��&1!&!�(0867*5� �86.2,�!4&5/�5&4-%����&2)�385�7.1*0<�)&7&+03:�.140*1*27&7.32�

$*�86*�7:3�,5&4-6��7-36*�86*)�'<�7-*��5&4-%�4&4*5�-7746�:::�86*2.;�35,6<67*1+.0*6(32+*5*2(*36).��36).���4&4*5�,32=&0*=�4)+� ��&������*),*�,5&4-�3+":.77*5�+3003:*56��-774�&2�/&.67�&(�/575&(*6$$$���-710� �� �������&2)�&� ����*),*�,5&4-�3+�:*'0.2/6��-774�0&:�).�82.1.�.7:*')&7&8/������ ���������������"-*������������,5&4-�.6�&063�7-*�32*86*)�+35�7-*�5*68076�6-3:2�.2�7-*��!���60.)*�)*(/�-7746�7:.77*5�(31/&<3867*5-38767&786��� ��� � ���� �

������ ������� ����� �*+35*�:*�67&57��0*7�6�7-.2/�&'387�*;.67.2,�'&6*0.2*6�7-&7�6-380)�,.9*�631*�(327*;7�

�*03:��:*�6-3:�45*9.3860<�5*4357*)�1*&685*1*276�+531�!4&5/�&2)��5&4-%��&6�:*00�&6�7-*�5827.1*�3+��5&4-%32�385�(0867*5��&2)�7-*�5827.1*�3+�7:3�6.2,0*�7-5*&)*)�.140*1*27&7.326��+531�7-*���!"�4&4*5�-7746�:::�86*2.;�35,(32+*5*2(*-3736��:35/6-34�453,5&145*6*27&7.321(6-*55<� ��

������ ��� �� �� �� ����� � � ���������

!4&5/ �5&4-%�4&4*5�-7746�:::�86*2.;�35,6<67*1+.0*6(32+*5*2(*36).��36).���4&4*5�,32=&0*=�4)+�

��;� ���6 ����6

�5&4-% �5&4-%�4&4*5�-7746�:::�86*2.;�35,6<67*1+.0*6(32+*5*2(*36).��36).���4&4*5�,32=&0*=�4)+�

��;� ���6 ���6

�5&4-% 1*&685*)�32�385�(0867*5 ��;� �6 ��6

!.2,0*7-5*&)�6.140*5�

��!"�4&4*5�-7746�:::�86*2.;�35,(32+*5*2(*-3736��:35/6-34�453,5&145*6*27&7.321(6-*55<�

� 6 ���6

!.2,0*7-5*&)�61&57*5�

��!"�4&4*5�-7746�:::�86*2.;�35,(32+*5*2(*-3736��:35/6-34�453,5&145*6*27&7.321(6-*55<�

� ��6 ���6

":*27<�4&,*5&2/�.7*5&7.326��'&6*0.2*�1*&685*1*276�

!3�+&5��237-.2,�2*:��7-*�1*&685*1*27�32�385�(0867*5�(32+.516�7-&7�7-*�281'*56�+531�7-*��5&4-%�4&4*5�(&2'*�5*453)8(*)� ���35*39*5��7-*�0&4734�4*5+3516�45*77<�:*00�*9*2�7-38,-�.7�6�86.2,�320<�32*���#�(35*��5&7-*57-&2�������2�7-*� ���������� �,5&4-��7-*�6.2,0*�7-5*&)*)�.140*1*27&7.32��������'*&76�7-*�).675.'87*)

Page 40: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Timely  dataflow  impl  

*.1-&.&/4"4*0/3�"/%�0/�4)&� ���������� �(2"1)�4)&�3*.1-&�3*/(-&4)2&"%&%�*.1-&.&/4"4*0/�*3�0/-9�;���

3-07&2�4)"/��2"1)!��"4� ��8�-&33�2&3052$&3�����)&�3."24&2�3*/(-&4)2&"%&%�*.1-&.&/4"4*0/�7*4)�"��*-#&2431"$&'*--*/(�$526&�(2"1)�-"9054�"-7"93�#&"43�4)&�%*342*#54&%�3934&.3�#9�#&47&&/�����"/%��8�

�3�4)"4�#"%�/&73�'02�%*342*#54&%�(2"1)�120$&33*/(�(&/&2"--9���&4�3�3&&�

����������� ������������������&4�3�4",&�052�%"4"1"2"--&-�*.1-&.&/4"4*0/�054�'02�"�31*/�� &�--�34"24�7*4)�+534�"�3*/(-&�."$)*/&�"/%�.06&'20.�0/&�$02&�40�.5-4*1-&�$02&3�� &�.&"352&�4)&�404"-�&-"13&%�4*.&��'*234�(2"1)��"/%�4)&�"6&2"(&�1&2*4&2"4*0/4*.&�0'�4)&�-"34�4&/�*4&2"4*0/3��3&$0/%�(2"1)����02�2&'&2&/$&�7&�"-30�3)07�4)&�2&35-43�'02��2"1)!�"/%�4)&3*.1-&�3*/(-&4)2&"%&%�*.1-&.&/4"4*0/3��"3�)02*:0/4"-�#"23��

������ �� �� ����� � � ���������

�*.&-9�%"4"'-07 �����3�� ���3� �����3������3�

�*.&-9�%"4"'-07 � ����3������3� �����3������3�

�*.&-9�%"4"'-07 � ����3���� �3� ����3������3�

�*.&-9�%"4"'-07 � ����3������3� ��� 3������3�

�*.&-9�%"4"'-07 � ����3������3� ���3������3�

�7&/49�1"(&2"/,�*4&2"4*0/3�0/�0/&�."$)*/&�.5-4*1-&�4)2&"%3�

&--�4)*3�*3�(00%��7*4)�0/&�4)2&"%�7&�34*--�1&2'02.�3*.*-"2-9�"3��2"1)!�"4� ���"/%�7&�0541&2'02.�4)&�3*.1-&3*/(-&4)2&"%&%�.&"352&.&/4�7*4)�+534�470�4)2&"%3 �"/%�7&�0541&2'02.�4)&�3."24�3*/(-&4)2&"%&%.&"352&.&/4�7*4)�&*()4�4)2&"%3�

������������������������������&4�3�/07�3&&�7)"4�)"11&/3�7)&/�7&�%*342*#54&�4)&�$0.154"4*0/�06&2�.5-4*1-&�$0.154&23���&2&�7&�)"6&�4)&$)0*$&�0'�53*/(�&*4)&2�"� ��/&4702,�*/4&2'"$&�02�"� ���/&4702,�*/4&2'"$&��7&�7*--�.&"352&�#04)�2&6&"-*/(�4)&1&2'02."/$&�("*/3�4)"4� ���#2*/(3��*'�"/9��

�/�"%%*4*0/�40�4)&�702,&2-&6&-�"((2&("4*0/�*.1-&.&/4"4*0/�'20.�"#06&�7&�"-30�*/$-5%&�.&"352&.&/43�'02120$&33-&6&-�"((2&("4*0/�0/�"� ��/&4702,��-"#&--&%�� �������)*3�*3�.02&�2&12&3&/4"4*6&�0'�"� �014*.*:&%*.1-&.&/4"4*0/�"/%�."4$)&3�7)"4��2"1)!�%0&3�

�02�&"$)�$0/'*(52"4*0/�7&�"("*/�2&1024�4)&�&-"13&%�4*.&�40�1&2'02.�47&/49�*4&2"4*0/3��'*234�(2"1)��"/%�4)&"6&2"(&�0'�4)&�'*/"-�4&/�*4&2"4*0/3��3&$0/%�(2"1)����&$"53&��2"1)!�"/%�052�*.1-&.&/4"4*0/�)"6&�%*''&2&/4�0/&0''�34"2451�$0343�4)&�"6&2"(&�*4&2"4*0/�4*.&�7)&/�4)&�$0.154"4*0/�*3�25//*/(�*3�120#"#-9�4)&�'"*2&34�.&42*$�'02$0.1"2*30/�

Page 41: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Now  you  can  have  mulOple  …  

 

System cores 1G 1G+ 10G 10G  speedup  over

1G+

total per-­iteration

Timelydataflow

1x8 107.6s(3.70s)

107.6s(3.70s)

107.6s(3.70s)

– –

Timelydataflow

2x8 115.2s(4.66s)

89.0s(3.51s)

65.6s(2.34s)

1.36x 1.50x

Timelydataflow

4x8 149.4s(6.77s)

80.9s(3.33s)

40.6s(1.49s)

1.99x 2.23x

Timelydataflow

8x8 145.4s(6.60s)

66.5s(2.86s)

27.6s(1.05s)

2.41x 2.72x

Timelydataflow

16x8 169.3s(7.51s)

51.8s(2.30s)

19.3s(0.75s)

2.68x 3.07x

GraphX 16x8 354.8s(13.4s)

333.7s(12.2s)

1.06x 1.10x

Elapsed  and  (per-­iteration)  times  for  twenty  PageRank  iterations  on  multiplemachines  using  the  twitter_rv  graph,  comparing  1G  and  10G  networks

System cores 1G 1G+ 10G 10G  speedup  over

1G+

total per-­iteration

Timelydataflow

1x8 137.1s(3.29s)

137.1s(3.29s)

137.1s(3.29s)

– –

Timelydataflow

2x8 173.3s(6.82s)

135.8s(4.82s)

80.7s  (2.31s) 1.68x 2.09x

Timelydataflow

4x8 231.9s(9.06s)

119.1s(4.67s)

51.4s  (1.54s) 2.32x 3.03x

Timelydataflow

8x8 196.4s(8.87s)

80.1s(3.18s)

34.1s  (1.07s) 2.35x 2.97x

Timelydataflow

16x8 231.2(10.25s)

53.9s(2.13s)

23.7s  (0.76s) 2.27x 2.80x

GraphX 8x8 666.8s(14.40s)

682.6s(15.00s)

0.98x 0.96x

GraphX 16x8 361.8s(9.30s)

357.9s(8.30s)

1.01x 1.12x

Elapsed  and  (per-­iteration)  times  for  twenty  PageRank  iterations  on  multiplemachines  using  the  uk_2007_05  graph,  comparing  1G  and  10G  networks

Phew,  that's  a  lot  of  data!  There  are  a  few  important  observations  that  we  can  draw  from  them,  though:

1.   Making  the  network  faster  does  not  improve  GraphX's  performance  much  (at  most  10-­12%),  which  confirmsthe  observations  of  the  NSDI  paper.

2.   Making  the  network  faster  does  improve  timely  dataflow's  performance  (by  2-­3x),  which  limits  the  generality

Page 42: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Conclusions  1  

•  As  we  have  seen,  the  three  implementaOons  (GraphX  and  the  two  Omely  dataflow  ones)  have  different  bo1leneck  resources.    

•  GraphX  does  more  compute  and  is  CPU-­‐bound  even  on  the  1G  network,  whereas  the  leaner  Omely  dataflow  implementaOons  become  CPU-­‐bound  only  on  the  10G  network.  

•  Drawing  conclusions  about  the  scalability  or  limitaOons  of  either  system  based  on  the  performance  of  the  other  is  likely  misguided.  

Page 43: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Conclusions  2  

•  Fast  10G  networks  do  help  reduce  reduce  the  runOme  of  parallel  computaOons  by  significantly  more  than  2-­‐10%:  we've  seen  speedups  up  to  3x  going  from  1G  to  10G.    

•  However,  the  structure  of  the  computaOon  and  the  implementaOon  of  the  data  processing  system  must  be  suited  to  fast  networks,  and  different  strategies  are  appropriate  for  1G  and  10G  networks.  

•   For  the  la"er,  being  less  clever  and  communicaOng  more  someOmes  actually  helps.  

Page 44: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Conclusions  3  

•  Distributed  data  processing  makes  sense  even  for  graph  computaOons  where  the  graph  fits  into  one  machine.    

•  When  computaOon  and  communicaOon  are  overlapped  sufficiently,  using  mul7ple  machines  yields  speedups  up  to  5x  (e.g.,  on  twi"er_rv,  1x8  vs.  16x8).  Running  everything  locally  isn't  necessarily  faster.  

Page 45: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Conclusions  4  

•  Can  make  PageRank  run  16x  faster  per  itera7on  using  distributed  7mely  dataflow  than  using  GraphX  (from  12.2s  to  0.75s  per  iteraOon).    

•  This  tells  us  something  about  how  much  scope  for  improvement  there  is  even  over  numbers  currently  considered  state-­‐of-­‐the-­‐art  in  research!  

Page 46: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Annex  2  -­‐  Systems  (th)at  Scale  –  reducing  latency  in  data  center  

network  

Jon  Crowcrou,    h"p://www.cl.cam.ac.uk/~jac22  

 

Page 47: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Cloud,  Data  Center,  Networks  

1.  New  Cloud  OS  to  meet  new  workloads  –  Includes  programming  language  – Collabs  incl  REMS  (w/  P.Gardner/Imperial)  

2.  New  Data  Center  structure  –  Includes  heterogeneous  h/w  – Collabs  incl  NaaS(Peter  Pietzuch  Imperial)  – Trilogy  (Mark  Handley  et  al  UCL)  

3.  New  Networks  (for  data  centers&)  – To  deal  with  aboveJ  

Page 48: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

What  not  talking  about  

•  Security    –  (we  do  that  –  had  another  workshop)  

•  Data  – Hope  Ed  folks  will!  

•  Scaling  Apps  – Oxford  

•  Languages  for  Apps  – Ed++  

Page 49: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

1.  Cloud  OS  

•  Unikernels  (Mirage,  SEL4,  ClickOS)  

Docker

User Processes

Filesystem

Network Stack

Kernel Threads

Language Runtime

Application Binary

Configuration Files

Oper

ating

Syst

em

User Processes

Filesystem

Network Stack

Kernel Threads

Language Runtime

Application Binary

Configuration Files

Oper

ating

Syst

em

Hypervisor

Hardware

Docker Container

(a) Containers, e.g., Docker.

Drawbridge

ntoskrnl Device Drivers

User Processes

IO Stack

Device Drivers

Kernel Threads

Language Runtime

Application Binary

Configuration Files

Win

dow

s 7 O

SPlatform Adaptation

Layer

Kernel Threads

IO Stack

Security Monitor

Library OS

Application Binary

Configuration Files

ntoskrnl

Hardware

Hos

t O

S

Hardware

Picoprocess

(b) Picoprocesses, e.g., Drawbridge.

Mirage

User Processes

Filesystem

Xen

Network Stack

Kernel Threads

Language Runtime

Mirage Runtime

Xen

Application Code

Application Binary

Configuration Files

ARM Hardware ARM Hardware

Oper

ating

Syst

em Mirage Unikernel

(c) Unikernels, e.g., MirageOS.

Figure 2: Contrasting approaches to application containment.

The Xen 4.4 release added support for recent ARMarchitectures, specifically ARM v7-A and ARM v8-A.These include extensions that let a hypervisor managehardware virtualized guests without the complexity offull paravirtualization. The Xen/ARM port is markedlysimpler than x86 as it can avoid a range of legacy re-quirements: e.g., x86 VMs require qemu device emu-lation, which adds considerably to the trusted comput-ing base [7]. Simultaneously, Xen/ARM is able to sharea great deal of the mature Xen toolstack with Xen/x86,including the mechanics for specifying security policiesand VM configurations.

Jitsu can thus target both Xen/ARM and Xen/x86, re-sulting in a consistent interface that spans a range of de-ployment environments, from conventional x86 serverhosting environments to the more resource-constrainedembedded environments with which we are particularlyconcerned, where ARM CPUs are commonplace.

2.3 Xen/ARM UnikernelsBringing up MirageOS unikernels on ARM required de-tailed work mapping the libOS model onto the ARM ar-chitecture. We now describe booting MirageOS uniker-nels on ARM, their memory management requirements,and device virtualization support.

Xen Boot Library. The first generation of uniker-nels such as MirageOS [26, 25] (OCaml), HaLVM [11](Haskell) and the GuestVM [32] (Java) were constructedby forking Mini-OS, a tiny Xen library kernel that ini-tialises the CPU, displays console messages and allocatesmemory pages [39]. Over the years, Mini-OS has beendirectly incorporated into many other custom Xen oper-ating systems, has had semi-POSIX compatibility boltedon, and has become part of the trusted computing basefor some distributions [7]. This copying of code becomesa maintenance burden when integrating new features thatget added to Mini-OS. Before porting to ARM, we there-fore rearranged Mini-OS to be installed as a system li-

brary, suitable for static linking by any unikernel.4 Func-tionality not required for booting was extracted into sep-arate libraries, e.g., libm functionality is now providedby OpenLibM (which originates from FreeBSD’s libm).

An important consequence of this is that a libc isno longer required for the core of MirageOS: all libcfunctionality is subsumed by pure OCaml libraries in-cluding networking, storage and unicode handling, withthe exception of the rarely used floating point formattingcode used by printf, for which we extracted code fromthe musl libc. Removing this functionality does notjust benefit codesize: these embedded libraries are bothsecurity-critical (they run in the same address space asthe type-safe unikernel code) and difficult to audit (theytarget a wide range of esoteric hardware platforms andthus require careful configuration of many compile-timeoptions). Our refactoring thus significantly reduced thesize of a unikernel’s trusted computing base as well asimproving portability.

Fast Booting on ARM. We then ported Mini-OS toboot against the new Xen ARM ABI. This domain build-ing process is critical to reducing system latency, sowe describe it here briefly. Xen/ARM kernels use theLinux zImage format to boot into a contiguous mem-ory area. The Xen domain builder allocates a fresh vir-tual machine descriptor, assigns RAM to it and loadsthe kernel at the offset 0x8000 (32KB). Execution be-gins with the r2 register pointing to a Flattened DeviceTree (FDT). This is a similar key/value store to the onesupplied by native ARM bootloaders and provides a uni-fied tree for all further aspects of VM configuration. TheFDT approach is much simpler than x86 booting, wherethe demands of supporting multiple modes (paravirtual,hardware-assisted and hybrids) result in configuration in-formation being spread across virtualized BIOS, memoryand Xen-specific interfaces.

4Our Mini-OS changes have been released back to Xen and are be-ing integrated in the upstream distribution that will become Xen 4.6.

3

Page 50: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Unikernels  in  OCaml  

•  But  also  Go,  Scala,  Rust  etc  •  Type  safety-­‐>security,  reliability  •  Apps  can  be  legacy  or  in  same  languages  

Linux Kernel

UnikernelsUnikernels

Xen

ARM Hardware

Linux Kernel

Jitsu Toolstack

XenStore

Unikernels Legacy VMs

inco

min

g tr

affic

domain 0

outgoing traffic

shared memory transport

Figure 1: Jitsu architecture: external network connec-tivity is handled solely by memory-safe unikernels con-nected to general purpose VMs via shared memory.

2 Embedded UnikernelsBuilding software for embedded systems is typicallymore complex than for standard platforms. Embeddedsystems are often power-constrained, impose soft real-time constraints, and are designed around a monolithicfirmware model that forces whole system upgrades ratherthan upgrade of constituent packages. To date, general-purpose hypervisors have not been able to meet these re-quirements, though microkernels have made inroads [9].

Several approaches to providing application isolationhave received attention recently. As each provides dif-ferent trade-offs between security and resource usage,we discuss them in turn (§2.1), motivating our choice ofunikernels as our unit of deployment. We then outline thenew Xen/ARM port that uses the latest ARM v7-A vir-tualization instructions (§2.2) and provide details of ourimplementation of a single-address space ARM uniker-nel using this new ABI (§2.3).

2.1 Application ContainmentStrong isolation of multi-tenant applications is a require-ment to support the distribution of application and sys-tem code. This requires both isolation at runtime as wellas compact, lightweight distribution of code and associ-ated state for booting. We next describe the spectrum ofapproaches meeting these goals, depicted in Figure 2.

OS Containers (Figure 2a). FreeBSD Jails [19] andLinux containers [38] both provide a lightweight mecha-nism to separate applications and their associated kernelpolicies. This is enforced via kernel support for isolatednamespaces for files, processes, user accounts and otherglobal configuration. Containers put the entire mono-lithic kernel in the trusted computing base, while stillpreventing applications from using certain functionality.Even the popular Docker container manager does not yetsupport isolation of root processes from each other.1

1https://docs.docker.com/articles/security/

Both the total number and ongoing high rate of dis-covery of vulnerabilities indicate that stronger isolationis highly desirable (see Table 2). An effective way toachieve this is to build applications using a library op-erating system (libOS) [10, 24] to run over the smallertrusted computing base of a simple hypervisor. This hasbeen explored in two modern strands of work.

Picoprocesses (Figure 2b). Drawbridge [34] demon-strated that the libOS approach can scale to runningWindows applications with relatively low overhead (just16MB of working set memory). Each application runsin its own picoprocess on top of a hypervisor, and thistechnique has since been extended to running POSIX ap-plications as well [15]. Embassies [22] refactors the webclient around this model such that untrusted applicationscan run on the user’s computer in low-level native codecontainers that communicate externally via the network.

Unikernels (Figure 2c). Even more specialised appli-cations can be built by leveraging modern programminglanguages to build unikernels [25]. Single-pass compi-lation of application logic, configuration files and devicedrivers results in output of a single-address-space VMwhere the standard compiler toolchain has eliminated un-necessary features. This approach is most beneficial forsingle-purpose appliances as opposed to more complexmulti-tenant services (§5).

Unikernel frameworks are gaining traction for manydomain-specific tasks including virtualizing networkfunctions [29], eliminating I/O overheads [20], build-ing distributed systems [6] and providing a minimal trustbase to secure existing systems [11, 7]. In Jitsu we usethe open-source MirageOS2 written in OCaml, a stati-cally type-safe language that has a low resource footprintand good native code compilers for both x86 and ARM.A particular advantage of using MirageOS when work-ing with Xen is that all the toolstack libraries involvedare written entirely in OCaml [36], making it easier tosafely manage the flow of data through the system and toeliminate code that would otherwise add overhead [18].

2.2 ARM Hardware VirtualizationXen is a widely deployed type-1 hypervisor that isolatesmultiple VMs that share hardware resources. It was orig-inally developed for x86 processors [2], on which it nowprovides three execution modes for VMs: paravirtualiza-tion (PV), where the guest operating system source is di-rectly modified; hardware emulation (HVM), where spe-cialised virtualization instructions and paging featuresavailable in modern x86 CPUs obviate the need to mod-ify guest OS source code; and a hybrid model (PVH) thatenables paravirtualized guests to use these newer hard-ware features for performance.3

2http://www.openmirage.org

3See Belay et al [4] for an introduction to the newer VT-x features.

2

Page 51: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Data  Centers  don’t  just  go  fast  

•  They  need  to  serve  applicaOons  1.  Latency,  not  just  throughput  2.  Face  users    

1.  Web,  video,  ultrafast  trade/gamers  2.  Face  AnalyOcs…  

3.  Availability  &  Failure  Detectors  4.  ApplicaOon  code  within  network  5.  NIC  on  host  or  switch  –  viz  

Page 52: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Industry  (see  pmJ  )  

Azure  h"p://conferences.sigcomm.org/sigcomm/2015/pdf/papers/keynote.pdf  Facebook:  h"p://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p123.pdf  Google:  h"p://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf    

Page 53: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

2.  DeterminisOc  latency  bounding  

•  Learned  what  I  was  teaching  wrong!  •  I  used  to  say:  

–  Integrated  Service  too  complex  •  Admission&scheduling  hard  

– Priority  Queue  can’t  do  it    •  PGPS  computaOon  for  latency?  

•  I  present  Qjump  scheme,  which  – Uses  intserv  (PGPS)  style  admission  ctl  – Uses  priority  queues  for  service  levels  – h"p://www.cl.cam.ac.uk/research/srg/netos/qjump/  

Page 54: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Data  Center  Latency  Problem  

•  Tail  of  the  distribuOon,    – due  to  long/bursty  flows  interfering  

•  Need  to  separate  classes  of  flow  – Low  latency  are  usually  short  flows  (or  RPCs)  – Bulk  transfers  aren’t  so  latency/ji"er  sensiOv  

Page 55: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Data  Center  Qjump  SoluOon  

–  In  Data  Center,  not  general  Internet!  •  can  exploit  topology  &  •  traffic  matrix  &    •  source  behaviour  knowledge  

– Regular,  and  simpler  topology  key  – But  also  largely  “cooperaOve”  world…  

Page 56: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Hadoop  perturbs  Ome  synch  

Page 57: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Hadoop  perturbs  memcached  

Page 58: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Hadoop  perturbs  Naiad  

Page 59: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Qjump  –  two  pieces  

1.  At  network  config  Ome  –  Compute  a  set  of  (8*)  rates  based  on  –  Traffic  matric  &  hops  =>  fan  in  (f)    

2.  At  run  Ome  –  Flow  assigns  itself  a  priority/rate  class  –  subject  it  to  (per  hypervisor)  rate  limit  

*  8  arbitrary  –  but  ouen  h/w  supportedJ  

Page 60: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Memcached  latency  redux  w/  QJ  

Page 61: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

QJ  naiad  barrier  synch  latency  redux  

Page 62: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Web  search  FCT100Kb    ave  

Page 63: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Big  Picture  Comparison    –  Related  work…  

Page 64: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Failure  Detectors  

•  2PC  &  CAP  theorem  •  Recall  CAP  (Brewer’s  Hypothesis)  

– Consistency,  Availability,  ParOOons  – Strong&  weak  versions!  –  If  have  net&node  determinisOc  failure  detector,  isn’t  necessarily  so!  

•  What  can  we  use  CAP-­‐able  system  for?  

Page 65: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

2b  2PC  throughput  with  and  without  QJump  

Page 66: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Consistent,  parOOon  tolerant  app?  

•  Souware  Defined  Net  update!  – Distributed  controllers  have  distributed  rules  – Rules  change  from  Ome  to  Ome  – Need  to  update,  consistently  – Need  update  to  work  in  presence  of  parOOons    

•  By  definiOon!  – So  Qjump  may  let  us  do  this  too!  

Page 67: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

3.  ApplicaOon  code  -­‐>  Network  

•  Last  piece  of  data  center  working  for  applicaOon  

•  Switch  and  Host  NICs  have  a  lot  of  smarts  – Network  processors,    –  like  GPUs  or  (net)FPGAs  – Can  they  help  applicaOons?  –  In  parOcular,  avoid  pathological  traffic  pa"erns  (e.g.  TCP  incast)  

Page 68: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

ApplicaOon  code  

•  E.g.  shuffle  phase  in  map/reduce  – Does  a  bunch  of  aggregaOon  –  (min,  max,  ave)  on  a  row  of  results  – And  is  cause  of  traffic  “implosion”  – So  do  work  in  stages  in  the  switches  in  the  net  (like  merge  sort!)  

•  Code  very  simple  •  Cross-­‐compile  into  switch  NIC  cpus  

Page 69: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Other  applicaOon  examples  

•  Are  many  …  •  Arose  in  AcOve  Network  research  

– Transcoding  – EncrypOon  – Compression  –  Index/Search  

•  Etc  etc  

Page 70: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Need  language  to  express  these  

•  Finite  iteraOon    •  (not  Turing-­‐complete  language)  •  So  design  python–  with  strong  types!  •  Work  in  progress  in  NaaS  project  at  Imperial  and  Cambridge…  

Page 71: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Tiny Terabit Datacentre An End-Host Networked-Server Architecture

ü  High Performance ü  Resource Isolation ü  Flexible Implementation

ü  Predictable Latency ü  Low Latency Interconnect ü  Affordable

Page 72: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

NITRO"

§

72

Networks, Interfaces and Transports!for Rack-Scale Operating Systems!

Page 73: Nottingham/Horizon 27th January 2016 Cybersecurity and network measurement problematic ...jac22/talks/notts-ethical... · 2016-01-20 · Part11–The1Internetis1Big1 • Notjustsize1butcomplexity1

Conclusions/Discussion  

•  Data  Center  is  a  special  case!  •  Its  important  enough  to  tackle  

– We  can  hard  bound  latency  easily  – We  can  detect  failures  and  therefore  solve  some  nice  distributed  consensus  problems  

– We  can  opOmise  applicaOons  pathological  traffic  pa"erns    

–  Integrate  programming  of  net&hosts  – Weird  new  h/w…  

•  Plenty  more  to  do…