guarded’modules:’adap/vely’ extending’the’vmm’s’privileges...

43
Guarded Modules: Adap/vely Extending the VMM’s Privileges Into the Guest Kyle C. Hale Peter Dinda Department of Electrical Engineering and Computer Science Northwestern University hIp://halek.co hIp://presciencelab.org hIp://v3vee.org hIp://xstack.sandia.gov/hobbes

Upload: others

Post on 25-Jan-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • Guarded  Modules:  Adap/vely  Extending  the  VMM’s  Privileges  Into  

    the  Guest    

    Kyle  C.  Hale  Peter  Dinda  

    Department  of  Electrical  Engineering  and  Computer  Science  Northwestern  University  

    hIp://halek.co  hIp://presciencelab.org  hIp://v3vee.org  hIp://xstack.sandia.gov/hobbes  

  • Redefining  the  boundaries  between  VMM  and  guest  OS  

    2  

    VMM

    Guest OS

  • Redefining  the  boundaries  between  VMM  and  guest  OS  

    3  

    Guest OS

    VMM

    VMM VMM

  • We  want  to  evolve  the  VMM-‐guest  rela/onship  

     …where  the  interface  between  the  two  is  more  flexible    …and  where  parts  of  the  VMM  may  actually  live  inside  the  guest    The  laIer  is  the  focus  of  this  work        

    4  

  • 5  

    Guest-context virtual service

    VMM

    Guest

    Virtual service

    GEARS*      

       

    *  K.  Hale,  P.  Dinda.  Shi$ing  Gears  to  Enable  Guest-‐context  Virtual  Services,  ICAC  2012  

  • 6  

    VMM

    Guest

    Virtual service

    privileged operations

  • 7  

    VMM

    Guest

    Virtual service

    Guarded Module

    can still interact with guest

  • How  can  we  isolate  and  protect  pieces  of  code  in  a  guest  OS  that  run  at  higher  privilege  than  the  rest  of  the  guest?    How  can  we  allow  legacy  code  con/nue  to  use  guest  func/onality?  

    8  

    We  show  how  with  two  examples          

  • •  OS-‐independent,  embeddable  VMM  •  Support  for  mul/ple  host  OSes  (Linux,  KiIen  LWK)  •  Open  source,  available  at    

    hIp://v3vee.org/palacios  

    9  

    Palacios  VMM  

  • Outline  

    •  Mo/va/on  •  Christoiza/on  •  Threat  Model  and  Run/me  Invariants  •  Run/me  System  and  Border  Crossings  •  Examples  – Selec/vely  Privileged  PCI  Passthrough    – Selec/vely  Privileged  MONITOR/MWAIT  

    •  Conclusions  

    10  

  • Guarded  Module  Transforma/on  

    11  

    Linux Kernel Module

    Guarded Module

    Christoization

    guarded module compiler

  • Christoiza/on  

    12  

    Compile-‐/me  and  link-‐/me  wrapping    we  use  gcc  toolchain  to  instrument  (wrap)  all  calls  out  of  and  into  the  module  

    Christo  and  Jeanne-‐Claude  wrap  the  Reichstag,  Berlin,  1995  

  • The  guest  is  not  to  be  trusted  

    We  assume  a  threat  model  in  which  a  malicious  kernel  wants  to  hijack  a  service’s  privilege      Execu/on  paths  entering  and  leaving  the  guarded  module  must  be  checked  

    13  

  • We  maintain  control  flow  integrity  

    Christoiza/on  allows  VMM  to  trap  all  entries  into  and  exits  from  module          VMM  validates  the  environment  for  unauthorized  changes  to  execu/on  path  (e.g.  return  oriented  aIacks)  

    14  

    VMM

    Guest

    Guarded Module

    printk()  

    record  environment  

    = privileged operation

  • We  maintain  control  flow  integrity  

    15  

    VMM

    Guest

    Guarded Module return  

    Environment  ok?  Entry  point  valid?  

    = privileged operation

  • …and  code  integrity  

    16  

    VMM

    Guest

    Guarded Module write  to  module  code  

    = privileged operation

  • …and  data  integrity  

    17  

    VMM

    Guest

    Guarded Module

    write  to  module  data  

    private  module  data  

    = privileged operation

  • What  we  don’t  provide  

    Parameter  checking    Module  cloaking    Currently  no  support  for  interac/ons  between  guarded  modules      

    18  

  • Programmer’s  perspec/ve  

    1.  Write  a  Linux  kernel  module  (or  use  an  exis/ng  one)  

    2.  Run  it  through  our  guarded  module  compiler  (christoiza/on)  

    3.  Op/onal:  verify  iden/fied  module  entry  points  

    4.  Pass  to  administrator,  who  registers  guarded  module  with  VMM  at  run/me  

    19  

  • RECAP:  Guarded  Modules  are  guest  context  virtual  services  

     …that  can  have  elevated  privilege,  are  protected  from  the  untrusted  guest  that  they  run  in,    yet  can  s/ll  use  its  func/onality    The  implementa/on  is  small:  ~220  lines  of  Perl,  ~260  lines  of  Ruby,  and  ~1000  lines  of  C    (includes  both  the  GM  compila/on  toolchain  and  run/me  system)    available  online  at  hIp://v3vee.org/palacios  

    20  

  • Run/me:  module  entries/exits  trap  to  the  VMM  

     We  call  these  trapped  events  border  crossings                

    21  

    Border-‐out  call  

    Border-‐in  ret  

    Border-‐out  ret  

    Border-‐in  call  

    Privileged  

    Unprivileged  

    Illegal  ac/vity  

  • Wrapper  stubs    exit_wrapped:      popq  %r11      pushq  %rax      movq  $border_out_call,  %rax      vmmcall      popq  %rax      callq  exit      pushq  %rax      movq  $border_in_ret,  %rax      vmmcall      popq  %rax      pushq  %r11      ret  (to  into  guarded  module)    

     22  

    Trap  to  VMM,    record  environment  

    Trap  to  VMM,    Check  integrity  of  environment    

  • 23  

    Guest KernelGuest Kernel

    Guarded ModuleB d t

    Border

    ngs

    Border out

    Border inde

    rCrossin

    VMMPrivilegedPrivileged

    Bor

    gHardwareAccess

    gVMMAccess

    Border ControlState Machine

    Hardware

  • Typical  Border  Crossing  

    24  

    guarded module

    guest

    printk()  

    module_init()  

    return  

    = privileged operation

  • Nested  Border  Crossings  

    25  

    guarded module

    guest

    foo()  bar()  

    callback()  return  

    return  return  

    = privileged operation

  • Border  Crossing  from  External  Event  

    26  

    guarded module

    guest

    interrupt  iret  

    = privileged operation

  • Experimental  Setup  

    •  Dell  PowerEdge  R415  – 2  sockets,  4  cores  each  =>  8  total  cores  – 2.2  GHz  AMD  Opteron  4122  – 16  GB  memory  – Host  kernel:  Fedora  15  with  Linux  2.6.38  – Guest:  single  vcore  with  Busybox  environment,  Linux  2.6.38  

    27  

  • System-‐independent  overhead  is  low  

    28  

    0  

    1000  

    2000  

    3000  

    4000  

    5000  

    6000  

    7000  

    8000  

    9000  

    Border-‐out  Call   Border-‐in  Ret   Border-‐in  Call   Border-‐out  Ret  

    Cycles  

    misc_exit_handle  

    entry  lookup  

    misc_hcall  

    lower/raise  

    Cost of any VM exit

  • …but  ensuring  control-‐flow  integrity  is  expensive  

    29  0  

    1000  

    2000  

    3000  

    4000  

    5000  

    6000  

    7000  

    8000  

    9000  

    10000  

    Border-‐out  Call   Border-‐in  Ret   Border-‐in  Call   Border-‐out  Ret  

    Cycles  

    entry_lookup  

    misc_exit_handle  

    misc_hcall  

    check  

    priv  lower/raise  

    Stack checking

  • Outline  

    •  Mo/va/on  •  Christoiza/on  •  Threat  Model  and  Run/me  Invariants  •  Run/me  System  and  Border  Crossings  •  Examples  – Selec/vely  Privileged  PCI  Passthrough    – Selec/vely  Privileged  MONITOR/MWAIT  

    •  Related  Work  and  Conclusions  

    30  

  • We  transformed  a  NIC  driver  into  a  guarded  module  

    31  

    guest VMM

    guarded module

    Broadcom  BCM5716  1Gb/s  

    no  manual  modifica/ons  to  NIC  driver!  

  • SelecFvely  expose  the  PCI  BARs  to  the  guest  

    32  

    guarded  module  2  unprivileged  mode  

    PCI  I/O  and  memory  BARs    

    guarded  module  1  privileged  mode  

    r   r  w   w  0000..  val  

    Hardware  NIC    

  • Bandwidth  drops,  but  border  crossing  count  is  very  high!  

    33  

    Packet  Sends  

    Border-‐in   1.06  

    Border-‐out   1.06  

    Border  Crossings  /  Packet  Send   2.12  

    Packet  Receives  

    Border-‐in   4.64  

    Border-‐out   4.64  

    Border  Crossings  /  Packet  Receive   9.28  

    Each  border  crossing  is  ~16,000  cycles  (7.3  μs)      

    Many of these are leaf functions!

  • We  implemented  an  adap/ve  idle  loop  with  selec/ve  privilege  

     MONITOR/MWAIT  instruc/ons  allow  a  CPU  to  go  into  a  low-‐power  state  un/l  a  write  occurs  to  a  region  of  memory    MONITOR  [addr]  …  MWAIT    

    34  

  • VMs  can’t  typically  use  these  instruc/ons  

     Puts  physical  core  to  sleep.  Other  VMs/processes  on  that  core  will  starve    But  if  the  guest  knows  how  many  guests  are  on  the  machine  (VMM  state),  we  can  let  it  run  these  instruc/ons  when  idling    Can’t  let  untrusted  parts  of  guest  hijack  this  capability!  

    35  

  • Adap/ve  mwait_idle()  as  a  guarded  module  

    36  

    guest

    guarded module

    idle()  

    Other  vcores  present?  

    no  

    mwait_idle()  

  • 37  

    pcore  0  

    zzz  

    vcore  0  

    mwait_idle()  

    idle  loop  redirec/on  stub  (guarded  module)  

    pcore  1  

    zzz  

    vcore  0  

    vcore  1  

    default_idle()  

    Scenario 1: a single vcore

    Scenario 2: vcores sharing a physical core

  • Guarded  Modules  as  adapFve,  guest-‐context  virtual  services  

     We’re  leveraging  VMM  global  informa/on  about    the  environment    That  informa/on  is  only  exposed  to  the  guarded  module—this  presents  a  new  way  to  adapt  VMs  at  run/me  

    38  

  • Related  Work  

    Nooks  –  isolate  faulty  code  in  kernel  modules  with  wrappers.  Kernel  requires  modifica/on,  protec/ng  guest  from  modules    [Swit,  SOSP  ‘03]  LXFI,  SecVisor  –  protect  kernel  against  aIack  with  VMM-‐authorized  code      [Mao,  SOSP  ’11],  [Seshadri,  SOSP  ‘07]  

    SIM  –  guest-‐resident  VMM  code,  but  special-‐purpose,  uses  completely  separate  address  space  [Sharif,  CCS  ‘09]  

     39  

  • Conclusions  

    We’ve  shown  the  feasibility  of  adapFvely  extending  the  VMM  into  the  guest  with  guarded  modules    General  technique  to  automa/cally  transform  kernel  modules  into  guarded  modules    Two  proof-‐of-‐concept  examples:    -‐  Selec/ve  Privilege  for  Commodity  NIC  driver    -‐  Selec/ve  Privilege  for  MONITOR/MWAIT  

    40  

  • Future  Work  

    Feasibility  of  automa/cally  inlining  leaf  func/ons  into  modules  (making  them  more  self-‐contained)    Further  mo/va/ng  examples  for  guarded  modules    Generaliza/on  of  guarded  modules—modules  with  VMM-‐controlled,  specialized  execu/on  modes        

    41  

  • We  are  rethinking  system  sotware  interfaces  

    This  talk  focused  on  virtualiza/on,  but  we’re  thinking  bigger  (HW/VMM/OS/app)    It’s  /me  to  reconsider  the  structure  of  our  system  sotware  stacks    We  can  adapt  sotware  services,  but  can  we  adapt  their  organiza/on/structure?    

     

    42  

  • For  more  info:    Kyle  C.  Hale  hIp://halek.co  hIp://presciencelab.org  hIp://v3vee.org  hIp://xstack.sandia.gov/hobbes