lightweight monitoring of the progress of remotely executing computations

Download Lightweight Monitoring of  the Progress of  Remotely Executing Computations

Post on 22-Jan-2016

13 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Lightweight Monitoring of the Progress of Remotely Executing Computations. Shuo Yang, Ali R. Butt Y. Charlie Hu, Samuel P. Midkiff Purdue University. H arvesting U nused R esources. Typical workloads are Bursty Periods of little or no processing Periods of insufficient CPU resources - PowerPoint PPT Presentation

TRANSCRIPT

  • Lightweight Monitoring of the Progress of Remotely Executing Computations

    Shuo Yang, Ali R. ButtY. Charlie Hu, Samuel P. Midkiff

    Purdue University

  • Harvesting Unused ResourcesTypical workloads are BurstyPeriods of little or no processingPeriods of insufficient CPU resourcesIdle cycles not usable for future Exploit values from the wasted idle resourcesAchieve more available processing capability for free or at low costSmooth out the workload

  • The Need of Remote Monitoring Centralized cycle sharing SETI@Home, Genome@Home, IBM (with United Device), etc.Condor, Microsoft (with GridIron), etc.P2P based cycle-sharing (Butt et al. [VM04])Individual node can utilize the system more incentiveNodes can be across administrative domains more available resourceRemote execution motivates remote monitoringUnreliable resourcesUntrusted resources

  • Review of GridCop [Yang et al. PPoPP05]

    Submitted Job(H-code)

    ReportingModuleReportingModuleJVM (Sandboxed)Host Machine

    ProcessingModule(S-code)

    JVMSubmitterprogresspartial computation

  • Our New Contribution: Key Difference From GridCopUses probabilistic code instrumentationPrevents replay attacks (like GridCop)No recomputation needed reduces network traffic and submitter machine overheadTies the progress information closely to program structureMakes spoofing more difficultPC values reflecting the program binary code internal nature

  • Outline

    OverviewDesign of Lightweight Monitoring MechanismExperimental ResultsRelated Research and Conclusions

  • System Overview: Code GenerationOriginalcodeHost-code

    Submitter-code

    Code Generation System

    OriginalcodeExecuted on Host:Emits progress information (beacons) during computationExecuted on submitter:Processes beacons

  • System Overview

    Submitted Job(H-code)

    ReportingModuleHost MachineBeaconProcessingModule(S-code)

    Submitter Beacon

  • Basic Idea of the FSA TrackingBeacons are placed at significant execution points along CFGBeacons can be viewed as states in an FSA Can be placed at any site satisfying the compiler instrumentation criteria, e.g. MPI call sites in this paperHost emits beacon messages at significant execution pointsAn FSA emitting transition symbolsSubmitter processes beacon messages A mirror FSA recognizing legal transitions

  • An FSA Examplemain(){ mpi_irecv(); //S1 if(predicate){ mpi_send(); //S2 } mpi_wait(); //S3 }S1S2S3

  • Binary file Location Beacon (BLB)BLB values are the virtual address of instructions in the virtual memory of a process states in FSA

    Stack ..heapCode segment Initialized databss804a69b: call mpi_wait804a679: call mpi_send804a641: call mpi_irecv

  • PC values labels driving the transitions in FSAmain(){ pc = getPC(); mpi_irecv();// 0x804a641 deposit_beacon(pc); if(predicate){ pc = getPC(); mpi_send(); //0x804a679 deposit_beacon(pc); } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); }@804a641@804a679@804a69b804a69b 804a641804a69b804a679Compiler inserts a getPC() in front of a BLBgetPC() returns the address of the next instruction

  • Tracking the Progress of an MPI Programmain(){ pc = getPC(); mpi_irecv();// 0x804a641 deposit_beacon(pc); if(predicate){ pc = getPC(); mpi_send(); //0x804a679 deposit_beacon(pc); } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); }@804a641@804a679@804a69b804a69b 804a641804a69b804a679

  • Attacks to the FSA MechanismSusceptible to replay attackRemember the stream of beacons of a previous runReplay the stream in the future (cheating to gain undeserved compensation)Reverse engineer the binary executableUnderstand the control flow graphExpensive NP-hard in worst case ([Wang, PhD thesis University of Virginia])

  • Probabilistic BLBEach MPI function call site is a BLB candidate but not necessarily a BLB siteIt is used as a BLB site with probability of PB in (0,1) Effect: an individual MPI function call site may be a BLB in the FSA in one code generation; but not a BLB in next time

  • Probabilistic BLBs Guard against AttackThe same job can have a different FSA each time it is submitted to the host This leads to a different legal beacon value streamDefeats the replay attack by making it detectableReverse engineering by binary analysis must be repeated by cheating host on each run Break once, spoof only onceToo expensive!

  • One FSA with Probabilistic BLBmain(){ pc = getPC(); mpi_irecv();// 0x804a641 deposit_beacon(pc); if(predicate){ pc = getPC(); mpi_send(); //0x804a679 deposit_beacon(pc); } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); }@804a641@804a679@804a69b804a69b 804a641804a69b804a679

  • Another FSA with Probabilistic BLBmain(){ pc = getPC(); mpi_irecv();// 0x804a641 deposit_beacon(pc); if(predicate){ mpi_send(); //0x804a679 } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); }@804a641@804a69b804a69b 804a641

  • Outline

    OverviewDesign of Lightweight Monitoring MechanismExperimental ResultsRelated Research and Conclusions

  • Experimental SetupSubmitter machine @UIUC (thanks to Josep Torrellas)Intel 3GHz Xeon/512K cache, 1GB main memoryRunning Linux 2.4.20 kernelHost machine @PurdueA cluster of 8 Pentium IV machines (each node has 512K cache, 512MB main memory), interconnected by a FastEthernet.Running FreeBSD 4.7, MPICH 1.2.5Network accessBoth machines connected to campus networks via EthernetUIUCPurdue: representing a typical scenario of cycle-sharing across WAN

  • Benchmarks & Evaluation Metrics Used NAS Parallel Benchmark (NPB) 3.2A set of benchmarks to evaluate the performance of parallel computational resourcesRun Time Computation OverheadNetwork Traffic OverheadNetwork resource is not freeBeacon Distribution over TimeCapability to track progress incrementally

  • Host Side Computation Overhead at Different Number of NodesOverhead = (Tmonitoring Toriginal) / Toriginal * 100%Lower bar is betterDoes not increase monotonically with the increase of process numbers

    Chart6

    0.0120.01230.01244

    0.01370.01640.0165

    0.0150.01680.0163

    0.01740.02050.019

    2 nodes

    4 nodes

    8 nodes

    Sheet1

    2 nodes4 nodes8 nodes

    EP1.20%1.23%1.24%

    IS1.37%1.64%1.65%

    MG1.50%1.68%1.63%

    CG1.74%2.05%1.90%

    Sheet1

    2 nodes

    4 nodes

    8 nodes

    Sheet2

    Sheet3

  • Host Side Computation Overhead under Different Input SizesOverhead = (Tmonitoring Toriginal) / Toriginal * 100%Lower bar is betterLower overhead for larger problem size

    Chart2

    0.012440.0118

    0.01650.0146

    0.01630.0075

    0.0190.0116

    size B

    size C

    Overhead

    Different input sizes on 8 nodes

    Sheet1

    size Bsize C

    EP1.24%1.18%

    IS1.65%1.46%

    MG1.63%0.75%

    CG1.90%1.16%

    Sheet1

    00

    00

    00

    00

    size B

    size C

    Overhead

    Different input size on 8 nodes

    Sheet2

    Sheet3

  • Submitter Side Computation CostOverhead = time(submitter code) / execution timeImperfect metricthe number depends on submitters hardware, submitter workload, host speed etc.

  • Network Traffic Incurred by MonitoringBytes sent over network between host and submitter machine divided by the total execution timeLow bandwidth usage

  • Beacon Distribution over TimeUniformly distribution enables incrementally tracking

  • Outline

    OverviewDesign of Lightweight Monitoring MechanismExperimental ResultsRelated Research and Conclusions

  • Related ResearchL. F. Sarmenta [CCGrid01], W. Du et al. [ICDCS04]A host performs same computation on different inputsNeeds a central managerYang et al. [PPoPP05]Partially duplicate compuationIncurs more network traffic associated with the recomputationHofmeyr et al. [J. of Computer Security98], Chen and Wagner [CCS02]Using system call sequence to detect intrusionsApproaches to achieve host security

  • Conclusions

    Lightweight monitoring over a WAN/Internet possibleNo changes to host side system requiredInstrumentation can be performed automatically

  • Host Side Overhead Details(Slide 22)Overhead = (Tmonitoring Toriginal) / ToriginalDoes not increase monotonically with an increase in the number of processes (Nprocess)When Nprocess increases: The denominator, Toriginal, decreasesThe numerator difference of Tmonitoring and Toriginal decreases (the number of MPI calls decreases, decreasing the overhead of BLB message generation)Synchronization: always one extra thread per process no matter how many processes are running

  • Host Side Overhead Details(Slide 23)Overhead = (Tmonitoring Toriginal) / ToriginalResults in lower overhead for larger problem sizeWhen the problem size increasesDenominator (Toriginal) increasesNumerator (Tmonitoring Toriginal) similar since the number of MPI calls is similar