specialization tools and techniques for systematic optimization of system software
DESCRIPTION
Specialization Tools and Techniques for Systematic Optimization of System Software. Dylan McNamee, Jonathan Walpole, Calton Pu, Crispin Cowan, Charles Krasic, Ashvin Goel, Perry Wagle. Presented by: Rizal Arryadi. Background. - PowerPoint PPT PresentationTRANSCRIPT
Specialization Tools and Techniques for Systematic
Optimization of System Software
Dylan McNamee, Jonathan Walpole, Calton Pu, Crispin Cowan, Charles Krasic, Ashvin Goel, Perry Wagle
Presented by: Rizal Arryadi
Background
• Much complexity in OS code arises from the requirement to handle all possible system states.
• There are conflict between correctness across all applications vs high performance for individual applications
• Micro-kernel approach to OS shows the conflict between performance, modularity, and portability
Approaches• Write general-purpose code, but optimized for a few
anticipated common cases– Problem: “common” cases varies
• Explicit Customization: incorporate customizability into system structure– SPIN, Exokernel, Synthesis, Mach, etc.– Problem:
• Burden in system tuners• Limit access to global system state optimization
opportunities are reduced
• Inferred Customization (Specialization): – Automatically derived optimization– Create optimized code for common cases– Restricting code, not extending it
Process Specialization
• Also called Partial Evaluation• Consider a program P, taking 2 arguments S and D,
producing a result R:run P(S,D) = R
A specialization of P wrt S is as follows:run Ps(D) = run P(S,D) = R
• Creating optimized code for common cases– Burden in system tuner is reduced– But, more complex analysis of system
• Tedious, error-prone• Result is more complex, harder to debug & maintain
Objective of this paper
• Provides toolkit to reduce manual work in specialization
• Evaluate the toolkit’s effectiveness in operating system components
Fundamentals of Specialization
• Specialization Predicates– States of the system known in advance
• Partial Evaluation– Given specialization predicates, separate the
static parts from the dynamic parts
• Guards– Enable/Disable specialized code when
specialized terms are modified
Three Kinds of Specialization
• Static specialization– Predicates known at compile time– Partial Evaluation can be applied before execution
• Dynamic specialization– Defer specialization until runtime– The values of spec predicates are not established
until some point during execution– Once established, hold for the remainder of execution
• Optimistic specialization– Spec predicates only hold for bounded time intervals
(aka quasi-invariants)
Steps to specialize a system 1. Identify specialization predicates
• Use developer’s knowledge• Locate code that can be optimized• Estimate the net performance improvement
2. Generate specialized code• Use partial evaluation to automate it
3. Check when specialization predicates hold• For dynamic and optimistic specialization• Locate all places that can cause predicates to change, and “guard”
them
4. Replace specialized code• Replugging: enabling/disabling one version of specialized code with
another• Not surprising: synchronization issue
Tempo: specialized code generator• Partial evaluator for C programs• Challenge in binding time analysis
– Side effects in C• Pointers & aliases • Structures & arrays• Functions that modify global state
– Ignored in conventional approaches but captured well by Tempo
• Tempo features:– Use sensitivity: accurate treatment of “nonliftable values”– Flow sensitivity: variables could be static – Context sensitivity: assign specific binding time desc for each context– Return sensitivity: return value of side-effecting functions can be static
• Static specialization• Compile-time and run-time specialization
Enabling/Disabling Specialized Code:
TypeGuard
• Place guards at the site of modifications to specialization predicate terms
• To be efficient, the predicate terms must be used more frequently than modified.
• Problems:– May include too many sites– Different variables, same name (locally defined)– Aliases pass by reference– Spec Predicate terms often are not simple scalar, but
fields of dynamically allocated structures.
Enabling/Disabling Specialized Code:
TypeGuard• Type based static analysis• Two-phase approach:
– Phase 1• Statically identify structure types whose fields are spec
predicate terms• Extend them to include spec predicate ID (SPID)• Identify statements that update a guarded field• Insert the guarding code
– Phase 2• Set SPID dynamically when specialized code is enabled• Clear it when specialized code is disabled• Check it when a spec predicate term is modified
Enabling/Disabling Specialized Code:
TypeGuard• Guarding example:
current->uid = bar
would become:if (current.SPID!=NULL)
current.SPID->update_uid(bar);
else
current->uid = bar;
Enabling/Disabling Specialized Code:
MemGuard• TypeGuard issues warnings about alias-producing operations, and it
needs to be validated.• MemGuard guarantees complete guard coverage • Uses memory protection HW to write-protect pages that contain
spec predicate terms• The write-fault handler check if the address being written is a spec
predicate term; if so, perform guarded write and might trigger replugging
• Uses HW memory protection guaranteed to capture all writes to spec predicate terms
• Drawback:– Coarse granularity– High Overheads
Replugger
• Replace current code with code that is consistent with the new state of specialization predicate.
• Problem: concurrent replugging and function invocation – Solution: synchronization using locks
• Factors affect the design:– Concurrent invocation of the same repluggable function
• Can be avoided by associating functions with threads– Concurrency between replugging and invocation
• With concurrent invocation (i.e. counting replugger):– Use counter to detect whether threads are exectuing the
repluggable function– Use stub function holding_tank to avoid invocation while
the function being replugged
• Without concurrent invocation:– Use a boolean flag
Experiments: Specializing RPC• Applying Tempo to Sun RPC’s marshaling process
• Static specialization the spec predicates are available when stubs are generated.
• Specialization Opportunities:– Some fields in data structures used by marshaling code have
values known at stub generation time
• Some details:– Encoding/Decoding Dispatch Elimination– Buffer Overflow Checking Elimination– Exit Status Propagation– Marshaling Loop Unrolling
Experiments: Specializing RPC (Cont.)• Performance
Experiments: Specializing Packet Filters• BSD Packet Filter (BPF):
– Interface for selecting packets from a network interface
• Specialization Opportunities:– BPF is executed many times– To be specialized: the packet interpreter– “Specializing an interpreter wrt a particular program effectively
compiles that program”
• Either static and dynamic specialization– Static: packet filter program is available in advance– Dynamic: if the packet filter program is presented immediately
before execution overhead are included in run-time.
Experiments: Specializing Packet Filters (cont.)• A packet is read and filterd by calling:
Parameters: packet filter program, a packet, the length of the original packet, the amount of data present
• BPF interpreter: Recursion:
Experiments: Specializing Packet Filters (Cont.)• Performance
• Code SizeUnspecialized: 550 lines
Specialized (6 instr filter): 366 lines
Specialized (10 instr filter): 576 lines
Experiments: Specializing Signals• Statement ret = kill (pid, n)
• Specialization opportunity: a process might repeatedly sends the same signal to the same destination process both task_structs won’t change
Experiments: Specializing Signals (cont.)• Optimistic specialization:
– if any predicate terms are modified between signals, the specialized code is invalid (e.g. the destination exits)
• TypeGuard is used to identify locations that require guarding.
Experiments:
Specializing Signals (Cont.)• Performance • Application Level Impact
– 100,000 Producer-Consumer iterations
– Buffer size: 4– Unspecialized: 11.9 s– Specialized: 5.6 s
• Code Size– 4 functions into 1, 59 LOC into 18
Related Work
• Multistage programming– Programs that generate other programs– Tempo is a:
• Automatic• Heterogenous• Static and Run-time generation• Two or Three Stage system
• Aspect-oriented programming“Different aspects of a system’s behavior tend to have their own natural form, so while one abstraction framework might do a good job of capturing one aspect, it will do a less good job capturing others”