metatm & txlinux
DESCRIPTION
MetaTM & TxLinux Hany Ramadan, Christopher Rossbach, Donald Porter, Owen Hofmann, Aditya Bhandari, Emmett Witchel University of Texas at Austin. TM Background. Transactional programming is an emerging alternative to locks Avoids problems such as deadlock - PowerPoint PPT PresentationTRANSCRIPT
MetaTM & TxLinux
Hany Ramadan, Christopher Rossbach, Donald Porter, Owen Hofmann, Aditya Bhandari, Emmett
Witchel
University of Texas at Austin
TM Background
Transactional programming is an emerging alternative to locks Avoids problems such as deadlock Avoids performance-complexity tradeoffs
HTM holds the promise of simpler programming and good performance
TM: “What’s the OS got to do with it?”
Lack of realistic workloads (counter, splash-2)
Will current results hold on real programs? Unclear design tradeoffs; Feature set unsettled
OS is a real-life, parallel workload OS will benefit from transactions
Reduces synchronization complexity System-call and interrupt control paths will benefit
Architectural support is needed for OS
Average Transaction Count
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
Other TMs Nearest TM MetaTM
Ave
rag
e T
x/B
ench
mar
k
Outline
TxLinux MetaTM
Goals Features Interrupt handling
Issue: Stack memory Experimental results
TxLinux 2.6.16.1
Slaballocator
Zoneallocator
IP routing
Socketlocking
Various MMstructures
Directorycache
Pathnametranslation
Memory managementNetworkingFile system
Spin-locksSequencelocks
RCU(read-copy-update)
Converted ~30% of dynamic synchronization to transactions
MetaTM: Design goals
HTM model co-designed with TxLinux Extensions to x86 ISA Architectural support for OS Execution-driven simulation
A platform for TM research Multiple HTM design points Eager & lazy version management Eager conflict detection
MetaTM: Model features
commit cost(lazy)
abort cost(eager)
polite karma
exponential linear
eruption
Versionmanagement
Contentionmanagement(eager)
Backoff policy
Tx demarcation
random
timestamp polka sizematters
xbegin xend
xpush xpopMultiple Tx
TxLinux: Interrupt handling
Question: What happens to active tx on an interrupt?
Interrupt handlers allowed to use transactions
Factors weighing against abort Transaction length growing Interrupt frequency
Answer: Active transactions are suspended on interrupt
MetaTM: Multiple Tx support
Multiple active transactions on a processor At most one running, all others are suspended
Interface xpush suspends current transaction xpop resumes suspended transaction Suspended transactions maintained in LIFO order
New execution context is unrelated to old one Same conflict semantics with all other transactions
May start new transactions
Outline
TxLinux MetaTM
Goals Features Interrupt handling
Issue: Stack memory Experimental results
Issue: Stack memory
Transactions can span stack frames Why: Retain same flexibility as locks Problem: Live stack overwrite (correctness)
Solution: Stack Pointer Checkpointfoo()
{
atomic
{
}
}
foo()
{
bar()
baz()
}
bar() { xbegin }
baz() { xend }
intr statedo_IRQ
foo+8bar
Live stack overwrite
0xC0
StkPtr
localsfoo
0x80
0x40
0x00
Error: invalidreturn address
Tx Reg. Checkpoint
PC: bar+4
StkPtr: 0x40
(other regs..)
foo+4: call barfoo+8: <work>foo+12:xend
bar+0: xbeginbar+4: ret
do_irq: iret
Conflict
localsfoo
Only interrupts that arrive in kernel mode have this problem
intr statedo_irq
foo+8bar
Live stack overwrite, fixed
0xC0
StkPtr
localsfoo
0x80
0x40
0x00Tx Reg. Checkpoint
PC: bar+4
StkPtr: 0x40
(other regs..)
foo+4: call barfoo+8: <work>foo+12:xend
bar+0: xbeginbar+4: ret
do_irq: iret
localsfoo
Conflict
Fixed by setting ESP to Checkpointed ESP on interrupt
Outline
TxLinux MetaTM
Goals Features Interrupt handling
Issue: Stack memory Experimental results
Experiments
Setup Workloads System characteristics
Execution time Transaction rates Transaction origins
Studies Contention management Commit & Abort penalties
Setup
Simics 3.0.17 8-processor, x86 system (1 Ghz) Memory hierarchy
L1: sep D/I, 16KB, 4-way, 1-cycle hit L2: 4MB, 8-way, 16-cycle hit, MESI protocol
Main memory: 1GB, 200-cycle hit Other devices
Disk device (DMA, 5.5ms latency) Tigon3 gigabit nic (DMA,0.1ms latency)
Workloads to exercise TxLinux
counter shared counter micro- benchmark (8 threads)
pmake Runs make -j 8 to compile files from libFLAC 1.1.2
netcat streams data over TCP network conn.
MAB simulates software development file system workloads
configure 8 instances of configure for tetex
find 8 instances of find on a 78MB directory searching for text
Note: Only TxLinux creates transactions
Kernel Execution Time
0
2
4
6
8
10
12
14
counter pmake netcat MAB config find
Ker
nel
Exe
cuti
on
Tim
e (s
)
Linux
TxLinux
counter
%Kern. time 91% 13% 54% 57% 43% 50%
High kernel time justifies transactions in the OS
Transaction Rates
32,48616,635
449,322182,072 121,808
1
10
100
1,000
10,000
100,000
1,000,000
pmake netcat MAB config find
Tra
ns
ac
tio
ns
/ S
ec
Restart Rate 2.6% 3.1% 1.7% 2.1% 10.2%
Find workload has highest contention in TxLinux
Transaction Origins
0
20
40
60
80
100
pmake netcat MAB config find
% T
ran
sact
ion
s
System calls
Interrupts, kthreads
Kernel locks accessed from both system call and interrupt handling contexts
Contention Management Study
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
eruption karma kindergarten polka size matters timestamp
Policy
Norm
alized R
esta
rt R
ate
.
counter
pmake
netcat
MAB
configure
find
Polka best performer, but complex to implement; SizeMatters viable
Stall-on-conflict – reduces conflicts, but not always performance
counter
Commit & Abort Study
0
0.5
1
1.5
2
2.5
3
counter pmake netcat MAB configure find
Re
lati
ve
Sy
ste
m T
ime
0
100
1,000
10,000
0.75
0.80
0.85
0.90
0.95
1.00
1.05
1.10
1.15
1.20
1.25
counter pmake netcat MAB configure find
Re
lati
ve
Sy
ste
m T
ime
0
100
1,000
10,000
Performance sensitive to commit penalty, not abort
Confirms benefit of eager version management (fast commits)
Nor
mal
ized
Ker
nel T
ime
Nor
mal
ized
Ker
nel T
ime
Commit Cost
Abort Cost
Related Work
TM Models TCC [Hammond04], UTM [Anaian05], LogTM [Moore06], VTM [Rajwar05]
Suspension techniques Escape actions [Zilles06] – can’t start tx
Interrupt handling XTM [Chung06] – also tries to avoid aborts
Contention management Scherer & Scott [PODC’05] – in STM context
Conclusions
TM needs realistic workloads TxLinux the largest TM benchmark
OS needs TM Complex synchronization; large % of runtime
Building & running TxLinux reveals much Architectural support needed (Tx suspension)
Contention management is important Cost studies confirm fast commits
… more in the paper