superoptimization venkatesh karthik srinivasan guest lecture in cs 701, nov. 10, 2015

70
Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Upload: damian-turner

Post on 18-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

Superoptimization Input: An instruction sequence I – Instructions belong to an ISA – No loops Output: An instruction-sequence I’ – Instructions belong to same ISA – I’ is equivalent to I On all inputs, I and I’ produce the same results – I’ is the optimal implementation of I + timeout t an improved shorter faster lesser energy consumption

TRANSCRIPT

Page 1: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Superoptimization

Venkatesh Karthik SrinivasanGuest Lecture in CS 701, Nov. 10, 2015

Page 2: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

2

IA-32 Primer

Registers

1000

ESP

Stack

push eax

20

EAX20

mov ebx, eax

20

EBX

mov [esp], 60

60

lea esp, [esp+4]

9961000

add eax, ebx

40

RegistersEAX, EBX, ECX, EDX,

ESP, EBP, EIP,ESI, EDI

FlagsCF, OF, ZF, SF, PF, …

Page 3: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Superoptimization

• Input: An instruction sequence I– Instructions belong to an ISA – No loops

• Output: An instruction-sequence I’– Instructions belong to same ISA – I’ is equivalent to I • On all inputs, I and I’ produce the same results

– I’ is the optimal implementation of I

+ timeout t

an improved • shorter• faster• lesser energy consumption

Page 4: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Superoptimization

sub eax, ecxneg eaxsub eax, 1

IA-32Superoptimizer

not eaxadd eax, ecx

EAX ← ECX – EAX – 1

Page 5: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Superoptimization

add ebp. 4mov [ebp], eaxadd ebp, 4mov [ebp], ebx

IA-32Superoptimizer

mov [ebp+4], eaxmov [ebp+8], ebxadd ebp, 8

Page 6: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Why do we need a superoptimizer?

Page 7: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

© Alvin Cheung, CSE 501

Page 8: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole Optimization

• Done at low-level IR– Closer to assembly code

• Sliding window – “peephole” • Peephole rules– Rewrite rules of form “LHS → RHS”

• If current peephole matches LHS of some rule, replace with RHS

Page 9: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole RulesRule category LHS RHS

Eliminating redundant loads and stores

mov reg, [addr] // Load contents of [addr] in regmov [addr], reg // Store contents of reg in [addr]

(Both instructions must be in the same basic block)

mov reg, [addr]

Control-flow optimizations

goto L1…L1: goto L2

goto L2…L1: goto L2

Algebraic rewrites

add oprnd, 0 (none)

imul oprnd, 2 shl oprnd, 1

Machine idioms

add oprnd, 1 inc oprnd

Page 10: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Problem with Peephole Rules

• Manually written– Cumbersome– Error-prone

Page 11: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole Superoptimizer

Superoptimizer

.

.

.mov eax, [esp]mov [esp], eax...mov eax, [esp]add eax, 1mov [esp], eax...

.

.

.mov eax, [esp]...mov eax, [esp]add eax, 1mov [esp], eax...

Page 12: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole Superoptimizer

Superoptimizer

.

.

.mov eax, [esp]mov [esp], eax...mov eax, [esp]add eax, 1mov [esp], eax...

.

.

.mov eax, [esp]....add [esp], 1mov eax, [esp]...

Page 13: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole Superoptimizer

Superoptimization is a sloooooooooowwwww process

Page 14: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Optimization Database

Training programs Instruction sequences

Superoptimizer

Optimization database

Page 15: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole Superoptimizer

.

.

.mov eax, [esp]mov [esp], eax...mov eax, [esp]add eax, 1mov [esp], eax...

.

.

.mov eax, [esp]...mov eax, [esp]add eax, 1mov [esp], eax...

Optimization database

Page 16: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Peephole Superoptimizer

.

.

.mov eax, [esp]mov [esp], eax...mov eax, [esp]add eax, 1mov [esp], eax...

.

.

.mov eax, [esp]....add [esp], 1mov eax, [esp]...Optimization

database

Page 17: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization– Speeding up critical inner loops

• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization

Page 18: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Checking equivalence of two instruction-sequences

• Key primitive in superoptimizers and related tools

• “Do two loop-free instruction sequences I and I’ produce the same outputs for all possible inputs?”

Page 19: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Checking equivalence of two instruction-sequences

1. Encode I as a logical formula ϕ2. Encode I’ as a logical formula ψ3. Use an SMT solver to check if ϕ ⟺ ψ is

logically valid

How to encode an instruction sequence

as a logical formula?

How to use an SMT solver to check if two

formulas are equivalent?

Page 20: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

How to encode an instruction sequence I as a logical formula ϕ?

1. Symbolically evaluate I on Id state to obtain post-state σ

2. Convert σ into a logical formula

Page 21: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

• IA-32 state • IA-32 interpreter– Concrete operational semantics of IA-32

instructions

Page 22: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

IA-32 State

• ⟨RegMap, FlagMap, MemMap⟩• RegMap : Register ↦ 32-bit value• FlagMap : Flag ↦ Boolean value• MemMap : 32-bit address ↦ 8-bit value

⟨[EBX ↦ 8, ESP ↦ 1000], [ ], [1000 ↦ 2]⟩⟨[EAX ↦ 0, EBX ↦ 8, ECX ↦ 0, … , ESP ↦ 1000, EBP ↦ 0, … , EDI ↦ 0],

[SF ↦ false, CF ↦ false, … OF ↦ false], [0 ↦ 0, 4 ↦ 0, 8 ↦ 0, … ,1000 ↦ 2, 1004 ↦ 0, … , 4294967292 ↦ 0]⟩

32-bit address ↦ 32-bit value

Page 23: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx ⟨[EBX ↦ 8, ESP ↦ 1000],

[ ], [1000 ↦ 2]⟩

Page 24: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx ⟨[EBX ↦ 8, ESP ↦ 1000],

[ ], [1000 ↦ 2]⟩

Page 25: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx ⟨[EAX ↦ 2, EBX ↦ 8, ESP ↦ 1000],

[ ], [1000 ↦ 2]⟩

Page 26: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx ⟨[EAX ↦ 6, EBX ↦ 8, ESP ↦ 1000],

[PF ↦ true], [1000 ↦ 2]⟩

Page 27: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx ⟨[EAX ↦ 6, EBX ↦ 4, ESP ↦ 1000],

[PF ↦ true], [1000 ↦ 2]⟩

Page 28: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Concrete Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx ⟨[EAX ↦ 6, EBX ↦ 4, ESP ↦ 1000],

[PF ↦ true], [1000 ↦ 4]⟩

Page 29: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

• Symbolic IA-32 state• Symbolic IA-32 interpreter– Symbolic transformers for instructions– Obtained from concrete operational semantics of

IA-32 instructions

Page 30: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic IA-32 State• ⟨RegMap, FlagMap, MemMap⟩• RegMap : Symbolic register-constant ↦ term• FlagMap : Symbolic flag-constant ↦ formula• MemMap : Function symbol ↦ function-update expression

ID state⟨[EAX’ ↦ EAX, EBX’ ↦ EBX, … , ESP’ ↦ ESP, EBP’ ↦ EBP, … , EDI’ ↦ EDI], [SF’ ↦ SF, CF’ ↦ CF, … ZF’ ↦ ZF],

[Mem’ ↦ Mem]⟩

QFBV+Arrays

Page 31: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx

⟨[EAX’ ↦ EAX, EBX’ ↦ EBX, … , ESP’ ↦ ESP, … ], [SF’ ↦ SF, … , ZF’ ↦ ZF],

[Mem’ ↦ Mem]⟩

Page 32: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx

⟨[EAX’ ↦ EAX, EBX’ ↦ EBX, … , ESP’ ↦ ESP, … ], [SF’ ↦ SF, … , ZF’ ↦ ZF],

[Mem’ ↦ Mem]⟩

Page 33: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx

⟨[EAX’ ↦ Mem(ESP), EBX’ ↦ EBX, … , ESP’ ↦ ESP, … ], [SF’ ↦ SF, … , ZF’ ↦ ZF],

[Mem’ ↦ Mem]⟩

Page 34: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx

⟨[EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX, … , ESP’ ↦ ESP, … ], [SF’ ↦ ((Mem(ESP) + 4) < 0), … , ZF’ ↦ ((Mem(ESP) + 4) = 0)],

[Mem’ ↦ Mem]⟩

Page 35: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx

⟨[EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, … , ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), … , ZF’ ↦ ((EBX – 4) = 0)],

[Mem’ ↦ Mem]⟩

Page 36: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Symbolic Evaluation

mov eax, [esp]add eax, 4sub ebx, 4mov [esp], ebx

⟨[EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, … , ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), … , ZF’ ↦ ((EBX – 4) = 0)],

[Mem’ ↦ Mem[ESP ↦ EBX – 4]]⟩

Page 37: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Convert a symbolic state into a logical formula

⟨[EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, … , ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), … , ZF’ ↦ ((EBX – 4) = 0)],

[Mem’ ↦ Mem[ESP ↦ EBX – 4]]⟩

EAX’ = Mem(ESP) + 4 ∧ EBX’ = EBX – 4 ∧ … ∧ ESP’ = ESP ∧ … ∧ SF’ = ((EBX – 4) < 0) ∧ … ∧ ZF’ = ((EBX – 4) = 0) ∧ Mem’ = Mem[ESP ↦ EBX – 4]

Page 38: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

How to encode an instruction sequence I as a logical formula ϕ?

1. Symbolically evaluate I on Id state to obtain post-state σ

2. Convert σ into a logical formula

push eax EAX’ = EAX ∧ EBX’ = EBX ∧ … ∧ ESP’ = ESP – 4 ∧ … ∧ SF’ = SF ∧ … ∧ ZF’ = ZF ∧ Mem’ = Mem[ESP – 4 ↦ EAX]ESP’ = ESP – 4 ∧ Mem’ = Mem[ESP – 4 ↦ EAX]

Page 39: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Exercise

• Convert the following instruction sequence into a QFBV formula

lea esp, [esp - 4]mov [esp], 1push ebpmov ebp, esp

Page 40: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Checking Equivalence of Two Instruction-Sequences

1. Encode I as a logical formula ϕ2. Encode I’ as a logical formula ψ3. Use an SMT solver to check if ϕ ⟺ ψ is

logically valid

How to encode an instruction sequence

as a logical formula?

How to use an SMT solver to check if two

formulas are equivalent?

Page 41: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

How to use SMT solver to check equivalence?

• Conventionally used for “satisfiability”– Symbolic execution for test-case generation

EAX > 0 EBX = EAX + 4 ∧ EBX > 0 ∧ SMT solver

SAT[EAX ↦ 1, EBX ↦ 5]

EAX > 0 EBX = EAX + 4 ∧ EBX ∧ ≤ 0

SMT solver UNSAT

Page 42: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

How to use SMT solver to check equivalence?

• Can also be used to check “validity”– Synthesis and superoptimization

• To check if ϕ ⟺ ψ is logically valid, check satisfiability of (ϕ ⟺ ψ)

((EAX’ = EAX + EAX EBX’ = EBX * 2) ∧⟺ (EAX’ = EAX * 2 EBX’ = EBX >> ∧1))

SMT solverUNSAT

(Two formulas are equivalent)

SMT solverSAT

[EAX ↦ 1, EBX ↦ 1](Counterexample)

((EAX’ = EAX + EAX EBX’ = EBX * 2) ∧⟺ (EAX’ = EAX * 3 EBX’ = EBX >> ∧1))

Page 43: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Checking equivalence of two instruction sequences

sub eax, ecxneg eaxsub eax, 1

EAX’ = -1 * (EAX + (-1 * ECX)) - 1 ∧SF’ = (-1 * (EAX + (-1 * ECX)) - 1) < 0

not eaxadd eax, ecx

EAX’ = -1 * EAX + ECX - 1 ∧SF’ = (-1 * EAX + ECX - 1) < 0

((EAX’ = -1 * (EAX + (-1 * ECX)) - 1 ∧SF’ = (-1 * (EAX + (-1 * ECX)) - 1) < 0) ⟺ (EAX’ = -1 * EAX + ECX - 1 ∧

SF’ = (-1 * EAX + ECX - 1) < 0))

SMT solver UNSAT

Page 44: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Checking equivalence of two instruction sequences

mov eax, [esp]add eax, ebx

EAX’ = Mem(ESP) + EBX ∧SF’ = (Mem(ESP) + EBX) < 0

mov eax, [esp]sub eax, ebx

((EAX’ = Mem(ESP) + EBX ∧SF’ = (Mem(ESP) + EBX) < 0) ⟺ (EAX’ = Mem(ESP) + EBX ∧SF’ = (Mem(ESP) - EBX) < 0))

SMT solverSAT

[EBX ↦ 1, ESP ↦ 1000,Mem[1000 ↦ 1]]

EAX’ = Mem(ESP) + EBX ∧SF’ = (Mem(ESP) - EBX) < 0

Page 45: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization

Page 46: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Thousands of unique IA-32 instruction

schemas

Exponential cost of

enumeration+ =

Instruction-Sequence

Enumerator

Equivalence Check

I

A Naïve Superoptimizer

Optimality check

No

Timeoutexpired

I’

Page 47: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization

Page 48: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Canonicalizationmov ebp, espmov esp, ebp

IA-32Superoptimizer

mov ebp, esp

mov ebp, eaxmov eax, ebp

IA-32Superoptimizer

mov ebp, eax

mov esi, espmov esp, esi

IA-32Superoptimizer

mov esi, esp

Page 49: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Canonicalizationmov ebp, espmov esp, ebp

IA-32Superoptimizer

mov ebp, esp

mov ebp, eaxmov eax, ebp

IA-32Superoptimizer

mov ebp, eax

mov esi, espmov esp, esi

IA-32Superoptimizer

mov esi, esp

mov reg1, reg2mov reg2, reg1

IA-32Superoptimizer

mov reg1, reg2

Page 50: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Canonicalization

add ebp, 20add ebp, 30

IA-32Superoptimizer

add ebp, 50

add reg1, c1add reg1, c2

IA-32Superoptimizer

add reg1, c1+c2

Page 51: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Canonicalization

add ebp, 20add ebp, 30 Canonicalizer

add ebp, 50

add reg1, c1add reg1, c2

Uncanonicalizer add reg1, c1+c2

IA-32Superoptimizer

reg1 ↦ ebpc1 ↦ 20c2 ↦ 30

Page 52: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Instruction-Sequence

Enumerator

Equivalence Check

I

A Naïve Superoptimizer

Optimality check

No

Timeoutexpired

I’

Page 53: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Instruction-Sequence

Enumerator

Equivalence Check

I

A Naïve Superoptimizer + Canonicalization

Optimality check

No

Timeoutexpired

I’

Canonicalizer

Canonicalizer

I’’

Uncanonicalizer

Page 54: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization

Page 55: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Instruction-Sequence

Enumerator

Equivalence Check

I

A Naïve Superoptimizer + Canonicalization

Optimality check

No

Timeoutexpired

I’

Canonicalizer

Canonicalizer

I’’

Uncanonicalizer

Page 56: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Test Cases

mov reg1, [reg2]add reg1, reg3

mov reg1, [reg2]sub reg1, reg3

[REG3 ↦ 0, REG2 ↦ 1000,Mem[1000 ↦ 1]]

[REG1 ↦ 1, REG3 ↦ 0, REG2 ↦ 1000,Mem[1000 ↦ 1]]

[REG1 ↦ 1, REG3 ↦ 0, REG2 ↦ 1000,Mem[1000 ↦ 1]]

[REG3 ↦ 1, REG2 ↦ 1000,Mem[1000 ↦ 1]]

[REG1 ↦ 2, REG3 ↦ 1, REG2 ↦ 1000,Mem[1000 ↦ 1]]

[REG1 ↦ 0, REG3 ↦ 1, REG2 ↦ 1000,Mem[1000 ↦ 1]]

Page 57: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Instruction-Sequence

Enumerator

Equivalence check

I

Test Cases

Optimalitycheck

No

Timeoutexpired

I’

Canonicalizer

Canonicalizer Test cases

Fail

I’’

Uncanonicalizer

Page 58: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Checking equivalence of two instruction sequences

mov reg1, [reg2]add reg1, reg3

REG1’ = Mem(REG2) + REG3 ∧SF’ = (Mem(REG2) + REG3) < 0

mov reg1, [reg2]sub reg1, reg3

((REG1’ = Mem(REG2) + REG3 ∧SF’ = (Mem(REG2) + REG3) < 0) ⟺ (REG1’ = Mem(REG2) + REG3 ∧SF’ = (Mem(REG2) – REG3) < 0))

SMT solver

SAT[REG3 ↦ 1, REG2 ↦ 1000,Mem[1000 ↦ 1]]

REG1’ = Mem(REG2) + REG3 ∧SF’ = (Mem(REG2) – REG3) < 0

Counterexample

Page 59: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Instruction-Sequence

Enumerator

Equivalence check

I

Test Cases

Optimalitycheck

No

Timeoutexpired

I’

Canonicalizer

Canonicalizer Test cases

Fail

Counter-example

Counterexample-guided inductive

synthesis (CEGIS)

I’’

Uncanonicalizer

Page 60: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization

Page 61: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Pruning sub-optimal candidates

.

.

.

.

.

.add ebp, c1+c2.....

One-instruction sequences Two-instruction sequences Three-instruction sequences

.

.

.

.

.

.add ebp, c1add ebp, c2.....

.

.

.

.

.

.

.

.

.

.

.

.

.

Page 62: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Instruction-Sequence

Enumerator + Pruning

Equivalence check

I

Superoptimizer

Optimalitycheck

No

Timeoutexpired

I’

Canonicalizer

Canonicalizer Test cases

Fail

Counter-example

Page 63: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Results – Search-space pruningLength Original search

spaceAfter canonicalization

After pruning Reduction factor

1 5453 997 644 8.5

2 29 million 2.49 million 1.2 million 24.7

3 162.1 billion 8.6 billion 3.11 billion 52.1

Original search space

After canonicalization

After pruning

After test-cases

Page 64: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

But still ….

Superoptimization is still a sloooooooooowwwww process

After canonicalization, pruning, and testing, checking all candidates of length up to 3 instructions takes several hours

Page 65: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization

Page 66: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

© Eric Schkufza, ASPLOS 2013

Page 67: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

© Eric Schkufza, ASPLOS 2013

Page 68: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

© Eric Schkufza, ASPLOS 2013

Page 69: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

© Eric Schkufza, ASPLOS 2013

Montgomery multiplication kernel in OpenSSL library (116

instructions):STOKE’s result is 16 instructions

shorter and 1.6 times faster than gcc –O3’s result

Page 70: Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015

Outline• Problem statement• Motivation

– Peephole optimization• Equivalence of two instruction-sequences• A naïve superoptimizer• Massalin/Bansal-Aiken superoptimizer

– Canonicalization– Test cases– Counterexamples as tests – Pruning away sub-optimal candidates

• Stochastic superoptimization