trading fault tolerance for performance in an encoding · an encoding–cont’d q make a program...
TRANSCRIPT
Trading Fault Tolerance for Performance in AN encoding
Norman A. Rink and Jeronimo CastrillonTechnische Universität [email protected]
Computing Frontiers15-17 May 2017Siena, Italy
Outline
1. Introduction – hardware faults and software-based fault tolerance
2. AN encoding
3. Configurable compiler-implemented AN encoding
4. Fault experiments and results
5. Summary and outlook
2
Outline
1. Introduction – hardware faults and software-based fault tolerance
2. AN encoding
3. Configurable compiler-implemented AN encoding
4. Fault experiments and results
5. Summary and outlook
3
Hardware faults and soft errors
4
q Faults are a long-standing and recurring issue.q In safety-critical embedded devices.q In servers/data centers and HPC workloads.
q What causes hardware faults?q Cosmic radiation.q “Dim silicon” (near-threshold computing to save energy).q Temperature variations, process variations.
à Typically lead to transient hardware faults, aka soft errors.
q Non-negligible fault rates in emerging computing paradigms: q Silicon-/carbon-based nano-technologies, graphene.q Chemical information processing, bio-inspired/quantum computing, NVM.
taken from S. Borkar, “Designing reliable systems from unreliable components: …,” IEEE Micro, vol. 25, no. 6, 2005.
Hardware faults and soft errors – cont’d
5
cost
(run
time,
ene
rgy)
error probability/output degradation
approximate computing applications,e.g. image processing, machine learning
safety-/security-critical applications,e.g. automotive, operating system (kernels)
non-critical user applications, processes that can be restarted
q The trade-off of computing with faulty hardware:
à Hardware-implemented fault tolerance is not flexible.
future and emerging HW (perhaps)
Error detection through redundant code
q Typical approach to detecting/correcting hardware faults:q Replicate data flow (duplication is sufficient for error detection).q Insert checks.
6
%3 = add i64 %0, %1%4 = mul i64 %3, %2
%3 = add i64 %0, %1%r3 = add i64 %r0, %r1%4 = mul i64 %3, %2%r4 = mul i64 %r3, %r2
%f0 = icmp eq i64 %4, %r4br i1 %f0, label continue,
label recover
transformation
original code
fault-tolerant code
duplicated dataflow
error check
runtime overhead of duplicated dataflow typically < 2x
q State of the art: EDDI ’02, SWIFT ’05.q Many variations and improvements exist.q It is usually assumed that the memory system is already protected, typically by ECC.
Outline
1. Introduction – hardware faults and software-based fault tolerance
2. AN encoding
3. Configurable compiler-implemented AN encoding
4. Fault experiments and results
5. Summary and outlook
7
AN encoding
q Correctness condition: Integer values are multiples of a fixed constant A.à Check for faults like this:
if (n % A != 0) { exit(AN_ERROR_CODE); }
q Advantages over replication:q Data in memory is encoded (automatically protected): no need to replicate loads/stores.q Suitable for multi-threaded and shared memory applications.
q Disadvantages: Large runtime overheads (up to and over several 10x).q Checking is expensive (modulo operation).q Decoding is expensive (division by A).q Decoding often required, e.g., for address operands.
8
Trade frequencies of these operations for runtime.
aside:moreeagervariantsofANencodinghaveevenlargeroverheads
AN encoding – cont’d
q Make a program fault-tolerant by transforming it into an AN-encoded program.
1. Value encoding:
2. Operation encoding, e.g.:
3. Check insertion:q Where non-encoded values enter/leave the scope of AN encoding:
§ memory accesses, calls to external functions, return values
à Often referred to as synchronization points.9
%3 = mul i64 %0, %1 %t3 = mul i64 %0, %1%3 = div i64 %t3, A
multiplication
%1 = load i64* %0 %t0 = ptrtoint i64* %0 to i64%t1 = div i64 %t0, A%t3 = inttoptr i64 %t1 to i64*%1 = load i64* %t3
load
integerconstantc integerconstantc*A
Outline
1. Introduction – hardware faults and software-based fault tolerance
2. AN encoding
3. Configurable compiler-implemented AN encoding
4. Fault experiments and results
5. Summary and outlook
10
Configurable compiler-implemented AN encoding
11
q optional pre-optimizationsq AN encoding increases the complexity of code.q Hence, after AN encoding, the compiler may fail to spot
opportunities for optimization.
q value encoding, operation encoding, check insertionq As previously discussed.q Remember, checks are inserted as so-called synchronization points.
q expansionq Turn encoding, decoding and checking operations into (sequences)
of native machine operations.
Encoding variants
12
q check insertionq Always check before values are stored to memory.q Always check before values are decoded.
codegeneratedforthevariantsduringcompilation
onelocalaccumulatorperfunction
Runtime overheads
13
label test case
A-C bubblesort
D CRC (cyclic redundancy checker)
E DES encryption
F Dijkstra
G-I lex, parse, eval
label test case
J Fibonacci
K integer matrix multiply
L memcopy
M-O quicksort
bestresultswithpre-optimizations
Outline
1. Introduction – hardware faults and software-based fault tolerance
2. AN encoding
3. Configurable compiler-implemented AN encoding
4. Fault experiments and results
5. Summary and outlook
14
Fault injection
15
q Assumptions:q Only a single fault affect program execution.q Only single bit flips occurs.
q Simulate symptoms of faults by …q … flipping a random bit in a random register.q … flipping a rondom bit in a random memory location (that is accessed).
q Evaluate AN encoding on the conditional probability:
Commonly justified by the rarity of faults.(SEU – single event upset)
faultsinmemoryfaultsintheCPU
pSDC = P( ”silent data corruption” | ”hardware fault (visible at the architecture level)” )
Faults in the CPU – full results
16
AA
.1A
.2A
.3A
.po.1
A.p
o.2
A.p
o.3 B
B.1
B.2
B.3
B.p
o.1
B.p
o.2
B.p
o.3 C
C.1
C.2
C.3
C.p
o.1
C.p
o.2
C.p
o.3 D
D.1
D.2
D.3
D.p
o.1
D.p
o.2
D.p
o.3 E
E.1
E.2
E.3
E.p
o.1
E.p
o.2
E.p
o.3 F
F.1
F.2
F.3
F.p
o.1
F.p
o.2
F.p
o.3 G
G.1
G.2
G.3
G.p
o.1
G.p
o.2
G.p
o.3 H
H.1
H.2
H.3
H.p
o.1
H.p
o.2
H.p
o.3
II.
1I.
2I.
3I.
po.1
I.p
o.2
I.p
o.3 J
J.1
J.2
J.3
J.p
o.1
J.p
o.2
J.p
o.3 K
K.1
K.2
K.3
K.p
o.1
K.p
o.2
K.p
o.3 L
L.1
L.2
L.3
L.p
o.1
L.p
o.2
L.p
o.3 M
M.1
M.2
M.3
M.p
o.1
M.p
o.2
M.p
o.3 N
N.1
N.2
N.3
N.p
o.1
N.p
o.2
N.p
o.3 O
O.1
O.2
O.3
O.p
o.1
O.p
o.2
O.p
o.3
rela
tive
fre
qu
en
cie
s
0.0
0.2
0.4
0.6
0.8
1.0
correct detected hang crash sdc
pSDC
Faults in the CPU and in memory – SDC
17
A B C D E F G H I J K L M N O means
p sdc
0.00
0.05
0.10
0.151 2 3 po.1 po.2 po.3
Memory:
pSDC · 103:q Memory accesses are relatively rare.q A single vulnerable has a much stronger effect.q Stack accesses have been found to be
particularly vulnerable.
A B C D E F G H I J K L M N O means
p sdc
0.00
0.02
0.04
0.06
0.081 2 3 po.1 po.2 po.3
CPU:
24x
55x
7x
15x
Outline
1. Introduction – hardware faults and software-based fault tolerance
2. AN encoding
3. Configurable compiler-implemented AN encoding
4. Fault experiments and results
5. Summary and outlook
18
Summary and outlook
19
q On less reliable HW, some applications …q … can live with the occasional error (approximate computing).q … may require additional measures to tolerate faults (e.g. OS kernels).
q Encoding is an interesting alternative to instruction duplication.q Protection of entire computing systems, multi-threaded and shared memory applications.
q AN encoding has large runtime overheads (several 10x to 100x).q Here: reduction from 9.9x to 3.6x – accompanied by an increase in pSDC.q Further reduction to 2.1x if pointers are not encoded.
q Can we design HW that supports error detection by encoding?q Must understand better the interaction of encoding with compiler optimizations.q In assessments, would like to avoid long-running fault injection experiments.
Trading Fault Tolerance for Performance in AN encoding
Norman A. Rink and Jeronimo CastrillonTechnische Universität [email protected]
Work supported by the German Research Foundation (DFG) within the Cluster of Excellence ‘Center for Advancing Electronics Dresden’ (cfaed). Thank you.
Trading Fault Tolerance for Performance in AN encoding
Norman A. Rink and Jeronimo CastrillonTechnische Universität [email protected]
Back upComputing Frontiers15-17 May 2017Siena, Italy
Worked example of AN encoding
22
%1 = load i64* %0%2 = add i64 %1, 42
%3 = call i64 @foo(i64 %2)store i64 %3, i64* %0
%1 = load i64* %0%2 = add i64 %1, 126
%3 = call i64 @foo(i64 %2)store i64 %3, i64* %0
value encoding
*p = foo( (*p) + 42 );
%t0 = ptrtoint i64* %0 to i64%t1 = div i64 %t0, 3%t2 = inttoptr i64 %t1 to i64*%1 = load i64* %t2
%2 = add i64 %1, 126
%t3 = div i64 %2, 3%t4 = call i64 @foo(i64 %t3)%3 = mul i64 %t4, 3
%t5 = ptrtoint i64* %0 to i64%t6 = div i64 %t5, 3%t7 = inttoptr i64 %t6 to i64*
store i64 %3, i64* %t7
operation encoding
call void @check(i64 %0)%t0 = ptrtoint i64* %0 to i64%t1 = div i64 %t0, 3%t2 = inttoptr i64 %t1 to i64*%1 = load i64* %t2call void @check(i64 %1)
%2 = add i64 %1, 126
call void @check(i64 %2)%t3 = div i64 %2, 3%t4 = call i64 foo(i64 %t3)%3 = mul i64 %t4, 3
call void @check(i64 %0)%t5 = ptrtoint i64* %0 to i64%t6 = div i64 %t5, 3%t7 = inttoptr i64 %t6 to i64*
call void @check(i64 %3)store i64 %3, i64* %t7
check insertion
(A = 3)