ocr introspection edt characterization & profiling infrastructure intel tg team

8
OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Upload: bryce-blankenship

Post on 31-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

OCR IntrospectionEDT Characterization & Profiling Infrastructure

Intel TG Team

Page 2: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Overview

• Background• Approach• Current Methodology• Initial Results• Implementation & Interfaces• Next Steps

Page 3: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Background

• Can we benefit from any ‘predictable behavior’ of EDTs ?– Code or data placement choices based

on EDTs’ resource requirements– Programmer productivity tool (datablock

partitioning, EDT functionality)

• Characterizing EDT behavior• Exploiting this information at runtime

Page 4: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Approach

0 20 40 60 80 100 1201

10

100

1000

10000

100000

1000000

10000000

100000000

f(x) = 0.333333333333 x³ + 0.5 x² + 0.166666666665 x + 1.61995338E-11

f(x) = 2 x³ + 10.0000000000001 x² + 51.9999999999917 x + 100.000000000032

f(x) = 22 x³ + 62.0000000000046 x² + 83.9999999997752 x + 68.0000000023669

sequential_cholesky() EDT

Read (bytes)Polynomial (Read (bytes))Polynomial (Read (bytes))

TileSize

Reads,

Wri

tes =

Byte

sFlo

ati

ng P

oin

t O

ps =

Count

• Express EDT metrics as a function of size of data operated upon by EDT

• Use the above function at runtime to quickly estimate EDT’s resource requirements

Page 5: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Current Methodology

Several simplifying assumptions– EDT behavior is consistent– Metrics can be expressed as polynomials– Sum of sizes of datablocks used as a proxy for size of data operated upon by EDT

OCR App

Raw EDT Metrics

- Instruction Count - Floating Point Ops- Reads & Writes - Datablock sizes

EDT MetricsRepresented as

polynomial function on sum of sizes of all

datablocks used by an EDT

Profiling header

“edt1”, {polynomial coeffs},

“edt2”, {polynomial coeffs},

“edt3”, {polynomial coeffs},

RuntimeCan use the

projections to inform its heuristics

LLVM Pass,Statistics

Summarize, Best-fit

Generate header

Rebuild

#include <ocr-profiling.h>

struct _profileStruct gProfilingTable[] = {

{"mainEdt", 1, 1, 1, 1, 0.0, 0.0, 0.0, 0.0, (double []){229,}, (double []){16,}, (double []){2068,}, (double []){2632,}, 0, },{"node_110_body", 2, 2, 2, 2, 0.0, 0.0, 0.0, 0.0, (double []){77620,25,}, (double []){0,0,}, (double []){14052,0,}, (double []){13432,1,}, 0, },{"node_210_body", 2, 2, 3, 2, 0.0, 0.0, 0.0, 0.0, (double []){63115,0,}, (double []){0,0,}, (double []){-12838.4,0.853455,-1.40977e-05,}, (double []){28000,7.27596e-12,}, 0, },{"node_220_body", 3, 3, 4, 2, 0.0, 0.0, 0.0, 0.0, (double []){113716,-0.0242976,2.13737e-07,}, (double []){7004.25,-0.00177152,1.55834e-08,}, (double []){33.7103,0.993679,-4.52053e-07,3.38311e-12,}, (double []){-3109.02,0.547309,}, 0, },};

OCR: LLVM Pass, Statistics credit: Romain Cledat

FP Ops = y = 0.33x^2 + 1.5x + 8000,x = sum(size_of(DB[i]))

Page 6: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Initial Results

SCF-N

WCHEM

Cholesky

Lules

h

Conjugate

Gradien

t

Embarr

assingly

Paralle

l

Fourie

r Tran

sform

Integer

Sort

LU So

lver

Multi-Grid

0

2

4

6

8

10

12

14

16

Instr Count

FP Ops

Reads

Writes

Wei

ghte

d N

orm

aliz

ed P

roje

ction

RM

S Er

rors

(%)

OCR Apps credit: Chih-Chieh Yang, Adam Smith (NAS), Roger Golliver (LULESH), Jamie Arteaga (SCF)

Two phases

Three modes

Page 7: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Implementation & Interfaces

• Profiling– On x86, LLVM-pass + OCR statistics hooks added by Romain– On FSim, hardware counters

• Data analysis– Offline tools for summarizing & best fit– Making them online will add to runtime overhead (how

much?)

• Projections– Optional “profile.h” #include’d by OCR app

• Lists function name corresponding to each EDT with the polynomial coefficients of its metrics

– Runtime picks it up during template creation– Runtime can use them when EDT is ready to run

Page 8: OCR Introspection EDT Characterization & Profiling Infrastructure Intel TG Team

Next Steps

• Cleanup the post-processing tools & integrate with OCR

• Using the projections– Code/data placement decisions– Reacting to temperature, failures

• Better projections– Polynomial functions not always accurate– Datablock reads vs. writes & stack accesses– Quantifying other metrics