empirical comparison of three versioning architecture hajime fujita 12*, kamil iskra 2, pavan balaji...

30
Empirical Comparison of Three Versioning Architecture Hajime Fujita 12* , Kamil Iskra 2 , Pavan Balaji 2 , Andrew A. Chien 12 1 University of Chicago, 2 Argonne National Laboratory Sep 11, 2015 Hajime Fujita, Cluster 2015 1 This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Award DE-SC0008603 and Contract DE-AC02-06CH11357 and completed in part with resources provided by the University of Chicago Research Computing Center. * Now at Intel

Upload: bartholomew-wheeler

Post on 19-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

How Multi-version Helps? Multi-versioning enables flexible recovery from latent errors Sep 11, 2015Hajime Fujita, Cluster version corrupted version corrupted version corrupted checkpoint corrupted checkpoint error occurred version corrupted version corrupted version Error detected Error detected Start Error detected restored v ersion restored v ersion Rollback new state Recovery using a part of an old version (b) Rollback using an old version (c) Forward error correction using an old version (a) Traditional checkpoint/restart Restart

TRANSCRIPT

Page 1: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

1Hajime Fujita, Cluster 2015

Empirical Comparison of Three Versioning

ArchitectureHajime Fujita12*, Kamil Iskra2, Pavan Balaji2,

Andrew A. Chien121University of Chicago, 2Argonne National Laboratory

Sep 11, 2015

This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Award DE-SC0008603 and Contract DE-AC02-06CH11357 and completed in part with resources provided by the University of Chicago Research Computing Center.

* Now at Intel

Page 2: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 2

Background• High error rate in large-scale supercomputers• Growing concern about latent errors (e.g. silent

data corruption)o Errors that have latency between its occurrence and

detection• Multi-versioned data store being a promising

approach to address latent errors

Sep 11, 2015

Page 3: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 3

How Multi-version Helps?

• Multi-versioning enables flexible recovery from latent errors

Sep 11, 2015

version

version

corrupted

version

corrupted

checkpoint

error occurred

version

version

corrupted

version

Error detected

Error detected

Start

Start

Start

Error detected

restored

version

Rollback

new

state

Recovery using a part of an old version

(b) Rollback using an old version

(c) Forward error correction using an old version

(a) Traditional checkpoint/restart

Restart

Page 4: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 4

Programming with GVR

• Globally-shared, multi-version array for application state preservation

• Explicit library calls for array manipulation/version creation

Sep 11, 2015

Put Get PutVersion 2

Version 1

...

Array A

Array BProcess Process Process

Put Get

(Global View Resilience)

Page 5: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015

Many Versions are Partial Updates

Sep 11, 2015 5

OpenMC

canneal

0% 20% 40% 60% 80% 100%

Modified ratio per version

ModifiedUnmodified

Opportunity for saving storage/bandwidth requirements

H.Fujita, et al.,  Log-Structured Global Array for Efficient Multi-Version Snapshots, CCGrid 2015

Page 6: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 6

How to Make Versions Efficiently?

Sep 11, 2015

Approach 1:Copy entire array each time

Current

Old

Current

Old

Approach 2:Keep updated data only

Current

Old

Approach 3:Allocate memory block on-demand

Runtime OverheadMemory SavingsLow High

Page 7: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 7

Approach 1: Flat Array• Copy and keep entire array on each version

creation

Sep 11, 2015

Current Version

Version 1Version 0

✔ Simple structure, fast access✖ High memory demand, copy overhead

Page 8: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 8

Approach 2:Flat with Change Tracking• Use a flat array for current version, then only

record updated regions upon version creation

Sep 11, 2015

Current Version

Version 1Version 0

• How to detect an updated region?o User: GVR library records updates on write operations (e.g.

put() or acc())o Kernel: Page write protection + page fault handling

✔ Relatively fast access, small footprint✖ At least one full array, change tracking overhead

Page 9: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 9

Approach 3:Log-structured Array

• Allocate memory block on-demando Allocated regions form a logo Log = data + metadata (index)

Sep 11, 2015

Current Version

Version 1Version 0

Log

H.Fujita, et al.,  Log-Structured Global Array for Efficient Multi-

Version Snapshots, CCGrid 2015

✔ Small footprint✖ High access overhead

Page 10: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 10

Problem StatementWhich array architecture brings the best performance and the lowest memory consumption, under various workload characteristics?

Sep 11, 2015

Page 11: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 11

Synthetic Benchmark• Get() and Put()

to random locations + version creation

Sep 11, 2015

Array Index

P0 P1 P2Probability

Parameter:• Versioning frequency

(=how many get/put ops per version)

Environment:• UChicago RCC Midway

• Intel Xeon E5-2670 (8 cores x2)

• Infiniband FDR-10• MVAPICH2 (gcc)

Based on APEX-Map [E. Strohmaier et al. 2004]

Page 12: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 12

Runtime Performance

Sep 11, 2015

Flat with change tracking best for

performanceThro

ughp

ut (K

ops/

s)

#procs=32, block size=4096 B, array size=256 MiB/proc, read ratio=50%

change tracking

Page 13: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 13

Memory Usage

Sep 11, 2015

Log-structured array best for memory usage

Mem

ory

usag

e (M

iB)

#procs=32, block size=4096 B, array size=256 MiB/proc, read ratio=50%, versioning frequency=1e-5

Page 14: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 14

Related Work• Log-structured file systems

o LFS [Rosenblum 1992], PLFS [Bent 2009]o Focused on improving write performance, while our focus is

in capturing writes• Log-structured distributed data stores

o RAMCloud [Ongaro 2011, Rumble 2014], SILT [Lim 2011], Pilaf [Mitchell 2013]

o Similar structure to log-structured arrayo GVR is array-oriented (not KV-oriented)

• Incremental checkpointingo [Plank 1995], TICK [Gioiosa 2005], [Agarwal 2004]o Not focusing on RDMA, a new challenge to transparent

change tracking

Sep 11, 2015

Page 15: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 15

Summary• Compared three versioning architectures for efficient

versioningo Flato Flat with change trackingo Log-structured Array

• Findings from synthetic benchmarko Flat with change tracking: best performance in most caseso Log-structured array: best choice for memory savings

• Future Worko Broader evaluation including version retrieval cost and

application-level performanceo Investigation on hardware/software architecture that allows fine-

grain, efficient change tracking on remote memory

Sep 11, 2015

http://gvr.cs.uchicago.edu

Page 16: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 16

Backup

Sep 11, 2015

Page 17: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 17

Versioning Schemes• Flat

o Contiguous buffer, whole copy for each version• Flat+change tracking

o Flat array serves as a current version, keeps copies of modified blocks

o Change tracking mechanisms• User-specified (arbitrary granularity)• OS kernel (page-level)• CPU (page-level)

o Versioning directions• Incremental: logs new value• Decremental: logs old value

• Log-structured Arrayo Appends modified blocks to the log

Sep 11, 2015

Page 18: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 18

Versioning Schemes• Flat array

• Flat with change tracking

• Log-structured array

Sep 11, 2015

Page 19: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 19

Change Tracking/Versioning

Direction• Change tracking schemes

o Usero Kernelo Hardware

• Versioning directionso Undoo Redo

Sep 11, 2015

Page 20: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 20

Fine-grain Comparison on Memory Change Tracking

(1)• Memory access latency of the first write to each

page

Sep 11, 2015

• Kernel change tracking has higher latency due to page fault handling

Page 21: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 21

Fine-grain Comparison on Memory Change Tracking

(2)

Sep 11, 2015

Performance with redo versions

relative to no versioning, array

size=128 MiB

Performance with undo

versions relative to redo ones,

array size=128 MiB

Page 22: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 22

Performance Comparison (1)

Sep 11, 2015

• Flat with change tracking works the best when versioning frequency is high

• Log-structured array has poor performance especially when versioning frequency is low

Better

More frequent

Page 23: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 23

Performance Comparison (2)

Performance over various versioning frequency, RMA, #procs=32, block size=4096B, array size=512MB/proc, read

ratio=50%

• Log-structured array works better for localized (smaller k) access pattern

Sep 11, 2015

Page 24: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 24

Memory Consumption

Sep 11, 2015

• Log-structured array requires the least amount of memory• Undo versioning requires additional memory for the undo buffer• Flat array requires fixed amount of memory, regardless of

locality• For flat with tracking and log array, higher locality incurs lower

memory consumption

Page 25: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 25

Version Retrieval Cost• Partial retrieval

o e.g. Localized recovery1. Create 256 versions

with certain fill ratio2. Pick one version3. Read from 10,000

random locations in that version

Sep 11, 2015

• Full retrievalo e.g. Full rollback

1. Create 256 versions with certain fill ratio

2. Pick one version3. Read the entire

region of that version

version

version

version

Get

version

version

versionGet

Page 26: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 26

Full Version Retrieval Cost

Sep 11, 2015

• Flat/log array have constant cost of version rollback• Redo versioning is good at restoring older versions, whereas

undo is good at newer versions

Page 27: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 27

Partial Version Retrieval Cost

Sep 11, 2015

• Flat/log array have less variant, shorter latency• Flat with tracking encounters higher variation and average

latency

Fill ratio = 1%

Page 28: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 28

Incremental/decremental

Sep 11, 2015

Page 29: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 29

Summary on Evaluation

• Flat with change tracking architecture achieves the best performance in most cases

• Flat and log-structured array achieves less variant and lower version retrieval cost, whereas flat with change tracking shows more variant and higher cost

• Flat with change tracking would be the best for moderate or low versioning frequency

• Log-structured array is the best choice if the versioning frequency is high, or memory consumption is the primary concern

Sep 11, 2015

Page 30: Empirical Comparison of Three Versioning Architecture Hajime Fujita 12*, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne

Hajime Fujita, Cluster 2015 30

Future Work• Analysis of data redundancy inside the array,

seeking a way to harden the array (e.g. error correction coding)

• Design & evaluation of a network/memory device capable of fine-grain change tracking on (remote) memory access

Sep 11, 2015