empirical comparison of three versioning architecture hajime fujita 12*, kamil iskra 2, pavan balaji...

1Hajime Fujita, Cluster 2015

Empirical Comparison of Three Versioning

ArchitectureHajime Fujita12*, Kamil Iskra2, Pavan Balaji2,

Andrew A. Chien121University of Chicago, 2Argonne National Laboratory

Sep 11, 2015

This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Award DE-SC0008603 and Contract DE-AC02-06CH11357 and completed in part with resources provided by the University of Chicago Research Computing Center.

* Now at Intel

Hajime Fujita, Cluster 2015 2

Background• High error rate in large-scale supercomputers• Growing concern about latent errors (e.g. silent

data corruption)o Errors that have latency between its occurrence and

detection• Multi-versioned data store being a promising

approach to address latent errors

Sep 11, 2015


How Multi-version Helps?

• Multi-versioning enables flexible recovery from latent errors

Sep 11, 2015

version

version

corrupted

version

corrupted

checkpoint

error occurred

version

version

corrupted

version

Error detected

Error detected

Start

Start

Start

Error detected

restored

version

Rollback

new

state

Recovery using a part of an old version

(b) Rollback using an old version

(c) Forward error correction using an old version

(a) Traditional checkpoint/restart

Restart


Programming with GVR

• Globally-shared, multi-version array for application state preservation

• Explicit library calls for array manipulation/version creation

Sep 11, 2015

Put Get PutVersion 2

Version 1

...

Array A

Array BProcess Process Process

Put Get

(Global View Resilience)

Hajime Fujita, Cluster 2015

Many Versions are Partial Updates

Sep 11, 2015 5

OpenMC

canneal

0% 20% 40% 60% 80% 100%

Modified ratio per version

ModifiedUnmodified

Opportunity for saving storage/bandwidth requirements

H.Fujita, et al., Log-Structured Global Array for Efficient Multi-Version Snapshots, CCGrid 2015


How to Make Versions Efficiently?

Sep 11, 2015

Approach 1:Copy entire array each time

Current

Old

Current

Old

Approach 2:Keep updated data only

Current

Old

Approach 3:Allocate memory block on-demand

Runtime OverheadMemory SavingsLow High


Approach 1: Flat Array• Copy and keep entire array on each version

creation

Sep 11, 2015

Current Version

Version 1Version 0

✔ Simple structure, fast access✖ High memory demand, copy overhead


Approach 2:Flat with Change Tracking• Use a flat array for current version, then only

record updated regions upon version creation

Sep 11, 2015

Current Version

Version 1Version 0

• How to detect an updated region?o User: GVR library records updates on write operations (e.g.

put() or acc())o Kernel: Page write protection + page fault handling

✔ Relatively fast access, small footprint✖ At least one full array, change tracking overhead


Approach 3:Log-structured Array

• Allocate memory block on-demando Allocated regions form a logo Log = data + metadata (index)

Sep 11, 2015

Current Version

Version 1Version 0

Log

H.Fujita, et al., Log-Structured Global Array for Efficient Multi-

Version Snapshots, CCGrid 2015

✔ Small footprint✖ High access overhead


Problem StatementWhich array architecture brings the best performance and the lowest memory consumption, under various workload characteristics?

Sep 11, 2015


Synthetic Benchmark• Get() and Put()

to random locations + version creation

Sep 11, 2015

Array Index

P0 P1 P2Probability

Parameter:• Versioning frequency

(=how many get/put ops per version)

Environment:• UChicago RCC Midway

• Intel Xeon E5-2670 (8 cores x2)

• Infiniband FDR-10• MVAPICH2 (gcc)

Based on APEX-Map [E. Strohmaier et al. 2004]


Runtime Performance

Sep 11, 2015

Flat with change tracking best for

performanceThro

ughp

ut (K

ops/

s)

#procs=32, block size=4096 B, array size=256 MiB/proc, read ratio=50%

change tracking


Memory Usage

Sep 11, 2015

Log-structured array best for memory usage

Mem

ory

usag

e (M

iB)

#procs=32, block size=4096 B, array size=256 MiB/proc, read ratio=50%, versioning frequency=1e-5


Related Work• Log-structured file systems

o LFS [Rosenblum 1992], PLFS [Bent 2009]o Focused on improving write performance, while our focus is

in capturing writes• Log-structured distributed data stores

o RAMCloud [Ongaro 2011, Rumble 2014], SILT [Lim 2011], Pilaf [Mitchell 2013]

o Similar structure to log-structured arrayo GVR is array-oriented (not KV-oriented)

• Incremental checkpointingo [Plank 1995], TICK [Gioiosa 2005], [Agarwal 2004]o Not focusing on RDMA, a new challenge to transparent

change tracking

Sep 11, 2015


Summary• Compared three versioning architectures for efficient

versioningo Flato Flat with change trackingo Log-structured Array

• Findings from synthetic benchmarko Flat with change tracking: best performance in most caseso Log-structured array: best choice for memory savings

• Future Worko Broader evaluation including version retrieval cost and

application-level performanceo Investigation on hardware/software architecture that allows fine-

grain, efficient change tracking on remote memory

Sep 11, 2015

http://gvr.cs.uchicago.edu


Backup

Sep 11, 2015


Versioning Schemes• Flat

o Contiguous buffer, whole copy for each version• Flat+change tracking

o Flat array serves as a current version, keeps copies of modified blocks

o Change tracking mechanisms• User-specified (arbitrary granularity)• OS kernel (page-level)• CPU (page-level)

o Versioning directions• Incremental: logs new value• Decremental: logs old value

• Log-structured Arrayo Appends modified blocks to the log

Sep 11, 2015


Versioning Schemes• Flat array

• Flat with change tracking

• Log-structured array

Sep 11, 2015


Change Tracking/Versioning

Direction• Change tracking schemes

o Usero Kernelo Hardware

• Versioning directionso Undoo Redo

Sep 11, 2015


Fine-grain Comparison on Memory Change Tracking

(1)• Memory access latency of the first write to each

page

Sep 11, 2015

• Kernel change tracking has higher latency due to page fault handling


Fine-grain Comparison on Memory Change Tracking

(2)

Sep 11, 2015

Performance with redo versions

relative to no versioning, array

size=128 MiB

Performance with undo

versions relative to redo ones,

array size=128 MiB


Performance Comparison (1)

Sep 11, 2015

• Flat with change tracking works the best when versioning frequency is high

• Log-structured array has poor performance especially when versioning frequency is low

Better

More frequent


Performance Comparison (2)

Performance over various versioning frequency, RMA, #procs=32, block size=4096B, array size=512MB/proc, read

ratio=50%

• Log-structured array works better for localized (smaller k) access pattern

Sep 11, 2015


Memory Consumption

Sep 11, 2015

• Log-structured array requires the least amount of memory• Undo versioning requires additional memory for the undo buffer• Flat array requires fixed amount of memory, regardless of

locality• For flat with tracking and log array, higher locality incurs lower

memory consumption


Version Retrieval Cost• Partial retrieval

o e.g. Localized recovery1. Create 256 versions

with certain fill ratio2. Pick one version3. Read from 10,000

random locations in that version

Sep 11, 2015

• Full retrievalo e.g. Full rollback

1. Create 256 versions with certain fill ratio

2. Pick one version3. Read the entire

region of that version

version

version

version

Get

version

version

versionGet


Full Version Retrieval Cost

Sep 11, 2015

• Flat/log array have constant cost of version rollback• Redo versioning is good at restoring older versions, whereas

undo is good at newer versions


Partial Version Retrieval Cost

Sep 11, 2015

• Flat/log array have less variant, shorter latency• Flat with tracking encounters higher variation and average

latency

Fill ratio = 1%


Incremental/decremental

Sep 11, 2015


Summary on Evaluation

• Flat with change tracking architecture achieves the best performance in most cases

• Flat and log-structured array achieves less variant and lower version retrieval cost, whereas flat with change tracking shows more variant and higher cost

• Flat with change tracking would be the best for moderate or low versioning frequency

• Log-structured array is the best choice if the versioning frequency is high, or memory consumption is the primary concern

Sep 11, 2015


Future Work• Analysis of data redundancy inside the array,

seeking a way to harden the array (e.g. error correction coding)

• Design & evaluation of a network/memory device capable of fine-grain change tracking on (remote) memory access

Sep 11, 2015

empirical comparison of three versioning architecture hajime fujita 12*, kamil iskra 2, pavan balaji...

Documents