interactive hadoop via flash and memory

28
© Hortonworks Inc. 2011 Interactive Hadoop via Flash and Memory Arpit Agarwal [email protected] @aagarw Chris Nauroth [email protected] @cnauroth Page 1

Upload: chris-nauroth

Post on 27-Jan-2015

113 views

Category:

Software


0 download

DESCRIPTION

Enterprises are using Hadoop for interactive real-time data processing via projects such as the Stinger Initiative. We describe two new HDFS features – Centralized Cache Management and Heterogeneous Storage – that allow applications to effectively use low latency storage media such as Solid State Disks and RAM. In the first part of this talk, we discuss Centralized Cache Management to coordinate caching important datasets and place tasks for memory locality. HDFS deployments today rely on the OS buffer cache to keep data in RAM for faster access. However, the user has no direct control over what data is held in RAM or how long it?s going to stay there. Centralized Cache Management allows users to specify which data to lock into RAM. Next, we describe Heterogeneous Storage support for applications to choose storage media based on their performance and durability requirements. Perhaps the most interesting of the newer storage media are Solid State Drives which provide improved random IO performance over spinning disks. We also discuss memory as a storage tier which can be useful for temporary files and intermediate data for latency sensitive real-time applications. In the last part of the talk we describe how administrators can use quota mechanism extensions to manage fair distribution of scarce storage resources across users and applications.

TRANSCRIPT

Page 1: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Interactive Hadoop via Flash and Memory

Arpit Agarwal

[email protected]

@aagarw

Chris Nauroth

[email protected]

@cnauroth

Page 1

Page 2: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

HDFS Reads

Page 2Architecting the Future of Big Data

Page 3: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

HDFS Short-Circuit Reads

Page 3Architecting the Future of Big Data

Page 4: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

HDFS Short-Circuit Reads

Page 4Architecting the Future of Big Data

Page 5: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Shortcomings of Existing RAM Utilization

• Lack of Control– Kernel decides what to retain in cache and what to evict based on observations of

access patterns.

• Sub-optimal RAM Utilization– Tasks for multiple jobs are interleaved on the same node, and one task’s activity

could trigger eviction of data that would have been valuable to retain in cache for the other task.

Page 5Architecting the Future of Big Data

Page 6: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Centralized Cache Management

• Provides users with explicit control of which HDFS file paths to keep resident in memory.

• Allows clients to query location of cached block replicas, opening possibility for job scheduling improvements.

• Utilizes off-heap memory, not subject to GC overhead or JVM tuning.

Page 6Architecting the Future of Big Data

Page 7: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Using Centralized Cache Management

• Pre-Requisites– Native Hadoop library required, currently supported on Linux only.– Set process ulimit for maximum locked memory.– Configure dfs.datanode.max.locked.memory in hdfs-site.xml, set to the amount of

memory to dedicate towards caching.

• New Concepts– Cache Pool

– Contains and manages a group of cache directives.

– Has Unix-style permissions.

– Can constrain resource utilization by defining a maximum number of cached bytes or a maximum time to live.

– Cache Directive– Specifies a file system path to cache.

– Specifying a directory caches all files in that directory (not recursive).

– Can specify number of replicas to cache and time to live.

Page 7Architecting the Future of Big Data

Page 8: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Using Centralized Cache Management

Page 8Architecting the Future of Big Data

Page 9: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Using Centralized Cache Management

• CLI: Adding a Cache Pool> hdfs cacheadmin -addPool common-poolSuccessfully added cache pool common-pool.

> hdfs cacheadmin -listPoolsFound 1 result.NAME OWNER GROUP MODE LIMIT MAXTTLcommon-pool cnauroth cnauroth rwxr-xr-x unlimited never

Page 9Architecting the Future of Big Data

Page 10: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Using Centralized Cache Management

• CLI: Adding a Cache Directive> hdfs cacheadmin -addDirective \ -path /hello-amsterdam \ -pool common-poolAdded cache directive 1

> hdfs cacheadmin -listDirectivesFound 1 entry ID POOL REPL EXPIRY PATH 1 common-pool 1 never /hello-amsterdam

Page 10Architecting the Future of Big Data

Page 11: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Using Centralized Cache Management

• CLI: Removing a Cache Directive> hdfs cacheadmin -removeDirective 1Removed cached directive 1

> hdfs cacheadmin -removeDirectives \ -path /hello-amsterdamRemoved cached directive 1Removed every cache directive with path /hello-amsterdam

Page 11Architecting the Future of Big Data

Page 12: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Using Centralized Cache Management

• API: DistributedFileSystem Methodspublic void addCachePool(CachePoolInfo info)

public RemoteIterator<CachePoolEntry> listCachePools()

public long addCacheDirective(CacheDirectiveInfo info)

public RemoteIterator<CacheDirectiveEntry> listCacheDirectives(CacheDirectiveInfo filter)

public void removeCacheDirective(long id)

Page 12Architecting the Future of Big Data

Page 13: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Centralized Cache Management Behind the Scenes

Page 13Architecting the Future of Big Data

Page 14: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Centralized Cache Management Behind the Scenes• Block files are memory-mapped into the DataNode process.> pmap `jps | grep DataNode | awk '{ print $1 }'` | grep blk00007f92e4b1f000 124928K r--s- /data/dfs/data/current/BP-1740238118-127.0.1.1-1395252171596/current/finalized/blk_107374182700007f92ecd21000 131072K r--s- /data/dfs/data/current/BP-1740238118-127.0.1.1-1395252171596/current/finalized/blk_1073741826

Page 14Architecting the Future of Big Data

Page 15: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Centralized Cache Management Behind the Scenes• Pages of each block file are 100% resident in memory.> vmtouch /data/dfs/data/current/BP-1740238118-127.0.1.1-1395252171596/current/finalized/blk_1073741826 Files: 1 Directories: 0 Resident Pages: 32768/32768 128M/128M 100% Elapsed: 0.001198 seconds

> vmtouch /data/dfs/data/current/BP-1740238118-127.0.1.1-1395252171596/current/finalized/blk_1073741827 Files: 1 Directories: 0 Resident Pages: 31232/31232 122M/122M 100% Elapsed: 0.00172 seconds

Page 15Architecting the Future of Big Data

Page 16: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

HDFS Zero-Copy Reads

• Applications read straight from direct byte buffers, backed by the memory-mapped block file.

• Eliminates overhead of intermediate copy of bytes to buffer in user space.

• Applications must change code to use a new read API on DFSInputStream:

public ByteBuffer read(ByteBufferPool factory, int maxLength, EnumSet<ReadOption> opts)

Page 16Architecting the Future of Big Data

Page 17: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Heterogeneous Storages for HDFS

Architecting the Future of Big DataPage 17

Page 18: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Goals

• Extend HDFS to support a variety of Storage Media• Applications can choose their target storage• Use existing APIs wherever possible

Page 18Architecting the Future of Big Data

Page 19: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Interesting Storage Media

Page 19Architecting the Future of Big Data

Cost Example Use case

Spinning Disk (HDD) Low High volume batch data

Solid State Disk (SSD) 10x of HDD HBase Tables

RAM 100x of HDD Hive Materialized Views

Your custom Media ? ?

Page 20: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

HDFS Storage Architecture - Before

Page 20Architecting the Future of Big Data

Page 21: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

HDFS Storage Architecture - Now

Page 21Architecting the Future of Big Data

Page 22: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Storage Preferences

• Introduce Storage Type per Storage Medium• Storage Hint from application to HDFS

–Specifies application’s preferred Storage Type

• Advisory• Subject to available space/quotas• Fallback Storage is HDD

–May be configurable in the future

Page 22Architecting the Future of Big Data

Page 23: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Storage Preferences (continued)

• Specify preference when creating a file–Write replicas directly to Storage Medium of choice

• Change preference for an existing file–E.g. to migrate existing file replicas from HDD to SSD

Page 23Architecting the Future of Big Data

Page 24: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Quota Management

• Extend existing Quota Mechanisms• Administrators ensure fair distribution of limited resources

Page 24Architecting the Future of Big Data

Page 25: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

File Creation with Storage Types

Page 25Architecting the Future of Big Data

Page 26: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Move existing replicas to target Storage Type

Page 26Architecting the Future of Big Data

Page 27: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

Transient Files (Planned feature)

• Target storage type is Memory–Writes will go to RAM–Allow short circuit writes equivalent to Short circuit reads to

local in-memory block replicas

• Checkpoint files to disk by changing storage type• Or discard• High performance writes For Low volume transient data–e.g. Hive Materialized Views

Page 27Architecting the Future of Big Data

Page 28: Interactive Hadoop via Flash and Memory

© Hortonworks Inc. 2011

References

• http://hortonworks.com/blog/heterogeneous-storages-hdfs/• HDFS-2832 – Heterogeneous Storages phase 1 – DataNode as a

collection of storages• HDFS-5682 – Heterogeneous Storages phase 2 – APIs to expose Storage

Types• HDFS-4949 – Centralized cache management in HDFS

Page 28Architecting the Future of Big Data