what's new in hadoop 3.0

65
1 What‘s new in Hadoop 3.0 Heiko Loewe @loeweh

Upload: heiko-loewe

Post on 08-Apr-2017

198 views

Category:

Data & Analytics


8 download

TRANSCRIPT

Page 1: What's new in hadoop 3.0

1

What‘s new in Hadoop 3.0Heiko Loewe@loeweh

Page 2: What's new in hadoop 3.0

2

Tributes

• Zhe Zhang, Erasure Coding• Akira Ajisaka, Script Rewrite• Junping Du, Yarn Timeline Service v2• And manny Others

Page 3: What's new in hadoop 3.0

3

Hadoop 3.0 Roadmap• Hadoop 3.x Releases• Planned for hadoop-3.0.0

Classpath isolation on by default HADOOP-11656 • hadoop-3.0.0-alpha1

– HADOOP• Move to JDK8+ • Shell script rewrite HADOOP-9902 • Move default ports out of ephemeral range HDFS-9427

– HDFS • Removal of hftp in favor of webhdfs HDFS-5570 • Support for more than two standby NameNodes HDFS-6440 • Support for Erasure Codes in HDFS HDFS-7285 • Intra-datanode balancer HDFS-1312

– YARN • YARN Timeline Service v.2 YARN-2928

– MAPREDUCE • Derive heap size or mapreduce.*.memory.mb automatically MAPREDUCE-5785

Page 4: What's new in hadoop 3.0

4

Release Feature2.0NameNode High-Availability,2.2Federation, Snapshots, NFS v3 Mount2.3Heterogenous Storage (Phase 1), In Memory Caching2.4Rolling Upgrades, Posix ACL2.5Extended Attributes2.6Hot Swap Volumes, Heterogeneous Storage (Phase 2), transp. Encryption2.7Files w/ variable Block Length, Inotify2.8Docker Container in Linux, Yarn ATS 1.5, changing resources on alloc Yarn Container2.9TBD

Release / Features / 2.X

Page 5: What's new in hadoop 3.0

5

What's Apache Hadoop 3?20142010 201

1201320122009 2015

2.2.0

Heterogeneous storage HDFS in-memory caching

2.3.0 2.5.0

2.4.0

HDFS ACLs

2.0.0-alpha

2.1.0-beta

branch-1 (branch-0.20)

1.0.0 1.1.0 1.2.1(stable)0.20.1 0.20.205

0.22.0Security

0.23.11(final)NameNode Federation, YARN

0.21.0New append

0.23.0

NameNode HAbranch-2

HDFS Snapshots NFSv3 support Windows

HDFS Rolling Upgrades Application History Server RM Automatic Failover

2.6.0YARN Rolling Upgrades Transparent Encryption Archival Storage

2.7.0

Hadoop2Drop JDK6 supportTruncate API

2016

branch-0.23

trunk

Hadoop3Hadoop 3 and 2 were diverged in 2011 (5 years ago!)

Hadoop1 (EOL)

Source: Akira Ajisaka

Page 6: What's new in hadoop 3.0

6

Break compatibility

Major version up is to clean up the code• Deprecated APIs can be removed only in changing major version

– @Public and @Stable Java API– REST API– Metrics/JMX– CLI– Environment variables

• Wire-compatibility can be broken– 2.X client cannot talk to 3.X server and vice versa

• Compatibility Guide:– Apache Hadoop 3.0.0-alpha1 – Apache Hadoop Compatibility

Page 7: What's new in hadoop 3.0

7

Erasure Coding

Click icon to add picture

Page 8: What's new in hadoop 3.0

8

Traditional Hadoop

• HDFS inherits 3-way replication from Google File System• Simple, scalable and robust

• 200% storage overhead• Secondary replicas rarely accessed

Page 9: What's new in hadoop 3.0

9

Erasure Coding Saves Storage

• Simplified Example: storing 2 bits

• Same data durability– can lose any 1 bit

• Pro: Half the storage overhead• Cons: Slower recovery

1 01 0Replication: 2 extra bits

XOR Coding: 1 0⊕ 1= 1 extra bit

Page 10: What's new in hadoop 3.0

10

Erasure Coding with m Data, n ParityReed Solomon Coding

Page 11: What's new in hadoop 3.0

11

Durability and EfficiencyData Durability = How many simultaneous failures can be tolerated?Storage Efficiency = How much portion of storage is for useful data?

  Data Durability Storage Efficiency Single Replica 0 100%3-way Replication 2 33%XOR with 6 data cells 1 86%RS (6,3) 3 67%RS (10,4) 4 71%

Page 12: What's new in hadoop 3.0

12

Continious Layout

Page 13: What's new in hadoop 3.0

13

Stripped Layout

Page 14: What's new in hadoop 3.0

14

Other Implementation

Page 15: What's new in hadoop 3.0

15

HDFS Roadmap

Page 16: What's new in hadoop 3.0

16

Erasure Coding: Current Status Phase 1: striping

layout C = 64KB (default)

Work for small files

No data locality Available on trunk Phase 2: contiguous layout C = 128MB (= HDFS Block

size) Not work for small files Data locality Now in progress (HDFS-8030)

Incoming Data

DataNode 1DataNode 2DataNode 3DataNode 4DataNode 5

・・・

Cell size (C)

16

Page 17: What's new in hadoop 3.0

17

Name Server ChangesMapping Logical and Storage Blocks

Too Many Storage Blocks?Hierarchical Naming Protocol:

Page 18: What's new in hadoop 3.0

18

Erasure Coding: Write files using (6,3)-Reed-Solomon

・・・・・・・・・

・・・

Write data to 9 DNs in parallel6 Data Blocks

18

DN1

DN6

DN7

Incoming Data

DN9

3 Parity Blocks

ECClient

Page 19: What's new in hadoop 3.0

19

Erasure Coding: Read files Read data from 6 DNs in

parallelDN1

ECClient

DN6

DN9

・・・・・・

・・・

19

Page 20: What's new in hadoop 3.0

20

Erasure Coding: Read files when DN fails Read data from arbitrary 6 DNs in

parallelDN1

ECClient

×DN6

DN7

DN9

・・・・・・

・・・

20

Page 21: What's new in hadoop 3.0

21

hdfs erasure command[loewe@loewe hadoop-3.0.0-alpha1]$ bin/hdfs erasurecode -helpUsage: hdfs erasurecode [generic options]

[-getPolicy <path>][-help [cmd ...]][-listPolicies][-setPolicy [-p <policyName>] <path>]

-getPolicy <path> : Get erasure coding policy information about at specified path

-help [cmd ...] : Displays help for given command or all commands if none is specified.

-listPolicies : Get the list of erasure coding policies supported

-setPolicy [-p <policyName>] <path> : Set a specified erasure coding policy to a directory Options : -p <policyName> erasure coding policy name to encode files. If not passed the default policy will be used <path> Path to a directory. Under this directory files will be encoded using specified erasure coding policy

Page 22: What's new in hadoop 3.0

22

Acceleration with Intel ISA-L

• 1 legacy coder– From Facebook’s HDFS-RAID project

• 2 new coders– Pure Java — code improvement over HDFS-RAID– Native coder with Intel’s Intelligent Storage Acceleration Library (ISA-L)

Page 23: What's new in hadoop 3.0

23

Benchmarks

Page 24: What's new in hadoop 3.0

24

Benchmarks

Page 25: What's new in hadoop 3.0

25

Benchmarks

Page 26: What's new in hadoop 3.0

26

Benchmarks

Page 27: What's new in hadoop 3.0

27

Yarn Timeline Service v2

Click icon to add picture

Page 28: What's new in hadoop 3.0

28

First, A bit of Vision…

• Evolution of Hadoop start with YARN• YARN Evolution will continue to drive Hadoop forward• Hadoop 3 will still use Yarn, but there are a lot of Improvements

Hadoop 3

Page 29: What's new in hadoop 3.0

29

Several important trends in age of Hadoop 3.0 +

YARN and Other Platform Services

StorageResource

Management SecurityServiceDiscovery Management

Monitoring

Alerts

IOT Assembly

Kafka Storm HBase Solr

Governance

MR Tez Spark …

Innovating frameworks:

Flink, DL(TensorFlow)

, etc.

Various Environments

On Premise Private Cloud Public Cloud

Page 30: What's new in hadoop 3.0

30

Yarn Architecture

Yarn

Resource Database

Scheduler

ApplicationTimeline Service

Page 31: What's new in hadoop 3.0

31

YARN Process Flow - Walkthrough

NodeManager NodeManager NodeManager NodeManager

Container 1.1

Container 2.4

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

Container 1.2

Container 1.3

AM 1

Container 2.2

Container 2.1

Container 2.3

AM2

Client2

Yarn

Scheduler

Yarn

Timeline Service Client (Query)

Page 32: What's new in hadoop 3.0

32

Why Timeline Service v2

• Scalability and reliability challenges– Single instance of Timeline Server– Storage (single local LevelDB instance)

• Usability– Flow– Metrics and configuration as first-class citizens– Metrics aggregation up the entity hierarchy

Page 33: What's new in hadoop 3.0

33

Highlights

v.1 v.2Single writer/reader Timeline Server Distributed writer/collector architectureSingle local LevelDB storage* Scalable storage (HBase)v.1 entity model New v.2 entity modelNo aggregation Metrics aggregationREST API Richer query REST API

Page 34: What's new in hadoop 3.0

34

Architecture

• Separation of writers (“collectors”) and readers• Distributed collectors: one collector for each app• Dedicated RM collector for RM-generated data• Collector discovery via RM• Pluggable storage with HBase as default storage

Page 35: What's new in hadoop 3.0

35

Distributed Collectors and Readers

Page 36: What's new in hadoop 3.0

36

New Entity Model

• Flows and flow runs as parents of YARN applicaSon enSSes• First-class configuraSon (key-value pairs)• First-class metrics (single-value or Sme series)• Designed to handle mulS-cluster environment out of the box

Page 37: What's new in hadoop 3.0

37

What is a flow

• A flow is a group of YARNapplications that are launched as parts of a logical app

• Oozie, Scalding, Pig, etc.– name:– “frequent_visitor_stat”– run id: 1466097809000– version: “b9b9068”

Page 38: What's new in hadoop 3.0

38

Metrics Aggregation

• Application level– Rolls up sub-application metrics– Performed in real time in the collectors

in memory• Flow run level

– Rolls up app level metrics– Performed in HBase region servers via

coprocessors• Offline aggregation (TBD)

– Rolls up on user, queue, and flow offline periodically

– Phoenix tables

Page 39: What's new in hadoop 3.0

39

More Cloud Friendly• Elastic

– Dynamic Resource Configuration• YARN-291• Allow tune down/up on NM’s resource in runtime

– Graceful decommissioning of NodeManagers• YARN-914• Drains a node that’s being decommissioned to allow running containers to finish

• Efficient– Support for container resizing

• YARN-1197• Allows applications to change the size of an existing container

Page 40: What's new in hadoop 3.0

40

More Cloud Friendly (Contd.)• Isolation

– Embrace container technology to achieve better isolation– Resource isolation support for disk and network

• YARN-2619 (disk), YARN-2140 (network)• Containers get a fair share of disk and network resources using Cgroups

– Docker support in LinuxContainerExecutor• YARN-3611• Support to launch Docker containers alongside process• Packaging and resource isolation

• Operation– Container upgrades (YARN-4726)

• ”Do an upgrade of my Spark / HBase apps with minimal impact to end-users”– AM Restart With Work Preserving

• MAPREDUCE-6608

Page 41: What's new in hadoop 3.0

41

• Add a native implementation of the map output collector– Sort, Spill and IFile serialization

• Prequisites– Built with -Pnative option– Custom writable types and comparators are not supported

• Setting<property name="mapreduce. job.map.output .co l lec tor.c las s" value="org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator">

Task level native optimization (MAPREDUE-2841)

Page 42: What's new in hadoop 3.0

42

Benchmark

• Release Note in the issue:– "For shuffle-intensive jobs this may provide speed-ups of 30% or more."

• Benchmarked with 3 slaves (m3.xlarge)– CentOS 7.2– 3.0.0-SNAPSHOT (revision 5865fe2b)

• A very shuffle-intensive wordcount job– Input: 2.6GB (compressed)– Shuffle: 14GB– Output: 10GB

42

Page 43: What's new in hadoop 3.0

43

Updated Web UI

Page 44: What's new in hadoop 3.0

44

Setup Timeline Service v2

• Set up the HBase cluster (1.1.x)– Add the timeline service jar to HBase– Install the flow run coprocessor– Create tables via TimelineSchemaCreator utility

• Configure the YARN cluster– Enable Timeline Service v.2– Add hbase-site.xml for the timeline collector and readers– Start the timeline reader daemon

Page 45: What's new in hadoop 3.0

45

Shell Script rewrite

Click icon to add picture

Page 46: What's new in hadoop 3.0

46

Directory Structure

Page 47: What's new in hadoop 3.0

47

Bin/hadoop Command

Page 48: What's new in hadoop 3.0

48

bin/hdfs Command

Page 49: What's new in hadoop 3.0

49

Shell Script Rewrite (HADOOP-9902)

• Hadoop and Shell Script– Launching daemons– Hadoop CLI

• Difficult to understand– What is the correct env var to set a option

• java classpath?• java.library.path?• GC options?

– How to add the option to the env var– We have to read almost all the shell scripts!

• New CLI is not completely downward compatible– Hadoop 3: bin/hdfs namenode ‐format– Hadoop 2: bin/hadoop namenode –format

Apache Hadoop 3.0.0-alpha1 – Apache Hadoop Compatibility

Page 51: What's new in hadoop 3.0

51

• Very similar to .bashrc– Read the API doc– Create your own ~/.hadoopXX

• hadoop-env : hadoop-env.sh for each user• hadooprc : called after shell env vars are configured

• And that's all :)

• ex.) Set additional classpath (.hadooprc)

hadoop_add_classpath /path / to /my/ jar

.hadoop-env and .hadooprc(HADOOP-11353, HADOOP-13045)

Page 52: What's new in hadoop 3.0

52

--debug option is available

CLASSPATH was overwritten!! (before HADOOP-13045)

Useful for troubleshooting

$ hadoop --debugversionDEBUG: DEBUG: DEBUG:( sn i p) DEBUG: DEBUG: DEBUG:( sn i p)DEBUG:

hadoop_parse_args: procesiong version hadoop_parse: asking c a l l e r toskip 1 HADOOP_CONF_DIR=/usr / local /hadoop/ e t c / hadoopApplying the u se r ' s

. hadooprc I n i t i a lCLASSPATH=/path/to/my/jar I n i t i a l i z e CLASSPATHI n i t i al

CLASSPATH=/usr / l o c a l / hadoop/ sha re / hadoop/common/lib/*

52

Page 53: What's new in hadoop 3.0

53

Many new features, bug fixes, improvements

• 'hadoop distch' to change the ownership and permissions on many files via MapReduce job

• 'hadoop jnipath' to print java.library.path• 'hadoop --daemon' instead of hadoop-daemon.sh

– ex.) hdfs --daemon status namenode– The return code for status is LSB-compatible– hadoop-daemon(s).sh are now deprecated

• .out files are now appended (not overwritten)– Allows external log rotation

• and many more– see https://issues.apache.org/jira/browse/HADOOP-9902

53

Page 54: What's new in hadoop 3.0

54

Derive heap size or mapreduce.*.memory.mb automatically (MAPREDUCE-5785)

• In Hadoop 2, two similar properties must be set :(– mapreduce.{map,reduce}.memory.mb

• The amount of memory to request from the scheduler for each task (ex. 2048)– mapreduce.{map,reduce}.java.opts

• Java options for YARN containers (ex. -Xmx2G)

• In Hadoop 3, either is enough– .java.opts is derived from .memory.mb and vice versa

• .java.opts = .memory.mb * mapreduce.job.heap.memory-mb.ratio• .memory.mb = .java.opts / mapreduce.job.heap.memory-mb.ratio

Page 55: What's new in hadoop 3.0

55

Intra Data-Node Balancer (HDFS-1312)

• Due to activities like deletesthe volumes of a DataNodemay become imbalance filled

DataNodeBlock PlacingPolicies

Page 56: What's new in hadoop 3.0

56

Intra Data-Node Balancer (HDFS-1312)

• Offline scripts existed to reblance a DataNode• HDFS-1312 introduces a online process that

rebalances the Volume of a DataNode• „hdfs diskbalancer“ Command

Page 57: What's new in hadoop 3.0

57

Multiple Name Nodes

Click icon to add picture

Page 58: What's new in hadoop 3.0

58

Support more than two NameNodes (HDFS-6440)

• Hadoop 2 now supports only 2 NameNodes– 1 active and 1 standby

• Hadoop 3 supports 2 or more standby NameNodes– provides additional fault-tolerance– avoids multiple standby NNs to checkpoint at the same time– # of standby should be small due to block report

(typically 3 or 5 NameNodes)

Page 59: What's new in hadoop 3.0

59

Old Layout

ZK Failover Controller

Active NameNode

ZK Failover Controller

Active NameNode

Zookeeper Zookeeper Zookeeper

Fencing

Monitors the health of the NN Participating in election of the active NN Coordinates transition process Fence the other NN, if it win election

The information whichNN is active is kept her

Page 60: What's new in hadoop 3.0

60

New Layout

ZK Failover Controller

Active NameNode

ZK Failover ControllerStandby

NameNode

Zookeeper Zookeeper Zookeeper

Fencing

Monitors the health of the NN Participating in election of the active NN Coordinates transition process Fence the other NN, if it win election

The information whichNN is active is kept her

ZK Failover ControllerStandby

NameNode… 4 .. 5

Page 61: What's new in hadoop 3.0

61

All Feature supported

• Checkpointing with NFS• Checkpoint with Quorum Journal Daemon• Manual Failover• Automatic Failover

Page 62: What's new in hadoop 3.0

62

Incompatible changes

• Many deprecated APIs will be removed– hftp/hsftp/s3 -> webhdfs/s3{n,a}– Metrics v1– org.apache.hadoop.Records– and more

• Improved CLI output– 'mapred job -list' shows the job name as well– 'hadoop fs -du' shows the raw disk usage, and aligned more unix-like– and more

• Search 'Incompatible change' flag– https://s.apache.org/sMO4

62

Page 63: What's new in hadoop 3.0

63

Bump up the versions of the libraries

• Drop JDK7 support (HADOOP-11858)• Dependency Hell

– Tomcat– Jetty– Jersey– Guava– Log4J– Jackson– And many more

common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/ hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/ common/lib/slf4j-api-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp- api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:!/usr/local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/ share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/ hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace- core4-4.0.1-incubating.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/ netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local! /hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/json-smart-1.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/ java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.51.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jcip-annotations-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/ usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/ hadoop/common/lib/hadoop-auth-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/ hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons- compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/ usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/ hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/ hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT-tests.jar:/usr/local/ hadoop/share/hadoop/common/hadoop-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hadoop- hdfs-client-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-all-4.1.0.Beta5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hpack-0.11.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okio-1.4.0.jar:/usr/local/ hadoop/share/hadoop/hdfs/lib/okhttp-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop- mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop- mapreduce-client-app-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce- examples-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client- common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javassist-3.18.1-GA.jar:/usr/local/hadoop/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/

local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/curator-test-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/fst-2.24.jar:/usr/local/hadoop/share/hadoop/yarn/lib/objenesis-2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/share/ha

doop/yarn

/lib/zookeeper-3.4.6

-tests.jar:/usr/lo

cal/ha

doop/share/hadoo

p/yarn/lib/guice-servle

0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-math-2.2.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-api-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/ share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/ yarn/hadoop-yarn-server-web-proxy-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn- registry-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-3.0.0-SNAPSHOT.jar:/usr/ java/latest/lib/tools.jar

Page 64: What's new in hadoop 3.0

64

Classpath isolation (HADOOP-11656)

• Relaxing the "dependency hell"– Separate client and server jars– Client jar does not pull any third party dependencies

• If the isolation is done ...– We can safely upgrade the libraries in server code– In branch-2, the upgrade is incompatible :(

64

Page 65: What's new in hadoop 3.0

65

Thank YouFollow me on Twitter: @loeweh