yarn services
DESCRIPTION
Talk at : Apachecon EU 2014TRANSCRIPT
![Page 1: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/1.jpg)
Hadoop YARN ServicesSteve Loughran– Hortonworks
stevel at hortonworks.com
@steveloughran
ApacheCon EU, November 2014
![Page 2: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/2.jpg)
Apache Hadoop + YARN:
An OS for data
![Page 3: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/3.jpg)
An OS can do more than SQL
statements
![Page 4: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/4.jpg)
An OS can do more than run
admin-installed apps
![Page 5: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/5.jpg)
An OS lets you run whatever
you want!
![Page 6: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/6.jpg)
An OS Offers
• Persistent Storage
• Execution of code
• jobs & services
• scheduling
• Communications
• Security
![Page 7: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/7.jpg)
YARN Services:
Long lived applicationswithin a Hadoop cluster
![Page 8: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/8.jpg)
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
“The RM”
HDFS
YARN Node Manager
• Servers run YARN Node Managers (NM)
• NM's heartbeat to Resource Manager (RM)
• RM schedules work over cluster
• RM allocates containers to apps
• NMs start containers
• NMs report container health
Background: YARN
![Page 9: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/9.jpg)
Client creates App Master
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
“The RM”
HDFS
YARN Node Manager
ClientApplication Master
![Page 10: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/10.jpg)
“AM” requests containers
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
Application Master
Container
Container
Container
![Page 11: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/11.jpg)
Short lived applications
• failure: clean restart
• logs: collect at end
• placement: by data
• security: Kerberos delegation tokens
• discovery: launcher app can track
![Page 12: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/12.jpg)
Long-lived services
• failure: stay available
• logs: ongoing collection
• placement: availability, performance
• security: ??
• discovery: ???
![Page 13: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/13.jpg)
YARN-896Support for YARN services:
![Page 14: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/14.jpg)
Log aggregation
Service registration & discovery
Windowed failure tracking
Anti-affinity placement
Gang scheduling
Applications to continue over AM restart
Container resource flexingContainer reuse
Kerberos token renewal
Container signalling
Net & Disk resources
Labelled nodes & queues
YARN-896
REST
![Page 15: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/15.jpg)
Log aggregation
Service registration & discovery
Windowed failure tracking
Anti-affinity placement
Gang scheduling
Applications to continue over AM restart
Container resource flexingContainer reuse
Kerberos token renewal
Container signalling
Net & Disk resources
Labelled nodes & queues
Hadoop 2.6
(Docker)
REST
![Page 16: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/16.jpg)
YARN-913 Service Registry
$ slider resolve --path \~/services/org-apache-slider/storm1
{ "type" : "JSONServiceRecord","external" : [ {
"api" : "http://","addressType" : "uri","protocolType" : "webui","addresses" : [ {
"uri" : "http://nn.example.com:46132"} ]
}, {"api" : "classpath:org.apache.slider.publisher.configurations","addressType" : "uri","protocolType" : "REST","addresses" : [ {
"uri" : "http://nn.example.com:46132/ws/v1/slider/publisher/slider"}]
} } ] }
![Page 17: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/17.jpg)
Internal and external
"internal" : [ {"api" : "classpath:org.apache.slider.agents.secure","addressType" : "uri","protocolType" : "REST","addresses" : [ {
"uri" : "https://nn.example.com:47749/ws/v1/slider/agents"} ]
} ]
![Page 18: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/18.jpg)
Failures
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
Application Master
Container
Container
Container
![Page 19: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/19.jpg)
Failures
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
Container
Container
![Page 20: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/20.jpg)
Failures
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
Application Master
Container
Container
container 1
container 2
lost: container 3
![Page 21: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/21.jpg)
Easy: enabling
// Client
amLauncher.setKeepContainersOverRestarts(true);
amLauncher.setMaxAppAttempts(8);
// Server
List<Container> liveContainers =
amRegistrationData.getContainersFromPreviousAttempts();
![Page 22: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/22.jpg)
Harder: rebuilding state
Node Map
Placement History
Specification
Container QueuesComponent Map
Event History
Persisted Rebuilt Transient
![Page 23: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/23.jpg)
<property><name>yarn.log-aggregation-enable</name><value>true</value>
</property>
Log Aggregation
![Page 24: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/24.jpg)
$ yarn rmadmin...
-addToClusterNodeLabels [label1,label2,label3] -removeFromClusterNodeLabels [label1,label2,label3]
-replaceLabelsOnNode [node1:port,label1,label2]-directlyAccessNodeLabelStore
Labels
![Page 25: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/25.jpg)
Labels offer
• Separation of workloads
• Separation of service roles
• Separation of production & dev code
• Allocation to specific hardware classes
![Page 26: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/26.jpg)
Security
• Token expiry a core Kerberos feature
• Token expiry inimical to service longevity
• Specifically: token delegation
![Page 27: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/27.jpg)
Security
YARN:
AM/RM token renewal
NM HDFS access for AM container relaunch
You: embrace keytabs, test lots
![Page 28: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/28.jpg)
…so you can now
• Write long lived apps
• with failure resilience
• centralised log viewing
• labelled/isolated placement
• in secure clusters
![Page 29: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/29.jpg)
Why not just use Mesos?
![Page 30: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/30.jpg)
Hadoop is everywhere!
![Page 31: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/31.jpg)
Log aggregation
Service registration & discovery
Windowed failure tracking
Anti-affinity placement
Gang scheduling
Applications to continue over AM restart
Container resource flexingContainer reuse
Kerberos token renewal
Container signalling
Net & Disk resources
Labelled nodes & queues
Hadoop 2.7+
REST
Docker
![Page 32: YARN Services](https://reader034.vdocuments.mx/reader034/viewer/2022052316/559445681a28ab02738b4575/html5/thumbnails/32.jpg)
Questions?
http://hadoop.apache.org