hbase. outline basic data model implementation – architecture of hdfs hbase server hregionserver 2

Post on 14-Dec-2015

225 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

HBase

2

OUTLINE

• Basic• Data Model• Implementation

– Architecture of HDFS• Hbase Server• HRegionServer

Basic

• HBase directly uses or subclasses the parent Hadoop implementation

4

Basic

Linux Linux LinuxLinux

Basic

• DataBase of problem:– Grown of Data – Complexity of install and maintain

• Mutil-RDBMS of poblem:(for nodes )– JOIN– not effective– rebalance

Solution : Relational DataBase Management System(RDBMS)

Solution : NoSQL DataBase

Basic

• NoSQL DataBase :– Distributed– Scalability– Easy to use (EX:put, get ,alter etc.)

Basic

• List of NoSQL:– OpenSource

• HBase (Yahoo!)• Cassandra (Facebook)• SimpleDB (Amazon)

– Commercial• BigTable (Google)

Basic• Hbase:

– Hadoop’s DataBase.– Reversion of 0.20.6 released– Usage with Map/Reduce

9

OUTLINE

• Basic• Data Model• Implementation

– Architecture of HDFS• Hbase Server• HRegionServer

Table• member : Row , Column, TimeStamp

Row key TimeStamep Column”Contents”

“com.yahoo.news.tw”

t3 “ 我研發水下 6 千公尺機器人”

t2 “ 蚊子怎麼搜尋人肉”

t1 “… Wang 40…”

“com.cnn.www” t1 “ 用腦波「發聲」 ”

Table• Add column

”Anchor”Row key

TimeStamep ”Contents”

“com.yahoo.news.tw”

t3“ 我研發水下 6 千

公尺機器人”

t2 “ 蚊子怎麼搜尋人肉”

t1 “… Wang 40…”

“com.cnn.www” t1 “ 用腦波「發聲」

• Add< Family, Label>

Table

Row key TimeStamep ”Contents” ‘’ Anchor’’

“com.yahoo.news.tw”

t5“Anchor:tech” “Silvia”

t4 “Anchor:sports” “Eric”

t3 “ 我研發水下 6 千公尺機器人”

t2 “ 蚊子怎麼搜尋人肉”

t1 “… Wang 40…”

“com.cnn.www” t1 “ 用腦波「發聲」 ”

‘’Anchor_tech’’ ‘’ Anchor_sports’’

Silva Eric

Region

Row key TimeStamep ”Contents” ‘’ Anchor’’

“com.yahoo.new

s.tw”

t5 “Anchor:tech” “Silvia”

t4 “Anchor:sports” “Eric”

t3 “ 我研發水下 6 千公尺機器人”

t2 “ 蚊子怎麼搜尋人肉”

t1 “… Wang 40…”

“com.cnn.www” t1 “ 用腦波「發聲」

“com.abc.www”

“com.def.www”

region1

region2

Region1(com.yahoo.new.tw,com.def.www>,ID

Express: Region(start row key, end row key>& identifier

Sort

• Sort by row key – byte-ordered

• Add label on family column

LockingRow key Time

Stamep ”Contents” ‘’ Anchor’’

“com.yahoo.new

s.tw”

t5 “Anchor:tech” “Silvia”

t4 “Anchor:sports” “Eric”

t3 “ 我研發水下 6 千公尺機器人”

t2 “ 蚊子怎麼搜尋人肉”

t1 “… Wang 40…”

“com.cnn.www” t1 “ 用腦波「發聲」

“com.abc.www”

“com.def.www”

User1

update

User2

update

User3

update

User4

update

16

OUTLINE

• Basic• Data Model• Implementation

– Architecture of Hbase• Hbase Server• HRegionServer

Architecture of Hbase

NN: NameNodeDN: DataNodeHM: HmasterHR:HRegion

Cluster

HDFSClient

NN DN DN

DN DN DN

HM HR HR

HRHR HR

ZooKeeper

rebalance

• a single host grows the regions– it split a row into two new regions of

approximately equal size.• Until not across threshold

• automatic

Hbase Master

• The master node is lightly loaded.• assignment of the replacement daughters• Recovering regionserver failures.

RegionServer

• carry zero or more regions • client read/write/scan requests

– Random access• Automatic split regions• Send HeartBeat to Master

RegionInfo.

• Region of metadata– the current list, state, recent history, and location

of all regions afloat on the cluster.

{NAME => ‘docs’, FAMILIES => [{NAME => ‘cache’, COMPRESSION => ‘NONE’, VERSIONS => ’3′, TTL => ′ 2147483647′, BLOCKSIZE => ’65536′, IN_MEMORY => ‘false’, BLOCKCACHE => ‘false’}

.MATA.

HBase in operation• memory size of 256MB and each row is 1KB size

-ROOT-

.META.

useregion

• 1.8 x 1019 (264) bytes of user data

2.6 x 105 META regions

6.9 x 1010 user regions

HBase in operation

NN: NameNodeDN: DataNodeHM: HmasterHR:Regionsever

Cluster

HBase Client NN DN DN

DN DN DN

HM R R

RR R

ZooKeeper

ROOT META

Request

consult

Step 3.User region

Step 1.

Step 2

• Read Requests - Step 1.location of -ROOT- - Step 2.location of the .META. Region - Step3.user region space

HBase in operation

NN: NameNodeDN: DataNodeHM: HmasterHR:Regionsever

Cluster

HBase Client

NN DN DN

DN DN DN

HM R R

RR R

ZooKeeper

Interacts with RegionServer

• Read Requests -clients cache save information of ROOT , META and User Region

HBase in operation

Interacts with RegionServer

HBase Client

HLog

• table Region server of state

Region Serser

Hstore RegionRegion

Hstore

Hstore

HFileHFileHfile

MemStore

HBase in operation

RegionServer

HBase Client

HLog

• Client request to save data in table

Region Serser

Hstore RegionRegion

Hstore

Hstore

HFileHFileHfile

MemStore

Hbase of characteristic

• Fault tolerance• Batch processing• Automatic partitioning• Scale linearly with new nodes

top related