hbase. outline basic data model implementation – architecture of hdfs hbase server hregionserver 2
TRANSCRIPT
HBase
2
OUTLINE
• Basic• Data Model• Implementation
– Architecture of HDFS• Hbase Server• HRegionServer
Basic
• HBase directly uses or subclasses the parent Hadoop implementation
4
Basic
Linux Linux LinuxLinux
Basic
• DataBase of problem:– Grown of Data – Complexity of install and maintain
• Mutil-RDBMS of poblem:(for nodes )– JOIN– not effective– rebalance
Solution : Relational DataBase Management System(RDBMS)
Solution : NoSQL DataBase
Basic
• NoSQL DataBase :– Distributed– Scalability– Easy to use (EX:put, get ,alter etc.)
Basic
• List of NoSQL:– OpenSource
• HBase (Yahoo!)• Cassandra (Facebook)• SimpleDB (Amazon)
– Commercial• BigTable (Google)
Basic• Hbase:
– Hadoop’s DataBase.– Reversion of 0.20.6 released– Usage with Map/Reduce
9
OUTLINE
• Basic• Data Model• Implementation
– Architecture of HDFS• Hbase Server• HRegionServer
Table• member : Row , Column, TimeStamp
Row key TimeStamep Column”Contents”
“com.yahoo.news.tw”
t3 “ 我研發水下 6 千公尺機器人”
t2 “ 蚊子怎麼搜尋人肉”
t1 “… Wang 40…”
“com.cnn.www” t1 “ 用腦波「發聲」 ”
Table• Add column
”Anchor”Row key
TimeStamep ”Contents”
“com.yahoo.news.tw”
t3“ 我研發水下 6 千
公尺機器人”
t2 “ 蚊子怎麼搜尋人肉”
t1 “… Wang 40…”
“com.cnn.www” t1 “ 用腦波「發聲」
”
• Add< Family, Label>
Table
Row key TimeStamep ”Contents” ‘’ Anchor’’
“com.yahoo.news.tw”
t5“Anchor:tech” “Silvia”
t4 “Anchor:sports” “Eric”
t3 “ 我研發水下 6 千公尺機器人”
t2 “ 蚊子怎麼搜尋人肉”
t1 “… Wang 40…”
“com.cnn.www” t1 “ 用腦波「發聲」 ”
‘’Anchor_tech’’ ‘’ Anchor_sports’’
Silva Eric
Region
Row key TimeStamep ”Contents” ‘’ Anchor’’
“com.yahoo.new
s.tw”
t5 “Anchor:tech” “Silvia”
t4 “Anchor:sports” “Eric”
t3 “ 我研發水下 6 千公尺機器人”
t2 “ 蚊子怎麼搜尋人肉”
t1 “… Wang 40…”
“com.cnn.www” t1 “ 用腦波「發聲」
”
“com.abc.www”
“com.def.www”
region1
region2
Region1(com.yahoo.new.tw,com.def.www>,ID
Express: Region(start row key, end row key>& identifier
Sort
• Sort by row key – byte-ordered
• Add label on family column
LockingRow key Time
Stamep ”Contents” ‘’ Anchor’’
“com.yahoo.new
s.tw”
t5 “Anchor:tech” “Silvia”
t4 “Anchor:sports” “Eric”
t3 “ 我研發水下 6 千公尺機器人”
t2 “ 蚊子怎麼搜尋人肉”
t1 “… Wang 40…”
“com.cnn.www” t1 “ 用腦波「發聲」
”
“com.abc.www”
“com.def.www”
User1
update
User2
update
User3
update
User4
update
16
OUTLINE
• Basic• Data Model• Implementation
– Architecture of Hbase• Hbase Server• HRegionServer
Architecture of Hbase
NN: NameNodeDN: DataNodeHM: HmasterHR:HRegion
Cluster
HDFSClient
NN DN DN
DN DN DN
HM HR HR
HRHR HR
ZooKeeper
rebalance
• a single host grows the regions– it split a row into two new regions of
approximately equal size.• Until not across threshold
• automatic
Hbase Master
• The master node is lightly loaded.• assignment of the replacement daughters• Recovering regionserver failures.
RegionServer
• carry zero or more regions • client read/write/scan requests
– Random access• Automatic split regions• Send HeartBeat to Master
RegionInfo.
• Region of metadata– the current list, state, recent history, and location
of all regions afloat on the cluster.
{NAME => ‘docs’, FAMILIES => [{NAME => ‘cache’, COMPRESSION => ‘NONE’, VERSIONS => ’3′, TTL => ′ 2147483647′, BLOCKSIZE => ’65536′, IN_MEMORY => ‘false’, BLOCKCACHE => ‘false’}
.MATA.
HBase in operation• memory size of 256MB and each row is 1KB size
-ROOT-
.META.
useregion
• 1.8 x 1019 (264) bytes of user data
2.6 x 105 META regions
6.9 x 1010 user regions
HBase in operation
NN: NameNodeDN: DataNodeHM: HmasterHR:Regionsever
Cluster
HBase Client NN DN DN
DN DN DN
HM R R
RR R
ZooKeeper
ROOT META
Request
consult
Step 3.User region
Step 1.
Step 2
• Read Requests - Step 1.location of -ROOT- - Step 2.location of the .META. Region - Step3.user region space
HBase in operation
NN: NameNodeDN: DataNodeHM: HmasterHR:Regionsever
Cluster
HBase Client
NN DN DN
DN DN DN
HM R R
RR R
ZooKeeper
Interacts with RegionServer
• Read Requests -clients cache save information of ROOT , META and User Region
HBase in operation
Interacts with RegionServer
HBase Client
HLog
• table Region server of state
Region Serser
Hstore RegionRegion
Hstore
Hstore
HFileHFileHfile
MemStore
HBase in operation
RegionServer
HBase Client
HLog
• Client request to save data in table
Region Serser
Hstore RegionRegion
Hstore
Hstore
HFileHFileHfile
MemStore
Hbase of characteristic
• Fault tolerance• Batch processing• Automatic partitioning• Scale linearly with new nodes