hbasecon 2015- hbase @ flipboard

34
HBASE @ FLIPBOARD

Upload: matthew-blair

Post on 08-Aug-2015

6.301 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: HBaseCon 2015- HBase @ Flipboard

HBASE @ FLIPBOARD

Page 2: HBaseCon 2015- HBase @ Flipboard

What is Flipboard?

Page 3: HBaseCon 2015- HBase @ Flipboard
Page 4: HBaseCon 2015- HBase @ Flipboard

Publications

Page 5: HBaseCon 2015- HBase @ Flipboard

People

Page 6: HBaseCon 2015- HBase @ Flipboard

Topics

Page 7: HBaseCon 2015- HBase @ Flipboard

Scale

100+ million users

250,000+ new users/day

15+ million magazines

Page 8: HBaseCon 2015- HBase @ Flipboard

Why HBase?• Transferrable Hadoop operational expertise

• MapReduce!

• Write throughput

• Better elasticity than MySQL, our other primary data store

• Strong consistency

• Column-oriented, as opposed to simple K/V

Page 9: HBaseCon 2015- HBase @ Flipboard

What do we use it for?

• User-generated magazines, likes, comments

• Vanity metrics for those magazines- daily + all-time counters/HLLs

• Follow graph

• RSS feeds

• More and more every day…

Page 10: HBaseCon 2015- HBase @ Flipboard

Magazine Storage• Stored in a single HBase table

• Magazines live in one column family (“magazine”)

• Articles in temporal order in another CF (“article”)

• Logically, everything shared is tagged with magazine ID (prefix compression helps here)

• Makes the calculation of everything a user shared efficient

Page 11: HBaseCon 2015- HBase @ Flipboard

User Magazines

magazine:<magazineid> magazine:<magazineid> magazine:<magazineid>

sha1(userid) MagazineData (serialized JSON) MagazineData (serialized JSON) MagazineData (serialized JSON)

Magazine CF of Collection Table Listing magazines a user has created is a single read

Data is stored in serialized JSON for language interoperability but is parsed and serialized by plain old java objects

Page 12: HBaseCon 2015- HBase @ Flipboard

User Magazines

[Reverse Unix TS]:[magazineid]

[Reverse Unix TS]:[magazineid]

sha1(userid) Article Data(Serialized JSON)

Article Data(Serialized JSON)

Articles CF of Collection Table Kept in temporal order so that most recently shared articles are first

Access patterns are usually newest first. HBase Filters are used to slice wide rows.

Page 13: HBaseCon 2015- HBase @ Flipboard

User Magazines

like:[userid] reflip:[new article id] comment:[timestamp][userid]

Article ID JSON (who/when) JSON (where it was

reflipped) JSON (comment/person)

Social Activity

One cell per like, since you can only do it once per user Can be many comment and reflip cells by one user per article

Alternative orderings can be computed from Elasticsearch indexes

Page 14: HBaseCon 2015- HBase @ Flipboard

User Magazines

magazine:<magazineid> contributor:<magazine>:<userid>

sha1(userid) JSON metadata JSON metadata

Multiple Contributors Magazine CF contains magazines that user can share into.

Contributor CF contains user’s magazines that others are allowed to share into.

Page 15: HBaseCon 2015- HBase @ Flipboard

User Magazines

<metric>:<day> <metric>_count

magID long count for day alltime count

Per magazine metrics live in Stats CF

Atomic increments for counters, both a per day count and a total count: Total Articles Contributors

etc.

Page 16: HBaseCon 2015- HBase @ Flipboard

User Magazines

<unique>:<day> <unique>_count

magID HLL for individual day Premerged HLL

Unique readers kept in each magazine’s row as a serialized HyperLogLog

Allows for merging unique data over day ranges or displaying all time

Page 17: HBaseCon 2015- HBase @ Flipboard

Social Graph

follow:userid follower:userid stats:<counter>

sha1(userid) JSON person that I follow

JSON person that follows me

long count of followers/following

Stored in friends table, follower/followers/stats CFs; metadata in MySQL

Alternative indexes in Elasticsearch

Page 18: HBaseCon 2015- HBase @ Flipboard

HBase Table Access Patterns• Tables optimized for application access patterns (“design for the questions, not the

answers”)

• Fetching an individual magazine- collection table, magazine CF, [magazine ID] -> cell

• Fetching an individual article - article table, article:[article ID] cell

• Fetch an article’s stats - article table, article:stats cells

• Fetching a magazine’s articles: collection table, article CF, with cell limit and column qualifier starts with magazine id

• Fetching a user’s magazines- collection table, magazine CF, [magazine ID] in the CQ

Page 19: HBaseCon 2015- HBase @ Flipboard

Client Stats

• Articles: sum(magazine stats:article_count for each magazine)

• Magazines: count(collection:magazine) cells • Followers: friends:stats:follower_count +

sum(magazine stats:subscriber_count for each magazine)

Page 20: HBaseCon 2015- HBase @ Flipboard

More Client Stats

• Summary stats use counters from the article table, detailed stats (who liked the article?) read cells

• We can cache the feed of items, but the stats/like state is calculated per user

• likes: article:stats:like_count • reflips: article:stats:reflip_count • comments: article:stats:comment_count

Page 21: HBaseCon 2015- HBase @ Flipboard

Even More Client Stats

Page 22: HBaseCon 2015- HBase @ Flipboard

AsyncHBase Usage• Our fork adds column filters on wide rows- we’d like to get these upstream

• Stats requests require scatter/gather reads for several tables, sometimes over multiple HBase clusters

• HBaseAsync requests are grouped into a single Deferred

• Most requests are a get on a single row, no multi row scans

• Most requests wait once until the results are returned or a deadline expires

• If data is returned late or HBase regions are not available, partial calculations are allowed (we just display the stats we’ve got)

Page 23: HBaseCon 2015- HBase @ Flipboard

Handling HBase Failures• Most patterns are read before write which causes early failure

• We can tolerate some data loss (atomic increments, vanity stats)

• Individual servers track inflight requests to HBase, slow puts and gets, and report to Graphite

• Various levels of caching allow HBase recovery/region reassignment without end users noticing

• Read Only mode - writes are stopped at the application layer

• Ability to switch to replica under duress

Page 24: HBaseCon 2015- HBase @ Flipboard

Current HBase Fleet

• 15 clusters

• ~100 tables

• ~250TB in HDFS

• ~250 RegionServers

• Busiest clusters: 100,000+ qps, 1000 regions

Page 25: HBaseCon 2015- HBase @ Flipboard

HBase Fleet, continued

• All in EC2 😳

• Nothing in VPC, yet

• Each cluster lives within an AZ

• 1 durable cluster doing cross-AZ HBase-level replication

• 1 cluster running Stargate (it works, but we’re not in love with it)

Page 26: HBaseCon 2015- HBase @ Flipboard

HBase History at FlipboardOldest current production instances launched in 2011

Page 27: HBaseCon 2015- HBase @ Flipboard

(This cluster is going away soon 😀)

Page 28: HBaseCon 2015- HBase @ Flipboard

HBase Version Distribution• 0.90: dwindling, thankfully

• 0.94: Moved to Snappy; Stargate cluster for RSS storage; Python writers, Go readers

• 0.96: First CDH5/Java 7/Ubuntu Precise clusters; magazines live in one of these

• 0.98: pre-calculated user homefeeds, more

• 1.0, 1.1: Soon…

Page 29: HBaseCon 2015- HBase @ Flipboard

Which instance types?• Started off with m1.xlarges for the 1.6TB of ephemeral (spinning)

disk; when we started using HBase, AWS didn’t have SSDs

• Moved to hi1.4xlarges (16 cores, 60GB RAM, 2x1TB local SSD)

• Moved to i2s (next-gen SSD instances, made for databases) as soon as AWS let us launch them!

• ❤ i2s; some 2x, some 4x

Page 30: HBaseCon 2015- HBase @ Flipboard

AWS tips• Use instance storage, not EBS

• Rely on HDFS to keep your bits replicated instead of using EBS

• Cross-AZ latency is minimal, but traffic is expensive!

• Push HDFS snapshots to S3 (we trigger this from Jenkins)

• If you jack up your network-related timeouts to handle AWS’ network flakiness, your MTTR rises, so be careful…

• Upgrade often, you’ll get more sleep!

Page 31: HBaseCon 2015- HBase @ Flipboard

Clients

• Java, Scala- AsyncHBase, which we love. Added column filtering for wide rows.

• Python + Go: protobufs over HTTP via Stargate, which works

• We use HAProxy everywhere, so we use that to load balance requests to Stargate servers

Page 32: HBaseCon 2015- HBase @ Flipboard

What’s next for HBase at Flipboard?• Moar HBase

• 1.0, now that CDH has it in 5.4.0; 1.1 when CDH gets it, hopefully soon…

• Region replicas (HBASE-10070) will help with use cases that can tolerate timeline consistency; 1.1 will have many improvements here

• Compaction throttling! (HBASE-8329)

• Java 8 + G1, Ubuntu Trusty, 3.13 kernel

• EC2 placement groups + VPC enhanced networking, once we’re in (no charge for these)

• HTrace (we use a little Zipkin, would love to get more HBase visibility)

• Multitenancy improvements in Apache HBase will help us put more customers on a cluster

Page 33: HBaseCon 2015- HBase @ Flipboard

Wish List $

HydraBase! (HBASE-12259)

Async client in Apache HBase so it keeps pace (HBASE-12684 is a start!)

Native Go client for 0.96+

Page 34: HBaseCon 2015- HBase @ Flipboard

Thanks!

• Matt Blair (@mb on Flipboard, @mattyblair on Twitter)

• Jason Culverhouse (@jsonculverhouse on both)

• Sang Chi (@sangchi on Flipboard, @sandbreaker on Twitter)