Transcript
Page 1: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Building Mobile Infrastructure with HBaseMay 2012Nate Putnam

Page 2: HBaseCon 2012 | Building Mobile Infrastructure with HBase

About Me

• Core Data and Analytics (1 year)

• Previously Engineer at Jive Software (4 years)

• Contributor to HBase/Zookeeper

Page 3: HBaseCon 2012 | Building Mobile Infrastructure with HBase

In this Talk

• About Urban Airship

• What is mobile infrastructure?

• Common mistakes

• Performance tuning and monitoring

• Tag system use case

• Questions

Page 4: HBaseCon 2012 | Building Mobile Infrastructure with HBase

What is an Urban Airship?• Hosting for mobile services that developers should not build themselves

• Unified API for services across platforms

• SLAs for throughput, latency

Page 5: HBaseCon 2012 | Building Mobile Infrastructure with HBase

By The Numbers

• Hundreds of millions devices

• Front end API sustains thousands of requests per second

• Millions of Android devices online all the time

• 6 months for the company to deliver 1M messages, hundred million plus a day now.

Page 6: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mobile Infrastructure ?

• Similar services as any large website.

• Identity

• Messaging

• Reporting

• Segmentation

• Waste time, waste battery, lose customers

Page 7: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mobile Infrastructure ?

Page 8: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Didn’t have any Hadoop experience, mistakes were made.

• Get READY, XMAS IS COMING!

Page 9: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

XMAS :)

Page 10: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Not tuning OS

• XFS

• Disable swap

• Large page support

Page 11: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 12: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Using RAID or virtualized disks

Page 13: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Not using Cloudera

Page 14: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Not tuning Hadoop

• dfs.datanode.max.xcievers

• Replication factor

• Number of mappers

Page 15: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Not tuning HBase

• Handler count

• Region Size

• Block Cache

• HFile block size

Page 16: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Coding against HBase

• Use lightweight access (checkAndPut, exists)

• HBaseTestingUtility is great

• Batching *or* concurrency

Page 17: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 18: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Schema

• Non distributed keys

• Too many column families

• Compression

• Versions

Page 19: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• How do you tell if your region sizes are good?

• How do you tell if your splits are good?

• http://bobcopeland.com/blog/2012/04/graphing-hbase-splits/

Page 20: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 21: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 22: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 23: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

• Exposes Yammer metrics around HBase client calls.

• Written by @dave_revell

• Simple integration with your existing Java code

Page 24: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

Page 25: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

Page 26: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

https://github.com/urbanairship/statshtable

Page 27: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tags System

• Give an object one or more aliases

• Ability to query based on those aliases

Page 28: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Hard on relation databases

• Large datasets are spread out on disk

• Query plan can fall back to full table scan

• Long reads hurt other parts of the system

Page 29: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

Page 30: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Legacy system running on Postgres

• Sharded dataset, adding new shards is tedious

• Needed something that was low latency high throughput

• Support for grouping, and, or, not

Page 31: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Bake off between HBase and Lucene

• HBase won because of operational ease and known scalability characteristics

Page 32: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Match can return full row or key

• Only touch disk when necessary

• Brute force, fast, simple

Page 33: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

Page 34: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Good

• Fast, 3 node ec2 ~200k/sec, ~.5 seconds to first result

• 3.5 million send time decreased 9x

• Scales easily

• Bad

• Key partitions can’t be changed

• Inefficient in the scaled down case

Page 35: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Thanks!

• HBase

• Urban Airship http://www.urbanairship.com

• We’re hiring! http://urbanairship.com/company/jobs/

• @nateputnam

Page 36: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Questions?


Top Related