Building Mobile Infrastructure with HBaseMay 2012Nate Putnam
About Me
• Core Data and Analytics (1 year)
• Previously Engineer at Jive Software (4 years)
• Contributor to HBase/Zookeeper
In this Talk
• About Urban Airship
• What is mobile infrastructure?
• Common mistakes
• Performance tuning and monitoring
• Tag system use case
• Questions
What is an Urban Airship?• Hosting for mobile services that developers should not build themselves
• Unified API for services across platforms
• SLAs for throughput, latency
By The Numbers
• Hundreds of millions devices
• Front end API sustains thousands of requests per second
• Millions of Android devices online all the time
• 6 months for the company to deliver 1M messages, hundred million plus a day now.
Mobile Infrastructure ?
• Similar services as any large website.
• Identity
• Messaging
• Reporting
• Segmentation
• Waste time, waste battery, lose customers
Mobile Infrastructure ?
Mistakes
• Didn’t have any Hadoop experience, mistakes were made.
• Get READY, XMAS IS COMING!
Mistakes
XMAS :)
Mistakes
• Not tuning OS
• XFS
• Disable swap
• Large page support
Mistakes
Mistakes
Using RAID or virtualized disks
Mistakes
Not using Cloudera
Mistakes
• Not tuning Hadoop
• dfs.datanode.max.xcievers
• Replication factor
• Number of mappers
Mistakes
• Not tuning HBase
• Handler count
• Region Size
• Block Cache
• HFile block size
Mistakes
• Coding against HBase
• Use lightweight access (checkAndPut, exists)
• HBaseTestingUtility is great
• Batching *or* concurrency
Mistakes
Mistakes
• Schema
• Non distributed keys
• Too many column families
• Compression
• Versions
Mistakes
• How do you tell if your region sizes are good?
• How do you tell if your splits are good?
• http://bobcopeland.com/blog/2012/04/graphing-hbase-splits/
Mistakes
Mistakes
Mistakes
Performance Monitoring
• Exposes Yammer metrics around HBase client calls.
• Written by @dave_revell
• Simple integration with your existing Java code
Performance Monitoring
Performance Monitoring
Performance Monitoring
https://github.com/urbanairship/statshtable
Tags System
• Give an object one or more aliases
• Ability to query based on those aliases
Tag System
• Hard on relation databases
• Large datasets are spread out on disk
• Query plan can fall back to full table scan
• Long reads hurt other parts of the system
Tag System
Tag System
• Legacy system running on Postgres
• Sharded dataset, adding new shards is tedious
• Needed something that was low latency high throughput
• Support for grouping, and, or, not
Tag System
• Bake off between HBase and Lucene
• HBase won because of operational ease and known scalability characteristics
Tag System
• Match can return full row or key
• Only touch disk when necessary
• Brute force, fast, simple
Tag System
Tag System
• Good
• Fast, 3 node ec2 ~200k/sec, ~.5 seconds to first result
• 3.5 million send time decreased 9x
• Scales easily
• Bad
• Key partitions can’t be changed
• Inefficient in the scaled down case
Thanks!
• HBase
• Urban Airship http://www.urbanairship.com
• We’re hiring! http://urbanairship.com/company/jobs/
• @nateputnam
Questions?