hbasecon 2012 | building mobile infrastructure with hbase

36
Bu ild ing Mob ile Infrastructure w ith HBase May 2012 Nate Putnam

Upload: cloudera-inc

Post on 08-Jul-2015

1.246 views

Category:

Technology


2 download

DESCRIPTION

In this session you will learn the common mistakes made when deploying a high write environment when building an analytics database in HBase, as well as tips on how to diagnose and debug performance bottlenecks, and an overview of an open source monitoring utility developed at Urban Airship for finding HBase hotspots. This session will also present a case study on how Urban Airship replaced a tag system running on a highly sharded PostgreSQL cluster to HBase, the options explored to create a high throughput Boolean tag system and how it was ultimately built on HBase.

TRANSCRIPT

Page 1: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Building Mobile Infrastructure with HBaseMay 2012Nate Putnam

Page 2: HBaseCon 2012 | Building Mobile Infrastructure with HBase

About Me

• Core Data and Analytics (1 year)

• Previously Engineer at Jive Software (4 years)

• Contributor to HBase/Zookeeper

Page 3: HBaseCon 2012 | Building Mobile Infrastructure with HBase

In this Talk

• About Urban Airship

• What is mobile infrastructure?

• Common mistakes

• Performance tuning and monitoring

• Tag system use case

• Questions

Page 4: HBaseCon 2012 | Building Mobile Infrastructure with HBase

What is an Urban Airship?• Hosting for mobile services that developers should not build themselves

• Unified API for services across platforms

• SLAs for throughput, latency

Page 5: HBaseCon 2012 | Building Mobile Infrastructure with HBase

By The Numbers

• Hundreds of millions devices

• Front end API sustains thousands of requests per second

• Millions of Android devices online all the time

• 6 months for the company to deliver 1M messages, hundred million plus a day now.

Page 6: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mobile Infrastructure ?

• Similar services as any large website.

• Identity

• Messaging

• Reporting

• Segmentation

• Waste time, waste battery, lose customers

Page 7: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mobile Infrastructure ?

Page 8: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Didn’t have any Hadoop experience, mistakes were made.

• Get READY, XMAS IS COMING!

Page 9: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

XMAS :)

Page 10: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Not tuning OS

• XFS

• Disable swap

• Large page support

Page 11: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 12: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Using RAID or virtualized disks

Page 13: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Not using Cloudera

Page 14: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Not tuning Hadoop

• dfs.datanode.max.xcievers

• Replication factor

• Number of mappers

Page 15: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Not tuning HBase

• Handler count

• Region Size

• Block Cache

• HFile block size

Page 16: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Coding against HBase

• Use lightweight access (checkAndPut, exists)

• HBaseTestingUtility is great

• Batching *or* concurrency

Page 17: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 18: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• Schema

• Non distributed keys

• Too many column families

• Compression

• Versions

Page 19: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

• How do you tell if your region sizes are good?

• How do you tell if your splits are good?

• http://bobcopeland.com/blog/2012/04/graphing-hbase-splits/

Page 20: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 21: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 22: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Mistakes

Page 23: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

• Exposes Yammer metrics around HBase client calls.

• Written by @dave_revell

• Simple integration with your existing Java code

Page 24: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

Page 25: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

Page 26: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Performance Monitoring

https://github.com/urbanairship/statshtable

Page 27: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tags System

• Give an object one or more aliases

• Ability to query based on those aliases

Page 28: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Hard on relation databases

• Large datasets are spread out on disk

• Query plan can fall back to full table scan

• Long reads hurt other parts of the system

Page 29: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

Page 30: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Legacy system running on Postgres

• Sharded dataset, adding new shards is tedious

• Needed something that was low latency high throughput

• Support for grouping, and, or, not

Page 31: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Bake off between HBase and Lucene

• HBase won because of operational ease and known scalability characteristics

Page 32: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Match can return full row or key

• Only touch disk when necessary

• Brute force, fast, simple

Page 33: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

Page 34: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Tag System

• Good

• Fast, 3 node ec2 ~200k/sec, ~.5 seconds to first result

• 3.5 million send time decreased 9x

• Scales easily

• Bad

• Key partitions can’t be changed

• Inefficient in the scaled down case

Page 35: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Thanks!

• HBase

• Urban Airship http://www.urbanairship.com

• We’re hiring! http://urbanairship.com/company/jobs/

• @nateputnam

Page 36: HBaseCon 2012 | Building Mobile Infrastructure with HBase

Questions?