hbasecon 2012 | gap inc direct: serving apparel catalog from hbase for live website

27
Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website HBaseCon 2012 Applications Track – Case Study 1

Upload: cloudera-inc

Post on 25-May-2015

2.919 views

Category:

Technology


2 download

DESCRIPTION

Gap Inc Direct, the online division for Gap Inc., uses HBase to serve, in real-time, apparel catalog for all its brands’ and markets’ web sites. This case study will review the business case as well as key decisions regarding schema selection and cluster configurations. We will also discuss implementation challenges and insights that were learned.

TRANSCRIPT

Page 1: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

1

Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

HBaseCon 2012

Applications Track – Case Study

Page 2: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

2

Who Are We?

Suraj Varma Director of Technology Implementation Gap Inc Direct (GID), San Francisco, CA IRC: svarma

Gupta Gogula Director-IT & Domain Architect of

Catalog Management & Distribution Gap Inc Direct (GID), San Francisco, CA

Page 3: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

3

Agenda - Case Study

Problem Domain

HBase Schema Specifics

HBase Cluster Specifics

Learning & Challenges

Page 4: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

4

2005NEW SITE LAUNCH

2007PIPERLIME2008

UNIVERSALITY2009ATHLETA

US

US

US

US

US

EU

EU

CA

CA

EUCA

INCOMING TRAFFIC

2010CA & EU

MARKETS

APPLICATION SERVERS DATABASES

Page 5: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

5

Problem Domain

Evolution of the GID Apparel Catalog 2005 - Three independent brands in US 2010 – 5 integrated brands in US, CA, EU

Rapid Expansion of Apparel Catalog

However, each brand / market combination necessitated separate logical catalog databases

Page 6: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

6

What We Wanted …

Single Catalog store for all brands/markets Horizontally scalable over time Cross brand business features

Access data store directly To avail of inventory awareness of items

Minimal Caching – only for optimization Keeping caches in sync is a problem.

Highly Available

Page 7: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

7

Initial Explorations

Sharded RDMBS, MemCached, etc Significant effort was required Still had scalability limits

Non-relational alternatives considered

HBase POC (early-2010) Promising results -decided to move

ahead

Page 8: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

8

Why HBase?

Strong Consistency Model Server Side Filters Automatic Sharding, Distribution,

Failover Hadoop Integration out of the box

General Purpose Other use cases outside of Catalog

Strong Community!

Page 9: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

9

Architecture Diagram

HBASE CLUSTER

MUTATIONS

MUTATIONSMUTATIONS

REQUESTSBACKEND SERVICES

NEAR REAL TIME INVENTORY UPDATES

PRICING UPDATES ITEM UPDATES

INCOMING REQUESTS

FOR CATALOG DATA

Page 10: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

10

Cluster Traffic Patterns

Read Mostly

Write / Delete Bursts

Continuous Writes

Website Traffic Sync MR Jobs

Catalog Publish Phase out to near real-

time updates from originating systems

MR jobs on Live Cluster

Inventory Updates

Page 11: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

11

Rows:100KB avg size1000-5000 colsSparse rows

Data Model & Access Patterns Hierarchical Data (Primarily)

SKU -> Style Lookups (child -> parent) Cross Brand Sell (sibling <-> sibling)

Data Access Patterns Full Product Graph in one read Single path of graph from root to leaf node Search - Secondary Indices Large Feed files

Page 12: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

12

Primary Access Patterns

READ FULL GRAPH

READ SINGLE PATH / EDGE

Page 13: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

13

HBase Schema Management Built custom “bean to schema

mapper” POJO graph < -> HBase qualifiers Flexibility to shorten column qualifiers Flexibility to change schema qualifiers

(per environment / developer)<…><association>one-to-many</association>

<prefix>SC</prefix> <uniqueId>colorCd</uniqueId>

<beanName>styleColorBean</beanName> <…>

Page 14: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

14

Schema Example - Hierarchy <PP>_<id1>_QQ_<id2>_RR_<id3>_name

Where PP is parent, QQ is child, RR is grandchild

cf1:VAR_1_SC_0012_colorCdcf2:VAR_1_SC_0012_SCIMG_10_path

Pattern: ANCESTOR IDS EMBEDDED IN QUALIFIER NAME

Page 15: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

15

Schema – Lookups

Secondary Index <id3> => RR ; QQ ; PP FilterList with (RR, QQ, PP) ids to get

thin slice path

14444 333 22KEY_5555

Pattern: SECONDARY INDEX TO HIERARCHICAL ANCESTORS

Page 16: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

16

Schema – Future Dates, Large Files “Publish at Midnight”

Future Dated PUTs Get/Scan with time range

Large Feed Files Sharded into smaller chunks < 2MB per

cell

S_4S_1 S_2 S_3KEY_nnnn

Pattern: SHARDED CHUNKS

Page 17: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

17

HBase Cluster

16 Slave (RS + TT + DN) Nodes 8 & 16 GB RAM

3 Master (HM,ZK,JT, NN) Nodes 8 GB RAM

NN Failover via NFS

Page 18: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

18

Configurations – Block Cache, GC

Block Cache Maximize Block Cache hfile.block.cache.size: 0.6

Garbage Collection MSLAB enabled CMSInitiatingOccupancyFactor

Page 19: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

19

Configurations – Timeouts Quick Recovery on node failure

Default timeouts too large zookeeper.session.timeout

Region Server hbase.rpc.timeout

Data Node dfs.heartbeat.recheck.interval heartbeat.recheck.interval

Page 20: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

20

Learnings – Regions

Block Cache Size Tuning Block Cache Churn

Hot Row scenarios Perf Tests & Doing Phased Rollouts

Hot Region issues Perf Tests & Pre-split Regions.

Filters CPU Intensive – profiling needed.

Page 21: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

21

Learnings – Monitoring, Hardware

Monitoring is crucial Layer by layer -> what’s the bottleneck Metrics to target optimization & tuning Troubleshooting

Non Uniform Hardware Sub-optimal region distribution Hefty boxes lightly loaded.

Page 22: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

22

Learnings – Miscellaneous

M/R Jobs running on live cluster Has an impact – so cannot run full

throttle Go easy …

Feature Enablement – Phase in Don’t turn on several features together Easier identification of potential hot

regions / rows, overloaded RS, etc

Page 23: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

23

Phasing In Features

23

HBASE CLUSTER

LOT MORE

REQUESTS

BACKEND SERVICES

INVENTORY UPDATES

PRICING UPDATES ITEM UPDATES

Enable Features individually to measure impact and tune cluster accordingly

FEATURE “A” ENABLED: ADDITIONAL “N” REQ / SEC

FEATURE “B” ENABLED: ADDITIONAL “K” REQ / SEC

INCOMING REQUESTS

Page 24: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

24

Challenges – Search, Transactions

Search No out-of-the-box secondary indexes. Custom solution with Solr

Transactions Only row level atomicity But … can’t pack all in a single row Atomic Cross-Row Put/Delete and HBASE-

5229 seem potential partial solves (0.94+)

Page 25: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

25

Challenges – Optimal Schema Orthogonal access patterns

Optimize for most frequently used pattern.

Filters May suffice, with early out configurations Impacts CPU usage

Duplicate data for every access pattern Too drastic Effort to keep all copies in sync

Page 26: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

26

Challenges - Backups

Rebuild from source data Takes time … but no data loss

Export / import based backups Faster … but stale Another MR on live cluster

Better options in future releases …

Page 27: HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live Website

27

Gap Inc Direct

We’re hiring!http://www.gapinc.com