interactive data analytics with couchbase n1ql: couchbase connect 2015

32
UNLEASH THE POWER OF COUCHBASE THROUGH N1QL (NICKEL) ARVIND JADE ([email protected]) GOVINDARAJAN RAGHUNATHAPURAM [email protected]

Upload: couchbase

Post on 11-Aug-2015

196 views

Category:

Software


0 download

TRANSCRIPT

UNLEASH THE POWER OF COUCHBASE THROUGH

N1QL (NICKEL)

ARVIND JADE ([email protected])GOVINDARAJAN RAGHUNATHAPURAM

[email protected]

AGENDA About Nielsen

Answers on Demand, Big Data Platform

Business Challenges

Why NoSQL with Couchbase?

Couchbase Usage Models

Next Steps

Q & A

ABOUT NIELSEN

Nielsen is a leading global information and measurement company that enables companies to understand consumers and consumer behavior

Nielsen measures and monitors what consumers watch (programming, advertising) and what consumers buy (categories, brands, products) on a global and local basis

Nielsen has a presence in approximately 100 countries spread across Africa, Asia, Australia, Europe, Middle East, North America, South America and Russia

ECOSYSTEM

REPORTING

Avg 175,000 Scan reports per month (>25% in cross cat)

Avg 7,000 Panel reports per month

MONTHLY USAGE

Avg 5,500 unique users AND

11,000 named users on the system

DATABASES

Scan• Core – 159• Custom - 141

Panel • Core – 34• Custom - 65

PROCESSING

175B records processed per week on scan, 10X that amount on the monthly

430MM purchase txns across 100000 Nielsen panelists per week

DATA VOLUMES

1.5PB of scan data - 130K stores, 4.3M UPCs, - 5 years

SLAS

Guaranteed system availability of 14 hours per day M-FWeekly refreshed scan data updates avail +9 6AM ETWeekly refreshed panel data updates avail +16 6AM ET

METRICS

AOD PLATFORM

DisaggregatedData Warehouse

USER EXPERIENCEONDEMAND ENGINESOURCE DATA

Flexible Adapters to accommodate multiple input source files.

Disaggregated data warehouse supports ultimate flexibility. Messaging architecture supports seamless orchestration.

Powerful BI and Rendering engine to provide fast, rich insights.

Trips

T-log

POS

ProdRef

Stores

Households

Loyalty Cards

ET

L P

OW

ER

ED

BY

IB

M

MESSAGING AND ORCHESTRATION POWERED BY TIBCO

FAC

T

PR

OC

ES

SIN

G

DIM

EN

SIO

NB

UIL

DS

PROCESSING POWERED BY IBM’S NETEZZABI ENGINE

POWERED BY JAVA

RENDERING ENGINE BY

EXTJS

PanelODS

ScanODS

LoyaltyODS

Fact

Dim

Dim

Dim Dim

Dim

On-the-fly Virtual

Aggregations

Answers

AOD APPLICATION

AOD DATA TIER

REPORT BUILDER

REPORT PLAYER

DATA SELECTOR

BUSINESS CHALLENGES

WHERE WE WERE?

BIG Problemso Growth - Expensive scalingo Relational Fatigueo Fragmented Cachingo No Unified Analyticso Ad Hoc querying capability

Application

Log

Access Log

Netezza

Rep

ortin

g

Oracle Report

Data

SQL Server

Audit Data

SQL Server

Audit Data

Report Cache

UnifiedBig DataPlatform

Caching Layer

Netezza Reporting

Application Log

Oracle ReportMetadata

SQL Server Audit Data

Access Log

OUR WANTS & NEEDS

Better Scalability – Be elastic to accommodate new data growth with ease.

Faster Performance

Cheaper – Can we get utopia for cheap?

Insights – Ability to run analytics

Faster feature release.

WHY NOSQL

Schema-less – Schema updates and cost of change were very high

Horizontal scaling – Sharding and replication with no single point of failure

Deep Analytics – Incremental map/reduce, aggregated searching

Cost – Commodity hardware

High Performance – Low latency and high throughput

COUCHBASE JOURNEY

Couchbase 2.0 Mobilization, Prototyping

Couchbase 2.0, 4 node cluster live in 1 data center,

for document and cache

storage

Upgraded Couchbase 2.5.1, 16 node clusters, in 2 data centers, advanced views, Unified Analytics w/i ElasticSearch

Upgrade to Couchbase 4.0, adopt Nickel for ad hoc querying, Couchbase Lite

for mobile prototyping

2012 2013 2014 2015

COUCHBASE USAGE MODEL

As Reverse index store using map/reduce for faster look up

For Unified analytics combining Indexes from Couchbase and Elastic

Needed a solution that keeps client reports agnostic of back end changes by updating reports of magnitude

(Millions)

Provide holistic insight into report metrics and system health

As Document and Cache persistence store

Real time application uses Couchbase for responsive UI

For Ad Hoc Querying – Instant Analytics (NEW)

Ability to query key spaces and indexes using SQL-like interfaces

N1QL – SQL-Like Query Language

USAGE MODEL 1 DOCUMENT STORE

REPORT SELECTION MODEL

19

REPORT DEFINITION MODEL

20

USAGE MODEL 2 INCREMENTAL MAP/REDUCE

MAP REDUCE SAMPLE

22

USAGE MODEL 3 UNIFIED ANALYTICS

26

ReportingData

Report Audit Data

Application Log Data

Custom Connector

Map/Reduce Log StashJDBC Connector

Metrics

1.4 B Reporting Data Points3 TB of Index Size20 M Audit Records5 TB of Application Log

UNIFIED ANALYTICS

27

UNIFIED JSON MODEL

28

UNIFIED ANALYTICS

USAGE MODEL 4 AD HOC QUERYING

N1QL BUSINESS USE CASES

30

Derive metrics – Get stats on user selections, usage patterns

Quantify Impacts - During a data refresh, identify the set of impacted reports to predict cost of change and impact

Identify and update JSON documents - Operational Need

N1QL DEMO

31

BUSINESS GAINS

32

Faster Performance - Over all processing time of fixing client reports is now reduced to 1/5th

Smart search - With creation of reverse index, able to perform targeted search and convert only affected documents.

Real time insights – Combining Couchbase and Elasticsearch, able to derive instant analytics, near real time.

Scalable - Able to onboard new clients rapidly

Adhoc Querying – Able to empower Analysts to run adhoc analytics.

33

HEADING TOWARDS

Upgrade cluster to Couchbase 4.0 to leverage Multi Dimensional Scaling

Prototype Couchbase Lite for mobile certification for AOD Application

Leverage the power of N1QL for Instant Analytics

34

Q & A