l'évolution de l'infrastructure bi viadeo par françois le lay

Post on 01-Nov-2014

1.755 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

http://fr.viadeo.com/fr/profile/francois.lelay

TRANSCRIPT

Techdays 22/11/2012

The evolution of Business Intelligence at Viadeo

Agenda

What is Business Intelligence?

Key Roles

Viadeo Data

Technical Solutions : a short history

Actions

Insights

Awareness

Application Stack

Data Warehouse & ETL

What is Business Intelligence ?

ActionsActionsInsightsInsightsAwarenessAwarenessApplication StackApplication StackData Warehouse & ETLData Warehouse & ETL

Plumbing of structured and unstructured data, logic to persists data

Meta Data, KPI’s, Visual Templates, Security, Information Dissemination, Scheduling

Reports, Dashboards

Forecasting, Predicting, Statistics, Competitor Information, Analysis

Marketing Actions, Business Strategies, Operations

Feedback

BI Dashboards Specification

Simple (Metrics)

Complex (Data viz)

Information Access

BI Dashboard

s

(Scalars)

Direct (SQL,

Datameer)

AnalysisFollowup

Proactive

Web Product

Specification

Functional

(Challenge PO)

Technical

(Enforce data quality)

Key Roles : the Business Analyst

BI

BI

● Simple (Metrics)

● Complex (Data

● BI Dashboards

● Direct (SQL,

● Followup

Information Access

Information Access

● Simple (Metrics)

● Complex (Data

● BI Dashboards

● Direct (SQL,

● Followup

Analysis

Analysis

Web Product Specification

Web Product Specification

Data plumbingReal Time

Batch

Expose to AppsREST/Scala/Java APIs

JDBC/ODBC

Awareness

Implement Data

Visualization

Enforce data quality

Key Roles : the Big Data Engineer

Data plumbing

Data plumbing

● Real Time

● Implement Data Visualization

● Enforce

Expose to Apps

Expose to Apps

● Real Time

● Implement Data Visualization

● Enforce

Awareness

Awareness

Usage

Mining

Viadeo data : The Dynamics

Usage

• 45 million members• Worldwide presence

• China, India, Russia, Mexico,..• Mobile App, Web, API• B2B / B2C

User Engagement

Viadeo data : Graph

Technical solutions : The Beginnings

MysqlServer name : Peach

Phase 1: 2006-2008

Internal tool to allow C-Level, Sales,…Access data

Phase 2 : 2008-2010

MysqlServer name : Lakitu

Technical solutions : A better architecture

Mysql

Phase 3: 2010 - 2012

Server name : « Unfied ODS »

Server name : ODS LiveCluster 1

Server name : ODS LiveCluster 2

Server name : ODS LiveCluster 3

Server name : ODS LiveCluster 5

MySQL

Technical solutions : 2 new internal productsScala-centric, Play! framework

Cross-channel messaging systemEmail, Mobile, SocialFlexible content managementFlexible targeting of recipientsContent testing strategies : A/B, multivariateEvent-driven : web app events, mobile events, ad hoc eventsAutomation, scheduling, frequency capping

Analytics Data visualization : based on Javascript D3.js, processing.js etc.Tabular Reports, OLAP navigationPluggable alerts : business activity monitoring

A common requirement : scalability!!!Viadeo data is BigProcessing performance is not an option, it is mandatory

Technical solutions : a new architecturebased on CQRS pattern

Technical solutions : a new architecture

• Master dataset : • Historical data stored in HBase• Provided as a service by architects team

• Datamarts : • Built on HDFS using MapReduce jobs• MapReduce eased by use of Cascading library

and Scala DSL (Scalding) • Pushed to in-memory distributed storage• Elastic Search, Riak

Technical solutions : A better architecture

MySQLSQ

OOP

Conclusion

• Many scalable data storage solutions• Rapid application development frameworks and low-risk

programming languages on the JVM• Custom analytics = what we implement is what we use

• Analytical needs are very well identified• Blend data stream and batch processing to answer

different needs• Pluggable Data mining R&D• Analytics for Viadeo members/recruiters/companies :

Social Media Monitoring as a Complex Event Processing topic

?

Thanks !

flelay@viadeoteam.comTél : 01 75 70 12 93

top related