building enterprise advance analytics platform

19
Building Enterprise Advance Analytics Platform SoCal Data Science Conference 09.25.2016 Raymond Fu Practice Architect Trace3 T3

Upload: haoran-du

Post on 12-Apr-2017

79 views

Category:

Technology


3 download

TRANSCRIPT

Building EnterpriseAdvance Analytics PlatformSoCal Data Science Conference 09.25.2016

Raymond FuPractice ArchitectTrace3

T3

22

Raymond FuPractice Architect, Trace3

16 years of IT experience specializing in big data, business intelligence, and enterprise architecture. 10 year corporate career with Bank of America highlighted by leading many data integrations and warehousing initiatives from mergers and acquisitions.

Founded his own technology company Xceed Consulting Group in 2012 enabling data driven solutions.

Joined California based consulting company Trace3 in 2016 as a practice architect for the Data Intelligence team.

Blog: Everything About Data

Twitter: @RaymondxFu

• Typically, organizations got a firm grasp on required People, Process, and Technology to deliver capabilities, articulate end-to-end roadmap, identify platforms and resources.

• Big Data disrupts the traditional architecture paradigm. Organizations may have an idea or interest, but they don’t necessarily know what will come out of it.

• The answer or outcome for an initial question will trigger the next set of questions. It requires a unique combination of skill sets, the likes of which are new and not in abundance.

• The pursuit of the answer is advanced analytics.

Big Data Disruption

3

Advanced Analytics Definition

• The process, tools, technology, and collaboration to create predictive models that enable/drive strategic and operational decisions. The predictive models (1) generate insights and hypotheses and (2) test/score them through experiments, so organizations KNOW what works better.

• Predictive models are created using machine learning, deep learning, advanced data management tools and visualization tools

• An integral part of Advanced Analytics includes the operationalization of the predictive models so they can be rapidly scored and decisioned at scale

Advanced Analytics Relevancy

5

Organizations’ goals

Advanced Analytics’ goals

What’s different today

Obstacles to the goals

Advanced Analytics Process

6• Domain

knowledge• Hypothesis

development

• Model architecture• Algorithm selection and development• Feature engineering• Visualization

Collaboration

Reproducibility

• Data mining• Statistical data shaping• Training• Cross-validation testing• Environment and libraries

Production feature

generation, modeling, testing

DeploymentParallel

experiments

• Performance assessment

• Connectivity• Landing• Ingestion• Knowledge• Preparation

Business metric assessment

Data management

Analytics creation(business modeling)

Analytics operationalization(model production and deployment)

Organization and business

impact

• Continuous integration and deployment

• Model iteration and redeployment

IT/DE, DS LoB, DS DS, IT/DE, LoB LoB, DS, IT/DE

• R-T and batch scoring

• Decisioning

Enterprise Big Data Strategy• Information management

• Data architecture, data governance and meta data management. • Address key issues such as data integration and data quality.

• Data platform modernization• Enterprise data warehouse offload.• Data lake platform assessment.

• Advanced Analytics• Methodology• Tools recommendation• Operationalization

• Step 1 – Establish Business Context and Scope (incubate ideas)

• Step 2 – Establish an Architecture Vision

• Step 3 – Assess the Current State

• Step 4 – Establish Future State and Economic Model

• Step 5 – Develop a Strategic Roadmap

• Step 6 – Establish Governance over the Architecture

Enterprise Architecture Approach

Establishing an Architecture Vision

9

The architecture development process needs to be more fluid and different from SDLC-like architecture process. It must allow organizations to continuously assess progress, correct course where needed, balance cost, and gain acceptance.

Advanced Analytics Capabilities

10

Category Capability Items

Organization and business impact

Fast, informed decisions • Time from question to hypothesis to model implementation to informed decision

Strategic and operational role

• Degree of input into business/policy decisions• Perceived and quantified value of analytics

Analytics operationalization

Model performance

• Execution of experiments in parallel• Model performance for scoring and decisioning

Model deployment • Continuous integration and deployment

Analytics creation

Efficient model creation

• Use of data mining and visualization tools• Rapidly spun-up environment customized to individual data scientists that enables execution of large data sets

and highly mathematical algorithms• Collaboration among data scientists and between data scientist and lines of business; reuse of data sets and

models• Model reproducibility (including versions, algorithms, data sets, parameters, notes, environment)

Appropriate model selection

• Understanding, and appropriate use, of model architecture and algorithms, feature engineering, hyper parameterization, statistical and mathematical concepts, training and validation, scoring, and decisioning

• Use of ML and DL concepts, tools, and libraries• Use of graph systems

Data management

Data capability • Infrastructure and tools to access and cleanse data

Data knowledge and confidence

• Understanding of, and confidence in, data (e.g. what is available, their relationships)

Data access • Access to internal and external data through infrastructure, logical associations, and tools

Enterprise Information Management Capabilities

11

Advanced Analytics Reference Architect

12

13

Structured data source Unstructured data source

RDBMS

Big Data

Business Intelligence / Data Visualization Advanced Analytics

HDFS NoSQL Cloud Storage

ETLETL

Teradata

Operation

CRM ERP Accounting Clickstream Sensor Info Images/Video Event Logs Social Media

Tools

Real-timeStreaming

Library (ML and DL) Online ML

AWS

Azuretorch

Machine Learning APIGoogle PredictionAWSAzureBigML IBM Watson

Advanced Analytics Services

14

Service Type

Services

Overall Assessment

• Advanced Analytics assessment

Architecture • Architecture for data science• Architecture for cloud analytics

ETL/ELT

• Data source identification and integration

• Data virtualization• Data preparation

Data analysis and modeling (data science)

• Statistical / quantitative analysis• Descriptive analysis• Predictive modeling• Machine learning• Deep learning• Graph systems• Simulation and optimization

Service Type ServicesVisualization and insight presentation and recommendations

• Data exploration / mining / advanced visualization to understand the data

• Insight presentation and recommendations

Tools recommendation

• Infrastructure• Software tools• Software environment, programming,

libraries

Process improvement

• Analytics process improvement• Data governance• Model governance• Continuous integration and deployment of

models

Organizational capabilities

• Advanced analytics organization structure and roles

• Advanced analytics training • Advanced analytics staff augmentation

Best Practice

15

• Align Analytics with Specific Business Goals • Ease Skills Shortage with Standards and Governance • Optimize Knowledge Transfer with a Center of Excellence • Top Payoff is Aligning Unstructured with Structured Data • Plan Your Discovery Lab for Performance • Align with the Cloud Operating Model

Example 1: Oracle

16

Example 2: Google Cloud Platform – Building Blocks

17

Example 2: Google Cloud Platform – Stepping Stone

18

Thank you! 19