making data science pay: mastering the challenges of ...€¦ · 13/06/2018  · making data...

19
Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst, STAC [email protected]

Upload: others

Post on 18-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Making data science pay:

Mastering the challenges of analytics operations

Michel DebicheGuest Analyst, STAC

[email protected]

Page 2: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Cognitive Reset Part 1: Window management technology

Page 3: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Cognitive Reset Part 2: Window management context

Page 4: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Investment Process

Gather

information

Digest

information

Make

decisionsExecute

decisions

Actions

Results

Info

Page 5: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Pressures

• Scale

• Volume

• Variety

• Density

• Computational complexity

• Velocity of innovation

• Cost

• Regulation

Page 6: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Dimensions of scale

• Scale• Volume

• Variety

• Kinds of data: structured, unstructured, text, binary

• Data entities: Millions of time series

• Density

• Transactions in microseconds

• Simultaneous transactions on multiple channels

• Computational complexity

• NLP, Image processing, AI

• Velocity of innovation

• Competitive pressures: New datasets, new models, new technologies

• Evolving opportunities

• Feedback loops

Page 7: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Responses

• DevOps

• Data Lake

• Open Source

• Big Data

• Data Science

• AI

Page 8: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Issues

• Model Factories: Hundreds of models with nowhere to go

• Redundant engineering

• Open source interoperation and upgrade nightmares

• Murky, expensive data lakes contributing little value

• Skills mismatches

• User resistance to new technologies

• Data lineage, audit trails

Page 9: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Goals

• Maximize returns

• Minimize risk

• Market risk

• Model risk

• Systems risk

• Data risk

• Operational risk (people)

• Maximize productivity

Page 10: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Principles

• Optimize use of resources

• People

• Time

• Data

• Technology

• End-to-end process design

• Agility

• Constant improvement

Page 11: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Industrial Engineering

• Similar challenges and goals

• Eventually came to software engineering as DevOps

• Need to carry paradigm over to full data-to-decision pipeline

• Why is it so hard?

Page 12: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

DevOps: Elegant Concept

Page 13: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

DevOps: More complicated to implement

Page 14: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

So let’s think about QuantOps™

Page 15: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Investment Process

Gather

information

Digest

information

Make

decisionsExecute

decisions

Actions

Results

Info

Page 16: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Investment Process, Expanded

Research data, develop and test models

Devops for data prep, analytical

functions, API

Production pipeline: data to

curated feature

Model scoring engine

Model testing manager

Data

Core

Research Data

Feature

updates

Test backlog

Data

Model suite

updates

Features

Features

Scores

Results

Ad hoc data

ingestion

Ideas

Function

library

Feature

definition

Model

definition

Model

Repository

Feature

preparation

code

Function

definition

Results

Results

Results

Page 17: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

1

7

A Unifying Paradigm: QuantOps as a DAG

Page 18: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

A Unifying Paradigm: QuantOps as a DAG

• Standardize the connections

• Carefully define the data APIs

• Then all the technology is pluggable

• Makes it possible to efficiently address:

• Orchestration

• Data lineage

• Monitoring

• Audit trails

• Automated code generation and testing

Page 19: Making data science pay: Mastering the challenges of ...€¦ · 13/06/2018  · Making data science pay: Mastering the challenges of analytics operations Michel Debiche Guest Analyst,

Copyright © 2018 Securities Technology Analysis Center LLC

®

Where does STAC fit in?

• Implementing analytics ops is a big commitment with big payoffs

• Biggest challenge: effective communication, change management

• Design needs to be process-oriented and based on user needs

• Technology needs to respond to process requirements, not vice versa

• Emerging STAC roles:

• Facilitate dialogue & training on analytics ops challenges & best practices

• Accelerate technology selection based on community-source standards

driven by process-oriented model of the investment process

• Let us know if you want to be involved!