big data + data science startup focus points

11
BIG DATA & DATA SCIENCE START-UP FOCUS POINTS + BUSINESS AND TECHNOLOGY REFERENCE ARCHITECTURE @TomZorde

Upload: tom-zorde

Post on 18-Jan-2017

469 views

Category:

Business


0 download

TRANSCRIPT

Page 1: Big data + data science startup focus points

BIG DATA & DATA SCIENCE START-UP FOCUS POINTS

+ BUSINESS AND TECHNOLOGY REFERENCE ARCHITECTURE

@TomZorde

Page 2: Big data + data science startup focus points

I HAVE AN IDEA FOR A DATA SCIENCE START-UP

• Use these slides to focus conversation

• What stage are you at?

• What is the problem you’re trying to solve?

• What type of business model would work?

• Tools? – A rapidly evolving space.

• Reference Architecture helps identify what level of the stack we’re talking about.

Page 3: Big data + data science startup focus points

AREAS OF EARLY FOCUSSEED STAGE - Research & Development

1. Research & Define Concept, business model, internal & sourced capabilities

2. Define customer value proposition and identify target market

ANGEL – Business Planning & Product Development

3. Identify services and products required and evaluate gaps for go-to-market readiness

4. Source funding partner to build minimum viable product and get commitment for round 2 funding

5. Assemble team and build MVP prototype exceeding expectations

ROUND 1/ SERIES A FUNDING – Commercially operational

ROUND 2 / SERIES B FUNDING – Fully Operational

ROUND 3 / SERIES C FUNDING – Expansion

IPO/ ACQUISITION

Page 4: Big data + data science startup focus points

BUSINESS PLANNING & DEVELOPMENT - LOGICAL STEPS1. Full business needs and information requirements

analysis. Business Drivers

• Revenue generation? Cost reduction? Customer retention? Compliance?

• Process Improvement? Fraud detection? Analytics? Dashboard?

• Solving a tough problem? Retiring/replacing assets, technologies and systems?

2. Technology Evaluation and Selection

• Define requirements and objective first

• Evaluation a variety of technology stacks – develop a framework first

3. Board Support for Start-up Resources

4. Prototyping, Discovery, and Planning• Rent Infrastructure in Cloud – VMWare, AWS, MS

Azure and others• Use Spare Hardware and Network Bandwidth• Assessment, Proposal. Project/Program Plan for

next steps• Start small and keep delivering

5. Architecture Design, Estimation, Business Case6. Obtain funding and executive sponsorships, owners, etc.7. SDLC, don’t forget Hardware, Security, Testing, Data governance etc.

Page 5: Big data + data science startup focus points

FORESEEABLE CHALLENGESBusiness urgency, time to market pressures

• Big Data /Data Science start up needs careful planning

• Big Data needs infrastructure, software stacks, people, start up plan

Lack of Big Data Resources, Lack of Sponsorships (except in some companies)

• Big Data is complex and multiple skill sets (mostly new to many companies) – Infrastructure, Administration, Security, Programming, Testing, etc.

• Skepticism about Big Data

Integration with Existing Technologies and Systems• Can not develop isolated big data solutions

• Integration with existing systems will be a top challenge (requires both sides to do additional work)

Open Sources: Stability, Maturity, and Security

Page 6: Big data + data science startup focus points

INFORMATION AS A PRODUCT/SERVICE

TYPES OF RELEVANT BUSINESS MODELS

Differentiation

New Services

Customers Experience

Contextual Relevance

Brokering

Raw Data

Benchmarking

Analysis and Insight (Meta Data)

Delivery

Market Place

Facilitator

Advertising

Page 7: Big data + data science startup focus points

REFERENCE ARCHITECTURE

Decisions & Insight

Analytics & Discovery

Data Access and Distribution

Data Collection& Organisation

Infrastructure Platform

Mon

itorin

g, A

lert

s, To

ols,

Se

curit

y, G

over

nanc

e

• The technology stack is rapidly evolving with all traditional as well as new vendors providing offerings• Open source tools remain at the foundation layers.• Different use cases will require different technology tools.

Page 8: Big data + data science startup focus points

REFERENCE ARCHITECTUREDecisions & Insight• IBM Watson• Industry Specific

Analytics & Discovery• SAP Business Objects• IBM Cognos• SAS Analytics• Dell Statistica

• Oracle Hyperion• Microsoft BI• KNIME• Pentaho• Informatica

Page 9: Big data + data science startup focus points

REFERENCE ARCHITECTUREData Access and Distribution• Document: MongoDB, CouchDB• Graph: Neo4j, Titan• Key Value Pair: Riak, Redis• Columnar: Cassandra, Hbase• Search: Lucene, Solr, ElasticSearch

Monitoring, Alerts, Tools, Security, Governance:• Hadoop:Apache, CloudEra, Hortonworks,

MapR, IBM• SQL Mapping: Hive• Big Data Transformation: Pig• Hadoop Load: Sqoop• Realtime-ETL: Storm• Cluster Computing: Apache Spark• Languages: Python, Java, R, Scala

Page 10: Big data + data science startup focus points

REFERENCE ARCHITECTUREData Collection& Organisation (Batch & Real-Time)• Hadoop• Hadoop Map Reduce• Mahout

Infrastructure Platform• AWS• Azure• Mortar• Google BigQuery• Qubole

• Dell• HP• IBM

Page 11: Big data + data science startup focus points

BIG DATA & DATA SCIENCE START-UP FOCUS POINTS

@TomZorde

Thank you