big data update 2018 - dama indiana · big data update 2018 by h. michael covert and victoria...

27
Big Data Update 2018 by H. Michael Covert and Victoria Loewengart January 19, 2018 Proprietary and Confidential

Upload: vutuyen

Post on 27-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Big Data Update 2018

by H. Michael Covert and Victoria Loewengart

January 19, 2018

Proprietary and Confidential

Agenda

• Financial and Market Overview

• New technology continues to emerge

• Key Trends for the future

2 January 19, 2018 Proprietary and Confidential

Who are we? • Analytics Inside was founded in 2011 • Based in Columbus Ohio • Big Data and advanced analytics

– Training – Big Data and Machine Learning – Consulting Services – Products – Machine Learning, Graph Theory, and Natural

Language Processing

• Specializing in – Health Care – Education – Intelligence

3 January 19, 2018 Proprietary and Confidential

Big Data Trends for 2018

• The term “Big Data” is slowly disappearing. Its all Big Data now! Google has amassed 10 – 15 EB of data!

• In a sense, Big Data has become just another enabling technology.

– Machine Learning and Artificial Intelligence has become the new table stakes • Now, visual and language identification are appearing everywhere.

From Siri and Alexa, to Tesla, and even drones!

January 19, 2018 Proprietary and Confidential 4 January 19, 2018 Proprietary and Confidential

Big Data Trends for 2018

• Cloud Based computing has emerged as a generalized computing platform and is now pervasive

• Specialized cloud based platforms have emerged – Health care – fitness, social interaction, home health

care

– Online – search by voice, automated shopping, personal assistants, music, fake news detection

– Financial – revenue cycle management, online banking

– Grocers! Automotive! Legal! Education!

– Programs that write programs!

• Micro-service architecture is fueling this

January 19, 2018 Proprietary and Confidential 5 5 January 19, 2018 Proprietary and Confidential

The Big Data Market

January 19, 2018 Proprietary and Confidential 6 January 19, 2018 Proprietary and Confidential

Interestingly though, the total market is almost double this size since Big Data is now only a component of what we once called the total market.

The Big Data Market

January 19, 2018 Proprietary and Confidential 7 January 19, 2018 Proprietary and Confidential

Machine Learning Market

The Big Data Market

January 19, 2018 Proprietary and Confidential 8 January 19, 2018 Proprietary and Confidential

Machine Learning Market

The Big Data Market

• Labor (and talent) shortages continue

– The largest investment is coming from:

• Cybersecurity

• Healthcare

• Retail (and automotive)

• Software is emerging to ease the skill set required to produce ML enabled systems

– But beware! This stuff isn’t easy, and bad algorithms yield bad answers very quickly!

January 19, 2018 Proprietary and Confidential 9 January 19, 2018 Proprietary and Confidential

Algorithm Marketplaces

January 19, 2018 Proprietary and Confidential 10 January 19, 2018 Proprietary and Confidential

• Specialized companies designed to solve business problems by building and selling custom designed algorithms

– Algorithmia – out of USC Viterbi

– OpenAI – Elon Musk

– Clarifai, Nara Logics

– MetaMind – purchased by Salesforce

– Deepmind – purchased by Google

– PrecisionHawk – AI for drones!!!!

Technology Trends

January 19, 2018 Proprietary and Confidential 11 January 19, 2018 Proprietary and Confidential

• Some pure technology: – Kudu – Apache Drill – In memory databases

• MemSQL, MapD

– Algorithm Market Places – Deep learning frameworks

• Tensorflow, Keras, Caffe, Torch, DL4J …

– GPUs and TPUs – Edge computing (Open Edge Computing, EdgeX Foundry, …) – Blockchain – Dark data tools

• Stanford DeepDive • Indiana University Health RX for Mining Dark Data • Stitch Fix

– Quantum computing

Technology Trends

January 19, 2018 Proprietary and Confidential 12 January 19, 2018 Proprietary and Confidential

• Some pure technology: – Kudu – Apache Drill – In memory databases

• MemSQL, MapD

– Algorithm Market Places – Deep learning frameworks

• Tensorflow, Keras, Caffe, Torch, DL4J …

– GPUs and TPUs – Edge computing (Open Edge Computing, EdgeX Foundry, …) – Blockchain – Dark data tools

• Stanford DeepDive • Indiana University Health RX for Mining Dark Data

– Stitch Fix – Quantum computing

Apache KUDU

January 19, 2018 Proprietary and Confidential 13 January 19, 2018 Proprietary and Confidential

• Apache Kudu is a functional replacement for HDFS that has finally matured – No more “Write-once read-many” design. Allows scalable

distributed deletes and updates to data (Raft Consensus)

– It is an enabling technology for advanced analytics and for real-time reporting • Columnar

• Linearly scalable

• Fault tolerant

• Optimized for: – Multi-core

– Solid state disk

Apache Drill

January 19, 2018 Proprietary and Confidential 14 January 19, 2018 Proprietary and Confidential

• Drill is a highly scalable SQL engine designed for Big Data. It is the open source version of Google Dremel

– Schema-less, using JSON, like MongoDB and ElasticSearch

– It is “real SQL-2003” that operated on structured and “semi-structured” data sources

Apache Hive

January 19, 2018 Proprietary and Confidential 15 January 19, 2018 Proprietary and Confidential

• Hortonworks continues to enhance Hive – Insert and update capabilities and persistent servers (no

more MapReduce) via Tez

– Soon it will have an MDX interface to allow robust OLAP integration and SQL-2011 certification

Advances in Computing

January 19, 2018 Proprietary and Confidential 16 January 19, 2018 Proprietary and Confidential

• In memory storage

– Spark

– MemSQL

• Advanced computing architectures

– The usage of GPUs

• MapD

• AWS – warehouses of Elastic GPUs!

– In Memory Computing

• IBM Phase Change Memory – exploits the physical characteristics of memory to allow computation to occur without moving data to the CPU! Initial results show 200X performance improvements.

MapD

January 19, 2018 Proprietary and Confidential 17 January 19, 2018 Proprietary and Confidential

Advances in Computing

January 19, 2018 Proprietary and Confidential 18 January 19, 2018 Proprietary and Confidential

• Blockchain – contains many concepts derived from Big Data • Immutability – after it is written, it cannot be changed

• Distribution – many servers handle the load

• Replication – data exists across the network redundantly

• Natural Language Processing (NLP)

Advances in NLP:

Natural Language Understanding (NLU) Natural Language Generation (NLG)

January 19, 2018 Proprietary and Confidential 19 January 19, 2018 Proprietary and Confidential

Ok Google

Hey Siri Alexa

C:> Cortana

Natural Language Processing Technology

January 19, 2018 Proprietary and Confidential 20 January 19, 2018 Proprietary and Confidential

• Will drive significant cultural change

– Primary method of human-machine interface

– Eventual acceptance of “machines” as companions (real people?)

– Will become active participants. They will talk to us without being “called for.”

Edge Computing

January 19, 2018 Proprietary and Confidential 21 January 19, 2018 Proprietary and Confidential

• Edge computing utilized layered computation, some server based, and some occurring at the edge (mobile phone, IoT device, drone, etc.)

– Coupled with NLP, edge computing becomes the strategic delivery technology that is driving its acceptance

Real World Technology Trends

January 19, 2018 Proprietary and Confidential 22 January 19, 2018 Proprietary and Confidential

• Restock Kroger

– Mobile GPS, cyber-currency, online profile, and in-store IoT converge

• Click-list ordering, cashier-less checkout

• Facebook Fact Checking Alert

– Proliferation of false news stories

– Foreign national political interference

• Stitch Fix online clothing

The Algorithmic Approach of

January 19, 2018 Proprietary and Confidential 23 January 19, 2018 Proprietary and Confidential

What Will the Future Hold?

January 19, 2018 Proprietary and Confidential 24 January 19, 2018 Proprietary and Confidential

• Machine learning and artificial intelligence will gain ground

– Job markets will be impacted. A lot of jobs will simply disappear

– Greater scrutiny of the how this technology is used, and specifically how transparent its use is will occur.

• Ethics will be closely monitored

• Privacy will be “encroached upon” even more than it already is

• Legal actions will result

What Will the Future Hold?

January 19, 2018 Proprietary and Confidential 25 January 19, 2018 Proprietary and Confidential

• Humans will become increasingly “augmented”

– Visual and vocal interactions will assume the primary role of interactions with computers

– High speed question answering will occur

– Cars and trucks will become self-driven

What Will the Future Hold?

January 19, 2018 Proprietary and Confidential 26 January 19, 2018 Proprietary and Confidential

• AlphaGo just beat the world’s best Go champion 4 games to 1

– Re-enforcement learning

– The real breakthrough is that this is “general purpose” AI.

• It is NOT Watson beating Ken Jennings. It is NOT Deep Blue beating Gary Kasparov.

– In fact, what we are seeing is thought by many to be some sort of “new” type of intelligence.

Questions and Answers

[email protected] [email protected]

http://www.AnalyticsInside.us

January 19, 2018 27 Proprietary and Confidential

Training. Consulting. Advanced. Analytical. Intelligence.