getting started with big data

21
Getting Started With Big Data

Upload: soner-altin

Post on 20-Jun-2015

99 views

Category:

Technology


3 download

DESCRIPTION

Big data for beginners. Tried to prove that "Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it..." is totally wrong.

TRANSCRIPT

Page 1: Getting Started with Big Data

Getting Started With Big Data

Page 2: Getting Started with Big Data

–Getting started in ‘Big Data’

“Wanted: Ph.D.-level statistician with the technical skill to use data-visualization software

and a deep understanding of the _____ industry.”

Page 3: Getting Started with Big Data

Gartner: "high volume, velocity and/or variety information assets that demand cost-effective, innovative forms of information processing

that enable enhanced insight, decision making, and process automation."

SAS: "The term big data has been around for decades, and we've been doing analytics all this time. It's not big, it's just bigger."

Wikipedia: Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand

database management tools or traditional data processing applications.

Page 4: Getting Started with Big Data
Page 5: Getting Started with Big Data
Page 6: Getting Started with Big Data
Page 7: Getting Started with Big Data
Page 8: Getting Started with Big Data
Page 9: Getting Started with Big Data

Size # many standard laptop hdd

U.S. Library of Congress 235 terabytes 320

Youtube 48 hours video / minute 371 terabytes / day 495

Facebook 100 terabytes / day 133

Walmart 2.5 petabytes / hour 3500

Average hospital in 2015 665 terabytes / year 908

Page 10: Getting Started with Big Data
Page 11: Getting Started with Big Data

• Recommendation engine

• Network monitoring

• Sentiment analysis

• Fraud detection

• Risk modelling

• Customer experience analytics

• Marketing campaign analysis

• Customer churn analysis

Page 12: Getting Started with Big Data

• Visa recently advised that it has greatly improved its ability to detect fraudulent transactions (estimated to be 6 cents out of every $100) by increasing the amount of data it analyzes and looking at a broader range of attributes for each transaction.

• Citibank has improved the quality of its consumer loan portfolio by hiring IBM's Watson supercomputer as a "financial advisor." By using information on market conditions as well as the applicant's life events, interactions on social media and past decisions, the company is able to get a far better prediction of potential loan defaults and fraud.

• Walmart applied big data techniques and technologies to allow it to understand how to better serve its online customers. The retailer generated product and category popularity scores by mining social media, which it combined with a self-teaching semantic search capability honed by the clickstream data of 45 million online shoppers each month.

• Netflix uses data analysis to refine movie recommendations and customer searches, as well as to identify which movies and TV shows to license or develop.

Page 13: Getting Started with Big Data
Page 14: Getting Started with Big Data

Data is only as good as the intelligence we can glean from it, and that entails effective data

analytics and a whole lot of computing power to cope with the exponential increase in volume.

Page 15: Getting Started with Big Data
Page 16: Getting Started with Big Data

data.gov

Page 17: Getting Started with Big Data

books.google.com/ngrams

Page 18: Getting Started with Big Data
Page 19: Getting Started with Big Data
Page 20: Getting Started with Big Data