big data. what is big data? big data analytics: 11 case histories and success stories

14
Big Data

Upload: clyde-hill

Post on 28-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Big Data

Page 2: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

What is Big Data?

• https://www.youtube.com/watch?v=c4BwefH5Ve8

• Big Data Analytics: 11 Case Histories and Success Stories• https://www.youtube.com/watch?annotation_id=annotation_3535169775&f

eature=iv&src_vid=c4BwefH5Ve8&v=t4wtzIuoY0w

Page 3: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Big Data• Data Size:– Gigabyte– Terabyte: Terabyte USB– Petabyte: Wal-Mart handles more than 1m

customer transactions every hour at more than 2.5 petabytes

– Exabyte: the amount of traffic flowing over the internet about 700 exabytes annually

– Zettabyte•

Page 4: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Big Data: Some Facts

• World’s information is doubling every two years• World generated 1.8 ZB of information in 2011• Cisco predicts that by 2016 global IP traffic will reach 1.3

zettabytes• There will be 19 billion networked devices by 2016• 70% of this data is being generated by individuals as opposed

to enterprises & organizations

Page 5: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Big Data Sources

• Web sites• Social media• Machine generated• RFID• Image, video, and audio• Etc.

Page 6: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Big Data Challenges• Big Data are high-volume, high-velocity,

and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.

• “3Vs":– Volume: Size >= 30-50 TBs– Velocity: Processing speed– Variety: • Structured: able to fit in a database table• unstructured data

Page 7: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Do Companies care about Data?

• Not really, What they care about are Key• Performance Indicators (KPIs)• Some examples of KPIs are– Revenue– Profit– Revenue per customer/employee– Customer Attrition: the loss of clients or customers

• Big Data is only useful if it helps drive KPIs

Page 8: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Big Data to KPIs

Page 9: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Applications• Text mining: deriving high-quality information

from text.– text categorization, text clustering, concept/entity

extraction, sentiment analysis, etc.• Web mining:– Web usage mining– Web content mining

• Social media mining– Salesforce Radian6 Social Marketing Cloud • http://www.youtube.com/watch?v=EH1dcFh_-I4

Page 10: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Hadoop HDFS: Hadoop Distributed File System

• O"Imagine you had a file that was larger than your PC's capacity. You could not store that file, right? Hadoop lets you store files bigger than what can be stored on one particular node or server. So you can store very, very large files. It also lets you store many, many files.“

Page 11: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

Hadoop: MapReduce

• “rather than take the conventional step of moving data over a network to be processed by software, MapReduceuses a smarter approach tailor made for big data sets.”

• “…rather than move the data to the software, MapReducemoves the processing software to the data.” (InfoWeek)

Page 12: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

NoSQL Database

• NotOnlySQL is a broad class of database management systems identified by non-adherence to the widely used relational database management system model.

• They are useful when working with a huge quantity of data when the data's nature does not require a relational model.

Page 13: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

In-Memory Database• An in-memory database is a database

management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism.

• Main memory databases are faster than disk-optimized databases.

• Good for Big Data analytics. • Use non-volatile memory module that retains

data even when electrical power is removed.

Page 14: Big Data. What is Big Data?  Big Data Analytics: 11 Case Histories and Success Stories

SAP HANA

• High-Speed Analytical Appliance (HANA), uses a technique called sophisticated data compression to store data in the random access memory. HANA's performance is 10,000 times faster when compared to standard disks, which allows companies to analyze data in a matter of seconds instead of long hours. (Techopedia)