bda 2012 big data why the big fuss?

36
Big Data: Why the big fuss?

Upload: christopher-bradley

Post on 03-Dec-2014

384 views

Category:

Technology


2 download

DESCRIPTION

Big Data, why the Big fuss. Volume, Variety, Velocity ... we know the 3 V's of Big Data. But Big Data if it yields little Information is useless, so focus on the 4th V = Value. If you haven't sorted quality & data governance for your "little data" then seriously consider if you want to venture into the world of Big Data

TRANSCRIPT

Page 1: BDA 2012 Big data why the big fuss?

Big Data: Why the big fuss?

Page 2: BDA 2012 Big data why the big fuss?

Presenter

My blog: Information Management, Life & Petrolhttp://infomanagementlifeandpetrol.blogspot.com

@InfoRacer

Chris BradleyChief Development [email protected]+44 1225 475000

Page 3: BDA 2012 Big data why the big fuss?

IntroductionsChris has spent 32 years in the Information management field, working for leading

organisations in Data Management Strategy, Master Data Management,

Metadata Management, Data Warehouse and Business Intelligence.

Graduating in 1979 Chris worked for the MoD(Navy), Volvo, Thorn EMI (as Head of

Information Management), Readers Digest Inc (as European CIO), and Coopers and

Lybrand Management Consultancy where he established and ran the International Data

Management practice.

Chris heads IPL’s Business Consultancy practice and is advising several Energy,

Pharmaceutical, Finance and Government clients on Business Process and Information

Asset Management.

Chris is a member of the MPO, Director of DAMA UK and holds the CDMP Master

certification. He co-authored

“Data Modelling For The Business – A Handbook for aligning the business with IT using high-level data models”

.

Chris is a columnist and frequent contributor to industry publications. He authors an

experts channel on the influential BeyeNETWORK, is a recognised thought-leader in

Information Management and regular key speaker at major International Information

Management conferences.

[email protected]+44 1225 475000

Blog: Information Management, Life & Petrolhttp://infomanagementlifeandpetrol.blogspot.com

@InfoRacer

Christopher BradleyChief Development Officer

Page 4: BDA 2012 Big data why the big fuss?

Who is IPL?Trusted, independent consulting & solutions co

30 year track record300 staff, £28m+ turnoverHigh-stakes, business & mission critical contextsConsistently exceed expectations

Business Consulting DivisionInformation Management- IM Strategy- Information Security & Assurance- Data Governance- Information Exploitation- Master Data Management- Information Architecture- Business Intelligence

.......turning Information into a strategic assetEnterprise ArchitectureBusiness Process ManagementProgramme Management

IPL Consulting Clients

Page 5: BDA 2012 Big data why the big fuss?

Three V’s

Page 6: BDA 2012 Big data why the big fuss?

Three V’s

Page 7: BDA 2012 Big data why the big fuss?

Three V’s

Page 8: BDA 2012 Big data why the big fuss?

• Big data comes in one size: large. All enterprises are awash with data, and can easily amass terabytes and petabytes of information.

• Can systems scale up without degrading performance intolerably?

Volume• Frequently time-sensitive, big data should be used as

it streams into the enterprise in order to maximise its value to the business.

• How can you calculate mean values across a constantly changing landscape?

Velocity• Big data extends beyond structured data to include

unstructured data of all varieties: text, audio, video, click streams, log files and more.

• How do you apply the normal methods of analytics and reporting with unknown structures?

Variety

Page 9: BDA 2012 Big data why the big fuss?

Data volume keeps growingThe total amount of global data is expected to grow to 2.7 zettabytes during 2012 (up 48% from 2011)*Equivalent of every person sending 30 tweets/hour for the next 1200 years! Enterprises will manage 50 times more data and files will grow 75 times in the next decade80% of the world’s data is unstructured

* IDC Digital Universe Study 2011

Page 10: BDA 2012 Big data why the big fuss?

Isn’t it all relative?

Page 11: BDA 2012 Big data why the big fuss?

The 7 dimensions of dataUsersDevicesCapacityMediaAdvancesSoftwareAutomation

Page 12: BDA 2012 Big data why the big fuss?

Users

Devices

Capacity

Media

Advances

Software

Automation

• Population increase• Computing demographic

• Proliferation• Portability

• Miniturisation• Reducing costs

• More choice• Temptation to fill

• File sizes• New formats

• Needs more space• More files

• Solution fulfillment• Augmentation

Page 13: BDA 2012 Big data why the big fuss?

Then and now

Dimension

• Users• Devices• Capacity• Media• Advances• Software• Automation

Then

• IT in the workplace• 3270 / Green screen• KBs and MBs• Expensive floppy disks• Dedicated• Minimal/business• Business processes

Now

• Anywhere• Fixed and mobile• PBs, ZBs & YBs• Cheap cards and sticks• Multi-purpose• Complex/everything• What isn’t?

Page 14: BDA 2012 Big data why the big fuss?

Big data is not a new problem…

Page 15: BDA 2012 Big data why the big fuss?

Then Now

Users

Devices

Capacity

Media

Advances

Software

Automation

Page 16: BDA 2012 Big data why the big fuss?

Then Now

Users

Devices

Capacity

Media

Advances

Software

Automation

Data

Page 17: BDA 2012 Big data why the big fuss?

It’s all about scale ……+ the combination

Page 18: BDA 2012 Big data why the big fuss?

Back to basicsStill all about good Information and Data ManagementDriver = Need to act fasterChallenge = Joining it all up … and that’s getting harderObjective = Remains the same … Information Exploitation

Page 19: BDA 2012 Big data why the big fuss?

The three Vs

Page 20: BDA 2012 Big data why the big fuss?

The fourth VWhat is needed?

(VARIETY)

In what quantity?

(VOLUME)

And by when?

(VELOCITY)

VALUE

Page 21: BDA 2012 Big data why the big fuss?

What’s the point of Big Data yielding Little Information?

Page 22: BDA 2012 Big data why the big fuss?

Understand what it is that you need

Page 23: BDA 2012 Big data why the big fuss?

Remember “Garbage in…”Quality is a key factor:

Unstructured – Homeland Security may not careStructured – poorly calibrated meters = bigger garbage

Faults in the technology and processes produce exaggerated errorsBad decisions get made fasterIt’s all about scale……get the IM basics for ‘little data’ right first

Page 24: BDA 2012 Big data why the big fuss?

More data isn’t necessarily better

Page 25: BDA 2012 Big data why the big fuss?

The fundamentalsData ArchitectureData GovernanceMaster Data ManagementInformation SecurityData QualityMetadata ManagementBusiness Intelligence

Information Management Core Disciplines Source: DAMA-I

Page 26: BDA 2012 Big data why the big fuss?

Managing Big Data successfullyData qualitySort out your ‘little data’ first

Page 28: BDA 2012 Big data why the big fuss?
Page 29: BDA 2012 Big data why the big fuss?

Managing Big Data successfullyData qualitySort out your ‘little data’ firstSelect the right technology solution(s) Understand the analytics required:

Near real-timeMining deeper than before

Design optimal presentation channelsTarget the skills you need

Key/value Data Stores eg Cassandra

Columnar/tabular NoSQL Data Stores eg Hadoop, Hypertable

MPP Appliances eg Greenplum , Netezza

XML Data Stores eg CuDB, Marklogic

Page 30: BDA 2012 Big data why the big fuss?

ConclusionsKeep it all in perspective, most of this is not newTrue value comes from deep understanding of the three VsRemember the fourth V is the bottom lineMore data does not necessarily mean better information or wiser decisionsApply data management fundamentals before the technology for Big Data

Page 31: BDA 2012 Big data why the big fuss?

QuestionsMy blog: Information Management, Life & Petrolhttp://infomanagementlifeandpetrol.blogspot.com

@InfoRacer

Tel: +44 1225 475000email: [email protected]

Page 32: BDA 2012 Big data why the big fuss?
Page 33: BDA 2012 Big data why the big fuss?

Financial Services OpportunitiesCreating actionable intelligence – credit historyCustomer insightFraud detectionRegulatory compliance

Page 34: BDA 2012 Big data why the big fuss?

Big Data sourcesKey/value Data Stores such as CassandraColumnar/tabular NoSQL Data Stores such as Hadoop & HypertableMassively Parallel Processing Appliances such as Greenplum & NetezzaXML Data Stores such as CuDB & Marklogic

Data Federation/ Data Virtualisation approaches are stepping up to meet this challenge

Page 35: BDA 2012 Big data why the big fuss?

Don’t forget Data QualityManaging the quality of the data is of the upmost importanceWhat’s the use of this vast resource if its quality and trustworthiness is questionable?Driving your data quality capability up the maturity levels is key

Page 36: BDA 2012 Big data why the big fuss?

Data Quality Maturity AssessmentLevel 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised

Limited awareness within the enterprise of the importance of information quality. Very few, if any, processes in place to measure quality of information. Data is often not trusted by business users.

The quality of few data sources is measured in an ad hoc manner. A number of different tools used to measure quality. The activity is driven by a projects or departments. Limited understanding of good versus bad quality. Identified issues are not consistently managed.

Quality measures have been defined for some key data sources. Specific tools adopted to measure quality with some standards in place. The processes for measuring quality are applied at consistent intervals. Data issues are addressed where critical.

Data quality is measured for all key data sources on a regular basis. Quality metrics information is published via dashboards etc. Active management of data issues through the data ownership model ensures issues are often resolved. Quality considerations baked into the SDLC.

The measurement of data quality is embedded in many business processes across the enterprise. Data quality issues addressed through the data ownership model. Data quality issues fed back to be fixed at source.