access 2011: big data in libraries

32
BIG DATA BIG DATA

Upload: robotninja

Post on 29-Jun-2015

115 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Access 2011: Big Data in Libraries

BIG DATABIG DATA

Page 2: Access 2011: Big Data in Libraries
Page 3: Access 2011: Big Data in Libraries

"datasets that grow so large that they become

difficult to work with using relational

databases and within a tolerable elapsed time"

Page 4: Access 2011: Big Data in Libraries

BIG DATA IS BIGBIG DATA IS BIG

Page 5: Access 2011: Big Data in Libraries

LIKE, REALLY BIGLIKE, REALLY BIG

Page 6: Access 2011: Big Data in Libraries

FACEBOOK: 140 BILLION PHOTOS

HUMAN GENOME: 3 BILLIONBASE PAIRS

GOOGLE: 50 BILLIONWEB PAGES

WORLDCAT: 1.5 BILLIONITEM RECORDS

Page 7: Access 2011: Big Data in Libraries

NOT REALLYNOT REALLY

Page 8: Access 2011: Big Data in Libraries

EUROPEANA: 20 MILLION(715K / COUNTRY)

LIBRARY OF CONGRESS:

1.9 MILLION

CANADIANA: 1 MILLION

LIBRARY AND ARCHIVES CANADA:

3.5 MILLION(ARCHIVAL DESCRIPTIONS)

Page 9: Access 2011: Big Data in Libraries

BIG DATABIG DATAIS COMPLICATEDIS COMPLICATED

Page 10: Access 2011: Big Data in Libraries

1966

Page 11: Access 2011: Big Data in Libraries

1976

Page 12: Access 2011: Big Data in Libraries

Page 13: Access 2011: Big Data in Libraries

Page 14: Access 2011: Big Data in Libraries

NOT REALLYNOT REALLY

Page 15: Access 2011: Big Data in Libraries

ಠ_ಠ

Page 16: Access 2011: Big Data in Libraries
Page 17: Access 2011: Big Data in Libraries
Page 18: Access 2011: Big Data in Libraries

SCALABILITYSCALABILITY

Page 19: Access 2011: Big Data in Libraries

● ICA-AtoM (LAMP)

● BENCHMARK 3.5M RECORDS (current largest: < 100K)

● 100% OPEN SOURCE SOFTWARE

● COMMODITY HARDWARE

Page 20: Access 2011: Big Data in Libraries
Page 21: Access 2011: Big Data in Libraries

CAN WE DO IT?CAN WE DO IT?

Page 22: Access 2011: Big Data in Libraries

WRITE SPEEDWRITE SPEED

Page 23: Access 2011: Big Data in Libraries

READ SPEEDREAD SPEED

Page 24: Access 2011: Big Data in Libraries

WRITE MEMORYWRITE MEMORY

Page 25: Access 2011: Big Data in Libraries

READ MEMORYREAD MEMORY

Page 26: Access 2011: Big Data in Libraries

NOSQL vs. SQLNOSQL vs. SQL(a.k.a. ODM vs. ORM)

● 4x - 10x FASTER

● 50% - 90% LESS MEMORY

Page 27: Access 2011: Big Data in Libraries
Page 28: Access 2011: Big Data in Libraries
Page 29: Access 2011: Big Data in Libraries

RELATIONAL DATABASESSCALE WELL

IF YOUR DATAIS NOT HIERARCHICAL

SOLRSCALES WELL

IF YOU HAVE INFINITE RAM

BEWARE THEDOGMA OF SQL

NOSQL IS AVIABLE OPTION

THINK SIDEWAYS SCALE OUT →

Page 30: Access 2011: Big Data in Libraries
Page 31: Access 2011: Big Data in Libraries

THE CLOUD IS A LIETHE CLOUD IS A LIE

Page 32: Access 2011: Big Data in Libraries

“big data is less about size, and more about

freedom”

open source tools+ distributed design= new opportunities