get real about big data

27
The Team Jim Blomo (@JimBlomo) Big Data @ Yelp Amazon, Pbworks Lecturer @ UC Berkeley He likes Distributed systems, startups, fitness, and whatever else you've got. Dave Mariani (@Dmariani, Klout 144) Big Data @ Yahoo! Blue Lithium, MindeShare Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple 4,000 node Hadoop clusters

Upload: bruno-aziza

Post on 17-Nov-2014

671 views

Category:

Technology


1 download

DESCRIPTION

This presentation was used at the Big Data Day at the Computer History Museum on June 7.

TRANSCRIPT

Page 1: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 5: Get Real About Big Data

www.crunchbase.sisense.com

Page 6: Get Real About Big Data

!@SiSense

Page 7: Get Real About Big Data

1

DATA SKILLS

Page 8: Get Real About Big Data

Little Bit of That…

Page 9: Get Real About Big Data

Little Bit of This…

Page 10: Get Real About Big Data

Favorite Data Scientist Hire?

Page 11: Get Real About Big Data

Source: Drew Conway

Little Bit of This…

Page 12: Get Real About Big Data

Little Bit of This…

http://bit.ly/dssurvey

Page 13: Get Real About Big Data

2

STARTING FROM SCRATCH?

Page 14: Get Real About Big Data

Starting from SCRATCH?

Page 15: Get Real About Big Data

Starting from CRAP?

Page 16: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 17: Get Real About Big Data

3

SAME DIFF?

Page 18: Get Real About Big Data

WHAT’S WHAT?

Page 19: Get Real About Big Data

Big different from…?

Page 20: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 21: Get Real About Big Data

4

FEATURE FEST?

Page 22: Get Real About Big Data

What feature…?

Page 23: Get Real About Big Data

What feature…?

Page 24: Get Real About Big Data

The Team

• Jim Blomo (@JimBlomo)– Big Data @ Yelp– Amazon, Pbworks– Lecturer @ UC Berkeley – He likes Distributed systems, startups, fitness, and whatever

else you've got.

• Dave Mariani (@Dmariani, Klout 144)– Big Data @ Yahoo!– Blue Lithium, MindeShare– Klout: 30B calls/month, Yahoo!: 20TB/Day across multiple

4,000 node Hadoop clusters

Page 26: Get Real About Big Data

TRY IT @ WWW.SISENSE.COM