data streaming algorithms - university of massachusetts ... › cs590d › lectures › ... ·...

37
Data Streaming Algorithms Barna Saha

Upload: others

Post on 30-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure

DataStreamingAlgorithms

BarnaSaha

Page 2: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 3: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 4: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 5: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 6: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 7: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 8: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 9: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 10: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 11: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 12: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 13: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 14: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure

DevelopingStreamingAlgorithms• Themainhurdleisthespace.

• Oftenitismuchmoreefficienttogetanapproximateanswerthananexactanswer.

• Oftenthealgorithmusesrandomizationlikehashingandsampling.

Page 15: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 16: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 17: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 18: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 19: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 20: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 21: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 22: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 23: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 24: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 25: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 26: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 27: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 28: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 29: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 30: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 31: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 32: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 33: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 34: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 35: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 36: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure
Page 37: Data Streaming Algorithms - University of Massachusetts ... › cs590d › lectures › ... · Developing Streaming Algorithms •The main hurdle is the space. ... > Cloud infrastructure

MiniExercise[DueOct31st]

• ImplementCountMinSketchandplotthefrequencyofallelementsasreportedbytheCountMinsketchdatastructureaswellastheirtruefrequenciesusingε=0.01andnumberofhashfunctions=25.– Data:considerastreamofsize1000000whereeachelementinthestreamarrivesfrom[1,1000]chosenuniformlyatrandom.