eddie aronovich [email protected]. “command line” input files web crawling (pull) web...

10
TOOLS PRESENTATION Eddie Aronovich [email protected]

Upload: aubrey-stafford

Post on 19-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)

TOOLS PRESENTATION

Eddie [email protected]

Page 2: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)

ONCE UPON A TIME

Page 3: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)

“EVOLUTION OF THE INPUT”

“command line” input

Files

Web crawling (pull)

Web sensors (using API - push)

Page 6: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)

PYTHON CODE FOR JSON FORMAT

 import jsonfrom pprint import pprintjson_data=open('json_data') data = json.load(json_data)pprint(data)json_data.close()

Page 7: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)

WEB CRAWLING

wget + parser (html2txt)

ETL (Extract, Transform, Load)

Structured vs. Unstructured data

Page 9: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)

OVERVIEW

Collect Data (and extract it)

Analyze Data

Build a model

Run the model

Collect more data

Page 10: Eddie Aronovich eddiea@cs.tau.ac.il.  “command line” input  Files  Web crawling (pull)  Web sensors (using API - push)