collecting uncertain data the reactive way
TRANSCRIPT
Collecting Uncertain Data the Reactive Way
Jeff Smith @jeffksmithjr
x.ai is a personal assistant who schedules meetings for you
Reactive Machine Learning
Machine Learning Systems
Machine Learning Systems
Machine Learning Systems
Traits of Reactive Systems
Traits of Reactive Systems
Reactive Strategies
Reactive Strategies
Reactive Machine Learning
Reactive Machine Learning
Reactive Machine Learning
Collecting Data
What’s for dinner?
Reactive Data Collection
Modeling Uncertain Data
Certain Data Model
case class ZebraReading(sensorId: Int, locationId: Int, timestamp: Long, count: Int)
Uncertainty Interval
27 33
Uncertain Data Model
case class PreyReading(sensorId: Int, locationId: Int, timestamp: Long, animalsLowerBound: Double, animalsUpperBound: Double, percentZebras: Double)
Scaling Data Collection
Simple Data Architecture
Simple Data Architecture
Mutable State
case class Region(id: Int)
import collection.mutable.HashMap var densities = new HashMap[Region, Double]()
densities.put(Region(4), 52.4)
Scaling with Queues
Scaling with Queues
Out of Order Updates
Out of Order Updates
densities.put(Region(6), 73.6) densities.put(Region(6), 0.5) densities.get(Region(6)).get
Out of Order Updates
densities.put(Region(6), 73.6) densities.put(Region(6), 0.5) densities.get(Region(6)).get
densities.put(Region(6), 0.5) densities.put(Region(6), 73.6) densities.get(Region(6)).get
Concurrent Collections
import collection.mutable._
var synchronizedDensities = new LinkedHashMap[Region, Double]() with SynchronizedMap[Region, Double]
Scaling with Locks
Scaling with Locks
Immutable Factscase class PreyReading(sensorId: Int, locationId: Int, timestamp: Long, animalsLowerBound: Double, animalsUpperBound: Double, percentZebras: Double)
implicit val preyReadingFormatter = Json.format[PreyReading]
Immutable Factsval reading = PreyReading(36, 12, currentTimeMillis(), 12.0, 18.0, 0.60)
val setDoc = bucket.set[PreyReading](readingId(reading), reading)
Scaling with Distributed Databases
Scaling with Distributed Databases
Handling Incomplete Data
Distributed Data Storage
Querying Complete Data
(bucket.searchValues[PreyReading]("prey", "by_sensor_id") (new Query().setIncludeDocs(true))) .enumerate.apply(Iteratee.foreach { doc => println(s"Prey Reading: $doc")})
Complete Data
Partition Tolerance
Partition Tolerance
Partition Tolerance
Partition Tolerance
Querying Incomplete Data
(bucket.searchValues[PreyReading]("prey", "by_sensor_id") (new Query().setIncludeDocs(true))) .enumerate.apply(Iteratee.foreach { doc => println(s"Prey Reading: $doc")})
Incomplete Data
Incomplete Data
Reactive Data Collection
For Later
reactivemachinelearning.com medium.com/data-engineering
M A N N I N G
Jeff Smith
x.ai @xdotai [email protected] New York, New York
skillsmatter.com/conferences/ 6862-scala-exchange-2015#skillscasts
Thank You
Collecting Uncertain Data the Reactive Way
Jeff Smith @jeffksmithjr