sparking pandas: an experiment

SPARKING PANDAS: ANEXPERIMENT

PyConOtto - Florence '17

Francesco Bruni

� brunifrancesco

WHO I AMMSc in Telecommunication Engineering

Functional pythonista

Currently working with geo data

OUTLINE

Why Sparking Pandas

Functional data processing pipelines

A real world application

Conclusions

WHY SPARKING PANDAS

What if your data don't fit into memory?

APACHE SPARK: THECOMPONENTS

APACHE SPARK: THE

ARCHITECTURE

FUNCTIONAL DATA

PROCESSING PIPELINES

High order functions

Immutable data

Lazy evaluation

THE EXPERIMENT

The scenario

Containerized application

THE SCENARIO

CONTAINERIZED

APPLICATION

Containerized componentsConstrained memory nodesdocker-composed ecosystem

HANDS ON CODEApache Spark basics

Linear regression

Near real time processing with Apache Kafka

CONCLUSIONS

Complex structure

Worth the effort with a lot of data

Worker nodes should be distribueted

Keep exploring :)

QUESTIONS?

� brunifrancesco

https://github.com/brunifrancesco/docker-spark

sparking pandas: an experiment

Data & Analytics

save the giant pandas by: rachel welcome readers. lets learn...

pandas (maddi)

osos pandas

non sparking tools

red pandas and giant pandas

sparking economic growth

the pandas

sparking brand reappraisal

sparkling pandas letting pandas roam - pydata seattle 2015

giant pandas!!!!!!!!

red pandas

baby pandas

pandas mongo

willard sparking innovation

los pandas

sparking inclusive dialog

music sparking memories

endangered pandas

experiment guide - homeschool science education · pdf...

python programming | pandas · pandas read data with pandas...