pydata: past, present future (pydata sv 2014 keynote)

88
PyData: Past, Present, Future Peter Wang @pwang Continuum Analytics PyData SV 2014

Upload: peter-wang

Post on 11-Aug-2014

203 views

Category:

Data & Analytics


9 download

DESCRIPTION

From the closing keynoteLook back at the last two years of PyData, discussion about Python's role in the growing and changing data analytics landscape, and encouragement of ways to grow the community

TRANSCRIPT

Page 1: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: Past, Present, Future

Peter Wang @pwang

!

Continuum Analytics !

PyData SV 2014

Page 2: PyData: Past, Present Future (PyData SV 2014 Keynote)

How did we get here?

Page 3: PyData: Past, Present Future (PyData SV 2014 Keynote)

“Python Data Workshop” March 3, 2012, Google HQ

Page 4: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 5: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 6: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 7: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 8: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 9: PyData: Past, Present Future (PyData SV 2014 Keynote)

“Guido, please help us convince core dev to

work with us to solve the packaging problem!”

Page 10: PyData: Past, Present Future (PyData SV 2014 Keynote)

“Guido, please help us convince core dev to

work with us to solve the packaging problem!”

“Meh. Feel free to solve it

yourselves.”

Page 11: PyData: Past, Present Future (PyData SV 2014 Keynote)

“Guido, please help us convince core dev to

work with us to solve the packaging problem!”

“Meh. Feel free to solve it

yourselves.”

Page 12: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”

Page 13: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

Page 14: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv

Page 15: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew

Page 16: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm

Page 17: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get

Page 18: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge

Page 19: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge• tar -zxf

Page 20: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge• tar -zxf• double-click MSI

Page 21: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge• tar -zxf• double-click MSI• configure ; make ; make install

Page 22: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge• tar -zxf• double-click MSI• configure ; make ; make install• export PYTHONPATH=…

Page 23: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge• tar -zxf• double-click MSI• configure ; make ; make install• export PYTHONPATH=…

Page 24: PyData: Past, Present Future (PyData SV 2014 Keynote)

“What Packaging Problem?”“I just use….”

• pip & virtualenv• homebrew• rpm• apt-get• emerge• tar -zxf• double-click MSI• configure ; make ; make install• export PYTHONPATH=…

from python import \! technical_debt

Page 25: PyData: Past, Present Future (PyData SV 2014 Keynote)

This Packaging Problem

Page 26: PyData: Past, Present Future (PyData SV 2014 Keynote)

This Packaging Problem

Page 27: PyData: Past, Present Future (PyData SV 2014 Keynote)

This Packaging Problem

Page 28: PyData: Past, Present Future (PyData SV 2014 Keynote)

This Packaging Problem

Page 29: PyData: Past, Present Future (PyData SV 2014 Keynote)

This Packaging Problem

Page 30: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: The First 2 Years• Oct 2012: First PyData Conf, NYC

!

• March 2013: PyData SV (PyCon) • July 2013: PyData Boston (Microsoft) • Oct 2013: PyData NYC (JP Morgan)

!

• Feb 2014: PyData UK (Level39) • May 2014: PyData SV (Facebook) • July 2014: PyData Berlin (EuroPython) • October 2014: NYC (Strata NYC)

!

• October 2014: NYC (YOUR COMPANY HERE)

Page 31: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: The First 10 years

Page 32: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: The First 10 years

• IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006

Page 33: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: The First 15 Years

• IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006 • SciPy: 1999 • IPython: 2001 • matplotlib: 2002

Page 34: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: The First 15 Years

• IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006 • SciPy: 1999 • IPython: 2001 • matplotlib: 2002

http://numfocus.org/johnhunter.html

Page 35: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData: The First 20 Years

• Numarray: 2001 • Numeric: 1995

• Matrix Obj: 1994

• IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006 • IPython: 2001 • matplotlib: 2002

Page 36: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

Page 37: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

• python: 1989-1991

Page 38: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

• python: 1989-1991• v1.0: 1994

Page 39: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

• python: 1989-1991• v1.0: 1994• “ABC, SETL…

Page 40: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

• python: 1989-1991• v1.0: 1994• “ABC, SETL… …That would appeal to UNIX/C hackers”

Page 41: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

• python: 1989-1991• v1.0: 1994• “ABC, SETL… …That would appeal to UNIX/C hackers”

$ conda create -n py10 python=1.0

Page 42: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

• python: 1989-1991• v1.0: 1994• “ABC, SETL… …That would appeal to UNIX/C hackers”

http://continuum.io/blog/python-1.0$ conda create -n py10 python=1.0

Page 43: PyData: Past, Present Future (PyData SV 2014 Keynote)

Way Way Back

It is interactive, structured, high-level, and intended to be used instead of BASIC, Pascal, or AWK. !

It is not meant to be a systems-programming language but is intended for teaching or prototyping.

Page 44: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 45: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 46: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 47: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 48: PyData: Past, Present Future (PyData SV 2014 Keynote)

“In June [1960] we were introduced to this tall college kid that always signed his name with lowercase letters. He was don knuth … don claimed that he could write the [Algol] compiler and a language manual all by himself during his three and a half month summer vacation.”

Page 49: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData NYC 2013 Keynote

Page 50: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData NYC 2013 Keynote

Page 51: PyData: Past, Present Future (PyData SV 2014 Keynote)

PyData NYC 2013 Keynote

Page 52: PyData: Past, Present Future (PyData SV 2014 Keynote)

http://tuulos.github.io/sf-python-meetup-sep-2013/#/

“One of the most exciting features in development is the Numba-based UDF

compiler. Building UDFs for Impala currently requires writing C++ or Java

code and registering them manually with the cluster. Writing C++/Java code is

more difficult, time-consuming, and error-prone for many data analysts.”

http://blog.cloudera.com/blog/2014/04/a-new-python-client-for-impala/

Page 53: PyData: Past, Present Future (PyData SV 2014 Keynote)

http://grokbase.com/t/python/python-list/01az9hmtf1/python-development-practices

Page 54: PyData: Past, Present Future (PyData SV 2014 Keynote)

http://grokbase.com/t/python/python-list/01az9hmtf1/python-development-practices

Page 55: PyData: Past, Present Future (PyData SV 2014 Keynote)

Glue 2.0Python’s legacy as a powerful glue language

• manipulate files • call fast libraries

!

Next-gen Glue: • Link data silos • Link disjoint memory & compute • Unify disparate runtime models • Transcend legacy models of

computers

Page 56: PyData: Past, Present Future (PyData SV 2014 Keynote)

Hard Problems in Data ScienceLots of data Messy data Noisy data

Page 57: PyData: Past, Present Future (PyData SV 2014 Keynote)

Hard Problems in Data ScienceLots of data Messy data Noisy data

Lots of computers Lots of tools

Lots of hacking

Page 58: PyData: Past, Present Future (PyData SV 2014 Keynote)

Hard Problems in Data ScienceLots of data Messy data Noisy data

Lots of computers Lots of tools

Lots of hacking

More questions More data

More people

Page 59: PyData: Past, Present Future (PyData SV 2014 Keynote)

The Hype & The Opportunity

“Internet Revolution” True Believer, 1996: Businesses that build network capability into their core will outcompete and destroy their competition.

Page 60: PyData: Past, Present Future (PyData SV 2014 Keynote)

The Hype & The Opportunity

“Internet Revolution” True Believer, 1996: Businesses that build network capability into their core will outcompete and destroy their competition.

“Data Revolution” True Believer, 2014: Businesses that build data comprehension into their core will destroy their competition over the next 5-15 years.

Page 61: PyData: Past, Present Future (PyData SV 2014 Keynote)

The Hype & The Opportunity

“Internet Revolution” True Believer, 1996: Businesses that build network capability into their core will outcompete and destroy their competition.

“Data Revolution” True Believer, 2014: Businesses that build data comprehension into their core will destroy their competition over the next 5-15 years.

(1993 == 2011?)

Page 62: PyData: Past, Present Future (PyData SV 2014 Keynote)

Soft Problems in Data Science

Page 63: PyData: Past, Present Future (PyData SV 2014 Keynote)

Soft Problems in Data Science

Computers

EE

Page 64: PyData: Past, Present Future (PyData SV 2014 Keynote)

Soft Problems in Data Science

Computers

EE

Applications

CS

Page 65: PyData: Past, Present Future (PyData SV 2014 Keynote)

Soft Problems in Data Science

Computers

EE

Applications

CS

DATAInsights

Math, Stats

Page 66: PyData: Past, Present Future (PyData SV 2014 Keynote)

Computers

Applications

Data

Insights

Page 67: PyData: Past, Present Future (PyData SV 2014 Keynote)

Computers

Applications

Data

Insights

Page 68: PyData: Past, Present Future (PyData SV 2014 Keynote)

Computers

DATA

Applications

Data Scientist

Page 69: PyData: Past, Present Future (PyData SV 2014 Keynote)

2013 Data Science Salary Survey!http://www.oreilly.com/data/free/stratasurvey.csp

Page 70: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 71: PyData: Past, Present Future (PyData SV 2014 Keynote)

“Python is the second best language…”

...Because it blurs the lines between “user” and “maker”. !

We stand on the shoulders of Users who became Makers. !

Some people say: “R has a very strong user community.” !

I want people to say that “Python has a strong maker community.”

Page 72: PyData: Past, Present Future (PyData SV 2014 Keynote)

Standing Tall

Page 73: PyData: Past, Present Future (PyData SV 2014 Keynote)

Standing Tall

• Science: Standing on the shoulders of giants

Page 74: PyData: Past, Present Future (PyData SV 2014 Keynote)

Standing Tall

• Science: Standing on the shoulders of giants

• Programming: Standing on each others toes

Page 75: PyData: Past, Present Future (PyData SV 2014 Keynote)

Standing Tall

• Science: Standing on the shoulders of giants

• Programming: Standing on each others toes

• But in Python, we stand on each others’

shoulders - community that bootstraps itself

Page 76: PyData: Past, Present Future (PyData SV 2014 Keynote)

“For there is but one veritable problem - the problem of human relations…”

— Antoine de Saint-Exupéry

Page 77: PyData: Past, Present Future (PyData SV 2014 Keynote)

https://archive.org/details/Scipy2010-PeterWang-PythonEvangelism101

Page 78: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 79: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 80: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 81: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 82: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 83: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 84: PyData: Past, Present Future (PyData SV 2014 Keynote)
Page 85: PyData: Past, Present Future (PyData SV 2014 Keynote)

Participate

• Submit issues and pull requests • Represent for the tools you love in social

media conversations • Start PyData meetups • Come to PyData conferences and present • Encourage diversity!!

Page 86: PyData: Past, Present Future (PyData SV 2014 Keynote)

How did we get here?

• Hard Work • By a community of people • Who cared • About code and people

Page 87: PyData: Past, Present Future (PyData SV 2014 Keynote)

Where do we go from here?

• More hard work • More community • More caring • More code • More people

Python is not just glue. Python and PyData are communities!

Page 88: PyData: Past, Present Future (PyData SV 2014 Keynote)

Where do we go from here?

• More hard work • More community • More caring • More code • More people

Python is not just glue. Python and PyData are communities!