pycon apac 2016 keynote

36
Saturday Morning Keynote Wes McKinney @wesmckinn PyCon APAC 2016 (Seoul)

Upload: wes-mckinney

Post on 23-Jan-2018

2.607 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: PyCon APAC 2016 Keynote

SaturdayMorningKeynoteWesMcKinney@wesmckinn

PyConAPAC2016(Seoul)

Page 2: PyCon APAC 2016 Keynote

Me

DataPad

ApacheArrow

Featheribis

Page 3: PyCon APAC 2016 Keynote

Inprocess:PythonforDataAnalysis:2ndEdi:onComing2017(inEnglishJ)

Page 4: PyCon APAC 2016 Keynote

Q:Whatbringsyouhere?

Page 5: PyCon APAC 2016 Keynote

Oursharedvalues

Page 6: PyCon APAC 2016 Keynote

PrideinsoMwarecraMsmanship

Page 7: PyCon APAC 2016 Keynote

Mystory

•  AccidentalsoMwaredeveloper•  2007:Myfirstjob(financialresearchanalyst)

•  IstartedwriPngPythonlibrariestodomyownworkbeQer

•  SoonIwashelpingmycolleaguesworkbeQer,too

Page 8: PyCon APAC 2016 Keynote

Tools

Page 9: PyCon APAC 2016 Keynote

Tools

Page 10: PyCon APAC 2016 Keynote

Empathythefeelingthatyouunderstandandshareanotherperson'sexperiencesandemoPons:theabilitytosharesomeoneelse'sfeelings

Source:Merriam-Webster'sLearner'sDicPonary

Page 11: PyCon APAC 2016 Keynote

Opensourceiswonderful…

Page 12: PyCon APAC 2016 Keynote

Opensourceiswonderful…butitcanalsobefrustraPng

Page 13: PyCon APAC 2016 Keynote

Sustainableopensource

•  Howtokeepcontributorsfromdrowning/burningout?

•  Howtofundthework?

•  Howtoprotectandservethecommunity?

Page 14: PyCon APAC 2016 Keynote

TheGrind

Page 15: PyCon APAC 2016 Keynote

“Thegrindisanendlessstreamofbugreports,requests,demands,quesPons,andoccasionalinquisiPons.” DHH,CreatorofRubyonRails

Page 16: PyCon APAC 2016 Keynote

pandas,theopensourceproject

•  PartsofcodedatebacktoApril2008•  Over600uniquecontributorsonGitHub•  AcPveprojectmaintainersrangefrom4-7people

•  >6900ClosedIssues•  >5100PullRequests

Page 17: PyCon APAC 2016 Keynote

pandasatendof2012

Page 18: PyCon APAC 2016 Keynote

April7,2014

Page 19: PyCon APAC 2016 Keynote

"Somemightarguethat[Heartbleed]istheworst

vulnerabilityfound(atleastintermsofitspotenPalimpact)

sincecommercialtrafficbegantoflowontheInternet."

JosephSteinberg,Forbescybersecuritycolumnist

Page 20: PyCon APAC 2016 Keynote

“Thereshouldbeatleast…[6]fullPmeOpenSSLteammembers,notjustone,abletoconcentrate…withouthavingtohustlecommercialwork.Ifyou’rea…inaposiPontodosomethingaboutit,giveitsomethought.Please.I’mgemngoldandwearyandI’dliketorePresomeday.”SteveMarquess,OpenSSLteam

Page 21: PyCon APAC 2016 Keynote

ByNadiaEghbal,supportedbytheFordFoundaPon

Formoreonthis

Page 22: PyCon APAC 2016 Keynote

“TheCathedralandtheBazaar”

Page 23: PyCon APAC 2016 Keynote

Python’snormalizaPoninindustry

•  Pythonhasbecomealeadinglanguageinsteadofsomething“experimental”or“risky”

•  ManybusinessesfoundedonthegrowthofthePythonuserbase

•  SeePaulGraham’s2004essay“ThePythonParadox”—howthingshavechanged!

Page 24: PyCon APAC 2016 Keynote

Governance“theprocessesofinteracPonanddecision-makingamongtheactorsinvolvedinacollecPveproblem…”

M.HuMy(viaWikipedia)

Page 25: PyCon APAC 2016 Keynote

OpennessandTransparency

Page 26: PyCon APAC 2016 Keynote

Consensus

Page 27: PyCon APAC 2016 Keynote

Someexamplegovernancedocuments

•  NumPy(seethedocs)

•  IPython/Jupytergovernance– github.com/jupyter/governance

•  pandas– github.com/pydata/pandas-governance– ModeledaMerJupytergovernance

Page 28: PyCon APAC 2016 Keynote

hQp://numfocus.org

hQp://apache.org

Page 29: PyCon APAC 2016 Keynote
Page 30: PyCon APAC 2016 Keynote

conda-forge

•  Community-curatedcondapackagechannel(hostedonanaconda.org)

•  Reproduciblebuildinfrastructure(Docker+CircleCI+TravisCI+Appveyor)

•  AutomatedGitHubhelpertools

conda config --add channels conda-forge

Page 31: PyCon APAC 2016 Keynote

Whatisnextforpandas?

•  pandas1.0– Astable,maintenance-onlyrelease

•  Beginning“pandas2.0”– PlanningsignificantrefactoringontheinternalsofSeries,DataFrame

Page 32: PyCon APAC 2016 Keynote

Whypandas2.0?

•  Somechangesdifficult/impossibletodoinanincrementalway

•  pandas’srelaPonshipwiththeecosystemhasevolvedoverthelast5years

•  Makepandas

– Fasteranduselessmemory– Fixlong-standinglimitaPons/inconsistencies– Easierinteroperability/extensibility

Page 33: PyCon APAC 2016 Keynote

ApacheArrow

hQp://arrow.apache.org

Page 34: PyCon APAC 2016 Keynote

HighPerformanceSharing&InterchangeToday With Arrow

•  Each system has its own internal memory format

•  70-80% CPU wasted on serialization and deserialization

•  Similar functionality implemented in multiple projects

•  All systems utilize the same memory format

•  No overhead for cross-system communication

•  Projects can share functionality (eg, Parquet-to-Arrow reader)

Page 35: PyCon APAC 2016 Keynote

FeatherFileFormatforPythonandR

• Problem:fast,language-agnosPcbinarydataframefileformat

• ByWesMcKinney(Python)andHadleyWickham(R)

• ReadspeedsclosetodiskIOperformance

• LeveragesApacheArrow

Page 36: PyCon APAC 2016 Keynote

Thankyou

@wesmckinnhQp://wesmckinney.com

pandassprintonMonday!