pypedia the free programming environment that anyone can edit! alexandros kanterakis genomics...

23
PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical Center, Groningen, The Netherlands

Upload: roderick-tolly

Post on 31-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

PyPedia

The free programming environment that anyone can edit!Alexandros Kanterakis

Genomics Coordination Center, Department of Genetics, University Medical Center, Groningen, The Netherlands

Page 2: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Introduction

Page 3: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

• Stay low level at every level• Be open source without being open• Make tools that make no sense to scientists• Do not ever share your results and do not reuse• Never maintain your databases and web services• Be unreachable and isolated

How not to be a bioinformatician

Page 4: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

So, you think you can be a bioinformatician…

• Imagine you only have: A personal computer with a browser and an Internet connection

• Answer the following question:- Who is the current prime minister of Latvia?

Page 5: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

SYTYCBAB• Imagine you only have: A personal computer with a

browser and an Internet connection• Answer the following question:

Compute the Hardy-Weinberg equilibriums of a set of genotypes

✔ Execute✖ Source✖ Documentation

✖ Execute✔ Source✖ Documentation

✖ Execute✖ Source✔ Documentation

Page 6: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

? Web environment, online execution? Open Source? Integrate with other tools? Edit a method and share it? Examples and Unit tests? Deploy in the cloud? Frequency of new releases

✔ Execute✔ Source✔ Documentation

But what about…

wiki

Page 7: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

A python sandbox to the rescue

From:http://wiki.python.org/moin/SandboxedPython

So:Google App Engine + MediaWiki = PyPedia

Page 8: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

www.pypedia.com

Page 9: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical
Page 10: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Code as wiki

Page 11: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

HTML input as wiki

Page 12: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Executing a method in a remote computer

• Edit your user page and add an “ssh” section:

• This content is NOT shown to anyone• Install the PyPedia client on remote

computer(details on pypedia.com)

==ssh==host=ec2-107-22-59-115.compute-1.amazonaws.comusername=JohnDoepath=/home/JohnDoe/runPyPedia

Page 13: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

“Execute on remote computer”

Example:Fixed_point_user_JohnDoe

The cloud instance contains:numpy, scipy, matplotlib

Like SAGE but with custom execution environments (i.e BioPython, PyCogent, …)

Page 14: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Cool, but I want to call the function from my local computer..

• Install the PyPedia python library:git clone git://github.com/kantale/pypedia.git

• Load the function in python:>>> import pypedia>>> from pypedia import Pairwise_linkage_disequilibrium>>> Pairwise_linkage_disequilibrium([("A","A"), ("A","G"), ("G","G"), ("G","A")], [("A","A"), ("A","G"), ("G","G"), ("A","A")])

{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG', 2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675}>>>

• You can call the method of any user and your method can be called by anyone.

• Edit locally, push changes.

Page 15: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

• On the top of each article there is a button:

• Creates a personalized version of the article that only you can edit.

• This is similar to the Github’s “fork” feature.

Page 16: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Using PyPedia for open science

• A complete analysis can be hosted in PyPedia

• Any finding generated or published should be easily shared and reproduced.

• The reproduction of a finding takes time even when the source code is released.

Page 17: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Reproducible science

• PyPedia offers a REST interface:• www.pypedia.com/index.php?

b_timestamp=<YYYYMMDDHHMMSS>&get_code=<python code>

• Get the most recent version of the <python code> that is edited before the timestamp.

• Reproduce the analysis by sharing a single URL:http://www.pypedia.com/index.php?b_timestamp=20120102101010&get_code=print Pairwise_linkage_disequilibrium([("A","A"), ("A","G"), ("G","G"), ("G","A")], [("A","A"), ("A","G"), ("G","G"), ("A","A")])

Page 18: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Reproducing an experiment#> curl \--data-urlencode 'b_timestamp=20120501010101' \--data-urlencode 'get_code=print Pairwise_linkage_disequilibrium([("A","A"), ("A","G"), ("G","G"), ("G","A")], [("A","A"), ("A","G"), ("G","G"), ("A","A")])' \ http://www.pypedia.com/index.php \ --output code.py

#> python code.py{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',

2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675}

Page 19: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Meta-webserver• HTML injection is allowed

and encouraged!http://www.pypedia.com/index.php/Draw_face_user_Kantale

• Example run an HTML code posted on gist:http://www.pypedia.com/index.php?

run_code=import urllib2print urllib2.urlopen(

‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04b3f7a23ba43f558fba98b/index_full.html’).read()

Click me!

Page 20: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

• All content is under the Simplified BSD License• Two namespaces:– Validated articles. i.e: Minor_allele_frequency• Safe, only admins can edit

– User articles. i.e: Minor_allele_frequency_user_John• Unsafe, edited by individual user

– Qualitative articles from User namespace is promoted to the Validated namespace

– Validated articles cannot call User articles (duh..)

Page 21: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical
Page 22: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

Some thoughts(in the embarrassing occasion I have some minutes left)

Code as wiki, program as wiki concept• Multidimensional expansion• As Mao said: Let a thousand flowers scripts bloom (and some of

them rot in hell)• Minimize the distance:

Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists)• Encyclopedialize ™ your scripts because open source isn’t enough!

Future steps:• Attract editors, make communities!• If it can be done in python, why not Ruby, …?

Page 23: PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics Coordination Center, Department of Genetics, University Medical

• Contact: [email protected]• Source code license: GPL v3• Content license: Simplified BSD license• Join us in google groups:

http://groups.google.com/group/pypedia• Twitter: @PyPedia

• PyPedia’s source code:– Mediawiki extension:

https://github.com/kantale/PyPedia_server– Python library:

https://github.com/kantale/pypedia

• Acknowledgements:– Despoina Antonakaki– Kostas Tselios– Morris A. Swertz