morgan fouesneau [email protected] observatoire astronomique de strasbourg march 28,...
TRANSCRIPT
An Introduction to Python for Astronomers (and curious)
Morgan [email protected]
Observatoire Astronomique de StrasbourgMarch 28, 2011 – One Day Session
http://astro.unistra.fr/~morgan/python/M11.pptx
Take-Awaywhy python? and good references
What, Why?
In astronomy
Should I throw away my C, Fortran codes?
Language power
Starting points when stuck in python
“Meals are not provided!”
Scientific Computing
Language
Application
Not to scale…
Why Python?
A programmer can be 5-10 times more productive than others
Java:
Fortran (90):
Python:
As easy as IDL (or Yorick):
public class HelloWorld{ public static void main(string[] args){ System.out.println(“Hello World!”); }}
print “Hello World!”
print, “Hello World!”
PROGRAM HelloWorld PRINT *, "Hello World!"END PROGRAM HelloWorld
“Hello World!” …
Why Python?
Easy and Quick coding
Already a reference in astronomy
Faster than IDL (especially for data access)
More flexible than any other language
Between fully compiled and pure script program
Free
Quantitative study based on 80 programmers implementing the same
program
http://page.mi.fu-berlin.de/prechelt/Biblio/jccpprtTR.pdf
C/C++, Fortran
Working horses of science…
zillions of codes existing
optimized for runtime performance
no reason to replace them (at this time)
rather cooperate (f2py, swig)
but...
not really general purpose
relatively primitive datatypes
manual memory management
slow edit/compile/test cycle
no reason to replace them (yet), rather cooperate
IDL, Matlab, Mathematica… Extremely popular these days
Interactive, great visualization, good libraries
but...
Not really general purpose
Not really ready for large scale programs
Not free, even expensive
We can still call them from Python but you will never do that unless you have a gun over your
head
Perl, Ruby, Shell scripts… Perl very strong at text manipulation
Ruby seems to be popular in scientific community in Japan
Shell scripts... you will not want to do shell scripting any more once you know Python (unless forced)!
Python Language
Easy to learn but yet powerful
Fast and easy solutions for everyday problems
Allows development of large and sophisticated software packages
Several astronomical program packages implemented using python
Excellent tool for scientists
Easy parallel programming
official Python website www.python.org
Example
Simple interactive example:
>>> t # print all t-values (assumed to be already defined)array([ 0. , 0.8975979 , 1.7951958, 2.6927937, 3.5903916, 4.48798951, 5.38558741, 6.28318531 ])
>>> t[2:4] # get sub-array with the elements at the indexes 2,3array([ 1.7951958, 2.6927937])
>>> p = 1.8 # set our point
>>> abs(t-p) # how much do the t[:]-values differ from p?array([ 1.8 , 0.9024021 , 0.0048042 , 0.8927937, 1.7903916, 2.68798951, 3.58558741, .48318531 ])
Example
>>> abs(t-p) # how much do the t[:]-values differ from p?array([ 1.8 , 0.9024021 , 0.0048042 , 0.8927937, 1.7903916, 2.68798951, 3.58558741, .48318531 ])<...>
>>> # how large is the 4-th largest absolute distance between the t[:]-values and p>>> dt_m = sort(abs(t-p))[3]
>>> # where are the four elements of t[:]closest to p ?>>> abs(t-p) <= dt_m # corresponds to a “where” command in IDLarray([False, True, True, True, True, False, False, False])
>>> # construct the sub-arrays; (1) get the 4 t[:]-values>>> t_p = t[abs(t-p)<=dt_m]
>>> t_parray([ 0.8975979, 1.7951958, 2.6927937, 3.5903916])
In Astronomy
Programs and libraries implemented (mainly in Python):
BoA (Bolometer Analysis Package)
APECS (APEX control software)
Astro-WISE (widefield imaging system)
PyFITS (access to FITS files)
SAMPy (Samp communications)
...
In Astronomy
Wrappers for existing programs and libraries:
PYGILDAS (GILDAS)
ParselTongue (AIPS)
PyRAF (IRAF)
PyMIDAS (MIDAS)
PyDS9 (DS9 via XPA messaging system)
…
Examples
Glue It All Together With Python
Python
Fortran
Java
C/C++
• Pyrex• Weave• Cython
• Jython
• f2py• pyfort
Keep it simple
Embed “basic” languages were really needed (e.g. brut/intensive comp.)
Communicate with almost any application
Ads Break: Dive into Python Python reference book
Python from novice to pro Available online available in multiple languages. http://diveintopython.org
$
0.00
Free!!
Ads Break: Good places to know
Websites
Pythonhttp://www.python.org Ipython http://ipython.scipy.org SciPy http://www.scipy.org NumPy http://numpy.scipy.org Matplotlib http://matplotlib.sourceforge.net Astropython http://www.astropython.org AstroPy http://www.astro.washington.edu/users/rowen/
AstroPy.html Mathesaurus http://mathesaurus.sourceforge.net quick reference for
switching F2py http://www.scipy.org/F2py c / f77 / f90 glue (is part of NumPy) PyTables http://www.pytables.org/ all around hdf5 format PyFITS http://www.stsci.edu/resources/software_hardware/pyfits/ FITS SAMPy http://cosmos.iasf-milano.inaf.it/pandora/sampy SAMP in python Mayavihttp:// code.enthought.com/projects/mayavi/ SAMP in python
This workshopWho, where, what, how?
Audience Observatoire: Toddler, grown up, elderly… anyone curious Must have at least basic experience in one or more computer languages
Content: Give you what you need for everyday problems asap
First half of course: Basic language elements and basic visualizationsome modules from standard library, elements of NumPy, SciPy and matplotlib
Second half of course: Advanced topics and installing packages
Complex scripting, exceptions, classes… finding packages and how to install them…
“Daily Specials”
Technical part: Where, How…? Avoid any personal installation
SSH connection to `ASTROMASTER` Username: `guestpython` Password: `python2011` This account will be deleted within the next few days Create your own folder in this user’s home directory
Installed packages numpy, matplotlib, pyfits, python-tables, setuptools,
tkinter, ipython
IDE: Eclipse, Spyder…
IDE: Interactive Development Environment i.e.: Very assisted Examples: Eclipse (ext. Python), Spyder…
http://code.google.com/p/spyderlib/
Spyder*
Editor, Console, History Log, Variable explorer, Project manager, online help
* Scientific PYthon Development EnviRonment
IPython
“comprehensive environment for interactive and exploratory computing.”
In [ ]: myVariable? prints documentation of the variable In [ ]: myFunction?? prints the source code (if accessible)
“comprehensive environment for interactive and exploratory
computing.”http://ipython.scipy.org/
Basic language elementsNumbers, variables, math and while
Python's numerical types
Operators for numerical types
Variables, variable names, and assignment to variables
The math module from the Standard Library
Loops with the while statement
“Classic stuffs and some specials.”
Integers
>>> 2, 0
>>> -4711
>>> 07, 022 # Octal literals
>>> 0x9, 0xa, 0XF # Hex literals
>>> 17 + 4 # expression
>>> 0xa - 2
>>> 23 ** (2+3) # 23⁵
>>> 7 / 2, 7 / -2 # Int division
Floats
>>> 2.3
>>> -4.
>>> 0.1, .1
>>> 2.99E10, 6.62607e-27, -1e10
>>> 1.7 + .4
>>> 17. + 4
>>> 7./2., 7./2., 7./2.
Arithmetic operators
>>> 1 + .4
>>> 2 - 3
>>> 25 * -17
>>> 2.**100
>>> 2/3, 2/-3 # Int division
>>> 2./3 # Float division
>>> 2.//3, 2.//3 # Floor division
>>> 7 % 3, 7 % -3 # Int Modulus
>>> 7. % 3, 7.2 % 2.5 # Modulus
Nb: (n//m)*m + (n%m) = n
Bitwise operators
>>> 17 & 5 # Bitwise AND
>>> 17 | 5 # Bitwise OR
>>> 17 ^ 5 # Bitwise EXOR
>>> ~17 # Bitwise complement
>>> 17 << 3 # Bitshift left
>>> 0x11 >> 2 # Bitshift right
Complex numbers
>>> 2.+3j, 2-3J # complex literals
>>> j # will not work
>>> 1J # but this will
>>> complex(1,2)
>>> 1+1j*2, (1+1j)*2
>>> (2.+3j).real, (2+3j).imag
Variables
>>> x = 2 # Assign variable
>>> x # Display
>>> x + 3 # Use variable
>>> y = x + 3 # New variable
>>> x = x + 1 # Assign new value
>>> x += 1 # Shorthand; no x++
Some stunts
>>> x, y = 2, 3
>>> x, y = y, x
>>> x = y = z = 0 # !! WARNING !!
Names
Case sensitive >>> xy, Xy = 2, 3
Must begin w. letter >>> 9x = 2 # not working>>> x9 = 2 # ok>>> _x = 2 # ok, but special
must not be keyword >>> if = 2
List of reserved keywords and
as
assert
break
class
continue
def
del
elif
else
except
exec
finally
for
from
global
if
import
in
is
lambda
None
not
or
pass
raise
return
try
while
with
yield
Built-in math functions
>>> abs(-2.)
>>> abs(1+1j)
>>> max(1,2,3,4)
>>> min(1,2,3,4)
>>> hex(17), oct(-5)
>>> round(1.23456, 2)
Comparisons
>>> x = 2 # Assignment
>>> x == 2 # Testing equality
>>> x != 0 # True
>>> x < 1 # False
>>> (x >= 2.) == False # False
>>> 1 < x <= 3 # (1<x) and (x<=3)
>>> 3 < x <= 1
>>> 1 < x >= 0
The while statement
>>> x = 0
>>> while x <= 10: # Bool. expr.... print x, x**2 # Indentation... x += 1... <RETURN>
>>> print x # Unindented again
Indentation = Block
Top level must not be indented
It does not matter how much blanks you use, but:
Uniform within each block
Most people use 4 blanks per level but tabs work too.
http://www.python.org/doc/essays/styleguide.html
While exercises
How much is the smallest / largest positive float? Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1. >>> x = 2.** 32 >>> while x > 0: >>> while x != float(‘inf’): ... x /= 2. ... x *= 2. ... print x ... print x
How precise is your installation? Help: Add smaller and smaller numbers to 1 while
comparing the sum to 1>>> x = 1.>>> while 1. + x > 1.: ... x /= 2. ... print x
While exercises – Solutions How much is the smallest / largest positive float?
Help: Use a while loop, compare to 0 and float(‘inf’)>>> x = 1. >>> x = 2.** 32 >>> while x > 0: >>> while x != float(‘inf’): ... x /= 2. ... x *= 2. ... print x ... print x
How precise is your installation? Help: Add smaller and smaller numbers to 1 while
comparing the sum to 1>>> x = 1.>>> while 1. + x > 1.: ... x /= 2. ... print x
4.94065645841e-324
While exercises – Solutions How much is the smallest / largest positive float?
Help: Use a while loop, compare to 0 and float(‘inf’)>>> x = 1. >>> x = 2.** 32 >>> while x > 0: >>> while x != float(‘inf’): ... x /= 2. ... x *= 2. ... print x ... print x
How precise is your installation? Help: Add smaller and smaller numbers to 1 while
comparing the sum to 1>>> x = 1.>>> while 1. + x > 1.: ... x /= 2. ... print x
4.94065645841e-324 8.98846567431e+307
1.11022302463e-16
Script FilesLarge scale program
Python's package notion (‘import’)
Module, Package
Differences between import and execfile or %run
Documentation
“A complex system that works is invariably found to have evolved from a simple
system that worked” – John Gall
Execute scripts
Run script from Unix shell with $> python myscript.py
Alternative: Make script executable $> chmod u+x myscript.py Add “ShiBang” line at the top of the file: #!/usr/bin/python
Execute: $> ./myscript.py
Or run script from within Python >>> execfile(“myscript.py”)
Or within IPython: In [1]: %run myscript[.py]
Content
Multiline codes (functions, classes, …) or interactive command list
Documentation Usually in the __doc__ function within the “module” (file)Ex: def __doc__:
... “”” this is my documentation for this module.
usage: >helloWorld()... “””
Directly in each block: def helloWorld(): ... “”” This is a dummy function “”” ... print “hello World!” Usage:>>> import myscript In[1]: %run myscript>>> help(myscript) In[2]: helloWorld?>>> help(myscript.helloWorld) In[3]: helloWorld??
Content: specials
__doc__ : introduce documentation
In general, any __<name>__ is a special function/variable
Select instructions that are executed only when running the script and not when import is called
everything in the “__main__” block __main__ : Top-level script environment if __name__ == “__main__”:
<...>
A first concrete problemReading a photometric dataset to generate HR diagram
Python's package notion (‘import’)
Mandatory packages for doing science: numpy, matplotlib
File access: reading from / writing to files
Lists, array and indexes, slices
For loop statements: while, for
Functions: classical and lambda functions
Basic visualization: plot, scatter, contours, colormaps
First python scripts
Problem description
Step-by-step
Read the file
Manipulate the data
Build the figures
Step-by-step: read the file File description: data.ascii
made of a header (lines start with #) 6424 lines into 9 columns (space as separator)
Step-by-step: read the file First solution: elbow-grease (like in C)
f = open('data.ascii', 'r') # 'r' by default Reading: f.read(), f.readline(), f.readlines() f.close()
Reading is done by simply looping over the lines
Long solution, you rarely need to be at this low-level.
Learning interest…
Step-by-step: read the file#include <stdio>#include <stdlib>
int main(void){ int ncols = 9, nlines = 6424, i, j; string filename = “data.ascii”; FILE *file; float *data = (float *) malloc( ncols * nlines * sizeof(float) ); file = fopen(filename, "r"); for ( i = 0; i < nlines; i++ ){ for ( j=0; j < ncols; j++ ){ fscanf( file, "%f ", data[j + ncols*i] ); } } fclose(file); ... free(data);}
If you had to do it with C code…
…I am not sure this works though
Step-by-step: read the file>>> import numpy # low-level does not mean very-low>>> ncol = 9>>> nrow = 6424>>> data = numpy.array(ncol, nrow) # common array (one type of data)>>> l = '#'>>> while l[0] == '#': #Skip the comments ... l = f.readline() """Warning: at the end of the loop, l contains the first data row """>>> for i in range(nrow): # <=> i in [0, ..., 6423] ... """ We should check whether ‘\n’ is present at the end of l ... and cut it out. In any other language, we may have something like: ... if (l[len(l)-1] == '\n') { l = l[0:len(l)-2] } ... But not in python: """ ... if l[-1] == '\n': ... l = l[:-1] ... l = l.split() # split according to spaces ... for k in range(ncol): # Now fill data array ... """Python only reads strings that have to be converted then """ ... data[k,i] = float(l[k]) ... l = f.readline() # go on to the next line>>> f.close()
Step-by-step: read the file A bit more in a reptile way:
>>> data = [] # declare an empty list>>> for l in f: #loop over each line of f (don’t care about #lines) ... if l[0] != '#': ... # Magical `inline` for loop, faster than a classic `for` ... cl = [ float(k) for k in l[:-1].split()] ... data.append( cl ) # add processed line to data>>> f.close() """ data constains n-cells of m-values. In order to access the
data by columns, we either transpose the "matrix", or create a sub-list:"""
>>> d = numpy.transpose(data)>>> bflux = d[0]>>> blux = [ k[0] for k in data ]
Step-by-step: read the file Finally in python: (think functions)
>>> def readAscii(fichier, transpose = False): """ Function that will read the file """ ... f = open('data.ascii') ... # lambda function <=> math function, x-->f(x) ... convertToFloat = lambda x: [ float(k) for k in x ] ... data = [ convertToFloat( l[:-1].split() ) \ for l in f if l[0] != '#' ] ... f.close() ... if transpose: ... return numpy.transpose(data) ... else: ... return data
>>> data = readAscii(f)>>> d = readAscii(f, transpose = True) # pédagoqique ;)
Step-by-step: read the file BUT: Someone already coded that for you
Either numpy or scipy or matplotlib or pylab>>> data = numpy.loadtxt('data.ascii')
Variations:>>> data = numpy.loadtxt('data.ascii', comments='#', skiprows=0, delimiter=None )
>>> bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
numpy.loadtxt('data.ascii', unpack=True)
Step-by-step: read the file loadtxt() is usually fine
But sometimes can be limited for big files
Other options are available e.g. ATpy, asciitable, … http://astropython.org/resources for other references ATpy is also able to read and write fits, votable,
sql …
Step-by-step: the data
Calibrate the dataset:
We apply operations to the whole arrays
>>> bmag = bmag - 1.3 # classic >>> vmag -= 2.5 # better for memory space
Step-by-step: Visualize
sereval possibilities for visualization 2D: matplotlib/pylab (other exists) 3D, matplotlib (still young), mayavi2 (openGL), …
We will use matplotlib/pylab in this workshop
>>> from numpy import loadtxt # only what we need>>> import pylab as plt # import & alias to plt
>>> bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
loadtxt('data.ascii', unpack=True)
>>> bmag += 1.3; vmag += 2.5 # zeropoint correction
Step-by-step: Visualize
Plot star positions:
>>> plt.figure() # declare a new figure >>> plt.plot(x, y, ',', color='0.0') # do the plot # do not hesitate with help cmds: `plt.plot?`>>> plt.xlabel('X (in pixels)') # set labels>>> plt.ylabel('Y (in pixels)')>>> plt.grid()>>> plt.show()# display (when not auto)
Step-by-step: Visualize
Plot star positions:
>>> plt.figure() # declare a new figure >>> plt.plot(x, y, ',', color='0.0') # do the plot # do not hesitate with help cmds: `plt.plot?`>>> plt.xlabel('X (in pixels)') # set labels>>> plt.ylabel('Y (in pixels)')>>> plt.grid()>>> plt.show()# display (when not auto)
Step-by-step: Visualize
Build HR diagram of the stars:
>>> plt.figure()>>> plt.scatter(bmag-vmag, vmag) # `plt.plot` would also work ;)>>> plt.xlabel(r'B$-$V')>>> plt.ylabel(r'V')>>> plt.xlim(1.9,3.4)>>> plt.ylim(6.7,15) >>> plt.ylim(plt.ylim()[::-1]) # reverse y-axis, or also `plt.ylim(15, 6.7)`>>> plt.savefig(‘myfigure.png’) # can export into eps,ps,pdf,png,svg… >>> plt.show()
Step-by-step: Visualize
Build HR diagram of the stars:
>>> plt.figure()>>> plt.scatter(bmag-vmag, vmag) # `plt.plot` would also work ;)>>> plt.xlabel(r'B$-$V')>>> plt.ylabel(r'V')>>> plt.xlim(1.9,3.4)>>> plt.ylim(6.7,15) >>> plt.ylim(plt.ylim()[::-1]) # reverse y-axis, or also `plt.ylim(15, 6.7)`>>> plt.savefig(‘myfigure.png’) # can export into eps,ps,pdf,png,svg… >>> plt.show()
Step-by-step: Be an artist Now explore matplotlib/pylab
Generate a density plot of this HR diagram numpy.where, pylab.histogram2d pylab.contour and/or pylab.contourf pylab.figure and/or pylab.subplot
Step-by-step: Be an artistimport numpy, pylab# get the databflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y = numpy.loadtxt('data.ascii',
unpack=True)bmag += 1.3vmag += 2.5
# Select space region of interestind = numpy.where( ( (bmag-vmag) > -0.5 ) & ( (bmag-vmag) < 2. ) & ( vmag > 10. ) & ( vmag < 22. ) )#do the histogram h = pylab.histogram2d( bmag[ind]-vmag[ind], vmag[ind], bins=100 )image = h[0]extent=[ h[1][0], h[1][-1], h[2][0], h[2][-1] ] #axes
limits
Step-by-step: Be an artist#display partax = pylab.subplot(121) #Create a subplot, 1 row 2 cols, #1ax.imshow( image.transpose(), extent=extent, origin='lower', aspect='auto‘, cmap=pylab.cm.Blues)ax.set_xlim(-0.5,1.0) ; ax.set_ylim(20, 11)ax.set_ylabel(r'V‘) ; ax.set_xlabel(r'B$-$V')ax.set_title('histogram2d')
ax = pylab.subplot(122) #Create a subplot, 1 row 2 cols, #2levels = [0.1, 3., 6., 9., 12., 15., 18., 21.]ax.contourf( image.transpose(), extent=extent,
levels=levels,cmap=pylab.cm.Blues )ax.set_xlim(-0.5,1.0); ax.set_ylim(20, 11)ax.set_ylabel(r'V') ; ax.set_xlabel(r'B$-$V')ax.set_title('contourf')
pylab.savefig('m3_ex2.png')
Step-by-step: Be an artist
Step-by-step: Be an artistax = pylab.gcf().get_axes() # get subplotsax1 = ax[1] # second subplot (contourf)ax1.set_ytitle('')# delete tick labels on the y-axispylab.setp(ax1.get_yticklabels(), visible=False)# shrink space between the plots pylab.subplots_adjust(wspace=0.001)
Use some more magic and add minor ticks, change fonts, ...
(battery included in figure.py)
Step-by-step: Be an artist
Digressions around variablesData: values, pointers and references
Location, affectation in C
Location, affectation in Python
Differences between equality and identity
“It’s a trap!”
Digression: Variables in C
int x, y;
Memory location x type: int
Memory location y type: int
? ?
Digression: Variables in C
int x, y;
x = 2;
Memory location x type: int
Memory location y type: int
? ?
2 ?
Digression: Variables in C
int x, y;
x = 2;
x = 3;
Memory location x type: int
Memory location y type: int
? ?
2 ?
3 ?
Digression: Variables in C
int x, y;
x = 2;
x = 3;
y = x;
Memory location x type: int
Memory location y type: int
? ?
2 ?
3 ?
3 3
Value copied
Digression: Variables in Python
>>> planet = “Pluto”
>>
Variables(names)
Objects(data)
Planet
“Pluto”, type: str
Digression: Variables in Python
>>> planet = “Pluto”
>>> planet = “Tiamat”
Variables(names)
Objects(data)
Planet“Tiamat”, type:
str
“Pluto”, type: str
Digression: Variables in Python
>>> planet = “Pluto”
>>> planet = “Tiamat”
>>> legend = planet
Variables(names)
Objects(data)
Planet“Tiamat”, type:
strlegend
Digression: Variables in Python
>>> planet = "Pluto“
>>> planet = “Tiamat“
>>> legend = planet
>>> del planet
Variables(names)
Objects(data)
Planet“Tiamat”, type:
strlegend
Digression: Variables in Python
>>> planet = "Pluto“
>>> planet = “Tiamat“
>>> legend = planet
>>> del planet
>>> del legend
Variables(names)
Objects(data)
“Tiamat”, type:
strlegend
Digression: Identity or Equality>>> l = [1,2,3]
>>> m = [1,2,3] # l and m different objects
>>> l == m # Equality (of values) tested with ==
>>> l is m # Identity (of objects) tested with is
>>> l[0] = 42
>>> print l
>>> print m # m[0] = ?
>>> l = [1,2,3]; m = l # l == m ? l is m ?
Digression: Variable copy>>> l = numpy.array([1,2,3])
>>> m = l.copy() # l and m different objects
>>> l == m # Equality (of values) tested with ==
>>> l is m # Identity (of objects) tested with is
>>> l[0] = 42
>>> print l
>>> print m # m[0] = ?
Python SpecialsParticular variables in Python: tuples, lists, arrays, & dictionaries
tuples, lists, arrays
dictionaries
Looping philosophy
Use case: Find out which the 10 most frequent words in a text are.
Tuple & list
Tuples and lists are ordered sequences of values of different types
Tuples are immutable: does not support item assignment
Lists are mutable
In [1]: t = (1,2,3); u = [1,2,3]
In [2]: t[0], t[-1], t[1:], u[0], u[-1], u[1:]Out[2]: (1, 3, (2, 3), 1, 3, [2, 3])
In [3]: t[0] = 10 TypeError : 'tuple' object does not support item assignment
In [4]: u[0] = 10; u.append(4) #ok
In [5]: v = list(t) # v is now a list of t-valuesIn [6]: del u[0] # delete one element
“It feels like somebody is using your head for origami, but it doesn't bend that way…”
Tuple & list
In [1]: t = ([1,2,3],[4,5])
In [2]: t[0] = 42 # list is immutableTypeError : ‘tuple’ object does not support item assignment
In [3]: t[0][0] = 42; print t Out[3]: ([42,2,3],[4,5])
In [4]: t+t # same result with listsOut[4]: ([42,2,3],[4,5],[42,2,3],[4,5])
In [4]: t*2 # duplicate!! (also with lists)Out[4]: ([42,2,3],[4,5],[42,2,3],[4,5])
“It feels like somebody is using your head for origami, but it doesn't bend that way…”
Dictionaries
Dictionaries are unordered, mutable set of (key, value) pairs
In [1]: d = {'spam': 2, 'eggs': 3}
Order is not preserved
Use of keys instead of indexes to access data (kind of named array)
In [2]: d[‘spam’]Out[2]: 2 In [3]: d[‘spam’] = 10000In [4]: d[‘bacon’]KeyError: ...
Kind of named lists or like c-structures
Dictionaries
In [2]: d[‘spam’]Out[2]: 2 In [3]: d[‘spam’] = 10000In [4]: d[‘bacon’]KeyError: ...
Either check in advance: >>> d.has_key('bacon')
Or use get method with def. val. >>> d.get('bacon', 42)
Adding items: >>> d['bacon'] = 13
Removing items: >>> del d['spam']
Keys may be of any immutable type but unique (value are any) [1,2] (1,2) ‘str’, val…
Kind of named lists or like c-structures
Looping philosophy
Loops use while or for constructions
while constructions are “classical”
For construction aims at iterating over an object>>> for i in range(10): … # range(10) = [0,…,9]
Lists, tuples and dictionaries are iterable objects
Any object with a __iter__ function is iterable>>> d = {'spam': 2, 'eggs': 3, 'bacon': 5}
>>> for k in d: ... print k, d[k]
>>> for k in d.keys(): # Clearer equivalent version ... print k, d[k]
>>> for k in sorted(d.keys()): # Sorted loop ... print k, d[k]
To be or not to be iterable…
Looping special: inline for-loop
Special for-loop construction
>>> d = {'spam': 2, 'eggs': 3, 'bacon': 5}
>>> val = []:>>> for k in d.keys(): ... if d[k] > 2.5:
... val.append( (k,d[k]) )
>>> v = [ (k,d[k]) for k in d.keys() if d[k] > 2.5 ]
Automatic and optimized allocation, hence faster!
Use Case: 10 most frequent words in a text
Find out which the 10 most frequent words in a text are.
“The First Men in the Moon” by H. G. Wells (Herbert Georges) http://www.gutenberg.org/ebooks/1013 (UTF-8 plain text format)
Help: use dictionaries or lists or tuples, and for-loops
Useful help: import re – this let you use regular expressions (see in particular: re.sub) Open the text file, read the full text into a variable Filter for `\r\n` sequences Be careful of case sensitivity, punctuation and so on… during word counts Some commands (not all needed):
▪ <str>.split, <str>.join, <str>.upper, <str>.lower, <str>.replace …▪ <list>.count, re.sub, numpy.argsort, numpy.unique ...
Use Case: 10 most frequent words in a text
Power Law !!!
codesources/textCount.py
Subplots & linked axes <…>.subplot(…,sharex=ax)
Use Case: 10 most frequent words in a text def count():
f = open('firstMenOnTheMoon.txt') #open file# =============== Get all lines and set to lower case ================txt = [ l.lower() for l in f if l != '\r\n']f.close() #file can be closed nowtxt = ''.join(txt)# ================== filter & replace punctuation ====================txt = txt.replace('\r\n', ' ').replace('--', ' ')w = re.sub('[.,:\'"!?;()]',' ', txt)w = w.split()# ========================== Do the stats ============================ #words = set(w) # also works, generate a sequence of unique wordswords = numpy.unique(w)counts = [ w.count(kwd) for kwd in words ]sind = numpy.argsort(counts)[::-1]sc = numpy.array(counts)[sind]sw = numpy.array(words)[sind]
print “Statistics:\ncontains %d words\n“ % (len(w)for k in range(10):
print "Word '%s' used '%d' times" % (sw[k], sc[k])
return sw,sc
codesources/textCount.py
A Second ProblemFitting a black body model on the Sun’s spectrum
Fitting data with arbitrary functions
Using scipy: scipy.optimize, scipy.optimize.leastsq
reading FITS files with pyfits
Lambda functions
2nd Problem: black body fitting We will fit a blackbody model onto the solar spectrum.
First, we need to code the blackbody model Planck constant: h = 6.626069e-34 (J.s) speed of light: c = 2.99792458e8 (m.s) Boltzmann constant: k = 1.3807e-23 (J/K)
1 parameter: T
Read the Solar spectrum: sun.fits
Optimize using a least square method
2nd Problem: black body fittingimport numpyfrom numpy import exp
# Module global variablesh = 6.626069e-34 # Planck (J.s)c = 2.99792458e8 # speed of light (m.s)k = 1.3807e-23 # Boltzmann (J/K)
def blackbody(wave, temp):""" wave in angstroms, """BBCONST = h*c/(k * 1.e-10)exparg = BBCONST/(numpy.array(wave)*temp)return 2*h*c**2*1e50/( wave**5 *
(numpy.exp(exparg)-1.) )
2nd Problem: black body fittingfrom scipy import optimizeimport pylab
Def fakeData():#Pure Black body with noisedx = numpy.linspace(1000,20000, 1000)dy = blackbody(dx, 7000)dy /= dy.sum()dy += numpy.random.normal(0,1e-4, len(dy))return dx, dy
2nd Problem: black body fitting# The target functionfitfunc = lambda p, x: p[0]*blackbody(x,p[1])# Distance to the target functionerrfunc = lambda p, x, y: fitfunc(p,x)-y
dx, dy = fakeData() # Test casep0 = [1., 6000] # Initial guess for the parametersp1, success = optimize.leastsq(errfunc, p0[:],
args=(dx, dy) )
pylab.figure()pylab.plot(dx,dy, ',')pylab.plot(dx,fitfunc(p1,dx), color='#ff0000')pylab.xlabel(r'Wavelength ($\rm{\AA}$)')pylab.ylabel(r'Flux (Arbitrary Units)')
2nd Problem: black body fitting
2nd Problem: black body fitting
A short digression using pyfits
In [1]: h = pyfits.open('sun.fits')In [2]: hOut[2]: [<pyfits.core.PrimaryHDU object at 0xae03bcc>, <pyfits.core.BinTableHDU object at 0xb9f57ec>]
In [3]: print h[1].headerXTENSION= 'BINTABLE' / binary table extension
<...>In [4]: h[1].data.field('flux')Out[4]: array([ 6.19000000e-05, 5.61400000e-04, 4.90100000e-03, ..., 2.95000000e-10, 1.01000000e-10, 3.38000000e-12])
In [5]: x = h[1].data.field('wavelength')In [6]: y = h[1].data.field('flux')
2nd Problem: black body fitting
Using pyfits
def getSunData():import pyfitst = pyfits.open(‘sun.fits’)x = h[1].data.field('wavelength')y = h[1].data.field('flux')t.close()return x, y
Now we can go back to our problem and fit the Solar spectrum Adapt the previous method to the new dataset Plot the result Look at the statistics What is the official temperature of the Sun?
2nd Problem: black body fittingx, y = getSunData() # Sun spectrum Case
ind = numpy.where((x>1e3) & (y< 2.0e4))p1, success = optimize.leastsq(errfunc, p0[:], args=(x[ind], y[ind]))e1 = errfunc(p1, x[ind], y[ind])e1 *= e1print " Norm = %g, Temperature = %f, error=%f " %
(p1[0],p1[1], numpy.sqrt(e1.sum()) )
pylab.figure()pylab.plot(x[ind], y[ind])pylab.plot(x[ind], fitfunc(p1, x[ind]), color='#ff0000')pylab.xlabel(r'Wavelength ($\rm{\AA}$)')pylab.ylabel(r'Flux (Arbitrary Units)')
2nd Problem: black body fitting Norm = 8.91261e-14, Temperature = 5466.278346, error=4.390300 Norm = 6.65493e-14, Temperature = 5800.000000, error=5.359789
VO InteractionWith SAMPy
Glue It All Together With Python
VO Interaction (SAMPy)
Simple example: (requires mytables, mysamp and dependencies)
In [1]: import mysamp, mytables
In [2]: c = mysamp.Client()[SAMP] Warning (2011-03-21T11:08:53.322297): Another
SAMP Hub is already running. Start-up process aborted.
In [3]: t = mytables.load('toto.csv')Auto-detected input type: csv
In [4]: c['t'] = tAuto-detected input type: fitsOverwrite existing file 't.fits'.Data exported into t.fitsAuto-detected input type: fits
<=> Aladin & Topcat launched before already providing a hub
Advanced Python: ClassesObjects oriented codes, graphical interface, Fortran-C interaction
Objects & Classes in Python Everything is object in python
int, float, …, variables, functions, classes, modules …
“a class is a construct that is used as a blueprint to create objects of that class.” – Hum… A class is a definition of a set of variables associated with
functions Equivalent to c/fortran structures + functions An object is an instance of one class
Example: a class “Vector” A vector of 3 coordinates Function/”methods”: addition, scalar product, projection …
Objects & Classes in Python Coding a Vector class in Python:
class Vector(object): # “object” is the top-lvl type
def __init__(self, x,y,z): # constructorself.x, self.y, self.z = x,y,z
def sum(self, p): # addition: p1+p2return Vector(self.x+p.x, self.y+p.y, self.z+p.z)
def scalarProd(self,p): # scalar product p1*p2return self.x*p.x + self.y*p.y + self.z*p.z
Instanciate Vector objects: >>> a = Vector(1,0,0) >>> b = Vector(0,1,0) >>> c = a.scalarProd(b) >>> print c # ???
Objects & Classes in Python What about print a?
Print is equivalent to a.__str__ (“convert a to string”) Hence you need to define (overload) a __str__ method Example: def __str__(self):
return “Vect(%f,%f,%f)” % (self.x, self.y, self.z)
Sometimes it is better to use ‘+’ and ‘*’ Reading and usage are easier You need to define operator functions:__add__, __sub__, __mul__, __div__, __pow__ …
More at http://docs.python.org/reference/datamodel.html
Objects & Classes in Python Now comes the good part: exercise
Try to implement a complete “Vector class” of your own
Manage a string representation of the vectors Manage the operators ‘+’, ‘– ’ as common sense Implement a scalar product solution based on ‘*’ Implement vector product based on ‘**’ (power operator)
Objects & Classes in Pythonclass Vector(object):
def __init__(self, x, y=None, z=None):# Class constructor: from list/tuple or 3 valuesif hasattr(x,'__len__'):
assert len(x) == 3self.x, self.y, self.z = x
else:assert((y != None) & (z != None))self.x, self.y, self.z = x,y,z
def __repr__(self):# String repr, called when >>> Vector([1,2,3]) <CR>s = self.__str__()+", "s += object.__repr__(self)return s
def __str__(self):# String Conversionreturn "Vect(%s)" % [self.x,self.y,self.z]
Objects & Classes in Python
def __add__(self, p):if isinstance(p, Vector): # if p Vector, sum of
p1*p2return Vector(self.x+p.x, self.y+p.y,
self.z+p.z)else: # if not p Vector
return Vector(self.x+p, self.y+p, self.z+p)def __mul__(self,p):
if isinstance(p, Vector): # scalar product p1*p2return self.x*p.x + self.y*p.y + self.z*p.z
else:return Vector(self.x*p, self.y*p, self.z*p)
def __pow__(self,p):if isinstance(p, Vector): # vector product p1**p2
x = self.y*p.z-self.z*p.yy = self.z*p.x-self.x*p.zz = self.x*p.y-self.y*p.xreturn Vector(x,y,z)
else:return Vector(self.x**p, self.y**p,
self.z**p)
Objects & Classes in PythonIn [1]: import vectorIn [2]: a = vector.Vector(1,0,0)In [3]: b = vector.Vector([0,1,0])In [4]: a+1Out[4]: Vect([2, 1, 1]), <vector.Vector object at 0x9544bcc>
In [5]: print a+bVect([1, 1, 0])
In [6]: print (a+1)**2Vect([4, 1, 1])
In [7]: print a**b, b**aVect([0, 0, 1]) Vect([0, 0, -1])
Advanced Python: User InterfaceGraphical User Interface from python
GUI from python
Many classical libraries are available Tcl/TK, QT, Wx, WxGTK …
An example is better than words… Example taken from Joachim K. homeworks “Applet” to show the physics of a 2 body gravitational
interaction One body is fixed and the second “orbits” around Trajectory of the second body is plotted
▪ Depending on its initial position and velocity
Source in `OneBody.py`
btw JK, may I be cannibal? ;)
GUI from python
Advanced Python: F2pyC/C++, Fortran interaction
Advanced Python: F2py
Python presents many advantages
However, too slow for highly intensive numerical computations
You can interface Python with C or Fortran codes Best of all worlds Pre- and post- processing in Python Intensive part in Fortran (or in C)
F2py is one solution (many others exists) Requires NumPy (> = 1.3) and Scipy_distutils (>= 0.2.2) A few lines in your fortran source code Interface file (F2py is able to auto-generate simple interface that
you can edit)
Advanced Python: F2py
Example (F90/95): <norme.f90>
subroutine norme(a,b,c)real(8), intent(in) :: a,bReal(8), intent(out) :: cc = sqrt( a*a + b*b )
end subroutine norme
Generate the interface with f2py
$> f2py -c norme.f90 -m vect \--fcompiler=gfortran --f90flags=-03
Advanced Python: F2py
From a python shell:>>> import vect>>> c = vect.norme(3,4)>>> print c5.0
Documentation is also generated by f2py:>>> print vect.norme.__doc__norme – Function signature:c = norme(a,b)Required arguments:a : input floatb : input floatReturn objects:c: float
Advanced Python: F2py
Works with lists and numpy.arrays
subroutine norme(a,c,n)integer :: n, ireal(8), dimension(n), intent(in) :: a!f2py optional, depend(a)::n=len(a)real(8), intent(out) :: creal(8) :: sumcsumc = 0.do i=1,n
sumc = sumc + a(i)*a(i)end doc = sqrt( sumc )
end subroutine norme
Advanced Python: F2py
From a python shell:
>>> from vect import *>>> a = [2,3,4] # a given list>>> type(a)<type ‘list’>>>> norme(a)5.3851648071345037
>>> from numpy import array>>> a = array([2,3,4]) # an equivalent
numpy.array>>> type(a)<type ‘numpy.ndarray’>>>> norme(a)5.3851648071345037
Advanced Python: F2py
You can do much more Callbacks are available (call python from fortran) Encapsulate fortran within python objects Compile fortran code inside python scripts …
Visit http://www.scipy.org/F2py
Advanced Python: SWIGC/C++ wrapper for python and many more (19 languages!)
Please don’t shoot… I haven’t use it yet!
Just trying to answer some of your questions
http://www.swig.org
Advanced Python: SWIG
SWIG is an interface compiler connects programs written in C and C++ with scripting languages such as Perl, Python … works by taking the declarations found in C/C++ header
files generates the wrapper code that scripting languages
need to access the underlying C/C++ code.
SWIG currently generates wrapper code for nineteen different target languages: Allegro CL, C#, CFFI, CLISP, Chicken, D, Go, Guile, Java,
Lua, Modula-3, Mzscheme, OCAML, Octave, Perl, PHP, Python, R, Ruby, Tcl, UFFI
Advanced Python: SWIG
So you want to get going in a hurry? To illustrate the use of SWIG, suppose you have some C functions you want added to Python
/* File : example.c */ #include <time.h> double My_variable = 3.0; int fact(int n) { if (n <= 1) return 1; else return n*fact(n-1); } int my_mod(int x, int y) { return (x%y); } char *get_time() { time_t ltime; time(<ime); return ctime(<ime); }
Advanced Python: SWIG
Interface file Now, in order to add these files to your favorite
language, you need to write an "interface file" which is the input to SWIG. An interface file for these C functions might look like this :
/* example.i */ %module example %{ /* Put header files here or function declarations */ extern double My_variable; extern int fact(int n); extern int my_mod(int x, int y); extern char *get_time(); %} extern double My_variable; extern int fact(int n); extern int my_mod(int x, int y); extern char *get_time();
Advanced Python: SWIG
Building a Python module Turning C code into a Python module is also easy. Simply do the following: $> swig -python example.i $> gcc -c example.c example_wrap.c \ -I/usr/local/include/python2.1 $> ld -shared example.o example_wrap.o -o _example.so
We can now use the Python module as follows : >>> import example >>> example.fact(5) 120 >>> example.my_mod(7,3) 1 >>> example.get_time() 'Sun Feb 11 23:01:07 1996’ More at http://www.swig.org
3D VisualizationMayavi2, Blender
3D Visualization: Mayavi2 A short example from the official website:
def getData():import numpy as np# Create data, z Gaussian function of x and y.np.random.seed(12345)x = 4*(np.random.random(500) – 0.5)y = 4*(np.random.random(500) – 0.5)f = lambda (x, y): np.exp(-(x**2 + y**2))z = f(x, y)return (x,y,z)
3D Visualization: Mayavi2from enthought.mayavi import mlabmlab.figure(1, fgcolor=(0, 0, 0), bgcolor=(1, 1, 1))
# get Datasetx, y, z = getData()# Visualize the pointspts = mlab.points3d(x, y, z, z, scale_mode=‘none’,
scale_factor=0.2)
# Create and visualize the meshmesh = mlab.pipeline.delaunay2d(pts)surf = mlab.pipeline.surface(mesh)
mlab.view(47, 57, 8.2, (0.1, 0.15, 0.14))mlab.show()
3D Visualization: Mayavi2
3D Visualization: Mayavi2 This example: how you can use Mayavi interactively to visualize
numpy arrays while doing numerical work with scipy.
It assumes that you are now familiar with numerical Python tools.
Use-case: study the trajectories of a particle in a potential. very common problem: orbits, particles… visualization of the potential and the trajectories
The potential: a periodic lattice, immersed in a parabolic confinement. We will shake this potential and see how the particle jumps from a hole
of the lattice to another. The parabolic confinement is there to limit the excursions of the
particle
More at http://code.enthought.com/projects/mayavi
3D Visualization: Mayavi2
Code in the appendixes
3D Visualization: Blender Blender is a free open-source 3D content creation suite
Movies are being made with it (e.g. Sintel, available on
youtube)
A suite like 3D Studio Max: 3D scenes, movies, animations
Full integration of python: scripting scenes and movies.
2.5x version: Full UI in python (calculations might be in C)
Who said science is not fun?
3D Visualization: Blender
Non scientific model of M31 (Andromeda Galaxy)
https://bitbucket.org/quiark/galaxy/.../galaxy2.blend
And you can even fly-in !!
3D Visualization: Blender N-body simulation of the Antennae Galaxy from
Florent Renaud
Python to build the scene
Blender to fly-in
Just imagine…
Installation FAQSetuptools mainly! (easy_install, pip)
Installation
How to install python? Apt-get install python / fink install python / python-
x.xx.msi(ish?)
About the “packages”: First install `setuptools` (or `pip` equiv., python package index)▪ Apt-get / fink / windows binary
Then easy_install [opts] <package name>▪ E.g. easy_install numpy, easy_install pyfits…▪ Is also able to update packages (-U option)
Quid if I am not root of my computer? VirtualEnv is the fix. http://pypi.python.org/pypi/virtualenv▪ Isolated python envir. (bottle)
That’s all folks !!!You got out of the egg ...
You have now the materials to go on by yourself… Soon you will dance to the flute songs
AppendixesOne day is quite short to show everything…
3D Visualization: tuto Mayavi2 This example: how you can use Mayavi interactively to visualize
numpy arrays while doing numerical work with scipy.
It assumes that you are now familiar with numerical Python tools.
Use-case: study the trajectories of a particle in a potential. very common problem: orbits, particles… visualization of the potential and the trajectories
The potential: a periodic lattice, immersed in a parabolic confinement. We will shake this potential and see how the particle jumps from a hole
of the lattice to another. The parabolic confinement is there to limit the excursions of the
particle
More at http://code.enthought.com/projects/mayavi
3D Visualization: tuto Mayavi2 Use the mlab module to interactively visualize the
data. For this it is best to use an interactive Python shell Mayavi2 application has a built-in interactive shell also
Let us visualize the 3D isosurfaces of the potential:>>> from enthought.mayavi import mlab>>> mlab.contour3d(X, Y, Z, V)
We can interact with the visualization Rotation, zoom, … In particular vary the iso-surfaces (maya-tree)
3D Visualization: tuto Mayavi2 We want to study the motion of a particle in this potential
We derive the corresponding forces from the gradient of the potential.
We create a gradient functiondef gradient(f, x, y, z, d=0.01): """ Return the gradient of f in (x, y, z). """ fx = f(x+d, y, z); fx_ = f(x-d, y, z) fy = f(x, y+d, z); fy_ = f(x, y-d, z) fz = f(x, y, z+d); fz_ = f(x, y, z-d) return ( (fx-fx_)/(2*d), (fy-fy_)/(2*d),
(fz-fz_)/(2*d) ) Let us visualize the corresponding vector field.
only along a cut for X=50, and every three points on our gridVx, Vy, Vz = gradient( V, X[50,::3,::3], Y[50,::3,::3], Z[50,::3,::3] )mlab.quiver3d( X[50,::3,::3], Y[50,::3,::3], Z[50,::3,::3],
Vx, Vy, Vz, scale_factor=-0.2, color=(1,1,1) )
3D Visualization: tuto Mayavi2 $> ipython -wthread import numpy as np
def V(x, y, z): """ Potential: A 3D sinusoidal lattice with a parabolic confinement. """ return np.cos(10*x) + np.cos(10*y) + p.cos(10*z)
+ 2*(x**2 + y**2 + z**2)
We would like to see what it looks like in 3D. To do this we create a 3D grid of points, and sample it on these
later X, Y, Z = np.mgrid[-2:2:100j, -2:2:100j, -2:2:100j]
3D Visualization: tuto Mayavi2 A good view of the potential:
turning off auto contours choosing -0.5 and 15
We can also change the colors used by selecting a different LUT (Look Up Table). select a paired as it separates well levels.
To get a better view of the potential: display more contours, but closed contours hide their interior. One solution: a cut plane. Right-click on the IsoSurface node and add a ScalarCutPlane. You
can move the cut plane by clicking on it and dragging.
Add a colorbar: mlab.colorbar(title=“Potential”)
3D Visualization: tuto Mayavi2
3D Visualization: tuto Mayavi2 We can use scipy to integrate the trajectories.
first define a dynamical flow of the system The flow is used by every ODE (ordinary differential equation)
solver The dynamics is made of the force deriving from the potential
def flow(r, t): """ The dynamical flow of the system """ x, y, z, vx, vy, vz = r fx, fy, fz = gradient(V, x-.2*np.sin(6*t),
y-.2*np.sin(6*t+1), z-.2*np.sin(6*t+2))
return np.array( (vx, vy, vz, -fx - 0.3*vx, -fy - 0.3*vy, -fz - 0.3*vz) )
3D Visualization: tuto Mayavi2 Now we can integrate the trajectory:
from scipy.integrate import odeint
R0 = (0, 0, 0, 0, 0, 0) # Initial conditionst = np.linspace(0, 50, 500) # Time SamplingR = odeint(flow, R0, t)
Finally we can plot the trajectories:
x, y, z, vx, vy, vz = R.Ttrajectory = mlab.plot3d(x, y, z, t,
colormap=“hot”,tube_radius=None)mlab.colorbar(trajectory, title=“Time”)
3D Visualization: tuto Mayavi2