python for scientific computing

20
Python for Scientific Computing Go Frendi Gunawan

Upload: go-asgard

Post on 23-Jun-2015

409 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Python for scientific computing

Python for Scientific Computing

Go Frendi Gunawan

Page 2: Python for scientific computing

Scientific Computing

● Fortran● MATLAB● Scilab● GNU-Octave● Mathematica● Python

Fortran is the first widely used programming language for scientific purposes.

Matlab is currently the most popular programming language for scientific computing. It is created by mathworks.inc and you need to spend a lot of money to get the license. Matlab is relatively easy for non-programmer scientist. It use “matrix” as basic data type. Good for you if you are familiar with “discreet mathematics”

GNU-Octave & Scilab are Matlab open source alternatives.

Mathematica is a language created by Wolfram. It is also used for scientific purposes.

Python is a general purpose programming language, created by Guido Van Rossum. Although it is not created for solely scientific purposes, it serves well since it has so many free libraries.

Page 3: Python for scientific computing

Why Python?

● Free & Open source● General Purpose● Readable● Has many libraries

As Python is free and open source, you can use it for free.It is important to use free and open source software, because science should be able to be accessed by everyone.

Python was not only created for scientific purposes, but also for other general purposes. Python can be used for almost everything such as web development, hacking, etc.

Python is readable because readability counts. Use a readable language make your program greatly maintainable.

Python has many libraries, like numpy, scipy, scikit-learn and matplotlib. Those libraries allow you to do everything you can do in Matlab by using Python.

Page 4: Python for scientific computing

How easy Python is?Python Java

# show prompt and read inputyour_name = raw_input (“What's your_name?”)

// show promptSystem.out.print(“What's your name?”);// read inputScanner keyboard = new Scanner(System.in);String your_name = keyboard.nextLine();

# define array (or list)fruits = [“Strawberry”, “Orange”, “Grape”]

// define arrayString[] fruits = {"Strawberry", "Orange", "Grape"};

# print out array componentfor fruit in fruits: print (fruit)

// print out array componentfor(int i=0; i<fruits.length; i++){ System.out.print(fruits[i]);}

# swap two variable's valuesa = 5b = 7a, b = b, a

// swap two variable's valuesint a = 5;int b = 7;int c = a;a = b;b = c;

To make it clear, I compare Python and Java. Not only that you need to write less, but also Python code seems to be more readable and intuitive.

With this easiness, I'm sure, even Hirasawa Yui can understand Python in few days :)

Page 5: Python for scientific computing

Let's get started

● Debian and Ubuntu user can do this:– sudo apt-get install python-numpy, python-scipy, python-sklearn, python-matplotlib

● Windows user can download the battery-included package called canopy (formerly EPD free)– https://www.enthought.com/prod

ucts/canopy/

Generally, any Linux platform come with Python pre-installed. So what you need to do is to install the “scientific” libraries, like numpy, scipy, sklearn and matplotlib. That's all.

Unlike Linux, Windows doesn't come with python pre-installed. So, you need to install python before installing the libraries. We don't need to worry since Enthought Inc, has provided a package with battery included. What you need to do is to download the package and click the next button only (again, I think Hirasawa Yui can also do this easily)

Since Python is cross-platform, you can make your codes in Windows and run it on Linux or vice-versa. Java is not the only “write once run everywhere” anymore.

Page 6: Python for scientific computing

Nice to meet you -はじめまして● Python use indentation to remark the

blocks

● There is no such semicolons, dollar signs, curly braces and other mythical characters. You can write code in python in normal way, no need to type $0meThing LIKE '%this%'

● Python is case sensitive, so THIS is different from This one, and not equal to this. True is not true, and false is differ from FALSE

This is just a brief introduction about Python's syntax. I hope you got the idea.

Learning Python is actually easy, but mastering Python needs more efforts. To have more understanding about Python, please take a glance at http://www.diveintopython.net/

In the meantime, you don't need to master it yet.

Page 7: Python for scientific computing

Show something

# integer valueprint (1)print (1+1)

# integer variablea = 2print (1+a)

a = a+2print (a)print (1+a)print (a*2-2)

# Stringprint (“Ok,..”)

# lista = [1,2,3,4,5,6,7]print(a)print(a[0])print(a[-1])print(a[1:5])print(a[1:])print(a[:5])

# dictionaryb = {“name” : “Yui”, “position” : “guitar”, “age” : 17}print b[“name”]print b[“age”]

Look, lookeveryone...

You can show any values or variables by using the print keyword.In python 2.x, it is written as: print “some_value”But in python 3.x, it is written as:print(“some_value”)

It is better to write a code that runs on both python 2.x and 3.x, that's why I use print(“some_value”) as an example.

Python supports various data types including, but not limited to, int, float, double, str, char, list, dictionary and tuples. You can get more information about Python data types on http://www.diveintopython.net/native_data_types/index.html

Page 8: Python for scientific computing

Ask for something

# Ask for String valuename = raw_input(“Your name? ”)print (“hello ”+name)

# Ask for int value, conversion neededage = int(raw_input(“Your age? ”))print (“your age is ”+age)

# Now, let's use dictionaryperson = {}person[“name”] = raw_input(“Your name? ”)person[“age”] = int(raw_input(“Your age? ”))print (“hello ”+name+“ you are ”+str(age)+“ years old”)

Do you have candy?

raw_input is not the only way to ask for input in Python. We can also use “input” keyword instead, which is easier yet less secure.

Using the input, you will be able to do something as follow:

have_candy = input(“Have some candy? ”)if(have_candy): how_many = input(“How many? ”) how_many = how_many – 1 print (“Now you only have ”+how_many)

First, the program will show a prompt to ask whether you have candies or not. You can write the True directly which would be evaluated as boolean. Next (if you write True), the program will ask how many candies you have and you can write an integer value as the answer.

But this is less secure, since you can write any Python codes as an answer, for example 1+1, 5==5, etc

In the other hand, raw_input will treat any of your answers as str. Therefore, conversion is needed.

Page 9: Python for scientific computing

Branching: Choose one and only one !!!

a = int(raw_input(“Gimme a number”))if a<5: print(“Too few”)elif a<10: print(“Fine...”)else: print(“Too much”)

Branching in Python is just straightforward. Notice that you don't need to write curly braces nor “begin-end” for each block-statement. But you need to write colon (:) at the end of a condition.

If the a is less than 5, the program will show “Too few”. If the a is not less than 5, but it is still less than 10, the program will show “Fine”. Otherwise, if the a is not less than 5 and a is not less than 10, the program will show “Too much”.

For making comparison in Python, you can use the following symbols:<,>,<>,==,<=,>=

You can also use “and” or “or” statement like this:if a<5 and a>3: print (“a must be 4”)

Notice, that in Python we use elif, instead of else if.

Page 10: Python for scientific computing

Looping: Do it over, over and over again

There are 2 kinds of looping in Python:● Whilecake_list = [“tart”,”donut”,”shortcake”]i=0while i<len(cake_list): print cake_list[i] i = i+1

● Forcake_list = [“tart”,”donut”,”shortcake”]for cake in cake_list: print cake

while hungry: eat()

Using python just to say that?

Unlike other programming language, the for in Python uses list instead of index as its parameter.

However, you can still do something as follow:for i in [0,1,2,3,4]: print i

You can also use xrange to make it even easier:for i in xrange(5): print i

If you want to make a countdown loop, you can also use the following:for i in xrange(4,0,-1): print i

Page 11: Python for scientific computing

np.array: Play with matrix easily

import numpy as np

a = np.array([[1,2,3][4,5,6]])print a.shapeprint a.Tprint a.diagonal()print a.max()print a.min()print a.mean()print a.reshape(6,1)print np.dot(a,a.T)

HomeworkDone ...

Numpy library allow you to use Matlab like matrix. To make a numpy array you can use the np.array(list)

numpy array has various functions, including (but not limited to):● shape: get the matrix shape● transpose: do transpose● diagonal: find matrix's diagonal● max: find maximum element in array● min: find minimum element in array● mean: find mean of elements in array● std: find standard deviation of elements in array● reshape: change matrix shape● dot: do matrix dot operation

Page 12: Python for scientific computing

matplotlib.pyplot: Your visual friendimport numpy as npimport matplotlib.pyplot as plt

x = np.array([1,2,3,4,5])# y will consists of square of each item of xy = x**2plt.plot(x,y) # make the line plotplt.scatter(x,y) # make the point plotplt.show() # show up the plot

# read an image and show itimg = plt.imread('/home/gofrendi/k-on.png')plt.imshow(img)plt.show() Umm, what are x and y?

Some kind of foodI guess ...

The matplotlib.pyplot allows you to do some plotting. Making graphical representation of numbers will help us to analyze things better.

The plot method allows you to make a smooth curve, while the scatter method allows you to put intersection of x and y as dots.

The imread allows you to fetch an image as a variable.

Page 13: Python for scientific computing

Classification

● Yui hasstrength = 4.2agility = 10.6

● Should she become monk, barbarian or demon hunter?

No...I don't want to be

monk

Classification is one of the most popular topics in computer science. Some “smart systems” usually learn from data (called as training set) before it is able to do the classification task.

The already-trained-classifier can then predict things as needed.

Strength and agility usually called as features or dimension, while monk, barbarian, and demon hunter usually called as class or cluster.

Page 14: Python for scientific computing

Let's Pythontable = np.array([

[8.4, 1, 2],

[1.6, 10, 1],

[3.6, 8.7, 2],

[1.2, 6.7, 3],

[8, 2.8, 2],

[9.3, 7.3, 1],

[4.3, 2.8, 3],

[6.6, 7.5, 2],

[8.5, 8.7, 1],

[0.4, 1.7, 3],

[8.2, 6.1, 1],

[4.5, 5, 3],

[3.3, 6.2, 2],

[5.6, 5.2, 2],

[2.4, 2.9, 3],

[9.2, 9.6, 1],

[7.2, 1.4, 2],

[3, 9.9, 1],

[2.7, 4, 3]

])

1 = barbarian

2 = monk

3 = demon hunter

Barbarian???It sounds like

Barbeque

How could “barbarian”become “barbeque”

In Python, every target should be represented by numbers. In this case, 1 is for barbarian, 2 is for monk, and 3 is for demon hunter.

Page 15: Python for scientific computing

Do some plottingx = table[:, 0]y = table[:, 1]target = table[:, 2]

import matplotlib.pyplot as plt# The othersplt.scatter(x,y, c=target)# Yuiplt.plot(4.2,10.6, 'r^')plt.show()

Barbarian

Monk

DemonHunter

Yui

Ah, save...I don't need tobe a monk ...

Graphical representation is usually more meaningful compared to numeric representation. Can you guess what Yui should be?

Page 16: Python for scientific computing

sklearn: The Classifiers

# K-Meansfrom sklearn.neighbors import KneighborsClassifier# Support Vector Machinefrom sklearn.svm import SVC# Decision Treefrom sklearn.tree import DecisionTreeClassifier# Random Forestfrom sklearn.ensemble import RandomForestClassifier# Naive Bayesfrom sklearn.naive_bayes import GaussianNB# Linear Discriminant Analysisfrom sklearn.lda import LDA# Quadratical Discriminant Analysisfrom sklearn.qda import QDA

All of the sklearn's classifiers have the same interface. They have predict and fit methods which we will see in the next slide.

Page 17: Python for scientific computing

Do classification in view lines# prepare data & targettable = np.array([ [8.4, 1, 2], [1.6, 10, 1], …, [2.7, 4, 3] ])data = table[:,:2]target = table[:,2]

# use SVM as classifierfrom sklearn.svm import SVCclassifier = SVC()# fit classifier (learning phase)classifier.fit(data, target)# get classifier's learning resultprediction = classifier.predict(data)

# calculate true & falsecorrect_prediction = 0wrong_prediction = 0for i in xrange(len(prediction)): if prediction[i] == target[i]: correct_prediction += 1 else: wrong_prediction += 1print (“correct prediction : ”+str(correct_prediction))print (“wrong prediction : ”+str(wrong_prediction))

You got 100Mr. Classifier...

One of the best classifiers in the world is called SVM (Support Vector Machine). We won't talk about the theory here, but you can see how good it is.

Beside SVM, you can also use Decision Tree, Random Forest, LDA, QDA and the others in exactly the same way.

For instance, we can use the Gaussian Naive Bayes as follow:

from sklearn.naive_bayes import GaussianNBclassifier = GaussianNB()

Page 18: Python for scientific computing

See the classifier visually# look for max and min of x & yx = data[:,0]y = data[:,1]x_max, x_min = x.max()+1, x.min()-1y_max, y_min = y.max()+1, y.min()-1# prepare axis & ordinatxx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01), np.arange(y_min, y_max, 0.01))

# let the classifier predict each pointZ = classifier.predict(np.c_[xx.ravel(), yy.ravel()])Z = Z.reshape(xx.shape)

# plot the contour of xx, yy and Zplt.contourf(xx, yy, Z)# plot the pointsplt.scatter(x,y,c=prediction)# see, where Yui really isplt.plot(4.2,10.6, 'r^')plt.show()

I amDemon Hunter

Looking for the classifier visually usually gives you the idea about its characteristics.

This allows you to find out what's wrong with the classification progress as well.

Page 19: Python for scientific computing

Each classifier has their own characteristics

● Check this out:– http://scikit-learn.org/stable/auto_examples/plot_cl

assifier_comparison.html#example-plot-classifier-comparison-py

As people has different characteristics and expertises, classifiers also differ from each other.

It's important to choose best classifier that matches your case to gain the best result.

Page 20: Python for scientific computing

Thank you ...

ありがとう ...

谢谢 ...