python for data science · 02/03/2020 matplotlib-tutorials localhost:8889/lab 1/ 32 python for data...

32
02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seaborn Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits. Third party packages A large number of third party packages extend and build on Matplotlib functionality, including several higher- level plotting interfaces (seaborn, holoviews, ggplot, ...), and two projection and mapping toolkits (basemap and cartopy). matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. In matplotlib.pyplot various states are preserved across function calls, so that it keeps track of things like the current figure and plotting area, and the plotting functions are directed to the current axes (please note that "axes" here and in most places in the documentation refers to the axes part of a figure and not the strict mathematical term for more than one axis). Tip: In Jupyter Notebook, you can also include %matplotlib inline to display your plots inside your notebook. Load the required libraries In [101]: import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline #plt.plot? Plot a point

Upload: others

Post on 02-Oct-2020

31 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 1/32

PYTHON FOR DATA SCIENCE

Visaulisation

Matplotlib & SeabornMatplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopyformats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Pythonand IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.

Third party packages

A large number of third party packages extend and build on Matplotlib functionality, including several higher-level plotting interfaces (seaborn, holoviews, ggplot, ...), and two projection and mapping toolkits (basemapand cartopy).

matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB. Eachpyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure,plots some lines in a plotting area, decorates the plot with labels, etc.

In matplotlib.pyplot various states are preserved across function calls, so that it keeps track of things like thecurrent figure and plotting area, and the plotting functions are directed to the current axes (please note that"axes" here and in most places in the documentation refers to the axes part of a figure and not the strictmathematical term for more than one axis).

Tip: In Jupyter Notebook, you can also include %matplotlib inline to display your plots inside your notebook.

Load the required libraries

In [101]:

import numpy as npimport pandas as pd

import matplotlib.pyplot as plt

%matplotlib inline #plt.plot?

Plot a point

Page 2: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 2/32

In [82]:

plt.plot(4, 3, '.')

Plot number of pointsIn [102]:

x = np.array([2,4,6,8,10,12,14,16])y = x/2

plt.figure(figsize=(10,5))plt.scatter(x, y, c='green') plt.show()

Out[82]:

[<matplotlib.lines.Line2D at 0x1a201c6d68>]

Page 3: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 3/32

Label the axes

In [104]:

plt.plot([1, 2, 3, 4])plt.ylabel('vertical')plt.xlabel('horizontal')plt.show()

Create multiple plots with subplots

Page 4: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 4/32

In [129]:

names = ['class1', 'class2', 'class3','class4','class5','class6','class7']scores = [10,15,20,30,40,50, 100]

plt.figure(figsize=(15, 8))

plt.subplot(131) #find the meaning of the parameter inside the subplot function plt.bar(names, scores)plt.subplot(132)plt.scatter(names, scores)plt.subplot(133)plt.plot(names, scores)plt.suptitle('Categorical Plotting') #you can give titles,xlabels and ylabels to each of the plots as wellplt.show()

Page 5: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 5/32

What are the differences between add_axes and add_subplot?

The calling signature of add_axes is add_axes(rect), where rect is a list [x0, y0, width, height] denoting thelower left point of the new axes in figure coodinates (x0,y0) and its width and height. So the axes ispositionned in absolute coordinates on the canvas

The calling signature of add_subplot does not directly provide the option to place the axes at a predefinedposition. It rather allows to specify where the axes should be situated according to a subplot grid. The usualand easiest way to specify this position is the 3 integer notation,

e.g. ax = fig.add_subplot(231)

In this example a new axes is created at the first position (1) on a grid of 2 rows and 3 columns. To produceonly a single axes, add_subplot(111) would be used (First plot on a 1 by 1 subplot grid). (In newer matplotlibversions, add_subplot()` without any arguments is possible as well.)

SeabornSeaborn comes with a large number of high-level interfaces and customized themes that matplotlib lacks asit becomes difficult to figure out the settings that make plots attractive.

Mostly, matplotlib functions don’t work well with dataframes as seaborn does.

NB: Seaborn visualisations are based on matplotlib

In [107]:

import seaborn as sns

Let's load a dataset to be used

In [108]:

ourdata=pd.read_excel("Pokemon.xls")

Page 6: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 6/32

In [111]:

ourdata.head()

In [112]:

sns.lmplot(x='Attack', y='Defense', data=ourdata) #lmplot() function is used toquickly plot the Linear Relationship between two(2) variables. lm for linear regression modelplt.show()

No regression line and adding hue

Setting fit_reg=False to remove the regression line

Out[111]:

Name Type1 Type 2 Total HP Attack Defense Atk Def Speed Stage Legenda

0 Bulbasaur Grass Poison 318 45 49 49 65 65 45 1 Fal

1 Ivysaur Grass Poison 405 60 62 63 80 80 60 2 Fal

2 Venusaur Grass Poison 525 80 82 83 100 100 80 3 Fal

3 Charmander Fire NaN 309 39 52 43 60 50 65 1 Fal

4 Charmeleon Fire NaN 405 58 64 58 80 65 80 2 Fal

Page 7: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 7/32

In [43]:

sns.lmplot(x='Attack', y='Defense', data=ourdata,fit_reg=False,hue='Stage')

We set hue='Stage' to color our points by the Pokémon's evolution stage. This hue argument is very usefulbecause it allows you to express a third dimension of information using color.

Out[43]:

<seaborn.axisgrid.FacetGrid at 0x1a1e37a860>

Page 8: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 8/32

In [44]:

fig = plt.figure()a1 = fig.add_axes([0,0,1,1]) #The calling signature of add_axes is add_axes(rect), where rect is a list [x0, y0, width, height] denoting the lower left point of the new axes in figure coodinates (x0,y0) and its width and height. So the axes is positionned in absolute coordinates on the canvas

x = np.arange(1,10)a1.plot(x, np.exp(x),'r')a1.set_title('range of numbers')plt.ylim(0,10000)plt.xlim(0,10)

#explicitly set x and y labelsplt.xlabel("x-axis") plt.ylabel('y-axis')plt.show()

Page 9: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 9/32

In [113]:

ourdata.head()

Out[113]:

Name Type1 Type 2 Total HP Attack Defense Atk Def Speed Stage Legenda

0 Bulbasaur Grass Poison 318 45 49 49 65 65 45 1 Fal

1 Ivysaur Grass Poison 405 60 62 63 80 80 60 2 Fal

2 Venusaur Grass Poison 525 80 82 83 100 100 80 3 Fal

3 Charmander Fire NaN 309 39 52 43 60 50 65 1 Fal

4 Charmeleon Fire NaN 405 58 64 58 80 65 80 2 Fal

Page 10: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 10/32

In [45]:

plt.figure(figsize=(15,15))sns.boxplot(data=ourdata)

Page 11: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 11/32

Out[45]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a1df3a860>

Page 12: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 12/32

Page 13: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 13/32

In [46]:

#drop the unnecessary clomns ourdata1=ourdata.drop(['Total','Legendary','Stage'],axis=1)ourdata1.head()

Out[46]:

Name Type 1 Type 2 HP Attack Defense Atk Def Speed

0 Bulbasaur Grass Poison 45 49 49 65 65 45

1 Ivysaur Grass Poison 60 62 63 80 80 60

2 Venusaur Grass Poison 80 82 83 100 100 80

3 Charmander Fire NaN 39 52 43 60 50 65

4 Charmeleon Fire NaN 58 64 58 80 65 80

Page 14: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 14/32

In [114]:

plt.figure(figsize=(15,15))sns.boxplot(data=ourdata1)

Page 15: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 15/32

Out[114]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a2634e908>

Page 16: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 16/32

Page 17: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 17/32

In [ ]:

In [115]:

corr = ourdata1.corr() # Calculate correlations corr

Out[115]:

HP Attack Defense Atk Def Speed

HP 1.000000 0.306768 0.119782 0.236649 0.490978 -0.040939

Attack 0.306768 1.000000 0.491965 0.146312 0.369069 0.194701

Defense 0.119782 0.491965 1.000000 0.187569 0.139912 -0.053252

Atk 0.236649 0.146312 0.187569 1.000000 0.522907 0.411516

Def 0.490978 0.369069 0.139912 0.522907 1.000000 0.392656

Speed -0.040939 0.194701 -0.053252 0.411516 0.392656 1.000000

Page 18: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 18/32

In [116]:

plt.figure(figsize=(10,10))sns.heatmap(corr) # Creating Heatmap

Page 19: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 19/32

Out[116]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a26b2ff60>

Page 20: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 20/32

Page 21: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 21/32

In [117]:

ourdata1.head()

Univariate Visualisation

DistplotThe most convenient way to take a quick look at a univariate distribution in seaborn is the distplot() function.By default, this will draw a histogram and fit a kernel density estimate (KDE). It is used basically for univariantset of observations and visualizes it through a histogram i.e. only one observation and hence we choose oneparticular column of the dataset.

In [118]:

sns.distplot(ourdata1['Defense'])

use boxplot to confirm your disttribution

Out[117]:

Name Type 1 Type 2 HP Attack Defense Atk Def Speed

0 Bulbasaur Grass Poison 45 49 49 65 65 45

1 Ivysaur Grass Poison 60 62 63 80 80 60

2 Venusaur Grass Poison 80 82 83 100 100 80

3 Charmander Fire NaN 39 52 43 60 50 65

4 Charmeleon Fire NaN 58 64 58 80 65 80

Out[118]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a26bc97f0>

Page 22: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 22/32

In [51]:

sns.boxplot(ourdata1['Defense'])

You can explicitly turn off the kde

read about kde: https://pythontic.com/pandas/series-plotting/kernel%20density%20estimation%20plot(https://pythontic.com/pandas/series-plotting/kernel%20density%20estimation%20plot)

https://pythontic.com/pandas/dataframe-plotting/kernel%20density%20estimation%20plot(https://pythontic.com/pandas/dataframe-plotting/kernel%20density%20estimation%20plot)

https://en.wikipedia.org/wiki/Kernel_density_estimation(https://en.wikipedia.org/wiki/Kernel_density_estimation)

https://www.statsmodels.org/stable/examples/notebooks/generated/kernel_density.html(https://www.statsmodels.org/stable/examples/notebooks/generated/kernel_density.html)

Out[51]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a1d590390>

Page 23: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 23/32

In [119]:

sns.distplot(ourdata1['Defense'],kde=False)

we can also use only the kde plot to plot only kde

Out[119]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a26d92240>

Page 24: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 24/32

In [120]:

sns.kdeplot(ourdata1['Defense']) plt.show()

we can as well shade the kde for better visualisation by using shade=True

In [54]:

sns.kdeplot(ourdata1['Defense'], shade=True) #by saying shade=True allows you to shade the area under the curve for better viewplt.show()

bivariate distributions-visualisations

Jointplot

Page 25: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 25/32

In [121]:

ourdata.head()

In [55]:

sns.jointplot(ourdata['Defense'], ourdata['Attack'])

As you can see a histogram plotted for Defense and another created for Attack, with a scatter plot createdbetween Defense and Attack

Out[121]:

Name Type1 Type 2 Total HP Attack Defense Atk Def Speed Stage Legenda

0 Bulbasaur Grass Poison 318 45 49 49 65 65 45 1 Fal

1 Ivysaur Grass Poison 405 60 62 63 80 80 60 2 Fal

2 Venusaur Grass Poison 525 80 82 83 100 100 80 3 Fal

3 Charmander Fire NaN 309 39 52 43 60 50 65 1 Fal

4 Charmeleon Fire NaN 405 58 64 58 80 65 80 2 Fal

Out[55]:

<seaborn.axisgrid.JointGrid at 0x1a1dac25f8>

Page 26: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 26/32

confirm their correlation

In [56]:

ourdata1[['Defense','Attack']].corr()

We can also explicitly set the 'kind' of visualisation to be displayed

e.g: kind= “scatter” or “reg” or “resid” or “kde” or “hex”

In [96]:

#NB: use shift and tab to get more info about a particular function

sns.jointplot(ourdata['Defense'], ourdata['Attack'], kind='kde')plt.show()

Out[56]:

Defense Attack

Defense 1.000000 0.491965

Attack 0.491965 1.000000

Page 27: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 27/32

In [122]:

sns.jointplot(ourdata['Defense'], ourdata['Attack'], kind='reg')plt.show()

#good when you want to explain the residuals

Visaulising more than two variables: Pairwise Bivariate Distributions-UsingPairplot()

Page 28: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 28/32

In [98]:

sns.pairplot(ourdata[['Defense','Attack','HP']],kind='scatter')plt.show()

Page 29: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 29/32

You can change the diagonal kind

Let's try diag_kind='kde'

In [99]:

sns.pairplot(ourdata[['Defense','Attack','HP']], kind='scatter', diag_kind ='kde')plt.show()

Page 30: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 30/32

In [61]:

ourdata.head()

Categorical Data Visualisation

In [123]:

data=pd.read_csv('Automobile.csv')

In [125]:

data.head()

Out[61]:

Name Type1 Type 2 Total HP Attack Defense Atk Def Speed Stage Legenda

0 Bulbasaur Grass Poison 318 45 49 49 65 65 45 1 Fal

1 Ivysaur Grass Poison 405 60 62 63 80 80 60 2 Fal

2 Venusaur Grass Poison 525 80 82 83 100 100 80 3 Fal

3 Charmander Fire NaN 309 39 52 43 60 50 65 1 Fal

4 Charmeleon Fire NaN 405 58 64 58 80 65 80 2 Fal

Out[125]:

symboling normalized_losses make fuel_type aspiration number_of_doors body_style

0 3 168 alfa-romero gas std two convertible

1 3 168 alfa-romero gas std two convertible

2 1 168 alfa-romero gas std two hatchback

3 2 164 audi gas std four sedan

4 2 164 audi gas std four sedan

5 rows × 26 columns

Page 31: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 31/32

In [64]:

sns.stripplot(data['number_of_doors'], data['horsepower'])

Cars with 2 door are having higher horsepower than cars with 4 door

In [65]:

sns.boxplot(data['number_of_doors'], data['horsepower'])

Out[64]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a1e4352b0>

Out[65]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a1e37ab00>

Page 32: PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting

02/03/2020 Matplotlib-TUTORIALS

localhost:8889/lab 32/32

In [66]:

sns.barplot(data['number_of_doors'], data['horsepower'])

Perform similar operations with the other variables

Read more: https://seaborn.pydata.org/introduction.html (https://seaborn.pydata.org/introduction.html)

MrBriit

Out[66]:

<matplotlib.axes._subplots.AxesSubplot at 0x1a1d823d30>