using docker for data science - part 2

21
USING DOCKER FOR DATA SCIENCE

Upload: calvin-giles

Post on 14-Jul-2015

659 views

Category:

Software


4 download

TRANSCRIPT

USING DOCKER FOR DATASCIENCE

RECAP

WHY DOCKERPortable environmentIsolated between projectsStatelessFast local file accessHetrogenous

GET DOCKERboot2docker .dmg or .exeapt-get install docker.io ...

https://docs.docker.com/installation/

RUN SCIPYSERVER$ docker run -d -e "PASSWORD=YourPassword?" ipython/scipyserver

$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --name dev_notebook \ -p 443:8888 \ ipython/scipyserver

https://localhost:443https://{boot2docker ip}:443

CREATE DATA-ONLY CONTAINERS$ docker run \ -d \ -v ~/notebooks:/notebooks \ --name notebooks_container \ ubuntu echo notebooks

$ docker run -d -v ~/data:/data --name data_container ubuntu echo

MOUNT DATA-ONLY CONTAINERS$ docker stop dev_notebook$ docker rm dev_notebook

$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --name dev_notebook \ -p 443:8888 \ --volumes-from data_container \ --volumes-from notebooks_container \ ipython/scipyserver

CREATE A DOCKERFILEFROM ipython/scipyserverMAINTAINER Calvin Giles <[email protected]>

COPY requirements.txt /requirements.txtRUN pip2 install -r /requirements.txtRUN pip3 install -r /requirements.txt

$ docker build \ -t calvingiles/ds-notebook \ .

$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --name dev_notebook \ -p 443:8888 \ --volumes-from data_container \ --volumes-from notebooks_container \ calvingiles/ds-notebook

THIS TIMECreating and connecting to local database containersTweaking the boot2docker vm memory from 2GB to 8 (ormore...)Automated builds with github linkingForget everything and use fig

CREATE LOCAL DATABASE CONTAINERS$ docker run -d -v /var/lib/postgresql/data --name=pg_data ubuntu $ docker run -d --name=dev_postgres postgres$ docker run -d --name=dev_mongo mongo

$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --link dev_postgres:dev_postgres --link dev_mongo:dev_mongo --name dev_notebook \ -p 443:8888 \ --volumes-from data_container \ --volumes-from notebooks_container \ calvingiles/ds-notebook

TWEAK YOU MEMORY IN YOUR VM ABOVE 2GBEither:

$ boot2docker delete$ boot2docker init -m 5555... lots of output ...$ boot2docker info{ ... "Memory":5555 ...}

Or (doesn't loose non-host data persistence):

$ VBoxManage modifyvm boot2docker-vm --memory 5555$ boot2docker stop$ boot2docker start$ boot2docker info{ ... "Memory":5555 ...}

AUTOMATED BUILDS WITH GITHUB LINKINGCommit Dockerfile, requirements.txt etc. to a githubrepo

Add an "Automated Buld" on

Select the repo and accept defaults

Check the "Build Details" for your repo build to finish

docker hub

$ docker run <dockername>/<reponame>

FORGET EVERYTHING AND USE FIGhttp://www.fig.sh/install.html

$ curl -L https://github.com/docker/fig/releases/download/1.0.1/fig-̀uname -s̀-̀uname -m̀ > ~/bin/fig$ chmod +x ~/bin/fig

FIG.YML -- DATAnotebooks: command: echo created image: busybox volumes: - "~/Google Drive/notebooks:/notebooks/analysis"data: command: echo created image: busybox volumes: - "~/Google Drive/data:/data/analysis"...

FIG.YML -- POSTGRES...devpostgresdata: command: echo created image: busybox volumes: - /var/lib/postgresql/datadevpostgres: environment: - POSTGRES_PASSWORD image: postgres links: ports: - "5432:5432" volumes_from: - devpostgresdata...

FIG.YML -- NOTEBOOK SERVER...ds_server: environment: - PASSWORD image: calvingiles/data-science-environment links: - devpostgres:postgres ports: - "443:8888" volumes_from: - notebooks - data

FIG UPIn the same directory as fig.yml:

$ fig rm$ PASSWORD=MyPass POSTGRES_PASSWORD=PGPass fig up -d

HERE'S ONE I MADE EARLIER$ curl -L http://goo.gl/rW47v3 > fig.yml$ PASSWORD=MyPass POSTGRES_PASSWORD=PGPass fig up -d

NEXT TIMELinking to private git repositoriesLessons learnt from using figResizing boot2docker volume (to fix "no space left on device")Fixing "Error response from daemon: client and server don'thave same version"TLS and CA certs to fix "Your connection is not private"Whatever other pain I have had to deal with before thenWhatever pain you feel -- let me know @cavingiles

MORE?Docker:

Fig:

ipython docker images:

my docker image:

fig.yml gist:

http://docs.docker.com/userguide/http://docs.docker.com/reference/commandline/cli/

http://www.fig.sh/

https://registry.hub.docker.com/repos/ipython/

https://github.com/calvingiles/data-science-environmenthttps://registry.hub.docker.com/u/calvingiles/data-science-environment/

http://goo.gl/rW47v3

ABOUT MECalvin GilesData Scientist at AdthenaPyData Meetup [email protected]@calvingiles on twitter, github, docker hub (and many more)