using docker for data science - part 2
TRANSCRIPT
GET DOCKERboot2docker .dmg or .exeapt-get install docker.io ...
https://docs.docker.com/installation/
RUN SCIPYSERVER$ docker run -d -e "PASSWORD=YourPassword?" ipython/scipyserver
$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --name dev_notebook \ -p 443:8888 \ ipython/scipyserver
https://localhost:443https://{boot2docker ip}:443
CREATE DATA-ONLY CONTAINERS$ docker run \ -d \ -v ~/notebooks:/notebooks \ --name notebooks_container \ ubuntu echo notebooks
$ docker run -d -v ~/data:/data --name data_container ubuntu echo
MOUNT DATA-ONLY CONTAINERS$ docker stop dev_notebook$ docker rm dev_notebook
$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --name dev_notebook \ -p 443:8888 \ --volumes-from data_container \ --volumes-from notebooks_container \ ipython/scipyserver
CREATE A DOCKERFILEFROM ipython/scipyserverMAINTAINER Calvin Giles <[email protected]>
COPY requirements.txt /requirements.txtRUN pip2 install -r /requirements.txtRUN pip3 install -r /requirements.txt
$ docker build \ -t calvingiles/ds-notebook \ .
$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --name dev_notebook \ -p 443:8888 \ --volumes-from data_container \ --volumes-from notebooks_container \ calvingiles/ds-notebook
THIS TIMECreating and connecting to local database containersTweaking the boot2docker vm memory from 2GB to 8 (ormore...)Automated builds with github linkingForget everything and use fig
CREATE LOCAL DATABASE CONTAINERS$ docker run -d -v /var/lib/postgresql/data --name=pg_data ubuntu $ docker run -d --name=dev_postgres postgres$ docker run -d --name=dev_mongo mongo
$ docker run \ -d \ -e "PASSWORD=YourPassword?" \ --link dev_postgres:dev_postgres --link dev_mongo:dev_mongo --name dev_notebook \ -p 443:8888 \ --volumes-from data_container \ --volumes-from notebooks_container \ calvingiles/ds-notebook
TWEAK YOU MEMORY IN YOUR VM ABOVE 2GBEither:
$ boot2docker delete$ boot2docker init -m 5555... lots of output ...$ boot2docker info{ ... "Memory":5555 ...}
Or (doesn't loose non-host data persistence):
$ VBoxManage modifyvm boot2docker-vm --memory 5555$ boot2docker stop$ boot2docker start$ boot2docker info{ ... "Memory":5555 ...}
AUTOMATED BUILDS WITH GITHUB LINKINGCommit Dockerfile, requirements.txt etc. to a githubrepo
Add an "Automated Buld" on
Select the repo and accept defaults
Check the "Build Details" for your repo build to finish
docker hub
$ docker run <dockername>/<reponame>
FORGET EVERYTHING AND USE FIGhttp://www.fig.sh/install.html
$ curl -L https://github.com/docker/fig/releases/download/1.0.1/fig-̀uname -s̀-̀uname -m̀ > ~/bin/fig$ chmod +x ~/bin/fig
FIG.YML -- DATAnotebooks: command: echo created image: busybox volumes: - "~/Google Drive/notebooks:/notebooks/analysis"data: command: echo created image: busybox volumes: - "~/Google Drive/data:/data/analysis"...
FIG.YML -- POSTGRES...devpostgresdata: command: echo created image: busybox volumes: - /var/lib/postgresql/datadevpostgres: environment: - POSTGRES_PASSWORD image: postgres links: ports: - "5432:5432" volumes_from: - devpostgresdata...
FIG.YML -- NOTEBOOK SERVER...ds_server: environment: - PASSWORD image: calvingiles/data-science-environment links: - devpostgres:postgres ports: - "443:8888" volumes_from: - notebooks - data
FIG UPIn the same directory as fig.yml:
$ fig rm$ PASSWORD=MyPass POSTGRES_PASSWORD=PGPass fig up -d
HERE'S ONE I MADE EARLIER$ curl -L http://goo.gl/rW47v3 > fig.yml$ PASSWORD=MyPass POSTGRES_PASSWORD=PGPass fig up -d
NEXT TIMELinking to private git repositoriesLessons learnt from using figResizing boot2docker volume (to fix "no space left on device")Fixing "Error response from daemon: client and server don'thave same version"TLS and CA certs to fix "Your connection is not private"Whatever other pain I have had to deal with before thenWhatever pain you feel -- let me know @cavingiles
MORE?Docker:
Fig:
ipython docker images:
my docker image:
fig.yml gist:
http://docs.docker.com/userguide/http://docs.docker.com/reference/commandline/cli/
http://www.fig.sh/
https://registry.hub.docker.com/repos/ipython/
https://github.com/calvingiles/data-science-environmenthttps://registry.hub.docker.com/u/calvingiles/data-science-environment/
http://goo.gl/rW47v3
ABOUT MECalvin GilesData Scientist at AdthenaPyData Meetup [email protected]@calvingiles on twitter, github, docker hub (and many more)