running galaxy in a secure environment using docker
TRANSCRIPT
Abdulrahman Azab
05, May
Running Galaxy in a Running Galaxy in a
Secure Environment Secure Environment
using Dockerusing DockerFirst experiences at TSDFirst experiences at TSD 2.0
Kiss!
Keep It Simple Stupid!Keep It Simple Stupid!
Agenda
� TSD: Services for Sensitive Data
� Running Galaxy inside TSD: Challenges
� Docker
� Galaxy Inside TSD as a Docker Container
� Galaxy Tools inside TSD as Docker Containers
TSD:
Services for
Sensitive Data
P01
TSD Services for sensitive data: Architecture
Parallel FileParallel File--systemsystem
HNAS FileHNAS File--systemsystem
SLURM
CECECECECECE
P1P1--u1u1
VMVM
P1P1--u2u2
VMVM
P1P1--uumm
VMVM
Colossus
P1P1
VMVMP1P1
VMVMP01P01
VMVM
P1P1
VMVMP1P1
VMVMP02P02
VMVM
P1P1
VMVMP1P1
VMVMPPnn
VMVM
Tw
o f
acto
r A
uth
en
ticati
on
Tw
o f
acto
r A
uth
en
ticati
on
TSD Services for sensitive data: Data Transfer
pXX/import
pXX Users
SFTP File –Lock
protocol
SLURM
WWWWWW
pXX/export pXX/fx/export
pXX/fx/import
Colossus Colossus
FileFile--systemsystem
TSD
tsd-fx01
HNAS FileHNAS File--systemsystem
File Sluice
tsd-fx02
Running Galaxy
inside TSD:
Challenges
P01 P02 Pn
Running Galaxy inside TSD: Challenges
Colossus FSColossus FS
HNAS FSHNAS FS
SLURM
WWWWWW
P01P01
VMVMP02P02
VMVMPPnn
VMVM
Running Galaxy inside TSD: Challenges
� Access from/to the outside world is highly restricted (Cannot install/update Galaxy from public repositories)
� Galaxy is a web-portal. It is NOT designed to run inside an isolated environment. It needs regular online updates.
� Galaxy shed tools need regular updates as well.
Ideas For Installing Galaxy:
� Install Galaxy on a VM and take the VM inside the TSD [Not permitted so far].
� Get all Galaxy installation files, take them inside the TSD through the file-sluice, and install Galaxy [Allowed but Pain in the head].
� Make a Galaxy Docker image and take it inside the TSD [Permitted and easy].
Running Galaxy inside TSD: Challenges
Running Galaxy inside TSD: Challenges
Ideas For Upgrading Galaxy:
� Upgrade online from the public Galaxy repository (https://bitbucket.org/galaxy) [Not permitted].
� Take a Docker image of the new Galaxy inside the TSD [Permitted and easy].
Docker
Docker
Docker is an open-source project that automates
the deployment of applications inside software
containers, by providing an additional layer of
abstraction and automation of operating system–
level virtualization on Linux.
[www.docker.com]
Docker vs. VM
Docker
conta
iners
Virtu
al M
ach
ines
Docker: Run Platforms
� Various Linux distributions (Ubuntu, Fedora,
RHEL, Centos, openSUSE, ...)
� Cloud (Amazon EC2, Google Compute
Engine, Rackspace)
� Windows, OSX: Boot2Docker
Docker: Build an Image
Dockerfile
Load
Base Image
New Image
Build
Installationscript
Example: Dockerfile for TopHat, Bowtie2, and SAMtools
FROM ubuntu
MAINTAINER John Wregglesworth <[email protected]>
RUN apt-get update && apt-get install -y python unzip gcc make bzip2 zlib1g-dev ncurses-dev
ADD tophat-2.0.10.Linux_x86_64.tar.gz tophat.tgz
ADD bowtie2-2.1.0-linux-x86_64.zip bowtie.zip
ADD samtools-0.1.19.tar.bz2 samtools.tar.bz2
RUN tar xzf tophat.tgz && unzip bowtie.zip && mv tophat-2.0.10.Linux_x86_64 tophat && mv bowtie2-2.1.0 bowtie2
RUN bunzip2 samtools.tar.bz2 && tar xf samtools.tar && mv samtools-0.1.19 samtools && cd samtools && make
ENV PATH /bowtie2:/tophat:/samtools:$PATH
RUN bowtie2-build /bowtie2/example/reference/lambda_virus.fa lambda_virus
Example: Running the (TopHat, Bowtie2, and SAMtools)
Container
$ docker run –t azab/bowtie2 bowtie2 --version
TopHatBowtie2SAMtools
Container
Host
Image
Galaxy inside
TSD as a
Docker
Container
Galaxy inside TSD as a Docker Container
$ docker run -d -p 8080:80 bgruening/galaxy-stable
Container
Host
P:8
080
P:8
0
Tool installation??
Data Storage??
Galaxy inside TSD as a Docker Container
$ docker run -d -p 8080:80 -v /home/user/galaxy-export/:/export/ bgruening/galaxy-stable
Container
/home/user/galaxy-export/
/export/
Host
P:8
080
P:8
0
Galaxy inside TSD as a Docker Container
Deep Tools Dockerfile
Container1
/home/user/galaxy-export/
/export/
Host
/export/
Container2
Galaxy inside TSD as a Docker Container
Container
/home/user/galaxy-export/
/export/
insilico.hpc.uio.no
tsd/p77
p77-galaxy01-l
Container
/export/
/galaxy/galaxy-export/
image
Tarball
tools
Tarball
Dockerfile
Galaxy Tools
inside TSD as
Docker
Containers
Galaxy Tools inside TSD as Docker containers
Containers
Containers
Host
Tools
Container
/home/user/galaxy-export/
/export/
Host
Tool
Containers
Galaxy Tools inside TSD as Docker containers: Bowtie2
<requirements> <requirement type="package" version="2.2.4">bowtie2</requirement><requirement type="package“ version="0.1.18">samtools</requirement>
</requirements>
<tool_dependency><package name="bowtie2" version="2.2.4"><repository changeset_revision="2b25b6e8d108"
name="package_bowtie_2_2_4" owner="devteam" toolshed="https://toolshed.g2.bx.psu.edu" /></package><package name="samtools" version="0.1.18">
<repository changeset_revision="171cd8bc208d" name="package_samtools_0_1_18" owner="devteam" toolshed="https://toolshed.g2.bx.psu.edu" /></package>
</tool_dependency>
bowtie2_wrapper.xml
tool_dependencies.xml
Galaxy Tools inside TSD as Docker containers:
Bowtie2
<requirements> <container type="docker"> azab/bowtie2 </container>
</requirements>
bowtie2_wrapper.xml
tool_dependencies.xml
Galaxy Tools inside TSD as Docker containers:
Advantages
� Package the tool together with it’s runtime environment in a container (No need to install the runtime on the server).
� No need to define and include dependencies.
� Isolated runtime inside the containers (Problems stay inside).
Galaxy Tools inside TSD as Docker containers:
Issues
� How many containers can run “and be stable” on a production server?
� Is the docker engine itself stable enough?
� What about the disk-space overhead?
� Others??...