bioquant’s large scale data facility usage patterns ... · bioquant’s large scale data facility...

30
BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant IT, Ruprecht-Karls-Universität Heidelberg

Upload: truonghanh

Post on 28-Aug-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

BioQuant’s Large Scale Data Facility

Usage Patterns, Problems, Solutions

Christian Thiemann Marc Hemberger

BioQuant IT, Ruprecht-Karls-Universität Heidelberg

Page 2: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Central Technology Platorms @ BioQuant

Page 3: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

DKFZ BioQuant

BioQuant LSDF

LSDF

KIT

Cluster Workstations

Microscopes

LSDF

Cluster

NFS

10 G

CIFS

1 G

Tapes

TSM

10 G

NFS

10 G

rsync

Page 4: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

3,300 HDDs in RAID6 (8+2) — 6.1 PB (physical) — 4.3 PB (logical)

NFS & CIFS

2x 10 GBit/s — 1.1 GB/s (read) — 700 MB/s (write)

Tier 1 — aged data is moved here Tier 0 — fresh bits are stored here

BioQuant - Current LSDF Implementation

240x

1 TB

480x

1 TB

480x

1 TB

300x

1 TB

480x

2 TB

420x

3 TB

480x

3 TB

420x

3 TB

Page 5: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

~ 80 TB

~ 70 TB

~ 1.3 PB

~ 1.6 PB total

Page 6: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

RNA interference

image size ~ 2 MB

~ 2,000 images per sample / run

future runs up to 70,000 images

many samples per study / experiment “typical” throughput: 20,000 images per day

peak throughput: 3,000,000 images (3 TB) in three weeks

many small files

Page 7: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

many small files

Microscopes

raw data

write once

Workstations

raw data

read once

or copy

results

(negligible size)

intermediate

results

write once

Page 8: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

many small files

Microscopes

raw data

write once

Workstations

raw data

read once

or copy intermediate

results

write once

results

(negligible size)

e.g. KNIME

http://www.knime.org

Page 9: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

throughput: 10–20 genomes per week

Microscopy

BioQuant LSDF

Group Shares

Sequencing

International Cancer Genome Consortium

~ 200 GB raw data per genome

~ 2 TB after processing intermediate results are retained

total ~ 2 PB of data ~ 1 PB at BioQuant LSDF

few large files

Page 10: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

few large files

BioQuant LSDF

Sequencers

raw data

write once

Cluster

raw data &

temp data

read often

results

(negligible size)

temp data

write once

Page 11: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

Results Database

Jürgen Eils, DKFZ

Page 12: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Microscopy

BioQuant LSDF

Group Shares

Sequencing

Jürgen Eils, DKFZ

Results Database

Page 13: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

BioQuant DKFZ

BioQuant LSDF - Complex AA - Requirements

LSDF

Cluster Workstations

Microscopes

LSDF

Cluster

NFS NFS

KIT

NFS

?

Von-Leitner-Institut f..

verteiltes Echtzeit-Java NFS

?

AD

Uni HD

AD LDAP AD

Shibboleth

Page 14: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

BioQuant Client LSDF

NFSv3 uses numeric UID/GIDs The Problem

Page 15: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Hi, I’m 3224. Who owns myfile.txt?

myfile.txt: user 103224 group 101000

BioQuant Client LSDF

???

NFSv3 uses numeric UID/GIDs The Problem

Page 16: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Hi, I’m 3224. Who owns myfile.txt?

myfile.txt: user 103224 group 101000

That only works on weak minds...

But here, have a sandwich instead.

BioQuant Client LSDF

???

chgrp 1801 myfile.txt

What?

sudo chgrp 1801 myfile.txt

NFSv3 uses numeric UID/GIDs The Problem

Page 17: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

BioQuant Client LSDF

NFSv3 on-route UID/GID translation

NFS Proxy

The Workaround

Page 18: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Hi, I’m 3224. Who owns myfile.txt?

myfile.txt: user 103224 group 101000

BioQuant Client LSDF

Ok

NFSv3 on-route UID/GID translation

NFS Proxy

The Workaround

Hi, I’m 103224. Who owns myfile.txt?

myfile.txt: user 3224 group 1000

Page 19: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Hi, I’m 3224. Who owns myfile.txt?

myfile.txt: user 103224 group 101000

Still ok

BioQuant Client LSDF

Ok

chgrp 1801 myfile.txt

Ok

sudo chgrp 1801 myfile.txt

NFSv3 on-route UID/GID translation

NFS Proxy

The Workaround

Hi, I’m 103224. Who owns myfile.txt?

myfile.txt: user 3224 group 1000

chgrp 101801 myfile.txt

sudo chgrp 101801 myfile.txt

Page 20: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

BioQuant Client LSDF

NFSv4 usernames The Solution

Page 21: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Hi, I’m 3224. Who owns myfile.txt?

myfile.txt: user 103224 group 101000

BioQuant Client LSDF

Ok

NFSv4 usernames

Hi, I’m bq_cthiemann@bioquant. Who owns myfile.txt?

myfile.txt: user bq_cthiemann@BQ group bq_admins@BQ

The Solution

Page 22: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Hi, I’m 3224. Who owns myfile.txt?

myfile.txt: user 103224 group 101000

BioQuant Client LSDF

Ok

chgrp 1801 myfile.txt

Ok

NFSv4 usernames

Hi, I’m bq_cthiemann@bioquant. Who owns myfile.txt?

myfile.txt: user bq_cthiemann@BQ group bq_admins@BQ

chgrp rattorturers@BQ myfile.txt

The Solution

Page 23: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

NFSv4 usernames

The Solution:

• SONAS 'will be' NFSv4-capable in 2014

• EMC Isilon has NFSv4 implemented since 2011

• Bioquant ran Isilon POCs to evaluate

Authentication Zones

Multiple Autentication Providers

NFS4 with Kerberos

NFSv4 development has been slow

„no economical incentive“ (IBM)

Page 24: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

LDAP

AD

NIS

Local files

Kerberos

Isilon - Multiple Authentication Providers

Isilon can use multiple authentication providers

LDAP, Active Directory, NIS, Local Files, Local Provider

User Mapping for UID/GID from/to SID is very flexible

Page 25: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Isilon - Multiple Authentication Providers

Each “Authentication Zone” is associated with

A SmartConnect Zone (IP Pool)

List of Auth source associations (trusted or untrusted)

List of Shares

SmartConnect

Zone 1

SmartConnect Zone 2

SmartConnect Zone 3

AD 1

/ifs/deptC

Acc

ess

Zo

ne

1

Acc

ess

Zo

ne

2

Acc

ess

Zo

ne

3

/ifs/deptB

/ifs/deptA

AD 2 or LDAP

Page 26: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

LSDF Requirements

Possible with Isilon and Access Zones

1. Provide access to data via CIFS from BioQuant such that

a) users are authenticated by the UniHD AD, either by password or (preferably) using Kerberos Yes

b) file permissions / ACLs can be edited using Windows Explorer or other standard tools Yes

c) the same data can be concurrently accessed via NFS Yes

2. Provide access to data via NFS from BioQuant and other institutions such that

a) users are securely authenticated and authorized by their home institution

(UniHD AD for BioQuant, DKFZ AD for DKFZ users, etc.), Yes

b) file permissions / ACLs can be edited using standard Unix tools Yes

c) collisions of UID/GIDs from different institutions do not lead to data leakage (e.g., a BioQuant user with UID 500,

from BioQuant LDAP, will not have access to files owned by a DKFZ user with UID 500 from DKFZ AD) Yes

d) access to files can be granted across institutional boundaries if desired (e.g., a group of BioQuant users may get

access to a directory owned by a DKFZ user).

contradicting with c.)

Page 27: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

31 EMC CONFIDENTIAL—INTERNAL USE ONLY

Current Environment

LSDF

BQ DKFZ UniHD

LDAP - BQ users

AD+SFU - DKFZ users

AD - UniHD - BQ users - bwGRID users

bwGRiD

DKFZ

Overlap in UID/GID values are possbile

BQ UniHD

LDAP - bwGRID

users

Page 28: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

32 EMC CONFIDENTIAL—INTERNAL USE ONLY

Isilon Authentication Solution for LSDF-2

LSDF2

BQ DKFZ bwGRiD-HD

LDAP - BQ users AD+SFU

- DKFZ users

AD (UniHD) - BQ users - UniHD

LDAP - bwGRiD users

Access Zone 1: BQ Protocols: NFS, SMB Auth1: LDAP-BQ Auth2: UniHD-AD Krb: UniHD-AD ID-Mapping: External (same account name in LDAP + AD)

Access Zone 2: UniHD Protocols: NFS Auth1: LDAP-HD Auth2: none Krb:Uni-HD ID-Mapping: none (could be done external) Zugriff auf Files die mit SMB erzeugt wurden ?

Access Zone 3: DKFZ Protocols: NFS Auth1: DKFZ-AD (SFU) Auth2: none Krb:DKFZ-AD ID-Mapping: AD-SFU but no SMB access

krb

mappin

g

User/passwd file extract

krb

SID

UID/GID

UID/GID krb

Solution attributes: + Isilon Access Zones can use

existing auth infrastructure + All parties remain autonomous + No additional user

administration required + Data access segregated in a

secure manner + Can have overlapping

UID/GIDs between Access Zones

Page 29: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

33 EMC CONFIDENTIAL—INTERNAL USE ONLY

Isilon Colaboration Solution for LSDF-2

LSDF2

BQ DKFZ bwGRiD-HD

LDAP - BQ users AD+SFU

- DKFZ users

AD (UniHD) - BQ users - UniHD

LDAP - bwGRiD users

Access Zone 1: BQ Protocols: NFS, SMB Auth1: LDAP-BQ Auth2: UniHD-AD Krb: UniHD-AD ID-Mapping: External (same account name in LDAP + AD)

Access Zone 2: UniHD Protocols: NFS Auth1: LDAP-HD Auth2: none Krb:Uni-HD ID-Mapping: none (could be done external)

Access Zone 3: DKFZ Protocols: NFS Auth1: DKFZ-AD (SFU) Auth2: none Krb:DKFZ-AD ID-Mapping: AD-SFU but no SMB access

krb

mappin

g

krb

SID

UID/GID

UID/GID krb

Access Zone 4: Colaboration Protocols: NFS4 + ? Auth1: LSDF2-AD Auth2: none Krb:LSDF2-AD On-Disk-Identity: SID ID-Mapping: ????? - Map from unique SIDs to to new and unique UIDS ?

LSDF2-AD - No users

One-way forsest trust

krb

This works as before

Page 30: BioQuant’s Large Scale Data Facility Usage Patterns ... · BioQuant’s Large Scale Data Facility Usage Patterns, Problems, Solutions Christian Thiemann Marc Hemberger BioQuant

Summary: AA problem understood!

‡Real life will tell

EMC Isilon has a solution ready for • Multiple Authentication Providers

- LDAP, Active Directory, Local, Kerberos - Inlcuding flexible ID-Mapping

• Multiple Zones for Multi-Tenancy • Proposed solution for collaboration is external

In-Band solution - ‡cross tenancy shares not compliant - requires solution for „identity problem“ by storage provider: none given - no economical incentive for this option͍