cloud based projects at belfast escience centre

30
Cloud based Projects at Belfast e-Science Centre An Overview Terry Harmer London 1 February 2010 http://www.besc.ac.uk

Upload: eduserv

Post on 25-May-2015

594 views

Category:

Education


2 download

DESCRIPTION

A presentation by Terry Harmer of the Belfast eScience Centre at the Repositories and the Cloud meeting organised by Eduserv and JISC in London on Feb 23 2010.

TRANSCRIPT

Page 1: Cloud based Projects at Belfast eScience Centre

February 2010 1

Cloud based Projects at

Belfast e-Science CentreAn Overview

Terry Harmer

London

http://www.besc.ac.uk

Page 2: Cloud based Projects at Belfast eScience Centre

February 2010 2

What do I do?

Technical Director of Belfast e-Science

• Develop project ideas for digital economy applications

• Form consortia to bid for funding … usually write the project funding proposals

… funding from EPSRC, INI, TSB and private companies

• Lead Technical architect for projects

• Project Manager… also do software development

These projects are (and increasingly so) based around utility infrastructure consisting of owned and multiple utility vendors.

London

Page 3: Cloud based Projects at Belfast eScience Centre

February 2010 3

Talk Outline

1. Talk objective2. BeSC?

..some history of BeSC applications

3. Evolution of our infrastructure4. 2 Examples of utility-centric deployed

applications5. Issues

London

Page 4: Cloud based Projects at Belfast eScience Centre

February 2010 4

Objective

• To present some large-scale projects that are in field deployment with established user groups– Dynamic and utility cloud focused– Why this approach and what advantages has this

approach given us.

• Issues, advantages, problems, pitfalls, – Headline– Technical

London

Page 5: Cloud based Projects at Belfast eScience Centre

February 2010 5

Belfast e-Science Centre?

• Belfast e-Science was established in 2002 with funding from EPSRC and the DTI under the UK e-Science programme.– Funded since then by TSB, EPSRC, INI, MoD,

QinetiQ– Currently one of four EPSRC Platform Award

funded e-Science Centres in the UK.– BeSC is entirely self funding (and has been since 2002)

• We have the attitude and tend to operate like a small R&D company

• Don’t really use resources within a University infrastructure• Have close connections with many companies but less with host

Uni.• Mainly deal with commercial users and organisations.• Have a tight budget and (perhaps too) big ambitions. London

Page 6: Cloud based Projects at Belfast eScience Centre

A bit of BeSC Context

• As somewhat of an accidental decision, BeSC focused on commercial/industrial applications

– Some of the accident was a result of the initial DTI Centre funding emphasising commercial applications and Tony Hey’s darwinian view of e-science programme.

• The industrial/commercial focus grew from the challenges within the application areas which we felt offered something new and distinct to the e-Science community.– No one else was focusing there so it made us

unique– There are real and significant challenges

London February 2010 6

Page 7: Cloud based Projects at Belfast eScience Centre

Why Commercial/Industrial?• Media domain

– Speed : user driven– Security : video is treated as

money.– large data sizes : larger than

LHC for example

• Financial domain – Speed : 100,000+ share trades

per second– Security : company business – heavily regulated – where how

data is moved

• On-demand infrastructure/resources– Hosting/utility management are

necessary parts of a dynamic digital economy and technology still required.

• Applications– Digital media –BBC, QinetiQ– Financial Services –First

Derivatives, ??– Military Applications –UK MoD,

QinietQ

• Technology– Resources, Auto-deployment,

on-demand resources– Management of owned and 3rd

party clouds– Autonomic management, SLAs

and scaling

London February 2010 7

Page 8: Cloud based Projects at Belfast eScience Centre

February 2010 8

Experimental Environment (2003)

London

BBCNI

Page 9: Cloud based Projects at Belfast eScience Centre

February 2010 9

Experimental Environment (Spring 2004)

London

Page 10: Cloud based Projects at Belfast eScience Centre

February 2010 10

Experimental Environment (Autumn 2004)

London

Page 11: Cloud based Projects at Belfast eScience Centre

February 2010 11

Experimental Environment (Autumn 2005)

London

Page 12: Cloud based Projects at Belfast eScience Centre

February 2010 12

Deployment Environment 2010

London

Page 13: Cloud based Projects at Belfast eScience Centre

February 2010 13

2 Examples

1. Financial Services with FD

2. On-demand media with BBC/QinetiQ/BT

London

Page 14: Cloud based Projects at Belfast eScience Centre

First Derivatives plc

• Provider of software to banks and financial services companies.– Have software in 9 of the top 10 banks.– Develop auto-trading software– Provide financial services, consulting, technology

outsourcing, design etc

London February 2010 14

Page 15: Cloud based Projects at Belfast eScience Centre

Simple Architecture

London February 2010 15

Page 16: Cloud based Projects at Belfast eScience Centre

Solution 1: 2004-2005

London February 2010 16

Page 17: Cloud based Projects at Belfast eScience Centre

Current Cloud Solution (2006- )

London February 2010 17

Page 18: Cloud based Projects at Belfast eScience Centre

February 2010 18

Digital Media (2002-)

• Working in the evolving on-demand media environment– Partners: BBC/QinetiQ/BT….completed late 2009– Started pre- iPlayer and YouTube!

• Concern early was on better resource utilisation in and expensive and highly dynamic environment.– Early model of pooled resources

• Most recently in on-demand media infrastructures– Project PRISM with BBC/QinetiQ/BT– Supporting game console to Phone to set-top box access.– Much of our work now is on military media infrastructures.

London

Page 19: Cloud based Projects at Belfast eScience Centre

Large-scaleContent Store

NetworkController

Scheduling Automation

BroadcastTransmitters

Uplinkto Satellite

I nternet

Content Store

Presentation Suite

Scheduling Automation

A circuit based infrastructure

London February 2010 19

Page 20: Cloud based Projects at Belfast eScience Centre

BBCNorthern I reland

BBCScotland

BBCWales

BBCNetwork

A Media SOA (this Slide dates from 2003!)

Services

BBCNorthern I reland

Services

BBCScotland

Services

BBCWales

Services

BBCNetwork

High SpeedNetwork

London 20February 2010

Page 21: Cloud based Projects at Belfast eScience Centre

Mobile Non-geographic services (slide from 2005!)

Services

BBCNorthern I reland

Services

BBCScotland

Services

BBCWales

Services

BBCNetwork

High SpeedNetwork

BBCYorkshireServices

BBCWest

Services

BBCSouth

Services

BBCE MidlandsServices

BBCW Midlands

Services

BBCNE

Services

BBCEast

Services

BBCSouth West

Services

BBCEY & L

Services

BBCLondonServices

BBCNW

Services

BBCSatellite

London 21February 2010

Page 22: Cloud based Projects at Belfast eScience Centre

A Dynamic Utility Cloud

London February 2010 22

Page 23: Cloud based Projects at Belfast eScience Centre

February 2010 23

Work flow

• Work with BBC is winding down– Expertise is moving to

military applications.

• Managing currently around 1+ petabytes of media content

• Has managed close to 2 petabytes in the last 4 years.

London

Page 24: Cloud based Projects at Belfast eScience Centre

February 2010 24

Infrastructure Summary

• Dynamic collections of services– Managing real user groups

• Service scale to established SLAs– We attempt to keep our deployed infrastructure

low

• Our infrastructure is a mix of owned and utility infrastructure– Buying capacity and storage on demand is our

norm.– increasingly the utility part is the majority for

processing and user interfaces– Owned infrastructure is a secure repository.

London

Page 25: Cloud based Projects at Belfast eScience Centre

February 2010 25

Advantages (headline)

• Develop an infrastructure that suits the application we are deploying.– The cost of ownership is pretty low.– As an R&D organisation we can punch above our

small size and relatively small budget.– Experiment with great flexibility running parallel

shared infrastructures.– Reach out to real user groups – ….. Unconstrained by (often entirely justified)

corporate/academic infrastructure procedures.

You own what you need to own for as long as you need to own it and it can be configured exactly for your needs.London

Page 26: Cloud based Projects at Belfast eScience Centre

February 2010 26

Issues (general)

• Utility resource market is immature– We treat providers as a commodity market place.– The offerings can be difficult to compare

• No standard unit of compute/storage• Prices will be dependant on the user usage pattern• What you get and what you can buy varies widely

– Some attempts at customer lock-in to providers– Multi-provider clouds can be (relatively) expensive

• Need to think carefully about what is stored, where it is stored, how long it is stored, who has access– We have put a lot of work into automated policy based

content management….because we do not have the people to manage this.

• Based around SAML and XACML

London

Page 27: Cloud based Projects at Belfast eScience Centre

February 2010 27

Issues (general)

• Provider APIs and features constantly changing.– No standard API– New services and providers appearing. – APIs not very well documented

• Weak SLAs from providers– Currently we build our infrastructure assuming

there is no SLA.

London

Page 28: Cloud based Projects at Belfast eScience Centre

February 2010 28

Issues (technical)

• Machine performance unpredictable. – CPU features especially unpredictable and can make a big

difference to compute-heavy tasks • e.g. we are heavy video transcoding users.

– individual instances can be (surprisingly) unreliable (hosts DO crash)

• Bandwidth unpredictable and can be costly• Required to manage OS images

– proliferation of images; – using anything but vendor images requires trust in

creator. • nobody has a trust framework - you have to trust that user

– creating own images (or using other peoples) means more machines to keep up to date!

London

Page 29: Cloud based Projects at Belfast eScience Centre

February 2010 29

Issues (security)

• low latency to other consumers' boxes decreases attacker cost and time to perform timing attacks– nefarious, rich attackers can get on your box and slow you

down or potentially compromise key generation• See http://people.csail.mit.edu/tromer/papers/cloudsec.pdf

• DDoS on cloud providers can be very damaging to everyone in it; – Larger providers just increase the cost of the attack but reward

is also high(see http://www.theregister.co.uk/2009/10/05/amazon_bitbucket_outage/ )

• No (meaningful) security QoS• post-attack analysis challenging - in most clouds you cannot inspect a disk

to see logs without starting up machine

• Potential data security issues: who has access to physical boxes?

• e.g. Amazon recommends all data on disks be encrypted

London

Page 30: Cloud based Projects at Belfast eScience Centre

February 2010 30

Issues (banal)

• Area not well understood– What was the inventory tag of the machine?

• Why are you not using our in-house IS cluster?

London