cloud based projects at belfast escience centre
DESCRIPTION
A presentation by Terry Harmer of the Belfast eScience Centre at the Repositories and the Cloud meeting organised by Eduserv and JISC in London on Feb 23 2010.TRANSCRIPT
February 2010 1
Cloud based Projects at
Belfast e-Science CentreAn Overview
Terry Harmer
London
http://www.besc.ac.uk
February 2010 2
What do I do?
Technical Director of Belfast e-Science
• Develop project ideas for digital economy applications
• Form consortia to bid for funding … usually write the project funding proposals
… funding from EPSRC, INI, TSB and private companies
• Lead Technical architect for projects
• Project Manager… also do software development
These projects are (and increasingly so) based around utility infrastructure consisting of owned and multiple utility vendors.
London
February 2010 3
Talk Outline
1. Talk objective2. BeSC?
..some history of BeSC applications
3. Evolution of our infrastructure4. 2 Examples of utility-centric deployed
applications5. Issues
London
February 2010 4
Objective
• To present some large-scale projects that are in field deployment with established user groups– Dynamic and utility cloud focused– Why this approach and what advantages has this
approach given us.
• Issues, advantages, problems, pitfalls, – Headline– Technical
London
February 2010 5
Belfast e-Science Centre?
• Belfast e-Science was established in 2002 with funding from EPSRC and the DTI under the UK e-Science programme.– Funded since then by TSB, EPSRC, INI, MoD,
QinetiQ– Currently one of four EPSRC Platform Award
funded e-Science Centres in the UK.– BeSC is entirely self funding (and has been since 2002)
• We have the attitude and tend to operate like a small R&D company
• Don’t really use resources within a University infrastructure• Have close connections with many companies but less with host
Uni.• Mainly deal with commercial users and organisations.• Have a tight budget and (perhaps too) big ambitions. London
A bit of BeSC Context
• As somewhat of an accidental decision, BeSC focused on commercial/industrial applications
– Some of the accident was a result of the initial DTI Centre funding emphasising commercial applications and Tony Hey’s darwinian view of e-science programme.
• The industrial/commercial focus grew from the challenges within the application areas which we felt offered something new and distinct to the e-Science community.– No one else was focusing there so it made us
unique– There are real and significant challenges
London February 2010 6
Why Commercial/Industrial?• Media domain
– Speed : user driven– Security : video is treated as
money.– large data sizes : larger than
LHC for example
• Financial domain – Speed : 100,000+ share trades
per second– Security : company business – heavily regulated – where how
data is moved
• On-demand infrastructure/resources– Hosting/utility management are
necessary parts of a dynamic digital economy and technology still required.
• Applications– Digital media –BBC, QinetiQ– Financial Services –First
Derivatives, ??– Military Applications –UK MoD,
QinietQ
• Technology– Resources, Auto-deployment,
on-demand resources– Management of owned and 3rd
party clouds– Autonomic management, SLAs
and scaling
London February 2010 7
February 2010 8
Experimental Environment (2003)
London
BBCNI
February 2010 9
Experimental Environment (Spring 2004)
London
February 2010 10
Experimental Environment (Autumn 2004)
London
February 2010 11
Experimental Environment (Autumn 2005)
London
February 2010 12
Deployment Environment 2010
London
February 2010 13
2 Examples
1. Financial Services with FD
2. On-demand media with BBC/QinetiQ/BT
London
First Derivatives plc
• Provider of software to banks and financial services companies.– Have software in 9 of the top 10 banks.– Develop auto-trading software– Provide financial services, consulting, technology
outsourcing, design etc
London February 2010 14
Simple Architecture
London February 2010 15
Solution 1: 2004-2005
London February 2010 16
Current Cloud Solution (2006- )
London February 2010 17
February 2010 18
Digital Media (2002-)
• Working in the evolving on-demand media environment– Partners: BBC/QinetiQ/BT….completed late 2009– Started pre- iPlayer and YouTube!
• Concern early was on better resource utilisation in and expensive and highly dynamic environment.– Early model of pooled resources
• Most recently in on-demand media infrastructures– Project PRISM with BBC/QinetiQ/BT– Supporting game console to Phone to set-top box access.– Much of our work now is on military media infrastructures.
London
Large-scaleContent Store
NetworkController
Scheduling Automation
BroadcastTransmitters
Uplinkto Satellite
I nternet
Content Store
Presentation Suite
Scheduling Automation
A circuit based infrastructure
London February 2010 19
BBCNorthern I reland
BBCScotland
BBCWales
BBCNetwork
A Media SOA (this Slide dates from 2003!)
Services
BBCNorthern I reland
Services
BBCScotland
Services
BBCWales
Services
BBCNetwork
High SpeedNetwork
London 20February 2010
Mobile Non-geographic services (slide from 2005!)
Services
BBCNorthern I reland
Services
BBCScotland
Services
BBCWales
Services
BBCNetwork
High SpeedNetwork
BBCYorkshireServices
BBCWest
Services
BBCSouth
Services
BBCE MidlandsServices
BBCW Midlands
Services
BBCNE
Services
BBCEast
Services
BBCSouth West
Services
BBCEY & L
Services
BBCLondonServices
BBCNW
Services
BBCSatellite
London 21February 2010
A Dynamic Utility Cloud
London February 2010 22
February 2010 23
Work flow
• Work with BBC is winding down– Expertise is moving to
military applications.
• Managing currently around 1+ petabytes of media content
• Has managed close to 2 petabytes in the last 4 years.
London
February 2010 24
Infrastructure Summary
• Dynamic collections of services– Managing real user groups
• Service scale to established SLAs– We attempt to keep our deployed infrastructure
low
• Our infrastructure is a mix of owned and utility infrastructure– Buying capacity and storage on demand is our
norm.– increasingly the utility part is the majority for
processing and user interfaces– Owned infrastructure is a secure repository.
London
February 2010 25
Advantages (headline)
• Develop an infrastructure that suits the application we are deploying.– The cost of ownership is pretty low.– As an R&D organisation we can punch above our
small size and relatively small budget.– Experiment with great flexibility running parallel
shared infrastructures.– Reach out to real user groups – ….. Unconstrained by (often entirely justified)
corporate/academic infrastructure procedures.
You own what you need to own for as long as you need to own it and it can be configured exactly for your needs.London
February 2010 26
Issues (general)
• Utility resource market is immature– We treat providers as a commodity market place.– The offerings can be difficult to compare
• No standard unit of compute/storage• Prices will be dependant on the user usage pattern• What you get and what you can buy varies widely
– Some attempts at customer lock-in to providers– Multi-provider clouds can be (relatively) expensive
• Need to think carefully about what is stored, where it is stored, how long it is stored, who has access– We have put a lot of work into automated policy based
content management….because we do not have the people to manage this.
• Based around SAML and XACML
London
February 2010 27
Issues (general)
• Provider APIs and features constantly changing.– No standard API– New services and providers appearing. – APIs not very well documented
• Weak SLAs from providers– Currently we build our infrastructure assuming
there is no SLA.
London
February 2010 28
Issues (technical)
• Machine performance unpredictable. – CPU features especially unpredictable and can make a big
difference to compute-heavy tasks • e.g. we are heavy video transcoding users.
– individual instances can be (surprisingly) unreliable (hosts DO crash)
• Bandwidth unpredictable and can be costly• Required to manage OS images
– proliferation of images; – using anything but vendor images requires trust in
creator. • nobody has a trust framework - you have to trust that user
– creating own images (or using other peoples) means more machines to keep up to date!
London
February 2010 29
Issues (security)
• low latency to other consumers' boxes decreases attacker cost and time to perform timing attacks– nefarious, rich attackers can get on your box and slow you
down or potentially compromise key generation• See http://people.csail.mit.edu/tromer/papers/cloudsec.pdf
• DDoS on cloud providers can be very damaging to everyone in it; – Larger providers just increase the cost of the attack but reward
is also high(see http://www.theregister.co.uk/2009/10/05/amazon_bitbucket_outage/ )
• No (meaningful) security QoS• post-attack analysis challenging - in most clouds you cannot inspect a disk
to see logs without starting up machine
• Potential data security issues: who has access to physical boxes?
• e.g. Amazon recommends all data on disks be encrypted
London
February 2010 30
Issues (banal)
• Area not well understood– What was the inventory tag of the machine?
• Why are you not using our in-house IS cluster?
London