supporting-researchers-in-the-cloud

40
Supporting Researchers in the Cloud Dr. Ann Borda Executive Director / VeRSI Victorian eResearch Strategic Initiative Chief Executive Officer /VPAC Victorian Partnership for Advanced Computing and V3 Alliance

Upload: ann-borda

Post on 16-Apr-2017

23 views

Category:

Documents


1 download

TRANSCRIPT

Supporting Researchers in the Cloud

Dr. Ann Borda

Executive Director / VeRSIVictorian eResearch Strategic Initiative

Chief Executive Officer /VPAC Victorian Partnership for Advanced Computing and V3 Alliance

IDC Report – The Digital Universe in 2020

• From 2005 to 2020, the digital universe will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes (more than 5,200 gigabytes for every man, woman, and child in 2020).

• From now until 2020, the digital universe will about double every two years.

• Only a tiny fraction of the digital universe has been explored for analytic value. By 2020, as much as 33% of the digital universe will contain information that might be valuable if analyzed.

• By 2020, nearly 40% of the information in the digital universe will be "touched" by cloud computing providers — meaning that a byte will be stored or processed in a cloud somewhere in its journey from originator to disposal.

• The amount of information individuals create themselves — writing documents, taking pictures, downloading music, etc. — is far less than the amount of information being created about them in the digital universe.www.emc.com/leadership/digital-universe/

Gartner Top Trends 2014

A Connected World

Internet of Things - en.wikipedia.org/wiki/Internet_of_Things

5

What is the cloudThe word 'cloud' is now ubiquitous when discussing online technologies and services.

“Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. “

The U.S. National Institute of Standards and Technology (NIST) Sept 2013

6

Service Models - Definitions

Software-as-a-Service (SaaS): Applications served over the Internet, like Google Docs. Such applications frequently include collaboration or sharing features that would be more difficult to implement in desktop software.

Platform-as-a-Service (PaaS): Specialized APIs for building applications on the Internet, like Google App Engine.

Infrastructure-as-a-Service (IaaS): Low-level services for basic storage and computing. A variety of services: Amazon Web Services, Windows Azure, and Google Compute Engine.

7

Service Models

http://en.wikipedia.org/wiki/Cloud_computing

9

Cloud Service Providers – Just a few!

1. Amazon2. VMware3. Microsoft Azure4. SalesForce5. Google  6. Rackspace7. IBM8. Citrix9. Joyent10. SoftLayer

11. OpenStack12. Cisco13. AT&T14. GoGrid15. Oracle16. SAP17. Dropbox18. Verizon/Terremark

A View from the Researcher

Challenges and Opportunities

Researchers as extreme “information workers”

Consumers and creators of information and knowledge

Open and closed access publishing of data and results “Data sets are becoming the

new instruments of research”

Dan Atkins, University of Michigan

Drivers• Key technological drivers

• Moores Law – the exponential increase in � �computing power and solid-state memory – faster cheaper devices, higher capacity storage, etc

• dramatic increase in communication bandwidth

• Research Process: coping with the data deluge– Finding and accessing data– Linking data– Processing data– Interpreting data – Presenting results

• Increased international and cross-organisational collaboration - Doing what was previously impossible

Genomics High EnergyPhysics

Astronomy

Research Patterns

*Humanities and social sciences (hybrid models across these types)

Based on a slide by Ewan Birney, EMBL, 2010

21st Century Research

Organise Data

DiscoverData

PublishData

Use Data

Collect Data

Generate Data

> 80% of researcher’s

time

< 20% of researcher's

time

A basic reality of research

Acknowledgement: Dr. Rhys Francis, Oct 2013

Reality of Data Use?

Researcher infrastructures at a Glance

Compute Data

Networks

Tools

Data scales can vary(Most research isn’t about big data)

Infrastructure Stack 2009-2013 (NCRIS & Super Science)

Infrastructure Co-ordination

Research Data CommonsBetter data management, description and access

Extended Bandwidth

Better HPC modelling Larger data collections

Shared Access Methods

NCI&Pawsey+ $156M

Australian Research Education Networks

(AREN)+ $40M

Research Data Storage Initiative

(RDSI) + $50M

Australian Access

Federation (AAF)

+ $2M

Australian National Data

Service (ANDS)+ $75M

Digital LaboratoriesBetter research tools, environments and workflows

National eResearch

Collaboration Tools and Resources

+ $69.5M

Australian eResearch Infrastructure Council $1.5M

NationalCapabilities

+ $246M

ResearchIntegration+ $144.5M

R.Francis, AERIC, 2010

Do computational modeling, complete data analysis,visualize results

NCIPawsey

Keep data and observations,

describe, collect, share,find, and re-use them

ANDSRDSI

Use new tools, apps,work remotely and collaborate in the

cloud

NeCTAR

A National Stimulation Package!

Increased connectivity and bandwidth ARENAAF Single Sign On, High reliance services, High reliability servers

Acknowledgement: Rhys Francis, AeRIC, May 2012

www.nectar.org.au

19

Cloud – What’s in it for researchers?

Cloud computing is cost-competitive for a wide variety of research workloads and application scenarios

Some examples:

•24/7 Web applications•Shared access to documents, data, tools to multiple collaborators• large-scale data processing• "bursty" or "spiky" CPU-intensive workloads

Humanities and the NeCTAR Research Cloud

• Researcher Lauren Gawne (pictured right) is a linguistics expert who has just completed a PhD thesis on a linguistic description of a Tibeto-Burman language of Nepal called Lamjung Yolmo at the University of Melbourne.

• “I can now search texts much quicker and modify them to suit my purposes. I don't really 'write code' as in create programs from scratch, but I'm quite happy to tinker with it… I am part of a generation who grew up with computers and I navigate html and xml when writing blogs and so forth.”

.

Software as a Service - NeCTAR Research Cloud

“By running Tilemill on the NeCTAR servers I was able to work on the program through my browser, and it ran as effectively for me as for any other student in the workshop (even those with shiny new computers). I was also free to move between working on it at home, or from a computer in the office.

For the collaborative group map we could work together remotely, which definitely made it easier.”

Virtual Laboratories

• CSIRO - Virtual Geophysics Laboratory

• Genomics Virtual Laboratory

• University of Tasmania - Marine Virtual Laboratory• The All Sky Virtual Observatory

• Climate and Weather Science Laboratory

• Humanities Networked Infrastructure (HuNI)

• The Characterisation Virtual Laboratory: research environments for exploring inner space

http://www.nectar.org.au/virtual-laboratories-1

eResearch Tools• Macquarie University - UniCarbKB: e-infrastructure for glycomics• University of Western Australia - cloud-based bio-informatics tools• University of Queensland - OzTrack: tools for the storage, analysis and visualisation of animal

tracking data• Monash University - Bioscience Data Platform - TARDIS in the cloud• Australian Synchroton - tools for the Australian Synchroton community• Australian National University - Drishti and Voluminous - volume visualisiation tools• University of NSW - federated Archaeological information management system• Curtin University - Collaborative and Automated Tools for the analysis fof marine imagery and

video (CATAMI)• Monash University - Geology from Geodynamics• Queensland Cyber Infrastructure – Quadrant• Centre of Excellence for Particle Physics - high throughput computing for globally connected

science• University of Queensland - Aust-ESE project - tools to support collaborative authoring and

mangement of electronic scholarly editions• University of Adelaide - Submission, harmonisation and retrieval of ecological data – SHaRED• University of Melbourne - Human Variome Project, Australian node clinical and molecular data

linkage tools• Schizophrenia Research Institute - Extension and Enhancement of Systems for the Australian

Schizophrenia Research Bank• CSIRO - Cloud based image analysis and processing toolbox

http://www.nectar.org.au/eresearch-tools

Data, its management and use, -a common consideration between the stakeholders

Data

Data

Data

Data

Frame work for access and use

Libraries supporting Researchers in the Cloud

"as researchers move into the cloud, and the world grows information rich, where are libraries (in the cloud)"?

“use the cloud to go to them, not they come to us?"

Future Researcher + Future Libraries

Some gaps:– Research interactions often cross an organisation

boundary, and across multiple infrastructures– Researchers managing complex workflows– What is the role of service providers like libraries– What new skills/roles to make stuff happen– Addressing barriers to uptake

– Conventional wisdom suggests we may need interoperability or a standard (or two), AND a lot of people working together.

Understanding Research Innovation

McCrindle Research – Emerging Research Methods http://www.mccrindle.com.au

Europeana Research

Platform

Content & Data

Tools “Portal”

“Annotation”“API”

“SPARQL”

Services

http://pro.europeana.eu/web/europeana-cloud

Helix Nebula

• http://helix-nebula.eu/• CERN, EMBL, ESA• Helix Nebula - the Science Cloud, will support the

massive IT requirements of European scientists.

• The project aims to pave the way for the development and exploitation of a Cloud Computing Infrastructure, initially based on the needs of European IT-intense scientific research organisations, while also allowing the inclusion of other stakeholders’ needs (governments, businesses and citizens).

Linked open Data Cloud• http://linguistics.okfn.org/resources/llod/

- See more at: http://linguistics.okfn.org/resources/llod/#sthash.v6oTG7ys.dpuf

Digital Curation• Digital data curation involves a wide range of

activities, many of which may be suitable for deployment within a cloud environment.

• These range from infrequent, resource-intensive tasks which will benefit from the ability to rapidly provision resources, to day-to-day collaborative activities which can be facilitated by networked cloud services.

E.g. Kindura project (duraspace.org)https://jiscinfonetcasestudies.pbworks.com/w/page/45197715/Kindura

Digital curation and the cloud White Paper 2012

Communities of practice

Future EverythingApps For EuropeApps for Europe is a support network that provides tools to transform ideas for data based apps into viable businesses, and FutureEverything is a key member of that network.

It brings together a powerful European network of individuals and organisations who have been involved in open data programmes and in supporting promising ideas to help ideas to scale

http://futureeverything.org

Developer communities

New Models of Delivery• Complement the continuum of library

/information services that are provided at local, faculty, institutional, state, national levels– For researcher uptake - all levels of

infrastructure must be in place, or at least services they can access

– Bring researcher champions on board• People as Infrastructure• Design everything with the user in mind

Joined up Thinking

• Identify the most commonly requested services (plug-and-play services) and strategise with agility and in consultation…

• Libraries as aggregators, facilitators, evaluators, validators, ‘nodes’ …

• Flexible underlying (technical) services…. fronted by a collaborative (aka human help desk) shop front…

• Invest in accessible & coordinated user support & training

• Invest in awareness and outreach

Ideal User Models?

Any Questions?

Dr. Ann Borda, V3 AllianceLevel 3, Thomas Cherry Building (201), The University of Melbourne 3010 Phone: 03 8344 8322 |Mobile: 0437 469 417 | Email: [email protected] | [email protected]

Copyright (c) 2013, V3 Alliance, Dr. Ann BordaThis work is licensed under a Creative Commons Attribution 2.5 Australia License. To view a copy of this license visit:http://creativecommons.org/licenses/by/2.5/au/