methodbox: from open-data to open-insight

39
Methodbox: From open-data to open-insight MethodBox Team Jul 2011

Upload: verdi

Post on 21-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Methodbox: From open-data to open-insight. MethodBox Team Jul 2011. Presentation. Problem Data tsunami + puddles of insight Solution Collective efficient science Deployment Sense-making networks on open-data. Quote. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Methodbox: From open-data to open-insight

Methodbox:From open-data to open-insight

MethodBox TeamJul 2011

Page 2: Methodbox: From open-data to open-insight

Presentation

• ProblemData tsunami + puddles of insight

• SolutionCollective efficient science

• DeploymentSense-making networks on open-data

Page 3: Methodbox: From open-data to open-insight

Quote

“…you call it Epidemiology and we call it quantitative Social Science”

A leading researcher, Jul 2011

Open dataCommon methodsPotentially complementary expertise

Page 4: Methodbox: From open-data to open-insight

Obesity Example

Fragmented understandingof public health problems such as obesity

...data, methods/models and expertisesplit across

disciplines (e.g. social vs. biomedical)

and settings (e.g. academia vs. healthcare)

Page 5: Methodbox: From open-data to open-insight

Puddles of researcharound the organising principle

… but policies need the big picture

Page 6: Methodbox: From open-data to open-insight

Data Example

• Time series data from Health Visitors from Wirral

• Data deposit with UKDA but no uses for 16 years

• Children measured at the time the obesity epidemic took hold…

Page 7: Methodbox: From open-data to open-insight

Fifths of IDAC 2004

Red (light) = most deprived

Red (dark)

Purple

Blue (dark)

Blue (light) = most affluent

Material deprivation affecting children

(households with children: % on benefits in 2001-3)

Wirral (0.3M), UK

Page 8: Methodbox: From open-data to open-insight

BMI of 3 yr olds

1988 - 1989

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 9: Methodbox: From open-data to open-insight

BMI of 3 yr olds

1990 - 1991

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 10: Methodbox: From open-data to open-insight

BMI of 3 yr olds

1992 - 1993

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 11: Methodbox: From open-data to open-insight

BMI of 3 yr olds

1994 - 1995

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 12: Methodbox: From open-data to open-insight

BMI of 3 yr olds

1996 - 1997

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 13: Methodbox: From open-data to open-insight

BMI of 3 yr olds

1998 - 1999

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 14: Methodbox: From open-data to open-insight

BMI of 3 yr olds

2000 – 2001

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 15: Methodbox: From open-data to open-insight

BMI of 3 yr olds

2002 - 2003

Fifths of BMISDS BMI fifth

Red (light) = fattest

Red (dark)

Purple

Blue (dark)

Blue (light) = thinnest

Page 16: Methodbox: From open-data to open-insight

Child Obesity:Action 6 years after signal in the data

Body Mass Index (BMI) trend in Wirral 3y-olds from 1988 to 2003

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Mar-88 Jul-89 Nov-90 Apr-92 Aug-93 Jan-95 May-96 Sep-97 Feb-99 Jun-00 Nov-01 Mar-03 Aug-04

Month of measurement by Health Visitor

Th

ree-

mo

nth

ly r

olli

ng

ave

rag

e B

MI S

DS

SDS = standard deviation score from 1990 British Growth Reference charts – adjusts for age and sex of the child

CluesClues ActionsActions

Page 17: Methodbox: From open-data to open-insight

Similar Data in 2011

• National Child Measurement Programme

• Anonymised national database

• Could be opened (like national pupil database) extend to other policy-relevant, timely research

Page 18: Methodbox: From open-data to open-insight

Data Already in UK Data Archive

• Example: Health Surveys for England (annual)

• Analyses feed national policies

• Does evidence need to be localised?...

Page 19: Methodbox: From open-data to open-insight

12

34

5

Men

Women25

25.5

26

26.5

27

27.5

BMI

Income fifth (low to high)

Women and not menfrom low-income households

are fatter in England

Data from Health Survey for England

Page 20: Methodbox: From open-data to open-insight

1 23

45

Men

Women25

25.5

26

26.5

27

27.5

BMI

Income fifth (low to high)

Women from low-income households and men from high-income households

are fatter in Greater Manchester

Data from Health Survey for England

Page 21: Methodbox: From open-data to open-insight

Linked-data ≠Linked: data, methods & investigators

Previous slides showsocial-biomedical signalsabout obesityfrom under-used datasets

Biomedical Research:Data, methods & investigators

Social Research:Data, methods & investigators

Page 22: Methodbox: From open-data to open-insight

MethodBox Aim

..to increase the sharing and reuse of

data sources & extracts

and data processing methods

in one in-silico environment (‘e-Lab’)

shared by social and health researchers

Page 23: Methodbox: From open-data to open-insight

e-Lab

Socially-stimulating science, in-silico

Research Object

FindShareReuse

Data-sources

Data-preparation scripts

Research protocol Statistical analysis scripts

Slides

Working datasets

Figures/Graphics

Manuscripts

References

Analysis-logs & notes

Page 24: Methodbox: From open-data to open-insight

National Dataset Example

• Health Surveys for England– Large-scale (participants * variables)– Annual since early 90s– Under-used by NHS who fund it

– Key barrier:extracting a research-ready subset of data

– Data archive playground = e-Lab

Page 25: Methodbox: From open-data to open-insight

Supporting and Developing Interdisciplinary Understanding

Sharing resources – tools, methods, data

Sharing expertise – discussions and reuse around shared resources

Promoting interdisciplinary working

Developing interdisciplinary understanding – language, tacit assumptions, methods

First step - sharing of resources

Shared resources provide the basis for discussion

Discussions lead to deeper interdisciplinary understanding

Understanding of other domains promotes more effective interdisciplinary working

Page 26: Methodbox: From open-data to open-insight

Facilitating a social networkof data archive users…

…toward a reward environmentfor sharing data, methods,and expertise

Page 27: Methodbox: From open-data to open-insight

Browsing for data extractsmade by a social networkof data archive users…

Page 28: Methodbox: From open-data to open-insight
Page 29: Methodbox: From open-data to open-insight

Shopping for variables from across different years of survey collections…

Page 30: Methodbox: From open-data to open-insight

Instant access to

relevant parts of

survey documentation

Page 31: Methodbox: From open-data to open-insight

Making the data extractvisible…

Linking a data extractwith a script forderiving variables…

Sharing and visibility

Page 32: Methodbox: From open-data to open-insight

Enabling user-visibility for data extraction or derivation contributions…

Page 33: Methodbox: From open-data to open-insight

Current MethodBox

Video link

Page 34: Methodbox: From open-data to open-insight

Training Course Apr `10• Trained a mixture of NHS, academic and industry users

of HSE in the use of Methodbox• Course run in conjunction with CCSR• Feedback forms completed by 15 of 16 attendees,

asked to rate Methodbox from 1 (negative) to 7 (positive) on the following statements:– I thought MethodBox was:

• Terrible - Wonderful: Mean = 5.57• Difficult to understand - Easy = 5.57• Frustrating to use - Satisfying = 5.79• Dull - Stimulating = 5.29• Rigid - Flexible = 5.71• Difficult to navigate - easy to navigate = 6

Page 35: Methodbox: From open-data to open-insight

Attitudes to Sharing

Data Scripts

Academic social scientists

Yes No

Academic epidemiologists/medical researchers

No Yes

NHS & Local Govt. analysts

Yes Yes

Page 36: Methodbox: From open-data to open-insight

MethodBox Evolution

• Amazon-like user-prompting forother variables that may be relevantto the set being extracted

• More surveys/datasets incorporated• User-contributed & community-curated

datasets• ….• Feature request list exceeds resources

Page 37: Methodbox: From open-data to open-insight

Building on Successful E-Science

• Most widely used scientific workflow sharing systems: myGrid, Taverna, myExperiment

• Over a decade of programme funding sustained world leading

• E-Infrastructure R&D ready to leverage more outputs from open-linked data

Page 38: Methodbox: From open-data to open-insight

Toward Open Insight

• Researcher A is expert in deprivation• Researcher B is expert in obesity• Both use a common data archive

but don’t usually meet• MethodBox shares the expertise of A and B

to create a more complete model of deprivation in obesity

Page 39: Methodbox: From open-data to open-insight

Conclusion

• Open-data alone is not enough

• Social e-infrastructure for science is needed

• Sharing insights and methods is key, and can be achieved through systems like MethodBox + ESDS