structural analysis of the aggregate outputs from the 2011 census to develop alternative integrated...

Post on 14-Dec-2015

219 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Structural analysis of the aggregate outputs from the 2011 Census to develop alternative integrated multidimensional conceptual models of data and geographies for easier management and dissemination

Justin Hayes

UK Data Service

What the census tells us workshop

Manchester

23 July 2014

Making it easier for everyone to find, understand and use the bits of the census they’re interested in

Justin Hayes

UK Data Service

What the census tells us workshop

Manchester

23 July 2014

Overview

• Traditional and integrated approaches• Work with 2011 outputs

• Integrated descriptive model• Integrated model of geographies

• Ongoing work with data producers

Our job

• Find• Understand• Use• Automated systems with online interfaces• Online and interactive support• Main services now freely available to everyone

Traditional tabular aggregate outputs

• Outputs conceived and specified as tables• Details of individual tables defined through consultation with

different user groups• Per-table categorisations and descriptions• Complex table universes and footnotes• Visual layout an important consideration• Extended metadata unattached

• Complex process!• Number of tables limited by resource available

• Numerous inconsistencies between tables• Effectively separate datasets

Traditional tabular dissemination

Traditional tabular dissemination

Integrated aggregate outputs

• Deconstruct tables• Assemble and rationalise all variables and categories in tables• Variable-ise table universes and footnotes• Create a standardised library of variables to describe all data

• Define integrated models of characteristics (What?)and geographies (Where?)• Enables global operations/queries• Framework for Attachment of extended metadata• Facilitates description and transfer using standards

• Provide access via Web service API• Data becomes self-describing

Integrated dissemination

Variable combination selection

Variable combination selection

Category combination selection

Area selection

Data download

InFuse

Under the bonnet

• Integrated multidimensional descriptive model• Integrated model of geographies• The really important bits!

InFuse 2011 release 2: Raw data

• England and Wales Local and Detailed Characteristics to output area level

• UK harmonised data to local authority level• 422 tables, mainly multivariate• 31 geography types• 241,334 areas• 11,311 files• 15Gb volume

Integrated descriptive model

• Processing of raw metadata• Deconstruction, rationalisation and re-integration• Library of variables and categories• Re-insertion of data values• Attachment of associated metadata

• Global description using standards• Global operations via Web service API

• Data is self-describing• Enables lightweight, generic applications

Benefits of this work

• Data producers• Efficient data management• Flexible output production• Best value

• Application developers• Easy access to self describing web services• Light weight generic applications

• End users• Quick and easy global search• Context along with data

InFuse 2011 release 2: Processed data

• 97 variables• 2,501 categories• 281 variable combinations• 140 thousand category combinations• 4.6 billion values

• A 460Km high stack of sticky notes!• Anticipating approximately 10 billion values in all

Integrated model of UK census geographies

• Assembly of raw information on geographies• 31 geography types• 241,334 areas (anticipating ~ 2 million including postcodes)• Direct and indirect hierarchies

• Simplified presentational model• 11 composite geography layers• Simplification of merged geographies in England and Wales

• Calculation of ‘missing’ data• Linkage between descriptive and geography models

• Partial availability of data for geographies and extents

Raw admin and statistical geographies

Admin and statistical geography layers

infuse.mimas.ac.uk/help/definitions/2011geographies

What’s next for InFuse

• Interface improvements• Geography first option• Fine tune interface features• Select categories from more than one category combination• ‘Select all’ categories• Back button• Geography tree improvements (multiple hierarchies)

• User testing

What’s next?

• More data• More comparable data

• Different data• Boundary and flow data

• More functionality• Personalisation, analysis and visualisation

• Public InFuse API• Work with statistical agencies?

• Machine-friendly data from source• Flexible generation with automated disclosure control?• Information on usage and contact with users

What is the UK Data Service?

• a comprehensive resource funded

by the ESRC

• a single point of access to a wide range of secondary social science data

• support, training and guidance

UK Data Service Census Support

• Specialist function of UK Data Service

• Access and support services for outputs from recent UK censuses

• Add value by making census outputs easy to find, understand and use

• Engagement with UK census agencies

• Long history of technological innovation in service development

• census.ukdataservice.ac.uk

census.ukdataservice.ac.uk

• Aggregate component of census outputs

Census Support at Manchester

Justin Hayes

Rob Dymond-Green

Richard Wiseman

Jamey Hart

• Aggregate component of census outputs

Census Support at Manchester

Justin Hayes

Rob Dymond-Green

Richard Wiseman

Jamey Hart

Give InFuse a go!

infuse.mimas.ac.uk

•Comments, questions and ideas welcome•help@ukdataservice.ac.uk

top related