qb'er demonstration
TRANSCRIPT
Tool for converting and linking statistical datasets to a cloud of interconnected historical datasets.
QB’er - Demonstration
Ashkan Ashkpour, IISH – CLARIAH WP407-10-2016
GOAL OF THIS PRESENTATIONFrom CSV files and structured statistical data to (harmonized) Interlinked data on the Web
Data Tooling Interlinked Datasets on the web
• Gather and enter own data• Find data on multiple repositories• Download• Clean and reshape• Merge• Clean and reshape…• Analyse
PROBLEM - Today’s Workflow
PROBLEMDisconnected data and efforts
We keep repeating ourselves and do this repeatedly for the same datasets
Comparability across time and datasets
https://blog.gaijinpot.com/knowledge-sharing-economy/
LOSS OFF.. Provenance Cleaning efforts (sometimes up to 60% of the work) Valuable mappings (discarding time consuming prior work) Expert decisions Discoverability
SOLUTION: INTEGRATE DISSIMILAR DATA IN FLEXIBLE AND ACCOUNTABLE WAYS
HARMONIZATION AND RDF What we want is harmonization by way of;
Standardization and Classification
Flexible approach while providing accountability
QB’EREmpower individual researchers to:
Code and harmonize individual datasets according to best practices of thecommunity (e.g. HISCO, SDMX, Worldbank, etc.) or against their colleagues
Share their own code lists with fellow researchers
Align code lists across datasets
Publish their standards-compliant datasets on a Structured Data Hub
Collaborative growing of a graph of interconnected datasets
INPUT
INPUT
INPUT
INPUT
DEMO EXAMPLE Nieuwkomers in de Utrechtse volkstelling van 1829 en 1839
http://hdl.handle.net/10622/KMAJLE
Utrecht 1829
Utrecht 1839
Variables
Values
TO CONCLUDE…• Generic, domain-independent tool• Uploading of a dataset and extraction of variables and value Frequencies• Mapping of variable values to codes (while preserving the originals!)• Publishing of dataset structure as Linked Data• Align codes and identifiers across datasets• Provenance of all assertions to the SDH traceable to time and person• Crowd-based production of code lists and mappings• Sharing / Reuse other people’s work (or stand on the shoulders of giants)• No disposable research
QUESTIONS ?
QB’er - Demonstration
Ashkan Ashkpour – CLARIAH WP407-10-2016