josephine anne gough - lex jansen · oxygen xml editor xslt development graph adminsitration gdsr...

46
A Day in the Life of an RDF Curator Josephine Anne Gough Information Architect Team Lead F. Hoffmann-La Roche Ltd Switzerland About Work on the Roche Global Data Standards Repository

Upload: others

Post on 20-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

A Day in the Life of an RDF Curator Josephine Anne Gough Information Architect Team Lead F. Hoffmann-La Roche Ltd Switzerland

About Work on the Roche Global Data Standards Repository

Page 2: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Who am I ?

•  My name is Amy, Didier, Robin, Ivan

•  I am an Information Architect working in the Roche Data Standards Office

•  I used to work in Data Management / Statistical Programming

•  I found out about RDF and Semantic technology from the GDSR

•  I took a leap of faith and changed my career direction to work in RDF

Page 3: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

What is RDF ?

•  RDF is a really cool technology

•  It’s been around for 15 years

•  Resource Description framework means we can describe all of our Data Standards in abstract models and then fill those models with content

•  RDF means we can really describe our standards in a machine readable format

•  RDF means we can uniquely identify things

•  RDF means that we can link pieces of information and even whole models together

•  RDF….is just really cool when you get to know it…you get hooked.

Page 4: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Demo of RDF

•  Short demo of RDF in TopBraid to show linking of information through an RDF network.

Page 5: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

What is the GDSR?

•  The GDSR is Roche’s RDF Dataset where all Data Standards are defined

•  GDSR stands for Global Data Standards Repository

•  The GDSR holds CDISC, Roche and other standards such as Questionnaires, Biomarkers, Lab data.

•  It has a browser and web services….but I’ll get onto that later

•  It is accessed by anyone in Roche who needs information or download products about Clinical Data Standards

•  It is available 365/24/7 to the business

•  Our job is basically to look after the content and exports from the GDSR

Page 6: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Roche GDSR

Page 7: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

GDSR Menu

Page 8: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Data Collection homepage

Page 9: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

SDTM VS Domain

Page 10: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

CDISC Controlled Terminology

Page 11: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Labs filtering for Albumin

Page 12: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Asthma Questionnaire

Page 13: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

GDSR Search

Page 14: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Today

•  My boss “Cheffe” keeps me really busy

•  But she’s nice

She makes sure that training is part of the job

and boy is there a lot to learn !

Page 15: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

TRAINING

Page 16: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Technical Training

SPARQL SPARQL Protocol and RDF Query Language

RDF Resource Description Framework

HTTP REST Representational State Transfer XML Extensible Markup Language

URIs Uniform Resource Identifier

RDFS RDF Schema

XSLT Extensible Stylesheet Language

OWL Web Ontology Language

SKOS Simple Knowledge organisation System

Page 17: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Tools Training

TopBraid RDF development

Graph Validation

oXygen XML editor XSLT development

Graph Adminsitration

GDSR Browser

GDSR

Administering GDSR Search

GDSR UI Models

Page 18: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Content Training…so glad I knew some of these already !

Roche Data Collection

SDTM

Roche extensions to CT Roche EDC

ISO 11179

Roche extensions to SDTM

Questionnaires ePRO

CDISC Controlled Terminology

ADaM

Roche Data Analysis

Biomedial Concepts

Page 19: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Other skills

User Requirements

Documentation UAT

Customer Facing

Validation and Testing

Page 20: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

GDSR Overall Architecture

W3C Semantic Standards RDF - OWL - SKOS

ISO 11179 for Metadata Registries

CDISC Foundational Standards

Sponsor Extensions

Protocol Submission DC DT DA �

HTTP REST Application Services

Page 21: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Overwhelming at first!

•  A lot to learn

•  Business needs to be aware of the learning curve

•  Continuous reading/training program

•  Simple steps

•  Progression to more complex parts

•  Each step is absorbing and interesting

•  Keep the training wheel turning

Page 22: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

My job broadly splits into three

•  Day to day changes to the GDSR metadata

•  Product development

•  Modeling

Page 23: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

CHANGES TO METADATA

Page 24: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Daily content changes - Jira requests

•  A user sends us a request to change metadata via a JIRA ticket system.

•  Today I’ve got a bunch of changes for example:

•  They want to change the CRF label on one of the Medical History Forms

•  The SDTM group have some new annotations

•  There is a new Questionnaire to be uploaded.

Page 25: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Example Jira request to IA team

Page 26: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Which graph do I change ?

Page 27: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Changing the CRF field label for Medical History in TopBraid

Page 28: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Admin Panel to load content into the GDSR

Page 29: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Medical History Form in GDSR browser

Page 30: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Adding new standards: bulk uploads example CRF

Page 31: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

PRODUCT DEVELOPMENT

Page 32: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Product development

1.  Web Services give XML

2.  Transform XML into other formats using XSLT

Page 33: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Product development oXygen

Page 34: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Product for SAS programmers: Lab unit conversions in csv

Page 35: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

GDSR Export Products in Production Today

•  Operational CRFs in pdf

•  SDTM and Codelists in Excel

•  Lab Analytes in excel and csv

•  Questionnaires in excel and csv

•  RAVE Medidata ALS (EDC build)

•  PK test codes in csv and excel format

Page 36: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Operational CRF

Page 37: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Product downloads from the GDSR

•  Demo of SDTM spreadsheet export if time

Page 38: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

GDSR Export Products in progress

•  Non-CRF contracts with standard specifications

•  Conformance checks

•  Data Analysis TLGs

•  Data Analysis VAD specifications

•  eSubmission

Page 39: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

eSubmission CRF

Page 40: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

MODELING

Page 41: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

RDF Modelling

•  Currently Models separated logically by domain but all Linked Data

•  40+ existing models in GDSR

•  New content can require new models and/or extensions

•  Team has learnt from scratch

Page 42: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

How to learn to modeling in RDF ?

RDF Apprenticeship

Know the domain

Explore existing models

Make extensions to

existing models.

Experiment designing new

models

Guidance and review from experienced

modelers

Iterate, review with peers, try

out

Page 43: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Modeling: Case history in apprenticeship

1.  Make changes to existing Data Collection models via jira

2.  Create a whole new CRF for a new TA

3.  Learn and explore CDISC SDTM models and code lists

4.  See if same model could be applied to non-CRF data

5.  Do gap analysis of use cases between SDTM and non CRF

6.  Propose any new Classes or Predicates for the model

7.  Present ideas to team – “Live” modeling workshops

8.  Finalise and fill

Page 44: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

RDF at the heart of….

RDF gives us the reality (finally) that we can really link together:

•  Standards to Standards (Via Models and Linked Data)

•  Standards to Governance (in an MDR with Admin and Validation)

•  Standards to User Interface (configurable Browser and Search)

•  Standards to Web Services (configurable schema downloads in XML)

•  Standards to exports (word, PDF, Excel etc)

Page 45: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Thanks go to the Roche IA Team:

•  Amy Klopman (SDTM Models, SOA Models)

•  Robin Köger (our XSLT and Web Service goto, Data Analysis Models)

•  Didier Clement (Data Collection Models, non CRF and Biomarker Models)

•  Ivan Robinson (UI Models, Controlled Terminology Models)

Special thanks for our most interesting jobs goes to:

Frederik Malfait creator of the GDSR

Page 46: Josephine Anne Gough - Lex Jansen · oXygen XML editor XSLT development Graph Adminsitration GDSR Browser GDSR Administering GDSR Search GDSR UI Models . Content Training…so glad

Doing now what patients need next