Download - Abstracts - HEALTH-RI Health-RI... · Abstracts Conference 2020 “Towards Data Driven Health” enabling data driven health. Abstractbook Health‐RI conference 2020 # Primary contact

January 30, 2020 | Utrecht, The Netherlands

Abstracts

Conference 2020“Towards Data Driven Health”

enabling data driven health

Abstractbook Health‐RI conference 2020

# Primary contact Title

1 Aaike van Oord Biobanken.nl, transparency for patients and public 2 Talia Santos SQLite4Radiomics: Automated Feature Extraction Integration with

ConQuest DICOM 3 Trynke de Jong FAIRification of The Lifelines Cohort Study and Biobank 4 Jeroen Beliën The iCRF Generator: Generating interoperable electronic case report

forms using online codebooks 5 Sofie Hansen Call for action: register and test BBMRI-NL’s request portal Podium 6 Erik van Iperen BBMRI’s request portal Podium for samples, data, and images from

Dutch biobanks and data collections 7 Arturo Moncada

Torres Showcasing the Personal Health Train: Federated Learning in Real Time using VANTAGE6

8 Harm Buisman Cancer surveillance 9 Kees Ebben Projecting Patient Data onto Clinical Decision Trees in Oncoguide 10 Brenda Hijmans User demo of a public study in the national cBioPortal instance hosted by

Health-RI 11 Derek de Beurs Applying machine learning on health record data from general

practitioners to predict suicidality 12 Carmen Rubio

Alarcón FAIR data management and stewardship in real practice: capture and integration of translational research data from the PLCRC sub-studies MEDOCC, MEDOCC-CrEATE and PROVENC3

13 Rosemarijn Looije Performance Indicators and the infrastructure for health data 14 A.W. (Sandra) van

den Belt-Dusebout Use of encrypted BSN in record linkage of epidemiological cohorts and biobanks with disease to ensure valid linkages with optimal privacy protection.

15 Tim Hulsen The Ten Commandments of Translational Research Informatics 16 Jan Worst A tranSMART driven clinical diagnostic decision support system on

anemia 17 Rob Hooft Data Stewardship Wizard 18 Rob Hooft Data Desks at the University Medical Centers: Facilitating Access to

Expertise on Data Handling 19 Yaron Caspi Changes in the intracranial volume from early adulthood to the sixth

decade of life 20 Annette Gijsbers The PALGA Portal - Streamlining and professionalizing the request,

delivery and use of pathology data and materials. 21 Valeriu Codreanu Generating CT-scans with 3D Generative Adversarial Networks Using

Supercomputers 22 Alberto Traverso Medical images and AI: the need of a big data revolution 23 Betzabel Cajiao Laboratory variation of molecular testing in a Dutch cohort of metastatic

non-small cell lung cancer patients from 2017 24 Evert-Ben van Veen Herziening Gedragscode gezondheidsonderzoek 25 Ronald van Schijndel Supporting your Research; Tools for Data Management and Processing 26 K. Joeri van der Velde FAIR Genomes: Standardizing a meta-data schema for FAIRifying

personal genome data workflows 27 Marian Beekman BBMRI-Omics: Valuable resource of multi-omics data and analysis tools 28 Jack Broeren Uitdagingen bij het bouwen van een Fair Data station 29 Anne-Charlotte Fauvel EATRIS-Plus - a multi-omic toolbox to support cross omic analysis and

data integration in clinical samples 30 Marcel Koek Fully automatic construction of optimal radiomics workflows 31 Hakim Achterberg Fastr workflow engine for reproducible and managed large-scale

processing 32 Marcel Koek Quantitative Imaging Biomarker Storage and Compute Infrastructure 33 Adriaan Versteeg Streamlining manual tasks in large medical imaging studies 34 Trynke de Jong Linkage of Lifelines and PALGA data: Enhancing multidisciplinary

research 35 Celia van Gelder Towards FAIR Data Steward as profession for the Lifesciences 36 Eric Vermeulen Self-initiated donation to a biobank. Should and could biobanks offer this

option?


37 Fatima El Messlaki CBS Microdata Services 38 Martin Boeckhout Ethics review for non-invasive (nWMO) health research: moving towards

a shared approach 39 Evelien van der

Schaaf Metadata matters

40 Erik Flikkenschld Getting started with trusted FAIR data lakes 41 Merel Wassenaar The real world nature of Prospective Dutch ColoRectal Cancer cohort

(PLCRC) 42 Nathalie Hijmering HOVON Pathology Facility and Biobank: Making the right choices for

workflow and data 43 Petros Kalendralis Public radiomics data collections in an open access Semantic Web

(SPARQL) endpoint 44 Matthijs Sloep The FAIRification of clinical data with modular knowlegde graphs. 45 Stuti Nayak Privacy Sensitive Distributed Analysis of Dementia Cohorts from

Hospitals in The Netherlands 46 H. Pieterman Only one copy 47 Inga Tharun Personal Health Train Coalition 48 Arturo Moncada-

Torres Implementation and Deployment of a Federated Logistic Regression

49 Thomas Rooijakkers Secure Log Rank Test in Survival Analysis on Vertically Partitioned Data using Multi-Party Computation

50 Rick Jansen Characterization of depression symptoms using large scale questionnaire data in the Dutch population: a BBMRI-BIONIC study

51 Louis Ter Meer What is a “digital” patient? An ontological approach. 52 Menno de Vries FAIRification at DNA, RNA and protein level in studying colorectal tumor

progression 53 Paula Jansen The Handbook for Adequate Natural Data Stewardship (HANDS) 54 Rogier de Jong SURF Research Access Management, an authorisation and

authentication service optimised for researchers 55 A.E.C. Schroten Administration of research logistics 56 Fariba Ahmadizar Cardio-metabolic profiling - Association of EFV with increased levels of

circulating lipid metabolites 57 Bob van Wijk Amsterdam UMC expertise center for high performance computing 58 Tessa van der Geest EPTRI - European Paediatric Translational Research Infrastructure: a

bridge towards the future of paediatric medicine 59 Rogier van der Stijl Public-private partnerships in biobanking and biobank-related research 60 Rogier van der Stijl Recommendations for sustainable biobanking 61 Peggy Manders COREON – Committee on Regulation and Research 62 Martin Brandt Galaxy in education using the SURF Research Cloud 63 Tessa van der Geest European Paediatric Translational Research Infrastructure (EPTRI): a

survey to map the expertise of the excellence of developmental pharmacology in pan-European countries

64 Nicolien A. van Vliet Thyroid function and metabolomics: from observational research in BBMRI cohorts to causal inference through Mendelian Randomization

65 Karlijn Groenen Applying the FAIR Data principles to a Rare Disease registry: a case study of the VASCA registry

66 Maxime Bos The metabolic profile of arterial calcification in the multi-cohort BBMRI setting

67 Mark Scheffer e/MTIC - Health Data Portal initiative 68 Janine Felix The Pregnancy And Childhood Epigenetics (PACE) Consortium - A

platform for epigenome-wide association meta-analyses 69 Cees Hof The DANS services for sharing, cataloguing and archiving your health

data 70 Daniele Bizzarri Metabolic risk scores: from metabolome to phenotype and back 71 Purva Kulkarni Towards precision diagnostics: Untargeted metabolomics for the

diagnosis of inborn errors of metabolism in individual patients


Floorplan Jaarbeurs Supernova


Abstract: 1 (Demonstration)

Biobanken.nl, transparency for patients and public Aaike van Oord (1), Tieneke Schaaij‐Visser (2, 3), Huig Schipper (4), Theo Mulder (4), Eric

Vermeulen (5), Ilse Broeders (6), Edmar Weitenberg (3), Susanne Rebers (1), Marjanka Schmidt (1,

2)

(1) ELSI Servicedesk/NKI, (2) BBMRI‐NL, (3) Lygature, (4) Patient and public advisory council (BBMRI‐

NL PPAC), (5) VSOP, (6) Lifelines

To create more transparent and accessible communication towards the public about biobanking

research (with clinical and population biobanks), the website Biobanken.nl has been edited and

reinstated as the place for patients, participants and public to find answers to their questions.

The website has been reinstated in co‐operation with a diverse group of stakeholders, varying from

BBMRI‐NL, large biobanks like PALGA and Lifelines and organizations like VSOP and the Patient and

Public Advisory Council of BBMRI‐NL.

At this Health‐RI conference we publicly announce that the new website Biobanken.nl is open for the

public. Plus that it has a new key feature offering patients and the general public answers to

questions they might have about biobanking and the storage and use of their data and tissue. New

questions result in answers that will be added to the website.

> Useful for biobank researchers and patient organizations

Within the network of Health‐RI the website Biobanken.nl offers a clear and easy accessible medium

for communicating difficult and sometimes sensitive information about privacy, legal rights, research

techniques, impact on patients, ethical boundaries, FAIR data stewardship, patient involvement and

financial organisation of biobanking research.

We invite all biobank research professionals to start using Biobanken.nl for their public

communication, apart from specific dissemination of scientific results of their research. In co‐

operation with media like Kennislink.nl and Dutch hospitals, we will communicate the existence of

Biobanken.nl to a broad audience in 2020.

> Answering questions

Transparency is the goal even though biobanking research and its legal and ethical limitations are

sometimes complicated. If information is not available on the website, website visitors are

encouraged to submit their question, to which an answer will be provided. The expert team of the

ELSI Servicedesk, consisting of experts such as legal specialists and ethicists, is available to provide

expert advice if needed.

Acknowledgements I would like to thank Tieneke Schaaij‐Visser for her support to keep Biobanken.nl the national

platform for transparancy about Biobanking , also after rebuilding the website. Tieneke pursuaded

me to change my initial plan to submit an abstract for a poster and propose a demonstration. Indeed

a better way to reach out to all biobanking professionals about the existence of Biobanken.nl andits

use for them towards the broad public.

Keywords: science communication, transparancy, biobanking, ethical, legal and societal issues,

patient communication



SQLite4Radiomics: Automated Feature Extraction Integration with

ConQuest DICOM Talia Santos (BSc) (1, 2), Lars van Driel (1, 2), Ivan Zhovannik (MSc) (2, 3), René Monshouwer

(PhD) (2)

(1) Fontys University of Applied Sciences, Eindhoven, The Netherlands, (2) Department of Radiation

Oncology, Radboud Institute for Health Sciences, Radboud University Medical Centre, Nijmegen, The

Netherlands, (3) Department of Radiation Oncology (MAASTRO), GROW – School for Oncology and

Development Biology, Maastricht, The Netherlands

Background: Radiomics stands for quantitative medical image analysis for non‐invasive disease

characterization. Radiomic features can then be fed into machine learning models. These models

could become a powerful asset for prognosis and diagnosis prediction, and treatment selection. But

the process of extracting these features oftentimes in unstructured and time‐consuming requires

certain IT skills. Moreover, radiomics extraction is rarely integrated into clinical imaging systems

(PACS), which is crucial for its clinical translation.

Methods: We developed SQLite4radiomics pipeline to standardize radiomics extraction and integrate

it into Conquest DICOM PACS tool.. . The radiomic feature extraction is performed by means of IBSI‐

compliant open‐source Pyradiomics package. The tool was tested with open anonymized imaging

data from a MAASTRO LUNG‐1 cancer cohort hosted on Health‐RI’s XNAT repository.

Functionality: The tool can perform radiomic feature extraction on stored data based on a selected

extraction strategy and a parameter file. The parameter file defines the settings for Pyradiomics. A

separate configuration file allows the researcher to change the region of interest selection strategy,

among other advanced features. The output of this process is saved to a file that can then be used

for machine learning.

Discussion: The goal of this project was to automate the feature extraction process and reduce time

spent on this aspect of research. It was also important to provide a tool that can be reused and

extended upon. The tool is open‐source and is available on GitHub. The pipeline tool is working with

a command line interface. A version with a web‐based interface was developed for internal use at

the Radboudumc and will be presented during the demonstration.

Acknowledgements Would like to thank all the Klinisch Fysici at the Radiotherapy department at the Radboudumc.

Keywords: PACS; DICOM; radiomics; SQLite; ConQuest DB


Abstract: 3 (Demonstration & Pitch)

FAIRification of The Lifelines Cohort Study and Biobank Trynke R. de Jong (1), Gijs L. Faber (1), Ruud van Vliet (2), Morris A. Swertz (3), Aafje Dotinga (4)

(1) Lifelines Cohort & Biobank, (2) Trivento B.V., (3) Molgenis

The Lifelines Cohort Study and Biobank collects longitudinal health‐related data and samples from

~167,000 inhabitants of the northern parts of the Netherlands (including children and elderly). Our

rapidly expanding collection is available for researchers working in the multidisciplinary field of

healthy ageing. Furthermore, researchers may design additional studies to collect extra biological

samples, physical measurements or questionnaire data from our participants.

In 2019 we completely restructured our database to increase the FAIR‐ness of our data, by i)

streamlining the import of novel data and metadata from various sources into a data platform, ii)

facilitating the release of customized datasets to researchers, and iii) increasing our (meta)data

quality. This resulted in a new data model and a new data platform developed on AWS with our ICT‐

partner Trivento (www.trivento.nl), and a new data catalogue developed by our partner Molgenis

(www.molgenis.org). Both developers allowed Lifelines data managers to configure functionalities

(and access some underlying code).

In our new model, each data point is given a unique place along three axes: the WHO‐axis (“which

participant delivered the data point?”), the WHEN‐axis (“in which context was the data point

collected?”), and the WHAT‐axis (“to which variable does the data point belong?”). The three axes

are implemented as filters in the new catalogue, allowing researchers to compile a compact, pre‐

filtered data order and to determine the number of participants who delivered a given variable in a

given context.

A fourth axis, HOW, represents the protocols used to collect the data (see for more detail our new

metadatawiki: http://wiki‐lifelines.web.rug.nl/doku.php). Combined, the four axes enable the

identification, processing, and communication of quality issues (i.e. data points or data sets that do

not properly adhere to a standard protocol) at various levels. In addition, the model ensures the

rapid incorporation of secondary data developed by our expert users.

Acknowledgements None

Keywords: Lifelines ‐ platform ‐ data model ‐ catalogue ‐ FAIR


Abstract: 4 (Demonstration & Pitch)

The iCRF Generator: Generating interoperable electronic case report

forms using online codebooks Sander de Ridder (1), Jan‐Willem Boiten (2), Gerrit Meijer (3) , Jeroen Beliën (1)

(1) Amsterdam UMC, Vrij Universiteit Amsterdam (2) Lygature, Utrecht, (3) Netherlands Cancer

Institute

Semantic interoperability of clinical data is essential to preserve its meaning and intent when the

data is exchanged, re‐used or integrated with other data. Achieving semantic operability requires the

use of a communication standard, such as HL7, as well as (functional) information standards.

Manually mapping clinical data to a medical thesaurus such as SNOMED CT is complicated and

requires expert knowledge of both the dataset, including its context, and the thesaurus. As an

alternative, the (re‐)use of codebooks, data definitions which may already have been mapped to a

thesaurus, can be a viable approach.

We’ve developed the iCRF Generator, a Java program which can generate the core of an

interoperable electronic case report form (iCRF) for three of the major electronic data capture

systems (EDCs): OpenClinica 3, Castor EDC and REDCap. To build their CRFs, users can select one or

more items from established codebooks, available from an online system called ART‐DECOR. ART‐

DECOR is an open‐source tool suite that supports the creation and maintenance of HL7 templates

and allows the storage of dataset definitions. Nictiz, the centre of expertise for eHealth and the

Dutch SNOMED‐CT release centre, facilitates ART‐DECOR to create health information standards that

are publicly accessible. The iCRF Generator currently provides access to six of these codebooks,

amongst which the Basic Health Data Set (Basisgegevensset Zorg) and the Clinical Building Blocks

(Zorginformatiebouwstenen). By providing an easy to use method to create CRFs for multiple EDCs

based on the same codebooks, interoperability can be more easily attained.

Acknowledgements We thank Jan‐Willem Boiten (Lygature), Gerben Rienk Visser (Trial Data Solutions) and Maarten

Ligtvoet (Nictiz) for reviewing the paper and providing invaluable suggestions. We also thank Wessel

Sloof (UMCG) for testing the generated REDCap exports.

Keywords: Interoperability, eCRF, iCRF, Codebook, FAIR, Software, EDC, Clinical data



Call for action: register and test BBMRI‐NL’s request portal Podium Erik van Iperen (1), Sofie Hansen (2), Tieneke Schaaij‐Visser (2), David van Enckevort (3), Jeroen

Beliën (4), Morris Swertz (3), Folkert van Kemmendade (5), Jan‐Willem Boiten (2)

(1) Durrer Center for Cardiovascular Research, (2) BBMRI‐NL/Lygature, Utrecht, (3) Department of

Genetics, University Medical Center Groningen,University Groningen,Groningen, The Netherlands, (4)

Amsterdam UMC, Vrije University Amsterdam, department of pathology, Amsterdam, The

Netherlands, (5) ErasmusMC, Rotterdam, The Netherlands

Background/information:

The exchange of samples and data from health registries, health databases, image archives and

biobanks to researchers, is often still administered through e‐mails, fax or telephone. This can make

the management of a request difficult and may hinder the efficient use of valuable resources.

Therefore, within the BBMRI‐NL 2.0 project, we have developed a generic national request portal

‘Podium’.

Methods:

‘Podium’ was developed together with The Hyve, an open‐source software company, and with the

valuable input of various existing request procedures in the Netherlands such as: the BIOS‐

consortium, PSI, PALGA, Lifelines, GO‐NL and PHARMO. Podium is currently in production and can be

used free of cost by researchers and organizations alike for requesting and managing requests of

samples and data in order to facilitate and stimulate efficient, optimal and shared use of available

resources within the Netherlands. Podium supports the following steps (including linked requests): a

generic request form, evaluation and approval of a request, track and trace of the data, and sample

release. It is also possible to link Podium to currently existing back‐end systems, such as has been

done at the NKI with ART.

Results:

In Podium we currently have 74 users and 17 organizations registered, including Go‐NL, BIOS, PSI,

NKI, Pharmo, and the NELSON study. However, we need your input! To enable a national one‐stop

shop request portal, including linked requests ‐ we need more organizations to register. To then

further improve our services, we need your feedback on how Podium works, and we are looking for a

use case to test our first request.

Conclusion:

Based on your input and wishes, we will submit a request for change to update the functionality of

Podium where needed. We hope you will join us in this effort in making existing data more easily

accessible for research.

Acknowledgements BBMRI‐NL

Keywords: FAIR, generic data request tool



BBMRI’s request portal Podium for samples, data, and images from

Dutch biobanks and data collections Erik van Iperen (1), Sofie Hansen (2), David van Enckevort (2, 3), Jeroen Beliën (4), Morris Swertz

(3), Jan‐Willem Boiten (2)

(1) Durrer Center for Cardiovascular Research, (2) BBMRI‐NL/Lygature, Utrecht, (3) Department of

Genetics, University Medical Center Groningen, University Groningen ,Groningen, The Netherlands,

(4) Amsterdam UMC, Vrije University Amsterdam, department of pathology, Amsterdam, The

Netherlands

The exchange of samples and data from biobank/lab to researcher is traditionally administered

through e‐mails, fax or telephone. Within the BBMRI‐NL 2.0 project, we have developed together

with The Hyve a generic request portal ‘Podium’ for requesting samples and data in order to facilitate

and stimulate efficient, optimal and shared use of available resources within the Netherlands.

Podium is directly linked to the BBMRI‐NL catalogue.

The workflow that is supported in Podium is based on the valuable input of various existing request

procedures in the Netherlands such as: BIOS‐consortium, PSI, PALGA, Lifelines, GO‐NL and PHARMO.

We are currently evaluating the use of Podium at several organization and based on their input and

wishes, we plan to submit a request for change to update the functionality of Podium where needed.

Podium offers all researchers and biobanks a portal supporting the process of requesting samples

and data in a standardized manner, improving quality, reliability and accountability. Podium supports

the following steps: a generic request form, evaluation and approval of a request, track and trace of

the data, and sample release. Every process step is logged. Linked requests are also supported, a

linked request is a request composed of different data types from two or more organizations for

which the resulting datasets need to be linked by subject and/or sample (eg. materials from PALGA

and data from PHARMO).

A link with the catalogue enables users to select interesting samples/data in the catalogue, and

request access to this selection across multiple organizations at once.

The backend system of the NKI, ART, has been successfully linked to Podium using the API.

After evaluating Podium at multiple organizations, we have a list of new features and changes. Next

step will be to prioritize these new features and changes and find additional funding to develop and

implement these.

Acknowledgements NA

Keywords: Request, tool, FAIR, data access



Showcasing the Personal Health Train:

Federated Learning in Real Time using VANTAGE6 Frank Martin (1), Arturo Moncada‐Torres (1), Melle Sieswerda (1), Johan van Soest (2), and Gijs

Geleijnse (1)

(1) Netherlands Comprehensive Cancer Organization (IKNL), Eindhoven, NL, (2) Maastricht University

Medical Centre+, Maastricht, NL

The growing complexity of cancer diagnosis and treatment needs data sets that are larger and richer

than currently available in a single location or database. This requires incorporating data from

different sources, which is typically done by generating a copy of each dataset and centralizing it by a

trusted party. Unfortunately, sharing patient information is becoming increasingly problematic due

to several risks and challenges, such as loss of data control, logistics of data transmission, and privacy

concerns.

Recently, the Personal Health Train (PHT) has emerged as a platform with the potential to overcome

these limitations. Under this approach, different parties (i.e., the stations) can answer their research

questions (i.e., the trains) collaboratively by exchanging aggregated data and/or statistics while

keeping the underlying data on site, safe and undisclosed. In order to make the PHT a feasible, long‐

term solution it requires a robust, flexible, and reliable infrastructure (i.e., the railways) to handle the

collaborations between parties.

At IKNL, we have developed VANTAGE6, our open‐source priVAcy preserviNg federaTed leArninG

infrastructurE for Secure Insight eXchange. This framework consists mainly of a central server, nodes,

and an interface with the user. We will showcase the capabilities of VANTAGE6 in a real time

Federated Learning demo by computing the average age of a group of participants in a Round‐robin

scenario. Furthermore, we will demonstrate how to perform a privacy‐preserving logistic regression

(as proposed by Li et al., 2015) to predict patient survival (i.e., the train) using the Breast Cancer

Wisconsin Diagnostic Data Set.

Acknowledgements ‐

Keywords: distributed learning, infrastructure, logistic regression



Cancer surveillance Harm Buisman (1), Guido Out (1)

(1) IKNL

Progress in cancer historically depends on hunches from the field. E.g. a doctor suspects a new

treatment is better, a researcher hypothesizes that centralization of care improves survival or

patients report increased local leukemia rates. These hunches depend on coincidence. What if

nobody gets a hunch? On top of that, doing analyses can be cumbersome and time‐intensive.

Answering research questions requires a researcher to write their own scripts. Also, manually

analyzing multiple cancers, regions, and treatments takes a lot of time. These factors make that the

reduction of the impact of cancer is hampered by chance and time factors.

In the cancer surveillance program at IKNL we develop a cancer monitor with analytics capabilities

based on available data such as the Netherlands Cancer Registry. This allows to 1) automatically

identify surprising findings from the data, 2) provide continuous monitoring on multiple indicators

such as incidence, survival or variation in care, and 3) monitor a variety of dimensions such as tumor

type, region or gender at the same time. The tooling in this monitor provides an additional data

service to inspect found patterns using powerful visualizations and a toolbox of statistical analyses.

With increased identification of promising research directions and a speedup in doing analyses, the

cancer surveillance program at IKNL helps reduce the impact of cancer.

Acknowledgements IKNL for over 30 years of data registered into the Netherlands Cancer Registry

Keywords: cancer, algorithms, data mining, visualization, GIS



Projecting Patient Data onto Clinical Decision Trees in Oncoguide Kees Ebben, clinical informatician (1), Guido Out, scientific software developer (1), Arturo

Moncada‐Torres, clinical data scientist (1), Thijs van Vegchel, clinical informatician (1)

(1) Netherlands Comprehensive Cancer Organization (IKNL), Eindhoven, NL

In oncology, clinical practice guidelines (CPGs) describe the best practices for treating specific

populations of patients. It has been shown that their implementation improves quality of care by

reducing unwanted variability and bettering outcomes in clinical practice.

CPGs are commonly structured as manuals, where the best practices are described in text through

many chapters. Unfortunately, textual CPGs are frequently ambiguous and inconsistent.

Furthermore, quite often the recommendations for a group of patients are spread through the whole

text, which makes it hard for the reader to get a clear picture of the decision process for a specific

treatment.

To address these issues, we have transformed the textual recommendations into data‐driven clinical

decision trees (CDTs) using FAIR principles. In order to analyze the flow of patients from the trees’

stem (i.e., original pool of patients) to the trees’ leaves (i.e., groups of patients for a specific

recommendation) we projected real‐world patient data for breast and prostate cancer onto the CDTs

in Oncoguide (www.oncoguide.nl). First, we created an inventory of the data‐items of all CDTs. Then,

we performed a delta analysis between these data‐items and the variables available in the

Netherlands Cancer Registry (NCR). We selected the CDTs where all data‐items could be obtained.

Next, we presented the total number of patients that matched the relevant data‐item value, and

showed the number (and percentage) of these patients that were treated according to the CPG. All

occurring treatments were presented and classified as adherent or non‐adherent. Finally, we

visualized this information on top of the CDT at each branch and leaf.

This visualization gives unambiguous and structured insight into guideline adherence. It also has the

potential to aid in the initial development and update of CPGs and could serve retrospectively as

evidence to measure and determine best clinical practice for patient (sub)populations.

Acknowledgements We would like to thank Janneke Verloop, Katja Aben, Aafke Honkoop, Theo de Reijke, and Ignace de

Hingh for their support and fruitful discussions during the development of this project.

Keywords: Clinical Practice Guidelines – Real‐world data – Data Projection – Breast Cancer – Prostate Cancer – Guideline Adherence



User demo of a public study in the national cBioPortal instance hosted

by Health‐RI N. Cassman (1), B. S. Hijmans (1), M. J. de Vries (1), J. Hudecek (1), M. Bierkens (1), R.J.A Fijneman

(1), R. Azevedo (2), J.‐W. Boiten (2), G.A. Meijer (1)

(1) Department of Pathology, The Netherlands Cancer Institute, Plesmanlaan 121 1066 CX,

Amsterdam, The Netherlands, (2) Lygature, Utrecht, The Netherlands

The cBio Cancer Genomics Portal, ‘cBioPortal’, is an open source data integration platform that

enables (cancer) researchers to view and query complex genomic datasets in a comprehensive

manner. The platform was originally developed by Memorial Sloan Kettering Cancer Center (New

York, USA) (1) and is actively maintained and further developed by an international community. The

original instance of cBioPortal (http://cbioportal.org) currently provides access to data from almost

83000 tumor samples from 273 public studies.

Health‐RI hosts a national instance of cBioPortal in the Netherlands (2), with controlled access. As

operators of the data team of Health‐RI we are involved with safe and secure importing of studies to

the national cBioPortal and would like to show you the possibilities of this platform and how it could

aid you in FAIR data management.

To show cBioPortal’s analysis and visualization capabilities, we would like to invite you to a demo

from the perspective of a researcher wishing to analyze a study’s dataset. We will explore the public

study ‘Low‐Grade Gliomas’ (3) using the national cBioPortal. We will take the viewer through up to

nine steps, as follows:

1. Logging in

2. cBioPortal interface

3. Study view

4. Study exploration

5. Study exploration: Group comparison

6. Study exploration: Patient level view

7. Gene panel‐based view

8. Gene panel‐based view: OncoPrint

9. Gene panel‐based view: Plots

After going through these steps, the viewer will have become familiar with exploring study data in

cBioPortal. The demo will include research questions, encouraging the audience to participate

actively.

Acknowledgements 1. Cerami et al. (2012). The cBio Cancer Genomics Portal: An Open Platform for Exploring

Multidimensional Cancer Genomics Data. Cancer Discovery 2(5): 401–404.

2. https://trait.health‐ri.nl/trait‐tools

3. Johnson et al. (2014) Mutational analysis reveals the origin and therapy‐driven evolution of

recurrent glioma. Science 343(6167):189‐193.

Keywords: cBioPortal, Health‐RI, demo


Abstract: 11 (Poster & Pitch)

Applying machine learning on health record data from general

practitioners to predict suicidality Kasper van Mens (1), Elke Elzinga (2), Mark Nielen (3), Joran Lokkerbol (4), Rune Poortvliet (3), Gé

Donker (3), Marianne Heins (3), Joke Korevaar (3), Michel Dückers (3), Claire Aussems (3), Marco

Helbich (5), Bea Tiemens (6), Renske Gilissen (2), Aartjan Beekman (7), Derek de Beurs (3)

(1) Altrecht Mental Healthcare, Utrecht, The Netherlands, (2) 113 Suicide Prevention, Amsterdam,

The Netherlands, (3) Nivel, Netherlands Institue for Health Services Research, Utrecht, The

Netherlands, (4) Centre of Economic Evaluation & Machine Learning, Trimbos Institutue (Netherlands

Institute of Mental Health), Utrecht, the Netherlands, (5) Human Geography and Spatial Planning,

Utrecht University, Utrecht, The Netherlands, (6) Behavioural Science Institute, Radboud University ,

Nijmegen, The Netherlands, (7) Psychiatry, Amsterdam Public Health (research institute), Amsterdam

UMC, Vrije Universiteit Amsterdam

Background

Suicidal behaviour is difficult to detect in general practice. Machine learning algorithms using

routinely collected data might support General Practitioners (GPs) in the detection of suicidal

behaviour. In this paper, we applied machine learning techniques to support GPs recognize suicidal

behaviour in primary care patients using routinely collected general practice data.

Methods

This case‐control study used data from a national representative primary care database including

over 1.5 million patients (Nivel Primary Care Database). Patients with a suicide (attempt) 2017 were

selected as cases (N = 574) and an at risk control group (N = 207,308) was selected from patients with

psychological vulnerability but without a suicide attempt in 2017. RandomForest was trained on a

small subsample of the data (training set), and externally validated on unseen data (test set).

Results

Almost two‐third (65%) of the cases visited their GP within the last 30 days before the suicide

(attempt). RandomForest showed a positive predictive value (PPV) of 0.05 (0.04 – 0.06), with a

sensitivity of 0.39 (0.32 – 0.47) and area under the curve (AUC) of 0.85 (0.81 – 0.88). Almost all

controls were accurately labelled as controls (specificity = 0.98 (0.97 – 0.98)). Among a sample of 650

at‐risk primary care patients, the algorithm would label 20 patients as high‐risk. Of those, one would

be an actual case and additionally, one case would be missed.

Conclusion

This is the first study to apply machine learning to predict suicidal behaviour using general practice

data. Our results showed that these techniques can be used as a complementary step in the

identification and stratification of patients at risk of suicidal behaviour. The results are encouraging

and provide a first step to use automated screening directly in clinical practice. Additional data from

different social domains, such as employment and education, might improve accuracy.

Acknowledgements Netherlands Organisation for Health Research and Development (ZONMW), Dutch ministry of Health.

Keywords: routine health care data, random forest, suicide preventie



FAIR data management and stewardship in real practice: capture and

integration of translational research data from the PLCRC sub‐studies

MEDOCC, MEDOCC‐CrEATE and PROVENC3 Lana Meiqari (1), Carmen Rubio Alarcón (1), Dave E.W. van der Kruijssen (2), Suzanna Schraa (2),

Maaike Koelink (2), Olivier Paping (2), Miranda van Dongen (1), Mirthe Lanfermeijer (1), Menno de

Vries (1), Noriko Cassman (1), Brenda Hijmans (1), Rinus Voorham (3), Mariska Bierkens (1), Veerle

M.H. Coupé (4), Miriam Koopman (2), Gerrit A. Meijer (1), Geraldine R. Vink (2, 5), Remond J.A.

Fijneman (1)

(1) The Netherlands Cancer Institute, Amsterdam, The Netherlands, (2) University Medical Center Utrecht,

Utrecht University, Utrecht, The Netherlands, (3) Pathologisch‐Anatomisch Landelijk Geautomatiseerd Archief

(PALGA), Houten, The Netherlands, (4) Amsterdam University Medical Centers, Free University, Amsterdam, The

Netherlands, (5) Netherlands Comprehensive Cancer Organisation, Utrecht, the Netherlands

Background: Colorectal cancer (CRC) is the second most common cancer in the Netherlands. For early

stage disease, there is an unmet clinical need to better define who to treat (or not to treat) with

adjuvant chemotherapy after primary tumor resection. Detection of cell‐free circulating tumor DNA

(ctDNA) in post‐surgery liquid biopsies is a promising biomarker for minimal residual disease (MRD)

and associated with disease recurrence. Therefore, using the Prospective Dutch CRC cohort (PLCRC)

infrastructure, we evaluate the prognostic value of ctDNA post‐surgery in three study cohorts:

MEDOCC (stage II, observational), MEDOCC‐CrEATE (stage II, interventional), and PROVENC3 (stage

III, observational).

Aim: FAIR data management of clinical, biobanking and molecular data from MEDOCC, MEDOCC‐

CrEATE and PROVENC3 to support data collection, analysis and dissemination.

Methods: Following PLCRC informed consent, patients from 25 participating hospitals are registered

SLIM. Clinical data are collected via the Netherlands Cancer Registry (‘NCR’) database and via on‐site

registration in Castor EDC. Tumor tissue and blood samples are collected and shipped to the

Netherlands Cancer institute (NKI). Tissue blocks are requested via the PALGA portal (part of the

Dutch National Tissue Portal project) or sent directly to NKI. Tissue and blood biosample data will be

registered in various NKI systems (Glims, LMS, Molpa). Molecular data will be obtained by targeted

sequencing (Illumina) and analyzed using PGDx bioinformatics pipeline. Clinical and longitudinal

molecular data will be integrated and uploaded to cBioPortal.

Results: An overview of the data flow has been drafted. Collaborations are needed to integrate

clinical data from NCR and Castor and biobanking data. Traceability of biosamples within the NKI

systems is being defined. The data model for cBioPortal is being optimized.

Conclusion: Defining a data‐flow at the start of a project is key to improve translational research

quality and to ensure FAIR data management and Good Clinical Practice.

Acknowledgements

PTRC IT & data project team.

Keywords: FAIR, translational research, PLCRC, ctDNA, NCR, SLIM, PALGA, Castor, cbioPortal.



Performance Indicators and the infrastructure for health data Rosemarijn Looije (1), Annemarijn Prins‐van Ginkel (1), Astrid Roskes (1)

(1) UMC Utrecht, Department of Business Intelligence, Heidelberglaan 100 Utrecht, The Netherlands

The FAIR (Findable, Accessible, Interoperable and Re‐usable) format provides advantages for all

stakeholders; hospitals, patients, researchers and insurance companies.

At UMC Utrecht we develop management information including strategic dashboards with key

performance indicators, which are evaluated each year and changed according to the focus of the

hospital. In addition, we develop dashboards with a specific focus, for instance, management

information regarding clinical processes and financial outcomes. A healthcare data infrastructure that

follows the FAIR format will provide benefits for the quality of our indicators. Some examples of

these benefits are listed below.

Currently, every (department of a) hospital has its own, non standardized, manner of registering data

and deciding on the business rules that result into an indicator. By making use of the same data in

the same format across hospitals and hospital departments, sharing and comparing will become

easier. This would benefit hospital wide decision‐making.

Second, ambiguous registered data could be disambiguated through the addition of other

information. As the human registered (and fault prone) data is backed with automatically registered

data, the indicators for the management information are a better representation of the truth.

Third, it provides information that is now hard to come by, because the patient journey continues

after hospitalization. The patient data that is created after hospitalization could be added. This will

support the ongoing development of (quality) indicators. In addition, this could result in

identification of certain patient types, thereby providing the possibility to improve patient care.

In turn, these identified patient types provide opportunities for research into, and application of,

personalized healthcare and forecasting. The results can be validated by monitoring the key

performance indicators thereafter.

In summary, all these aspects will result into improved healthcare, benefiting all aforementioned

stakeholders.

Acknowledgements UMC Utrecht

Keywords: FAIR, management information, quality indicators



Use of encrypted BSN in record linkage of epidemiological cohorts and

biobanks with disease to ensure valid linkages with optimal privacy

protection. A.W. (Sandra) van den Belt‐Dusebout (1), R. (Rosemarie ) Wijnands (1), O. (Otto) Visser (2), H.

(Hannelore) Hofhuis (3), J.L. (Hans) van Vlaanderen (4), J.A. (Jasper) Bovenberg (5), G. (Gerard) van

Grootheest (6), F.E. (Floor) van Leeuwen (1)

(1) Antoni van Leeuwenhoek ‐ The Netherlands Cancer Institute (NKI‐AVL), Amsterdam, (2)

Netherlands Comprehensive Cancer Organisation (in Dutch, IKNL), Utrecht, (3) The nationwide

network and registry of histo‐ and cytopathology in the Netherlands (in Dutch, PALGA), Houten, (4)

ZorgTTP, Houten, (5) jurist, (6) GGZ inGeest, Amsterdam

Background and purpose

Record linkage between cohorts/biobanks and (disease) registries is essential to efficiently and

validly answer important epidemiologic research questions. But, in the Netherlands it is prohibited by

law to use the Citizen Service Number (in Dutch: Burger Service Nummer, BSN) for research

purposes. Therefore, record linkage can only be performed using personal identifying data, e.g.

name, date of birth and postal code. However, large registries also increase the chance of

incorrect/uncertain links while restrictions in the Dutch Personal Data Protection Act (Dutch: Wet

bescherming persoonsgegevens) complicate checking links, yielding incorrect research outcomes.

Therefore, this project aimed to develop an improved standard record linkage procedure through use

of irreversibly encrypted BSNs, to ensure valid linkages with optimal privacy protection.

Methods

We used the nationwide OMEGA‐cohort of 42,000 women treated for subfertility between 1980 and

2001. Detailed information is available on fertility treatments, life‐style factors and cancer diagnoses

(through record linkages with the Netherlands Cancer Registry and PALGA). A large biobank consists

of toenail clippings providing DNA and tumor tissue blocks of hormone‐related cancers providing

more specific phenotypic information. ZorgTTP performed pseudonymization to enable anonymized

linkage on our behalf.

Results

After having overcome many procedural and ethical problems, several pseudonomized PALGA record

linkages and NCR record linkages based on encrypted personal identifiers have been performed.

Linkage results from NCR and PALGA were compared based on OMEGA identification numbers. The

best sensitivity and specificity combination for several record linkage scenarios based on personal

identifiers compared with the linkage based on encrypted BSN, was 0.91 and 0.98 and the worst

sensitivity and specificity combination was 0.91 and 0.77.

Conclusion

Using encrypted BSNs yields the best linkage results with optimal privacy protection.

Acknowledgements This study is granted by BBMRI‐NL 1.0 and BBMRI‐NL 2.0 Science Voucher. Tissue blocks have been

collected through BBMRI CP2011‐39. We thank VUmc and UMCU for providing encrypted BSNs.

Keywords: Record linkage; encrypted BSN; validation; privacy protection


Abstract: 15 (Poster)

The Ten Commandments of Translational Research Informatics Tim Hulsen (1)

(1) Philips Research

Translational research applies findings from basic science to enhance human health and well‐being.

In translational research projects, academia and industry work together to improve healthcare, often

through public‐private partnerships. This “translation” is often not easy, because it means that the

so‐called “valley of death” will need to be crossed: many interesting findings from fundamental

research do not result in new treatments, diagnostics and prevention. To cross the valley of death,

fundamental researchers need to collaborate with clinical researchers and with industry so that

promising results can be implemented in a product. The success of translational research projects

often does not depend only on the fundamental science and the applied science, but also on the

informatics needed to connect everything: the translational research informatics. This informatics,

which includes data management, data stewardship and data governance, enables researchers to

store and analyze their ‘big data’ in a meaningful way, and enable application in the clinic. The author

has worked on the information technology infrastructure for several translational research projects

in oncology for the past nine years, and presents his lessons learned in this poster in the form of ten

commandments. These commandments are not only useful for the data managers, but for all

involved in a translational research project. Some of the commandments deal with topics that are

currently in the spotlight, such as machine readability, the FAIR Guiding Principles and the GDPR

regulations. Others are mentioned less in the literature, but are just as crucial for the success of a

translational research project.

Acknowledgements The author would like to thank everyone involved in the CTMM‐TraIT, CTMM‐PCMM, Movember

GAP3, ERSPC, RE‐IMAGINE and LIMA projects.

Keywords: Translational research, medical informatics, data management, data curation, data

science



A tranSMART driven clinical diagnostic decision support system on

anemia J. Worst DBA (1) and Prof. Dr. H.J. van den Herik (1)

(1) Leiden University

Data integration

In Health‐RI, the common goal is to interconnect the biomedical resources, empowering researchers

to develop better personalized medicine and health solutions. A tranSMART driven CDDSS focused

on anemia will in the context of personalized medicine support a correct prognosis, which is

regarding the current available medical knowledge possible. Cooperation focused on the integration

of clinical data leads according to Engelen (2018) to medical knowledge, which at present (2019)

doubles every 3 to 4 years.

Clinical data patterns representing the health condition of elderly patients

Signs and symptoms of a mild or moderate anemia are often asymptomatic, e.g., breathlessness

and/or fatigue upon a strenuous exercise, which are common in severe anemia. Anemia (Bunn and

Aster, 2011; Boogaerts and Verhoef, 2017) is based on thinking about production versus destruction

of red blood cells. It explains the level of circulating red blood cells. The erythropoiesis is an example

of a complex system. Ineffective erythropoiesis is critical to the pathophysiological explanation of

destructive anemia such as iron deficiency, myelodysplastic, and megaloblastic.

Our study detected data patterns of an anemia underlying disease as a clinical indicator. Anemia

reflects its influences on the immunity system of elderly (those > 65 years), which makes them

vulnerable for acute disease. The traditional notion has been that anemia in elderly individuals

always reflects a serious underlying condition. It has long been recognized that a proportion of

patients, usually elderly, have anemia that does not meet diagnostic criteria for a specific etiology

(unexplained anemia), which concerns a prevalence of 17 to 45 % among elderly. At present the

ending of the life‐span after 20 years or more of a 65 year old is a reality.

Acknowledgements We like to thank Lygature, Health‐Ri and the Hyve

Keywords: anemia, tranSMART, CDDSS, elderly patients, data patterns, life‐span



Data Stewardship Wizard Rob Hooft (1), Marek Suchánek (2), Vojtěch Knaisl (2), Jan Slifka (2), Robert Pergl (2)

(1) DTL, The Netherlands, (2) Czech Technical University in Prague, Czech Republic

We will present the latest state of our tool, the Data Stewardship Wizard.

The Data Stewardship Wizard is a tool for data management planning that is focused on getting the

most value out of data management planning for the project itself rather than on fulfilling

obligations. It is based on FAIR Data Stewardship, in which each data‐related decision in a project

acts to optimize the Findability, Accessibility, Interoperability and/or Reusability of the data. The

background to this philosophy is that the first reuser of the data is the researcher himself. The tool

encourages the consulting of expertise and experts, can help researchers avoid risks they did not

know they would encounter, and can help them discover helpful technologies they did not know

existed.

Data management planning has several sociological problems:

The activity is seen as an obligation, a burden, by researchers.

Some data management novices underestimate the risks of insufficient data management and

data management planning and think that their knowledge of computing in the home

environment extrapolates to research data management in the lab.

It is hard for experts in specific aspects of data management to be found by the researchers that

need them the most. Expertise lists do not work for users who are unaware that they will be

running into a specific data management problem during the project.

We try to solve those problems using our tool, the “Data Stewardship Wizard”. We use the term

“Data Stewardship” to indicate that the activity is not only taking place during the project, but

extends to the long term maintenance of the resulting research data. We use the term “Wizard” to

refer to the tool as an “expert system” providing context dependent guidance to its users.

Our wizard:

alleviates the negative view of data management planning by focusing primarily on the benefits

for the research project itself and the researcher, not on the obligations;

can help to show researchers all the different aspects of data management: IT, archival,

sustainability and the entire FAIR data spectrum. The guidance tells stories of experts who have

learned their lessons the hard way;

points to available experts and expertise exactly where the issue at hand is brought up in the

questionnaire.

The Data Stewardship Wizard (https://ds‐wizard.org/)

presents questions in a hierarchical fashion, so that only relevant data management subjects are

presented to the user;

can function as a checklist for data stewards operating in a project, just like pilots use a checklist

to fly a plane: it ascertains that the experts do not forget any aspects of the planning;


consists mostly of closed questions, encouraging thinking through all aspects and avoiding the

problem that the researcher does not know where to start writing, thereby preventing the urge to

copy an existing data management plan from another project.

Technically, the wizard consists of

an open source web tool (with containerized installation) to present hierarchical data

management questionnaires, storing intermediate results in a database;

a knowledge model that contains a few hundred questions and is easy to extend;

a system to maintain knowledge models and to adapt them to your own institute or

infrastructure.

a templating engine that can be used to transform the data stewardship plan into a standard

data management plan following funder guidelines.

Acknowledgements This work has been paid partly by ELIXIR, with in‐kind contributions by DTL and Czech Technical

University in Prague

Keywords: DMP, Data Management



Data Desks at the University Medical Centers: Facilitating Access to

Expertise on Data Handling Rob Hooft (1), Anne‐Lotte Masson (2), Margo van Reen (3), Mirjam Brullemans (3), Erik van Iperen

(4), Rudy Scholte (4), Harry Pijl (5), Judith Manniën (6), Petra van Overveld (6), Pascal Suppers (7),

Ronald van Schijndel (8) and Salome Scholtens (9)

(1) DTL, (2) Erasmus MC, (3) Radboudumc, (4) Amsterdam UMC locatie Meibergdreef, (5) UMC

Utrecht, (6) LUMC, (7) MUMC+, (8) Amsterdam UMC locatie de Boelelaan, (9) UMCG

Each of the University Medical Centers (UMCs) in The Netherlands has its own data expertise center.

These provide help with many different aspects of data management to all researchers. Over the last

years we have brought these groups together in data4lifesciences work package "Access to

Expertise” and we will keep coming together inside Health‐RI in the future.

The group collaborates in order to:

* Exchange experience on how to run this kind of expertise desks (organisation and business)

* Exchange experience of the kind of questions received and how they are answered

* Discuss the position of data stewards in the UMCs

* Exchange practices for training of researchers as well as support staff

* Discuss related initiatives in The Netherlands, e.g. LCRDM

Through the exchange of information we want to achieve a landscape in which researchers from any

of the organisations can get access to expertise in the entire network.

Acknowledgements NFU

Keywords: data stewardship, expertise, support, NFU, collaboration



Changes in the intracranial volume from early adulthood to the sixth

decade of life Yaron Caspi (1), Rachel M. Brouwer (1), Hugo G. Schnack (1), Marieke E. van de Nieuwenhuijzen

(1), Wiepke Cahn (1), Renè S. Kahn (1,2), Wiro J. Niessen (3), Aad van der Lugt (3), Hilleke Hulshoff

Pol (1)

(1) UMC Utrecht Brain Center, Department of Psychiatry, University Medical Center Utrecht, The

Netherlands, (2) Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY,

USA, (3) Department of Radiology and Nuclear Medicine, Erasmus MC: University Medical Center

Rotterdam, The Netherlands

Aging is manifested in structural changes of the brain. To understand normal and abnormal brain

aging, it is important to study various brain biomarkers among different age groups.

The total intracranial volume (ICV) is an important biomarker. Previous work about ICV aging

gravitated around two views. The first view claims that a substantial ICV reduction of about 0.2

%/year occurs during adulthood. The second view suggests that the ICV stays constant during

adulthood. In light of these conflicting positions, it is essential to clarify if, and to what extent, ICV

changes with age.

We measured IC in MRI (T1w) brain scans using a longitudinal design. Subjects were scanned at three

different time points with an average time interval of 3 years between scans (number of individuals

was 563, 363 and 323; their mean ages were 27.13±7.23, 30.07±7.15 and 33.67±7.57 years). By

applying a semi‐automatic in‐house‐build algorithm for IC volume extraction, we measure individual

trajectories of ICV changes between 20 and 62 years. This procedure allows us to detect within‐

individual changes in IC with increasing age.

Using three different analysis methods, we detected subtle but statistically significant longitudinal

trajectories of the ICV from adulthood to the middle of the sixth decade of life. Though the extent of

change differ between the analysis methods, they all show the same trend and order‐of magnitude

changes. E.g., one of the methods show that at age 20 there was an increase of +0.03 %/year, and at

age 55 there was a decline of ‐0.09 %/year. Thus, ICV changes from positive growth to negative

decline and is accelerates with age.

When using a cross‐sectional approach, we find a constant ICV decline rate of about 0.2 %/year.

Thus, the cross‐sectional approach estimated a stronger decline than the cross‐sectional one that we

interpret as a generational effect.

Acknowledgements The authors would like to thank Hakim Achterberg, Marcel Koek, Adriaan Versteeg, Thomas Phil,

Thomas Kroes, Baldur van Lew, Marcel Zwiers and Seyed Mostafa Kia for a collaboration within the

BBMRI‐NL work‐package 3.

This work was supported by the Netherlands Organization for Scientific Research (NWO

184.033.111), Biobanking and BioMolecular resources Research Infrastructure The Netherlands

(BBMRI‐NL2.0), and by the ENIGMA World Aging Center grant (NIH 1R56AG058854‐01, subaward

112068003).


The infrastructure for the GROUP study is funded through the Geestkracht programme of the Dutch

Health Research Council (Zon‐Mw, grant number 10‐000‐1001), and matching funds from

participating pharmaceutical companies (Lundbeck, AstraZeneca, Eli Lilly, Janssen Cilag) and

universities and mental health care organizations (Amsterdam: Academic Psychiatric Centre of the

Academic Medical Center and the mental health institutions: GGZ Ingeest, Arkin, Dijk en Duin, GGZ

Rivierduinen, Erasmus Medical Centre, GGZ Noord Holland Noord. Groningen: University Medical

Center Groningen and the mental health institutions: Lentis, GGZ Friesland, GGZ Drenthe, Dimence,

Mediant, GGNet Warnsveld, Yulius Dordrecht and Parnassia psycho‐medical center The Hague.

Maastricht: Maastricht University Medical Centre and the mental health institutions: GGzE, GGZ

Breburg, GGZ Oost‐Brabant, Vincent van Gogh voor Geestelijke Gezondheid, Mondriaan, Virenze

riagg, Zuyderland GGZ, MET ggz, Universitair Centrum Sint‐Jozef Kortenberg, CAPRI University of

Antwerp, PC Ziekeren Sint‐Truiden, PZ Sancta Maria Sint‐Truiden, GGZ Overpelt, OPZ Rekem. Utrecht:

University Medical Center Utrecht and the mental health institutions Altrecht, GGZ Centraal and

Delta.

Keywords: Intracranial Volume, Aging, Brain age, MRI, Longitudinal design, Cross‐sectional design.



The PALGA Portal ‐ Streamlining and professionalizing the request,

delivery and use of pathology data and materials. Annette Bruggink (1), Rinus Voorham (1), Stefan Willems (2) , Iris Nagtegaal (3), Folkert van

Kemenade (4)

(1) PALGA, (2) UMCU, (3) RadboudUMC, (4) ErasmusMC

PALGA, the Dutch pathology registry, delivers data from national databases for purposes such as

scientific research, medical quality control, and for the evaluation and monitoring of screening

programs. Furthermore PALGA offers the option of linking cohort data via a Trusted Third Party (TTP).

PALGA contains over 85 million pathology records and the accompanying materials, FFPE blocks and

tissue slides, are stored in 45 pathology labs. The PALGA Portal allows fast, easy and safe access to

these resources and with that, stimulates secondary use of pathology data and tissues for research.

The PALGA Portal was built in collaboration with BBMRI. The PALGA portal is a web‐based portal that

allows researchers to request pathology data or material from almost all diagnostic pathology labs in

the Netherlands. Laboratory Requests are forwarded to the designated labs and track‐and‐traced.

HUB‐employees, stationed in every academic hospital and serving the non‐academic labs, aid in

picking, registering and sending the requested materials.

Before the start of the PALGA portal almost all pathology labs were visited to introduce the PALGA

portal. In 2018 35 of 45 pathology labs were visited to evaluate the use of the PALGA portal. We have

spoken with more than 100 pathologists and laboratory staff to give an update about the PALGA

portal, to discuss the changes under the GDPR in pathology research, and to retrieve any

improvements.

In 2019 (Q1‐Q3) 144 requests for ‘PA material’ were send to the laboratories. 19.558 PA numbers

were requested from which 14.745 consists of FFPE material. The other 4.813 where Pathology

reports or clinical data.

The PALGA Portal has streamlined and professionalized the request, delivery and use of pathology

data and material for research. It has shown to increase efficiency and transparency for both the

requesting researchers and providing pathology labs.

Acknowledgements BBMRI

Keywords: PALGA portal, Data, FFPE blocks



Generating CT‐scans with 3D Generative Adversarial Networks Using

Supercomputers (1) David Ruhe, (2) Valeriu Codreanu, (3) Caspar van Leeuwen, (4) Damian Podareanu, (5) Vikram

Saletore, (6) Jonas Teuwen

(1) SURFsara, (2) Intel, (3) NKI

It is already a well‐known fact in the computer vision community that current deep learning methods

achieve very accurate results compared to traditional methods. These approaches are very data

hungry, and also require that training data overlaps as much as possible with the true data

distribution, such that the algorithmic bias is minimized when deploying these systems in the real‐

world. Although deep learning applied to medical imaging has shown to achieve good results within

the same medical centre that provided the training data, generalization to other centres is often poor

because of a lack of large, multi‐centre (public) datasets. One of the main reasons that such data sets

are scarce is that privacy concerns make sharing very difficult. To overcome this challenge, this study

aims to generate synthetic 3D CTs (computed tomography), that would allow hospitals around the

world to share medical images that follow the same data distribution as their real.

We extend recent works that develop a technique that progressively grows GANs (Generative

Adversarial Networks) during training to voxel space, and validate our techniques using the open

LIDC‐IDRI thoracic CT dataset. Generating CT samples is very challenging in terms of computational

and memory requirements, as it requires both using 3D convolutional layers, as well as the ability to

generate large‐dimensionality 512x512x128 scans. In order to iterate faster over this high

computational complexity model, we have used distributed training, our current models being

trained on up to 256 dual‐socket Intel Cascade Lake nodes (~12000 cores).

Acknowledgements We would like to acknowledge the Endeavor support team

and especially Mr. Mallick Arigapudi.

Keywords: medical imaging; deep learning; generative adversarial networks



Medical images and AI: the need of a big data revolution Alberto Traverso (1), Ivan Zhovannik (1, 2), Ibrahim Hadzic (1), Suraj Pai (1), Dominik Jeurissen (1),

Andre Dekker (1)

(1) Department of Radiotherapy, Maastro Clinic, Doctor Tanslaan 12, 6229ET Maastricht (NL), (2)

Department of Radiotherapy, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525GA

Nijmegen (NL)

Medical images represent the larger percentage of data produced in the clinic with a key role in

radiation oncology. Cancer diagnosis, treatment and monitoring are based on patients’ scans.

However, visual inspection of medical images does not catch the unique information that scans

embed. This is because medical images should be considered more as just pictures, they are big data.

The advent of AI (Artificial Intelligence)‐driven computational pipelines opened the possibility to

automatically inspect images via “artificial eyes” and extract valuable information, more than as

humans we can process, this is called quantitative imaging. This information has the advantage to

non‐invasive (compared for example to biopsies) exploit the unique biology of a patient tumor and it

can support data driven decisions about our patients: better decisions, better cancer care. However,

the medical imaging community was not ready for this revolution. After an initial phase of

excitement with promising results for several diseases, investigation of the weakness and drawbacks

of this new methodology followed especially when it exists a gap between academic results and

translation in the clinic. Lack of transparency, low quality of input data, absence of accepted

methodology, complexity of the problem to be solved, generalizability of image‐derived models

represents the major drawbacks. Practically, all these drawbacks were generated by not considering

medical images and derived information as big data, with all the issues that come with that, but also

with all the methodology already available to tackle this hyperdimensional problem. Only by taking

steps back and recognizing the importance of considering quantitative imaging as a big data problem

we could solve the above‐mentioned issues. We will present our ongoing efforts to revitalize

quantitative imaging, which include: improvement of images’ quality, better and reproducible AI

tighten with FAIR principles and common computational infrastructure that are strong than data‐

sharing barriers.

Acknowledgements The authors would like to acknowledge their colleagues at Moffitt Cancer Centre, Princess Margaret

Cancer Centre and the DIAG (Diagnostic Image Analysis Group) in Nijmegen.

Keywords: medical imaging, AI, big data, transparent science, radiation oncology



Laboratory variation of molecular testing in a Dutch cohort of

metastatic non‐small cell lung cancer patients from 2017 (1) Betzabel Cajiao Garcia, (2) Chantal CHJ Kuijpers, (3) Michel M van den Heuvel, (4) Anne SR van

Lindert, (5) Ronald AM Damhuis, (6) Stefan M Willems

(1) University Medical Centre Utrecht, (2) University Medical Centre Utrecht, Foundation PALGA, (3)

Radboudumc, (4) University Medical Centre Utrecht, (5) Netherlands Comprehensive Cancer

Organisation, (6) University Medical Centre Utrecht

Background & objective: Adequate and timely testing for genetic alterations in non‐small cell lung

cancer (NSCLC) is necessary to consider targeted therapy when a certain genetic alteration is present.

Previously, we demonstrated that in the Netherlands molecular testing was suboptimal in 2015, as

25% (EGFR/KRAS and ALK) to 50% (ROS1) of patients were not tested according to guidelines, and

notable laboratory variation was present. Currently, by analyzing a cohort of metastatic NSCLC from

2017, we aim to assess whether the performance of molecular testing improved.

Methods: All stage IV non‐squamous NSCLC with incidence year 2017 from the Netherlands Cancer

Registry were matched to the Dutch pathology registry (PALGA). Using information extracted from

pathology excerpts, proportions of tumors tested for EGFR/KRAS mutation, and ALK, ROS‐1, and RET

rearrangement <3 months after diagnosis were determined and variation between 42 laboratories

was assessed.

Results: Of 3746 identified patients, we have currently analyzed 1565 (42,0%). Fifty‐five patients

were non‐eligible after matching, leaving 1510 (40,3%) patients. EGFR/KRAS testing was performed

in 1245 patients (82.5%) (laboratory variation 63.6‐100%). Of the EGFR/KRAS wildtype tumors

(n=608), 422 (69.4%) tumors were tested for ALK (33.3‐100%), 305 (50.2%) for ROS‐1 (0‐100%), and

214 (35.2%) for RET (0‐100%). Insufficient tumor tissue and inappropriate specimen were the most

stated reasons for not testing.

Conclusions: These preliminary data show significantly higher EGFR/KRAS, ALK and ROS‐1 testing

proportions compared to 2015. Further improvement remains possible, in some laboratories more

than in others, and especially for ROS‐1 and RET testing, to identify candidates for targeted therapy.

Combining FAIR data from two national databases facilitates data‐driven feedback of clinicians to

enhance personalized treatment of lung cancer patients.

Acknowledgements UMC Utrecht, Foundation PALGA, Pfizer, Roche, AstraZeneca

Keywords: Non‐Small‐Cell Lung Carcinoma, molecular testing, personalized medicine, data

agregation, PALGA, FAIR data



Herziening Gedragscode gezondheidsonderzoek Dr. Martin Boeckhout (1), mr. Evert‐Ben van Veen (1)

(1) MLC Foundation

De Gedragscode gezondheidsonderzoek uit 2004 heeft geruime tijd duidelijkheid geboden over de

voorwaarden om patiëntgegevens in gezondheidsonderzoek te mogen gebruiken. Inmiddels is er de

AVG met helaas ook soms de ‘AVG kramp’ en is ook het gezondheidsonderzoek aanzienlijk van

karakter veranderd. Met steun van VWS en ZonMw is COREON de herziening van de Gedragscode

gestart. De praktische uitvoering is belegd bij de MLC Foundation. In tegenstelling tot de vorige zal de

herziene Gedragscode handvaten bieden voor alle gegevensverwerking bij gezondheidsonderzoek,

bijvoorbeeld ook rond gegevens die in het kader van een WMO studie, van vrijwilligers aan nWMO

onderzoek of via biobanken worden verkregen. De poster beschrijft:

• de aanleiding(en) voor de herziening;

• het proces van herziening en op welke wijze de diverse stakeholders daarbij worden

betrokken;

• voorbeelden van de onderwerpen die in de Gedragscode aan de orde zullen komen;

• de opbouw van de Gedragscode;

• de tijdlijn.

Acknowledgements ZonMw

Keywords: AVG, GDPR, WMO, nWMO,



Supporting your Research; Tools for Data Management and Processing Project and Steering Committee Research ICT (1)

(1) Amsterdam UMC, Amsterdam, The Netherlands

In 2018, an ambitious four‐years plan was launched, aiming to boost and harmonize data

management and IT support for researchers within Amsterdam UMC. Here is an overview of services,

including dedicated workspaces, that recently already have become available for researchers:

Data Management Support –A newly established helpdesk can help to review data management

plans and provides support on data collections and applications.

Research Data Platform – Clinical data from Amsterdam UMC patients gathered during care though

Epic and other sources are currently collected in a research data platform and can be extracted in a

GDPR‐compliant format – anonymous, encoded or identifiable, depending on the legal requirements.

CTcue Patientfinder – CTcue uses Epic to search for patients fitting study criteria based on up‐to‐date

structured (i.e., diagnose, medication, lab) as well as unstructured (i.e., notes, reports, letters) data.

Castor Campus License – Castor enables researchers to easily capture and integrate data in a GDPR‐

compliant manner. A new campus license offers the Castor eCRF free of charge for non‐commercial

research activities at Amsterdam UMC.

Azure‐based DRE – The anDREa consortium led by Radboudumc is developing a user‐friendly digital

research environment where researchers can work with all their data, analytics, and tools in a secure

and self‐serviced manner. Researchers from Amsterdam UMC have access to this environment.

Research Cloud – The Amsterdam UMC Research Cloud is a platform for flexible and advanced

computing, hosted within the SURFcumulus research environment. The cloud is designed for and

available for IT‐experienced researchers

Research Zone – The research zone is a network specifically set up for research and separated from

other networks, like the care domain network. The research zone facilitates connections to

(inter)national networks and shared resources, including data, storage, software, and high‐

performance computing.

Acknowledgements None

Keywords: Data management Support, Research Data Platform, CTcue, Castor, anDREa, Research

Cloud



FAIR Genomes: Standardizing a meta‐data schema for FAIRifying

personal genome data workflows Gurnoor Singh (1) , K. Joeri van der Velde (2), Jeroen Beliën (4), Jasmin Böhmer (3), Daphne

Stemkens (5), Lisenka Vissers (1), Jeroen van Reeuwijk (1), Saskia Hiltemann (7), Lennart Johansson

(2), Nienke van der Stoep (6), Daoud Sie (4), Janneke Weiss (4), Geert Frederix (3), Marco Roos (6),

Erik van Iperen (8), Terry Vrijenhoek (3), Folkert W. Asselbergs (3), Joris van Montfrans (3), Rolf

Sijmons (2), Hanneke van Deutekom (3), Pieter Neerincx (2), Joep de Ligt (3), Fernanda de Andrade

(2), Anna Niehues (1), Hindrik H.D. Kerstens (10), Mark Thompson (6), Rajaram Kaliyaperuma (6),

Annika Jacobsen (6), Katy Wolstencroft (6, 14), Ies Nijman (3), Marcel Nelen (1), Ariaan Siezen (1),

Koen ten Hove (1), Nine Knoers (2), Christian Gilissen (1), Hans Scheffer (1), Stefan Willems (3),

Wendy van Zelst‐Stams (1), Helger IJntema (1), Kim Elsink (3), Bart de Koning (9), Bauke Ylstra (4),

Erik Sistermans (4), Patrick Kemmeren (10), Henne Holstege (4), Christine Staiger (11), Bastiaan

Tops (10), Susanne Rebers (12), David van Zessen (7), Valesca Retèl (12), Edwin Cuppen (13), Peter

van Tintelen (3), David van Enckevort (2), Lieneke Steeghs (1), Salome Scholtens (2), Jeroen Laros

(6), Leon Mei (6), Cor Oosterwijk (5), Andrew Stubbs (7), Peter A.C. ‘t Hoen (1), Mariëlle van Gijn

(2), Morris Swertz (2)

(1) Radboud University Medical Center, Nijmegen, The Netherlands, (2 ) University Medical Center

Groningen, The Netherlands, (3) University Medical Center Utrecht, The Netherlands, (4) Amsterdam

University Medical Centers, location VUmc, The Netherlands, (5) = VSOP ‐ Dutch Patient Alliance for

Rare and Genetic Diseases, (6) Leiden University Medical Center, The Netherlands, (7) Erasmus

Medical Center, Rotterdam, The Netherlands, (8) Durrer Center for Cardiovascular Research, Utrecht,

The Netherlands, (9) Maastricht University Medical Center, The Netherlands, (10) Princess Máxima

Center for Pediatric Oncology, Utrecht, The Netherlands, (11) Dutch Techcentre for Life Sciences,

Utrecht, The Netherlands, (12) Netherlands Cancer Institute, Amsterdam, The Netherlands, (13)

Hartwig Medical Foundation, Amsterdam, The Netherlands, (14) Leiden Institute for Advanced

Computer Science, Leiden University, Leiden, The Netherlands

The increase in personal genome data generated in diagnostics and research holds great promise for

advancing personalized prevention and medicine. However, valuable genomic and associated clinical

data is fragmented across many healthcare providers and research organizations, making it difficult

to reuse due to lack of findability, accessibility and interoperability. This prohibits us from exploiting

the potential information contained in these genomes for health benefit. FAIR Genomes aims to

provide guidelines that should increase reuse of genomic data while considering the needs of all

stakeholders and addressing ELSI issues.

We present a standardized meta‐data schema to harmonize genomic data workflows and their

reporting practices. This schema is broadly segmented into five categories: general information;

informed consent; personal and clinical information; material information and technical information.

In face‐to‐face and videoconference meetings, we work towards defining the schema, which is a list

of common and optional data elements with relationships and values mapped to existing ontologies

such as SNOMED, DUO, HPO, UMLS and EDAM. This project aims to make all data and meta‐data

elements findable and interoperable to increase FAIRness and standardization in capturing genomic

data. This meta‐data schema provides a strong basis for digital twin data in Dutch hospitals,

development of personal genetic lockers, and active Dutch participation in the European '1+ Million

Genomes' Initiative.


The scope of this schema goes beyond to next‐generation DNA sequencing data. We expect to

expand into various *omics varieties, as well as capturing analysis pipelines in FAIR terms. Hence, the

FAIR Genomes meta‐data framework could be used to develop other research‐based infrastructures

such as X‐omics, BBMRI, ELIXIR, Solve‐RD and European Joint Programme on Rare Diseases.

The FAIR Genomes meetings are open to receive input from anyone to achieve the highest quality

and usability of the resulting meta‐data framework. Join us at: https://github.com/fairgenomes.

Keywords: FAIR, datasharing, genomics, healthcare, meta‐data, ontologies, framework



BBMRI‐Omics: Valuable resource of multi‐omics data and analysis

tools Marian Beekman (1), Jurriaan Barkey Wolf (1), Davey Cats (1, 2), Joyce van Meurs (3), Lude Franke

(4), Bastiaan T. Heijmans (1), Morris Swertz (4), Leon Mei (1, 2), Cornelia van Duijn (5), Dorret I.

Boomsma (6), P. Eline Slagboom (1), GONL (7), BIOS Consortium (8), BBMRI Metabolomics

Consortium (9)

(1) Molecular Epidemiology, LUMC, Leiden, (2) Sequencing Analysis Support Core, Leiden University

Medical Center, Leiden, The Netherlands, (3) Internal Medicine, Erasmus University Medical Center,

Rotterdam, (4) Genetics, University Medical Center Groningen, University of Groningen, Groningen,

The Netherlands, (5) Epidemiology, Erasmus University Medical Center Rotterdam, Rotterdam, The

Netherlands, (6) Biological Psychology, Vrije Universiteit, Amsterdam, The Netherlands, (7) BBMRI‐NL

consortium Genome of the Netherlands, (8) BBMRI‐NL consortium Biobank‐based Integrative Omics

Studies, (9) BBMRI‐NL consortium Metabolomics

BBMRI‐Omics is the joint collection of omics data that has collaboratively been generated on

thousands of participants of 29 Dutch biobanks and that is made available for BBMRI researchers

focusing on integrative omics studies. BBMRI‐Omics is publicly available and has proven to facilitate

researchers in their discovery of novel biological mechanisms and biomarkers for health and disease.

BBMRI‐Omics consists 4,000 individuals with integrative genomics data (genome, epigenome,

transcriptome and metabolome) with an extension of the metabolome in 30,000 extra individuals

and whole genome sequences in a selective group of 700 individuals. BBMRI‐Omics also provides

tools (on gitlab and Bioconductor) and computational space both facilitating omics data analysis. The

summary statistics of cross‐omics association analyses, like expression QTLs, methylation QTLS,

metabolite QTLs are browsable in the BBMRI‐Omics Atlas (bbmri.researchlumc.nl/atlas), where for

example genes or genetic locations of interest can be browsed for association with DNA methylation,

metabolite levels and/or gene expression in blood. Soon it will be possible to link available omics

data to brain available imaging. The unique scale of the BBMRI‐Omics infrastructure, in the sense of

the number of individuals as well as the amount of data per individual, enables researchers to

investigate a broad spectrum of research questions.

Acknowledgements This work was financially supported by BBMRI‐NL, a Research Infrastructure financed by the Dutch

government (NWO, numbers 184.021.007 and 184.033.111).

Keywords: Whole genome sequences, transcriptome, methylome, metabolome, multi‐omics data,

analysis tools



Uitdagingen bij het bouwen van een Fair Data station Jack Broeren (1)

(1) stakeholders

We willen informatie delen over waar je tegen aan kan lopen als je een FAIR data station inricht voor

productiedoeleinden. Het is relatief simpel om in een POC of Pilot een klein demo‐project op te

zetten. Als je dit dan vervolgens wil uitbreiden naar full‐scale dan loop je tegen allerlei zaken aan als:

performance, inrichten laad‐processen, update‐mechanismes etc.

Acknowledgements stakeholders

Keywords: Big data; scaling;performance; architectuur;



EATRIS‐Plus ‐ a multi‐omic toolbox to support cross omic analysis and

data integration in clinical samples Ms Anne‐Charlotte Fauvel (1) Dr Florence Bietrix (1), Dr Andreas Scherer (2), Prof. Alain van Gool

(3), Prof. Peter‐Bram 't Hoen (3), Prof. Marian Hajduch (4), Dr. Antonio Andreu (1)

(1) EATRIS, (2) FIMM, (3) Radboudumc, (4) IMTM

Efficient advancement of Personalised Medicine depends on the availability of validated patient‐

targeted biomarkers. However, as our capacity to identify genetic variants associated with complex

diseases increases, these do not fully recapitulate the resulting disease phenotypes, and a more

precise understanding of the molecular profiles are needed. This realisation provides a rationale for

the development of multi‐omic approaches. In order to turn the multi‐omic promises into a reality,

systemic bottlenecks impacting the biomarker field needs to be overcome:

Poor levels of technological and analytical harmonisation;

Poor data stewardship and compliance to the FAIR (Findable, Accessible, Interoperable, and

Reusable) principles;

Lack of understanding of the relationship between genomic biomarkers and downstream

molecular markers (transcriptomic, proteomic, metabolomic, among others);

Lack of reliable control reference values for these biomarkers in healthy populations; and

Poor understanding of the clinical needs resulting in limited clinical adoption.

Tackling those issues in a systematic way is one of the objectives of EATRIS‐Plus, a H2020‐funded

project to kick start early 2020. With 19 partners across 13 countries, the consortium ambitions to

deliver a multi‐omic toolbox to support cross omic analysis and data integration in clinical samples.

This toolbox will contain:

Consensus‐based SOPs for omic technologies;

Guidelines for omic analytical processes;

Validated reference materials for analytical processes;

Quality parameters for benchmarking quality assessment activities;

Data analytical and FAIRification tools;

Criteria for establishing reference values in population cohorts;

Troubleshooting guidelines;

Access to a repository of multi‐omic reference values

The omic tools will be developed and tested with a real‐setting demonstrator, an already established

cohort of 1,000 healthy individuals in Czechia upon whom genomic sequencing has been already

performed. Information available on this healthy individual cohort will be augmented during the

project with transcriptomic, proteomic and metabolomic data.

By providing such toolbox to the research community, EATRIS‐Plus will be the engine to enable high‐

quality research in the context of patient stratification and accelerate the implementation of

Personalised Medicine solutions.

EATRIS is the European Infrastructure for Translational Medicine providing services for accelerating

biomedical innovation.


Acknowledgements This project has received funding from the European Union's Horizon 2020 research and innovation

programme under grant agreement No 871096

Keywords: personalised medicine, multi‐omics, FAIR data, patient stratification, translational

research



Fully automatic construction of optimal radiomics workflows Martijn P. A. Starmans (1, 2), Sebastian R. van der Voort (1, 2), Hakim Achterberg (1, 2), Marcel

Koek (1, 2), Razvan L. Micle (1), Milea J.M. Timbergen (3, 4), Melissa Vos (3, 4), Fatih Incekara (1, 5),

Maarten M.J. Wijnenga (6), Guillaume A. Padmos (1),

(1) Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands, (2) Department

of Medical Informatics, Erasmus MC, Rotterdam, the Netherlands, (3) Department of Surgical Oncology,

Erasmus MC, Rotterdam, the Netherlands, (4) Department of Medical Oncology, Erasmus MC, Rotterdam, the

Netherlands, (5) Department of Neurosurgery, Erasmus MC, Rotterdam, The Netherlands, (6) Brain Tumor

Center, Erasmus MC, Rotterdam, the Netherlands, (7) Department of Pathology, Erasmus MC, Rotterdam, the

Netherlands, (8) Faculty of Applied Sciences, Delft University of Technology, the Netherlands

Purpose: Radiomics uses combinations of imaging features to predict clinically relevant data. Many

radiomics methods have been described in the literature; however, there is no single method that

works for many applications. We present a Workflow for Optimal Radiomics Classification (WORC),

an open‐source solution to fully automatically construct an optimal workflow per application.

Methods and Materials: WORC states radiomics as a modular workflow, including multiple

algorithms and their parameters for each component. During training, WORC automatically adapts

itself by testing thousands of pseudo‐randomly defined radiomics workflows. The best workflows are

combined into one optimal signature. To evaluate WORC, three experiments on different clinical

applications were performed: (1) to classify 119 patients

with primary liver tumors in benign or malignant on T2‐weighted Magnetic Resonance Images scans

(MR); (2) to predict the 1p/19q co‐deletion in 287 patients with presumed low‐grade gliomas on T1‐

and T2‐weighted MR; and (3) to distinguish liposarcomas from lipomas in 88 patients on T1‐weighted

MR. Ground truth was obtained through pathology. Evaluation was implemented through a 100x

random‐split cross‐validation, with 80% of the

data used for training and 20% for testing. Performance is given in 95% Confidence Intervals (CIs).

WORC requires an efficient infrastructure to host these datasets, integrate different software

solutions and execute a large number of workflows. WORC therefore uses the fastr workflow engine

for managing the execution of automated analysis pipelines. Datasets are stored on XNAT and

experiments executed on the SURFSara Cartesius cluster, for which fastr contains plugins.

Results: The CIs of the area under the ROC curve were (0.86, 0.99) for liver tumors, (0.74, 0.85) for

brain tumors, and (0.74, 0.93) for lipomas/liposarcomas.

Conclusion: The results in three different applications demonstrate that WORC is a promising

approach for fully automatic construction of optimal radiomics workflows.

Acknowledgements Martijn Starmans acknowledges funding from the research program STRaTeGy (project number

14929‐14930), which is (partly) financed by the Netherlands Organisation for Scientific Research

(NWO). Sebastian R. van der Voort acknowledges funding from the Dutch Cancer Society (Koningin

Wilhelmina Fonds (KWF) project number EMCR 2015‐7859).

Keywords: workflow optimization, automatic algorithm selection, radiomics, machine learning,

oncology



Fastr workflow engine for reproducible and managed large‐scale

processing Hakim Achterberg (1), Marcel Koek (1), Adriaan Versteeg (1), Mahlet Birhanu (1), Martijn Starmans

(1), Thomas Kroes (2), Esther Bron (1), Wiro Niessen (1, 3)

(1) Biomedical Imaging Group Rotterdam, Department of Radiology & Nuclear Medicine, Erasmus

Medical Center, Rotterdam, the Netherlands, (2) Division of Image Processing (LKEB), Department of

Radiology, Leiden University Medical Center, Leiden, the Netherlands, (3) Department of Imaging

Physics, Faculty of Applied Sciences, Delft University of Technology, Delft, the Netherlands

Within the life‐science domain, much of the processing is not a single executable that is run, but a

combination of many executables that need to be run in a specific environment. Traditionally, this

was handled with shell scripts. However, with the increasingly complex analyses and size of the data,

this solution has reached its limits. There is a strong trend towards distributed execution of pipelines

in processing environments such as a cluster or cloud, which requires special orchestration.

Workflow engines like Fastr formalize how data flows between processing steps. This helps allows for

validation of the workflow before and during execution.

Fastr has been designed with consolidated workflows in mind. To this end there are a number of

important features: 1) management of tool versions, 2) data‐provenance, 3) workflow and

intermediate results validation to pinpoint errors on occurence, and 4) visualization of the execution

of a workflow using PIM. Fastr allows the tracking and the use of different versions of tools, this is to

ensure reproducibility, also in the future when tools are updated. The provenance model ensures

that there is a complete audit trail for all processed data. The complete definition of tools allows the

system to check if the workflow is valid before run, comparing data types of connected steps. During

execution the system automatically checks if valid results have been created for each step to detect

errors early on instead of propagating them. Finally, Fastr has a plugin to submit progress of a

workflow run to Pipeline Inspection and Monitoring (PIM) to give a visual representation of the run

via a web interface. Fastr has been used for processing >105 imaging sessions, leading to >107 of jobs

being executed on a cluster. In conclusion, Fastr is a robust workflow system enabling reproducible,

managed workflows.

Acknowledgements BBMRI‐NL 2.0

Keywords: pipeline, processing, HPC



Quantitative Imaging Biomarker Storage and Compute Infrastructure Marcel Koek (1), Hakim Achterberg (1), Adriaan Versteeg (1), Mahlet Birhanu (1), Henri Vrooman

(1), Thomas Phil 1), Thomas Kroes (2), Aad van der Lugt (1), Wiro Niessen (1,3)

(1) Department of Radiology & Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands, (2)

Division of Image Processing (LKEB), Department of Radiology, Leiden University Medical Center,

Leiden, the Netherlands, (3) Department of Imaging Physics, Faculty of Applied Sciences, Delft

University of Technology, Delft, the Netherlands

For extracting Quantitative Imaging Biomarkers (QIBs) from population or cohort based imaging

studies, a storage and compute infrastructure is essential. To make these QIBs meaningful, they can

be related to other data in (biobank) repositories. For this, we developed an infrastructure where

standardized image‐analysis pipelines can be managed and manual annotations and inspections can

be performed on large medical imaging datasets. The infrastructure is currently being used in

multiple population imaging studies. The QIBs and processed data can be linked to study and genetic

data, creating more comprehensive biobank repositories.

The infrastructure can be divided into three main components:

* Medical imaging data storage using XNAT

* Compute infrastructure

* Data and workflow management services

The compute infrastructure is developed to work in cloud environments and HPC clusters. The Fastr

workflow engine and PIM inspection and monitoring are the key components. These tools interface

with each other through REST APIs. Fastr is able to interact with the cluster and cloud environments

for executing jobs and to data storage directly through extensions.

The data and workflow management services are a collection of tools and services to manage the

data flow, including manual interaction with the data, in an automated fashion. The key components

are the Study‐Governor for automatically managing the data flow, Task‐Manager for task based

manual interaction with the data, and the ViewR to interact with the data on XNAT by researchers

based on the tasks served by the Task‐Mmanager. These components interface with each other

through REST APIs.

Our infrastructure can greatly benefit personalized medicine by making pipelines for imaging

biomarker extraction available to researchers and clinicians. Additionally, we create a reference

database for different imaging biomarkers, which can be used to compare an individual against the

general population. This will enable improved re‐use of imaging data for diagnostics and prognostics.

Acknowledgements BBMRI‐NL2.0, EPI2, EuroBioImaging

Keywords: Quantitative Imaging Biomarker, IT infrastructure, Population Imaging



Streamlining manual tasks in large medical imaging studies Adriaan Versteeg (1), Hakim Achterberg (1), Mahlet Birhanu (1), Henri Vrooman (1), Marcel Koek

(1), Aad van der Lugt (1), Wiro Niessen (1,2)

(1) Department of Radiology & Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands, (2)

Department of Imaging Physics, Faculty of Applied Sciences, Delft University of Technology, Delft, the

Netherlands

Medical imaging studies, even with all the automated methods, still require manual work; to assure

the quality of input data (QA), to control the output quality of the automated pipelines (QC), and to

create annotations that can be used to develop Machine/deep Learning algorithms.

We developed two applications to streamline the manual tasks leading to increased productivity,

reproducibility and quality. These applications (ViewR and Task Manager) work together, the Task

Manager keeps track of the work to be performed and the ViewR is used by users to perform the

work.

This ensures that the tasks are performed by a specific person or by a person with a specific skill set

and that this person always has the correct images and tools available for these tasks.

Each task contains the location of the data and a ViewR template. The ViewR uses this template to

setup the layout, editor capabilities and electronic Case Report Form to be filled. Furthermore, to

improve ease of use, tasks are loaded by the press of a single button and all images are preloaded so

loading a new task takes seconds instead of minutes.

The combination of the Task Manager and the ViewR can be used for:

▪ QA/QC in large cohort studies

▪ Incidental Findings

▪ Manual annotations for Machine learning

▪ Inter/Intra rater comparison

The Task Manager, ViewR combinations have been successfully used in the Rotterdam Scan Study to

perform both incidental finding and QC on approximately 2000 subjects. It has been used for the

CVON Heart Brain Connection project to mark/outline Infarcts on 500+ subjects. It has been used for

inspection tasks for more than 50.000 scans.

Acknowledgements BBMRI‐NL2.0, EPI2, EuroBioImaging

Keywords: Population Imaging, Infrastructure, Machine learning



Linkage of Lifelines and PALGA data: Enhancing multidisciplinary

research Rinus Voorham PhD (1), Annette Gijsbers PhD (1), Aafje Dotinga PhD (2), Trynke de Jong PhD (2)

(1) PALGA, (2) Lifelines cohort study

Multiple longitudinal data‐collections exist in the Netherlands with different levels of maturity, either

developed specifically for scientific research purposes (such as Lifelines) or for patient care purposes

(such as PALGA). Data linkage between these collections on the individual level is a powerful tool to

combine general health and lifestyle information with specialized clinical results, creating new

possibilities for multidisciplinary research.

Lifelines is a large, population‐based cohort study and biobank including 167,000 participants in the

Northern part of the Netherlands, among which three‐generation families. Lifelines allows linkage

between its own data and datasets from other data registries, including PALGA, in order to facilitate

scientific research in the field of healthy ageing.

PALGA, the nationwide network and registry of histo‐and cytopathology in the Netherlands, contains

over 85 million pathology records, maligned and benign, from all Dutch pathology laboratories.

Furthermore PALGA is the linkage between the investigator and the laboratories in case of requests

for pathology materials.

Data linkage with respect to privacy

Lifelines, PALGA, and ZorgTTP composed pseudonomization and encryption procedures making

linkage through Lifelines‐ and PALGA personal identifiers possible compliant with the Dutch privacy

protection laws. The linked (FAIR) dataset consisting of PALGA and Lifelines personal data matched

at the individual level by ZorgTTP, via PALGA pseudonyms and devoid of any Personal Identifiers, is

available on request through the secure remote access environment of Lifelines.

Useful for multidisciplinary research

The linkage of the PALGA pathology dataset with the Lifelines data on health, lifestyle and

demographics enables 1) linkage on request, 2) continuous update of data and 3) required privacy

protection of participants and patients. Ultimately, the linkage of Lifelines and PALGA databases will

strongly stimulate (prospective) multidisciplinary research aiming at personalized medicine and

health, leading to improvements in health care and disease prevention.

Acknowledgements ZorgTTP

Keywords: Data linkage Lifelines PALGA multidisciplinary research



Towards FAIR Data Steward as profession for the Lifesciences Salome Scholtens (1), Mijke Jetten (2), Jasmin Böhmer (3), Christine Staiger (4), Inge Slouwerhof

(2), Marije van der Geest (1), Margreet Bloemers (5), Ingeborg Verheul (6), Celia W.G van Gelder (4)

(1), Genomic Coordination Centre, UMCG, Groningen, (2) Radboud University Nijmegen, (3) UMC

Utrecht, (4) Dutch Techcentre for Life Sciences (DTL), Utrecht, (5) ZonMw, (6) LCRDM

Data stewardship expertise is essential in research. However, the lack of consensus on the function

profile of data stewards hampers adequate data steward capacity building in organisations. In our

ZonMw funded project entitled “Towards FAIR Data Steward as profession for the

Lifesciences”(Aug18‐Jul19), we delivered community‐endorsed job descriptions (including

responsibilities and tasks) and an agreement on the required knowledge, skills and abilities (KSAs) for

data stewards. To be able to build tailored data stewardship training, also detailed learning

objectives were formulated.

Our analysis of the data stewardship landscape shows three different, partly overlapping, data

stewardship roles which all have their own focus: policy, research and infrastructure

(https://doi.org/10.5281/zenodo.3243909). Furthermore we have identified 8 competence areas for

the data steward: Policy/strategy, Compliance, Alignment with FAIR data principles, Services,

Infrastructure, Knowledge management, Network, Data archiving. We have formulated the

responsibilities, tasks, KSAs and learning objectives per competence area. Our three matrices (one

for each data stewardship role) can be found at https://doi.org/10.5281/zenodo.3239079. Our final

report and all other documents are available at https://zenodo.org/communities/nl‐ds‐pd‐ls/.

We have formulated recommendations related to a) embedding data stewardship roles in university

function profiles, b) developing a self‐assessment tool for the competencies, KSAs and learning

objectives, and c) developing and implementing training. Since September 2019, we are continuing

our work in the context of the National Platform Open Science NPOS

(https://www.openscience.nl/en/projects/project‐f‐education‐and‐training‐in‐open‐science‐and‐

datastewardshop). This new project focuses on professionalizing data stewardship competences and

training. We will build on our previous work as well as on the outcomes of another recent Dutch data

stewardship project from LCRDM (https://doi.org/10.5281/zenodo.2669149).Major partners in the

project are ZonMw, LCRDM, VSNU, Vereniging van Hogescholen, PNN, and SURF.

Acknowledgements This project is funded by the ZonMw Personalised Medicine Programme (Dossier number: 80‐84600‐

98‐3007), Zilveren Kruis en KWF Kankerbestrijding. Additional funding was provided by UMCG,

UMCU, Radboud University Nijmegen, Radboudumc and DTL/ELIXIR‐Netherlands

Keywords: Data stewardship, Research Data Management, Training, Capacity Building, Competences

Skills, FAIR



Self‐initiated donation to a biobank. Should and could biobanks offer

this option? E. Vermeulen (1, 2), T. Schaaij‐visser, E. Eijdems (2), S. Rebers (2), M. Kaatee (2), H. Schipper (2), G.

Remmers (2) on behalf of the Maatschappelijke Raad Biobankonderzoek.

(1) VSOP, (2) MAB

Self‐initiated donation to a biobank. Should and could biobanks offer this option?

Introduction: When asked, citizens are very willing to donate tissues to biobanks. Due to increasing

awareness of biobank research and ‘citizen science’ with data tracking, people also consider self‐

initiated donation to biobanks.

Normally, most biobanks do not inform the public about the option of self‐initiated donation. By

providing (international) examples and starting a discussion, the Patient & Public Advisory Council

(PPAC) for Biobank Research (installed by BBMRI.nl) would like to stimulate Dutch biobanks to offer

(information and a procedure about) this option to the public.

Materials and methods: This work was initiated by questions posed by rare disease patient

representatives in the PPAC. Their question resonated with other patient organisations, such as

MD|OG (http://mdog.nl/). A horizon scan was done to collect information and possible procedures,

and to make an inventory of Dutch and other biobanks that welcome self‐initiated donation.

Results: Only a few Dutch biobanks facilitate self‐initiated donation. Self‐initiated donation is possible

under specific circumstances. Some biobanks offer information to donate for example brain tissue

post mortem (https://www.mscnn.nl/doneren/ ).

Some biobanks offer the option for donors to donate while the donor is alive. The International

Fibrodysplasia Ossificans Progressiva association offers people to donate: (

https://www.ifopa.org/biobank ). The Luxemburg biobank (IBBL) offers the option for ‘healthy

controls’ to donate: https://www.biobank.lu/research‐programmes/general‐population/?lang=en.

Tools that can be used to inform citizens are Orphanet and RD‐connect.

Discussion:We conclude there are several, yet still limited options to offer information and

procedures for self‐initiated donations.

Advantages of self‐initiated donation are:

‐ it increases inclusion of citizens‐ advertisement of the work of the biobank

‐ increased engagement of citizens in biobank research

To further discuss the possibilities and limitations, we like to invite all biobanks to visit our poster and

provide their feedback.

Acknowledgements We like to thank BBMRI.NL

Keywords: Public, patients, donation, engagement



CBS Microdata Services Fatima El Messlaki (1), Anouk de Rijk (1)

(1) CBS

Statistics Netherlands (CBS) collects data from people, companies and institutions. Upon receipt of

these data, all directly identifying personal details are removed as soon as possible and replaced by a

pseudo key. CBS uses these so‐called pseudonymised data to conduct statistical research. CBS will

never supply identifiable data to third parties. However, (academic) research institutions may, under

strict conditions, be given access to pseudonymised microdata. Microdata are linkable data at the

level of individuals, companies and addresses.

CBS offers a wide range of administrative health care data that can be reused for microdata research.

Combined with surveys and other available data sources with demographic and socio‐economic

variables this offers numerous possibilities for research on health, life style and use of care.

CBS Microdata Services facilitates this use of microdata by employees of authorized research

institutions by giving them access to the data from a secure workplace via a secure internet

connection. The requirements for obtaining an authorization are that the institute’s primary

objective is doing statistical or scientific research and that results of their research are accessible for

the general public. Researchers get access to the datasets necessary to answer their research

question, do their analyses and save their results. Under specified conditions it may also be possible

to link your own microdata set on persons or companies.

Within the RA, research / applications can be done on health care economy, epidemiology and public

health, evaluation or diagnostic and treatment protocols

Via the ODISSEI Microdata Access Discount, employees of ODISSEI can get a discount of up to 50% on

their CBS Microdata projects. ODISSEI also supports the development of the Odissei Secure Super

computer where researchers can analyse their data linked to CBS Microdata in SURFsara’s high‐

performance computing environment.

Acknowledgements not applicable

Keywords: Microdata reused data healthcare registers surveys Super computer CBS Statistics

Netherlands



Ethics review for non‐invasive (nWMO) health research: moving

towards a shared approach Martin Boeckhout (1), Miriam Beusink (2), Susanne Rebers (2), Evert‐Ben van Veen (1), Lex Bouter

(3), Irith Kist (2), Marjanka K. Schmidt (2, 4)

(1) MLC Foundation, (2) NKI‐AVL, (3) Free University Amsterdam & AUMC‐Vumc, (4) LUMC

The Wet Medisch‐wetenschappelijk onderzoek met mensen (WMO) provides the regulatory

framework for invasive health research, mandating ethics review for all research falling under its

remit. However, non‐invasive and data‐driven health research generally fall outside the scope.

Upholding standards of research research quality, privacy and data protection measures, as well as

protection of research and data subjects’ rights and interests are just as important for so‐called

nWMO research. However, an overarching national framework for ethics review is currently lacking.

What kinds of research are included under the heading of nWMO research? How is ethics review for

such research currently organized and conducted? What issues are currently at stake, and how can

ethics review be improved upon? Using a mixed‐methods approach involving literature review,

ethical and legal analysis, a series of interviews with ethics committees and stakeholders as well as a

workshop, we set out to answer these questions for an exploratory report commissioned by the

Dutch Ministry of Health. This poster presentation will present the initial findings.

All main parties involved in Dutch health research ascribe to the importance of ethics review for

responsible, high‐quality health research. Ethics review is common, but suffers from high inter‐

institional diversity and fragmentation. Potential ways forward include the drawing up of guidelines

for privacy and data protection, clarification of the burden research can demand from participants

outside the WMO as well as shared risk classifications on which to base organizational policies for

efficient and broadly comparable organization, procedures and formats for ethics review. Broadening

of the scope of the WMO to also include non‐invasive and data‐driven health research received much

support, but will require considerable time and preparation. In the meantime, concerted

collaboration and self‐regulation involving all parties involved in health research provide the most

promising way forward.

Acknowledgements This poster will present the findings of an exploratory report commissioned by the Dutch Ministry of

Health.

Keywords: ethics review, ELSI, privacy, WMO



Metadata matters Evelien van der Schaaf‐de Wolf (1), Erik van Iperen (1), Joost Daams (1), Rudy Scholte (1), Silvia

Olabarriaga (1)

(1) Amsterdam UMC, location AMC

When a research project is started, funders and institutions request a Data Management Plan (DMP).

In such DMP, researchers are asked to indicate which metadata standard they will use in order to

enhance data availability for reuse. However, so far no metadata standard has been widely accepted

and adopted for medical data. The DataCite metadata schema is available, however it does not cover

specific information for medical research. The Amsterdam UMC is therefore creating a metadata

schema for medical data.

To comply with standards that already exist, our first step was to select items from the DataCite

metadata schema that were considered meaningful for medical research. The second step was to

identify domains that use similar data types within the Amsterdam UMC. For each domain, an expert

was asked to determine a minimal set of metadata.

This resulted in two main metadata categories: the data collection level and the domain‐specific

level. The data collection level consists of a subset of DataCite items supplemented with additional

items that are necessary as a useful description for medical data collections. The domain‐specific

level consists of items for the following: subject data, biosamples, genetic data, images and signals,

and questionnaires. To facilitate initial adoption it is not mandatory to use all items, but it is

recommended to indicate as many as possible.

In the near future the metadata schema will be tested at the Amsterdam UMC through pilot projects.

Researchers who have started their project about four years ago will be asked to use the metadata

schema to describe their study data. In the long term, the vision is to implement the metadata

schema in a research data management tool (e.g. iRODS) in which all Amsterdam UMC research data

can be archived.

Acknowledgements We thank the contribution of the domain experts, Paul Groot and Aldo Jongejan, and the feedback

received from the Data Matters Expert Group of the Amsterdam UMC.

Keywords: metadata schema, data management plan (DMP), archiving, medical data



Getting started with trusted FAIR data lakes Erik Flikkenschild (1), Marlon Domingus (2)

(1) LUMC, chairman NL‐SIG veilige datakoppelingen, (2) Erasmus University Rotterdam, Data

Protection Officer

Introduction

Building trustworthy multipurpose data lakes requires a multi functional (legal, ethical and technical)

approach. Pooling pseudonomised data is complex, time consuming and experts must deal with

substantial privacy risks. Legal discussions are regretfully not started from community accepted

ethical viewpoints, and in the discourse, different perspectives are not fully done justice by not

recognizing their different scopes. Expert IT working groups, for example, typically do not have a

common accepted legal basis to start with, and tend to solve domain specific challenges, thus

creating silos.

Strategy

The authors of this paper are convinced that progress will be made if we start with a community

accepted ethical viewpoint first, in which common interest and privacy concerns are balanced. The

validity of these ethical principles shall subsequently be validated in the GDPR norms and formulated

in a common interest NL GDPR manifest. IT architects can then add technical safeguards in order to

secure the required trust. Everyone has to take into account the different perspectives (Open

Science, Artificial Intelligence, Personalized Medicine,..)

Methods

Datalake design principles (trusted linkage data framework) to be discussed based on the results of

ethical guiding principles. With a POC one can take a first small step and create an IT architecture for

each perspective. Evaluation of these three designs by existing NL communities can validate this

collaborative approach and decide on suitable adequate measures.

Results

A (national) trusted linked data framework and method that can be used to build trustworthy data

lakes.

Follow‐up

The creation of a national multi‐disciplinary multi‐domain working group that specifies the trusted

linked FAIR data framework.

Acknowledgements LCRDM working groups, SIG Veilige koppelingen members, NFU Good Research Practice working

group 6

Keywords: data infrastructures, Trusted data datalakes, ethics, personal data linkage, Open Science, Personalized Medicine, Artificial Intelligence, AVG, GDPR, data driven, ELSI



The real world nature of Prospective Dutch ColoRectal Cancer cohort

(PLCRC) Jeroen W.G. Derksen (1), Merel Wassenaar (1), Marloes A.G. Elferink (2), Jeanine M.L. Roodhart

(1), Anne M. May (3), Miriam Koopman (1), Geraldine R. Vink (1,2), on behalf of the PLCRC Working

Group

(1) Department of Medical Oncology, University Medical Center Utrecht, Utrecht, Netherlands, (2)

Department of Research, Netherlands Comprehensive Cancer Organisation, Utrecht, Netherlands, (3)

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht,

Netherlands

Background: Large high‐quality population‐based cohort studies are of tremendous value to support

real‐world data studies and improve treatment outcomes. In 2013, the Prospective Dutch ColoRectal

Cancer cohort (PLCRC) was initiated combining different data sources collecting longitudinal clinical

data, patient‐reported outcomes and biomaterial. PLCRC serves as an infrastructure for a wide range

of observational and interventional studies to improve outcomes of colorectal cancer (CRC) patients.

Here, we investigate whether PLCRC evolves in the direction of a nationwide cohort of real world

nature.

Methods: All CRC patients 18 years and older are eligible for PLCRC. Clinical and demographical data

of PLCRC participants, as collected in the Netherlands Cancer Registry, are compared with the total

Dutch CRC population over the period 2013‐2017 (reference population), also obtained from the

Netherlands Cancer Registry. Cohort characteristics are described and populations compared.

Results: In August 2019, 5722 patients were enrolled, of which 4759 (83%) with a complete TNM

stage classification included in the analyses. Compared to participants enrolled between 2013‐2016

(N = 1,088), we found a small shift of the 2017‐2019 population (N = 3671) towards the Dutch

reference population (N=72.685) in terms of age at diagnosis (mean 64.6±10.2 years in 2013‐16,

65.0±10.2 in 2017‐19, and 69.3±10.8 in the ref. group), sex (65% males in 2013‐16, 61% in 2017‐19,

and 57% in the ref. population), location of primary tumor (56% rectum in 2013‐16, 39% in 2017‐19,

and 31% in the ref. population) and TNM stage (34% stage I‐II in 2013‐16, 42% in 2017‐19, and 49% in

the ref. population).

Conclusion: Over the past years, enrollment in PLCRC steeply increased. Improvements in

recruitment and multidisciplinary enrolment of patients has enhanced PLCRC’s representation of the

real‐world. This helps to learn from today’s patients to enable personalized therapy facilitating better

outcomes for future CRC patients.

Acknowledgements n.a.

Keywords: infrastructure, cohort, real‐world data, personalized treatment, data collection



HOVON Pathology Facility and Biobank: Making the right choices for

workflow and data Nathalie J. Hijmering (1, 2), Erik van Iperen (3), Phylicia Stathi (1, 2), Dirk Veldman (4), Paula

Rinkens (4), Karin Aretz (4), Rita Azevedo (5), Martine Chamuleau (6), King H. Lam (7), D. De Jong

(1)

(1) Department of Pathology, AmsterdamUMC, location VUmc, Amsterdam, The Netherlands, (2)

HOVON Pathology Facility and Biobank, AmsterdamUMC, location VUmc, Amsterdam, The

Netherlands, (3) Durrer Center for cardiovascular research, Netherlands Heart Institute, Utrecht, The

Netherlands, (4) MEMIC, Maastricht University, Maastricht, The Netherlands, (5) Lygature, Utrecht,

The Netherlands, (6) Department of Hematology, AmsterdamUMC, location VUmc, Amsterdam, The

Netherlands, (7) Department of Pathology, Erasmus University Medical Center, Rotterdam, The

Netherlands

Background

Central pathology review and translational studies on tissue biopsy material are an integral part of

clinical trials for malignant lymphoma patients performed by HOVON (Haemato‐Oncology

Foundation for Adults in the Netherlands). The HOVON Pathology Facility and Biobank (HOP)

supports all pathology‐ related activities from requesting and processing the pathology material until

support of side studies. Optimized logistics are required to improve the quality and speed of

pathology review and successfully accommodate translational research.

Methods and Result

We have developed a customized, web‐based platform to monitor requesting, dispatching, collecting

and processing of bio‐specimens and storage of all compiled biomarker data within the TraIT

(Translational Research IT) infrastructure. Four years of experience now shows that complete

pathology review results can be made available within weeks after completion of the trial, including

molecular information. However, the current system is limited by its frozen design after launch,

restricted options for integration with external tools (double data‐entry) and suboptimal data export

options. Therefore, we designed a future‐proof platform, based on our experience. We separated the

logistical platform (Ldot) from the data storage (Castor), thereby introducing flexibility. Both

platforms are highly suitable to be set up by the user with support from MEMIC and Castor. Ldot

provides a user‐friendly overview of ongoing actions supporting the daily workflow. Both platforms

are optimized for integration to receive and export data from/to external tools, such as ALEA,

EXCEL/SPSS.

Conclusions

Future‐proof workflow platforms such as the HOP benefit from a flexible, modular design that can be

fully set‐up and maintained by the user. Various tools within the design should be selected for user‐

friendly daily workflow support and integrative options with external systems for data import and

export to optimize performance for direct trial‐related actions as well as for future research

applications according to FAIR principles.

Acknowledgements No

Keywords: Workflow support, Clinical trials, Pathology, Lymphoma, TraIT, Ldot, Castor



Public radiomics data collections in an open access Semantic Web

(SPARQL) endpoint Petros Kalendralis (1), Zhenwei Shi (1), Chong Zhang (1), Ananya Choudhury (1), Alberto Traverso

(1), Matthijs Sloep (1), Johan Van Soest (1), Rianne Fijten (1), Andre Dekker (1), Leonard Wee (1)

(1) GROW School for Oncology and Developmental Biology‐ Maastricht University Medical Centre+,

Department of Radiation Oncology MAASTRO, Maastricht, The Netherlands

Purpose or Objective

In a groundbreaking investigation, Aerts et al. (1) showed that quantitative imaging features

(radiomics) could potentially be used to decode information about tumour phenotype that is

relevant to disease prognosis. This publication has been the subject of intense interest ever since,

and there have been numerous requests for more information about the datasets – RIDER,

Interobserver, Lung1 and Head‐Neck1. To support research into repeatability, reproducibility,

generalizability and explainability in radiomics, we have now made the clinical follow‐up, extracted

pyRadiomics (2) features and DICOM metadata findable, accessible, interoperable and re‐usable

(FAIR) through a public semantic web access point (http://sparql.cancerdata.org).

Material and Methods

Overall survival intervals (days since start of radiotherapy) have been updated through the Dutch

national registry under an internal ethics board‐approved request. Spatially incorrect offsets of the

Primary Gross Tumour Volume (“GTV‐1”) regions of interest (ROIs) in the Lung1 set were amended in

The Cancer Imaging Archive (TCIA) collection . Image features were extracted using the ontology‐

guided radiomics workflow (3) (O‐RAW) and published in Resource Descriptor Format (RDF)

consistent with the Image Biomarker Standardization Initiaitive (IBSI) through an open Radiomics

Ontology. DICOM metadata as RDF was extracted using a research version of Semantic DICOM

(SoHard, GmbH, Fuerth; Germany). Clinical data was published in RDF using the Radiation Oncology

Ontology. Example queries were tested, which verified that the SPARQL endpoint was accessible.

Conclusion

We successfully generated separate RDF repositories of clinical, DICOM and radiomics data and

published these on an open access SPARQL endpoint. We can effortlessly cross‐reference into the

clinical, dicom and radiomics data. Queries can be generated which simultaneously looks in all three

repositories, thus taking advantage of the semantic linking between the data elements.

References

1: PMID: 24892406

2: PMID: 29092951

3:PMID: 31580484

Acknowledgements Clinical Data Science group‐Maastricht University Medical Centre+, Department of Radiation

Oncology MAASTRO, Maastricht, The Netherlands.

Keywords: Keywords: Radiomics, public datasets, reproducibility, FAIR data



The FAIRification of clinical data with modular knowlegde graphs. Matthijs Sloep (0000‐0003‐3602‐1885) (1), Petros Kalendralis (1), Johan van Soest (0000‐0003‐

2548‐0330) (1), Rianne Fijten (0000‐0002‐1964‐6317) (1), Andre Dekker(0000‐0002‐0422‐7996) (1)

(1) Department of Radiation Oncology (MAASTRO), GROW school for Oncology and Developmental

Biology, Maastricht University Medical Centre+, Maastricht, The Netherlands

Everyday physicians manually collect and enter information about their patients in electronic records

and turn it into some of the most expensive data available. Combining data from multiple centres is

vital for good research, unfortunately, linking it is not straightforward due to differences in treatment

protocols and clinical systems. Consequently, before we can solve any medical and research

problems, we first need to solve data integration problems first. ProTRAIT is an effort to link patient

data from multiple radiotherapy centres to build a comprehensive national registry and research

database for proton beam therapy. We turned to the FAIR principles

(https://doi.org/10.1038/sdata.2016.18) to facilitate data integration. We address the FAIR principles

using a Semantic Web technology and describe the challenges we faced and our solutions to solve

these issues.

Our approach to the FAIRification process is akin to the steps described by the Go‐FAIR initiative.

Radiotherapists first listed specific clinical items, then we manually created a knowledge graph and

annotated it with existing and new ontology classes using the Radiation Oncology Ontology

(https://doi.org/10.1002/mp.12879). Our approach means that much effort goes into creating and

maintaining the knowledge graphs. Each cancer type requires a separate, manually created

knowledge graph. To facilitate this process, we made full use of the considerable overlap between

information elements by creating separate turtle files for small subsets of information elements to

leverage the modular characteristics of knowledge graphs. The challenge was to arrange these items

in subsets that are both logical and practical, but the big advantage of this modular approach is that

we can easily adapt and add more variables when needed to adjust our graphs to changes in the

clinic.

Acknowledgements KWF Kankerbestrijding

Keywords: FAIR data, Linked Data, Modular, knowledge graphs



Privacy Sensitive Distributed Analysis of Dementia Cohorts from

Hospitals in The Netherlands Stuti Nayak (1), Ananya Choudhury (1), Matthijs Sloep (1), Inigo Bermejo (1), Johan Van Soest (1),

Andre Dekker (1)

(1) Department of Radiation Oncology (MAASTRO), GROW school for Oncology and Developmental

Biology, Maastricht University Medical Centre+, Maastricht, The Netherlands

Dementia is a multi‐factorial disease that affects around 35 million people around the world. The

prediction is that in Netherlands the absolute number of patients will increase by 115% in the next 20

years. (volksgezondheidenzorg.info) It imposes an enormous burden on society with both the

suffering of patients and their caregivers and the tremendous financial costs.

Although there is ample data around, not all data can be used to predict better outcomes for

patients. Not only technical bottlenecks but also ethical, legal and societal issues impede data sharing

and hinder researchers from using data to its full potential. Also, there is minimal standardization of

data and as such, often data from different hospitals are syntactically and semantically not

interoperable with each other.

However, with Personal Health Train (PHT), the focus shifts from sharing data to sharing algorithms

data. The PHT is agnostic of the actual data and relies heavily on the Findable Accessible

Interoperable and Reusable data principles. The current project aims at establishing a FAIR data‐

sharing infrastructure, the Personal Health Train, which connects three hospitals in the Netherlands

namely, EMC, Rotterdam, MUMC, Maastricht and LUMC, Leiden. The infrastructure will enable

usability of sensitive dementia cohorts from each of these hospitals, without data having to leave the

hospital.

As a proof of concept, we set up two mock FAIR data stations containing minimal set of variables

such as age, gender, diagnosis, MMSE, smoking, cholesterol, diabetes, BMI and hypertension. We

designed the Dementia Cohort Analysis (DCA) train to analyze the distributed cohort and find

correlations between the variables. It has been shown that we can use sensitive data from multiple

sources using PHT and still adhere to the ELSI of data sharing. These correlations would further lead

to designing and training the machine learning models for risk prediction in dementia patients.

Acknowledgements MEMORABEL Project, Department of Radiation Oncology (MAASTRO)

Keywords: Personal Health Train, FAIR, distributed learning, dementia



Only one copy H. Pieterman (1)

(1) ErasmusMC Rotterdam

Only one copy.

To do their job safely, physicians need access to all medical information of their patients. In fact,

diagnosis and treatment should be based on interpretation of both actual and all historical

information. It also should be possible to relate all these different information data.

All data should therefore ideally be displayed along one timeline. However the medical history of a

patient is not presented in the form of chronological data, but the data are traditionally more or less

hidden in documents in the various patients’ files, each with its own timeline. Nowadays, nearly 25%

of patients visit more than one physician (general practitioner not included) in parallel, and in

different hospitals, causing both fragmentation and duplication of patient files. A solution could be a

national electronic patient file. However, until today this is politically unacceptable in the

Netherlands. As a result, much time and money is spent in developing a platform solution for

searching, viewing and eventually copying the medical data.

A better solution which prevents searching, is storing all patient information (in the format of data)

immediately after their generation in a personal database hosted in one of multiple dedicated

datacenters. The data in such a “health vault” can only be accessed by the patients themselves or by

those who have an actual treatment relationship with the patient. Because the general practitioner

has a lifetime treatment relation with his patient he could serve as a steward or keyholder.

In doing so, data will be findable, accessible, interoperable and re‐usable (FAIR) for healthcare. With

commitment of the patients these datacenters can also function as stations for personal health

trains (PHT) for research.

Acknowledgements none

Keywords: timeline, patient files, database, FAIR, PHT



Personal Health Train Coalition I.M. Tharun (1)

(1) Lygature

Acknowledgements

Keywords:



Implementation and Deployment of a Federated Logistic Regression Arturo Moncada‐Torres (1), Frank Martin (1), Katja Aben (1),

Stefan Willems (2), Rinus Voorham (2), Paul Seegers (2), Gijs Geleijnse (1)

(1) Netherlands Comprehensive Cancer Organization (IKNL), Eindhoven, NL, (2) PALGA Foundation,

Houten, NL

At the Netherlands Comprehensive Cancer Organization (IKNL), we are dedicated to continuously

improve care for cancer patients. For this purpose, we curate the Netherlands Cancer Registry, a

population‐wide registry containing data of nearly all cancer cases in the country since 1989.

Complementing IKNL’s clinical data with (molecular) pathological data from the National Registry of

Histo‐ and Cytopathology (PALGA) has been proven to be of unprecedented value for observational

research. This is evidenced by the more than 30 joint projects between these two institutions

annually.

For these projects, data are typically obtained by aligning both datasets and generating a centralized

copy of the data, which can then be used for analysis. However, the latter is not ideal, since it could

potentially compromise patient data privacy. As a matter of fact, the General Data Protection

Regulation (GDPR) has established strict rules and limitations in this matter. Moreover, centralizing

data poses several organizational, operational, social, and political challenges.

In this project, we used our open‐source priVAcy preserviNg federaTed leArninG infrastructurE

(VANTAGE) for jointly analyzing IKNL and PALGA data without them leaving their original location.

Namely, we implemented a federated logistic regression model (based on the work by Li et al., 2015)

to investigate the relation between structured pathological reporting and survival of prostate cancer

patients. The federated model’s coefficients were equivalent to their centralized counterparts. The

results of this project show the potential for Federated Learning using VANTAGE as a cornerstone of

the Personal Health Train.


Keywords: distributed learning, federated learning, personal health train, vertically‐partitioned data



Secure Log Rank Test in Survival Analysis on Vertically Partitioned Data

using Multi‐Party Computation Rooijakkers, T.A. (Thomas) (1), Kamphorst, B. (Bart) (1), l’Isle, N.A.F. (Natasja) van de (1),

Cellamare, M. (Matteo) (2), Knoors, D. (Daan) (2)

(1) TNO, (2) IKNL

The growing complexity of cancer diagnosis and treatment requires data sets that are larger and

richer than currently available in a single cancer registry. However, sharing patient data is difficult

due to patient privacy and data protection needs. Secure Multi‐Party Computation (MPC) has the

potential to overcome these limitations. MPC is an umbrella term for cryptographic techniques that

allows several different entities to jointly perform analysis on data without sharing their actual data.

IKNL and TNO are collaborating to develop solutions using these technologies to enable privacy‐

preserving training of survival analysis models (e.g. Kaplan‐Meier estimator, Log Rank Test, Cox

regression, etc.)

The Kaplan–Meier estimator is a non‐parametric statistic used to estimate the survival function of a

lifetime table. To compare survival between groups we can use the log rank test. The log rank test is

a statistical procedure that compares two (or more) survival distributions by testing, at each

observed event time, whether the hazard functions of the groups are different. A direct application

could be to test whether treatment X has a greater effect on the longevity of a patient compared to

another treatment Y.

We present an MPC protocol that permits to perform the log rank test on vertically partitioned data.

In particular, we focus on cases where, for a group of patients, party A owns data on the patients

survival (diagnosis date, death date, censorship, etc.) and party B has access to the treatment

information.

Acknowledgements 1. Veugen, P.J.M. (Thijs)

Keywords: Multi‐Party Computation, MPC, Survival Analysis, Kaplan‐Meier, Log Rank Test



Characterization of depression symptoms using large scale

questionnaire data in the Dutch population: a BBMRI‐BIONIC study Marije van Haeringen (1)#, Sarah R Vreijling (1)#, Floris Huider (2), Mariska Bot (1), Yuri Milaneschi

(1), Najaf Amin (3), Joline W. Beulens (4, 5, 6), Marijke A. Bremmer (1), Petra J. Elders (5, 7), Tessel

E. Galesloot (8), Lambertus A. Kiemeney (8), Hanna M. van Loo (9), H. Susan J. Picavet (10), Femke

Rutters (4, 5), Ashley van der Spek (3), Anne M. van de Wiel (11), Cornelia van Duijn (3, 13), Edith

J.M. Feskens (11), Catharina A. Hartman (9), Albertine J. Oldehinkel (9), Jan H. Smit (1), W. M.

Monique Verschuren (6, 10), Gonneke Willemsen (2), Eco JC de Geus (2), Brenda WJH Penninx (1),

Dorret I Boomsma (2), Femke Lamers (1)*, Rick Jansen (1)* # shared first author, *shared last

author

(1) Department of Psychiatry, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam Public

Health Research Institute and Amsterdam Neuroscience, Amsterdam, Netherlands, (2) Department of

Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands, (3) Department of

Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands, (4) Department of

Epidemiology and Biostatistics, Amsterdam University Medical Centres, location VUMC, The

Netherlands, (5) Amsterdam Public Health Research Institute, The Netherlands, (6) Julius Center for

Health Sciences and Primary Care, University Medical Center Utrecht, University Utrecht, Utrecht, The

Netherlands, (7) Department of General Practice, Amsterdam University Medical Centres, The

Netherlands, (8) Radboud university medical center, Radboud Institute for Health Sciences, Nijmegen,

The Netherlands, (9) Department of Psychiatry, University of Groningen, University Medical Center

Groningen, Groningen, Netherlands, (10) Centre for Nutrition, Prevention and Health Services,

National Institute for Public Health and the Environment, Bilthoven, The Netherlands, (11) Division of

Human Nutrition and Health, Wageningen University, Wageningen, The Netherlands, (12)

Amsterdam Public Health and Amsterdam Neuroscience, The Netherlands, (13) Nuffield Department

of Population Health, University of Oxford, Oxford, UK

Background

Depression is a highly heterogeneous disease with diverse symptom profiles. In current clinical

practice, a personalized approach based on symptoms or biomarkers is lacking. The BIObanks

Netherlands Internet Collaboration (BIONIC) within the BBMRI infrastructure is a large scale online

survey on lifetime depression in seven Dutch population‐based and clinical cohorts. Our aim here is

to explore the consistency of single symptom prevalence across cohorts and determine demographic,

clinical and lifestyle characteristics of single symptoms.

Methods

Data was obtained from seven cohorts (N=~75.000, 60% female, age 16‐100) with the online Lifetime

Depression Assessment Self‐report (LIDAS). Lifetime depression was defined according to DSM‐5

criteria. The LIDAS contains the 8 DSM symptoms for depression and additionally demographic and

physical characteristics, information on age of onset, number and duration of episodes, past

treatment, and comorbidity with other psychiatric conditions. Using linear or logistic models, for

each of the 8 symptoms we compared participants with and without a specific depression symptom

in these demographic, clinical and lifestyle characteristics.

Results


Preliminary analyses based on ~ 4.000 individuals with lifetime depression revealed that, besides the

highly prevalent core DSM symptoms ‘depressed mood’ and ‘loss of interest’, the symptoms ‘trouble

concentrating’ and ‘energy loss’ had the highest prevalence (>80%). Cohorts showed very similar

prevalence rates of depression symptoms. The symptom ‘increased appetite or weight’ (prevalence:

20%) appeared to have the most distinctive demographic and clinical profile as compared to other

symptoms: individuals with this symptom were younger (P<1e‐7), more often female (P<1e‐7), had

more often recurrent episodes (P<1e‐17), and were younger during their first episode (P<1e‐8).

Conclusion

We found support for the consistency in endorsement of individual depressive symptoms across

cohorts. Patients with increased appetite or weight are most different from other patients, which

may indicate a partially unique underlying pathophysiology. We will use this information in our

future research to investigate the role of individual depression symptoms in personalized medicine

approaches.

Acknowledgements This research was financially supported by BBMRI‐NL,

a Research Infrastructure financed by the Dutch

government (NWO 184.021.007). We would like to

acknowledge all researchers involved in the BBMRI‐NL

project ’Phenomics 2.0 ‐ proof of principle for major depressive disorder’

Keywords: Online survey, depression



What is a “digital” patient? An ontological approach. L.P. Ter Meer (1)

(1) Erasmus School of Health Policy and Management, Rotterdam

There is a need to define essential objects so we can describe role, function and purpose and be able

to register these objects in a uniform way. “Patient” is one of the most frequently used objects in

healthcare but it lacks a uniform ontological description. Patient is most often used as a subject of a

disease instead of having its own identity and related elements. We describe an ontological model

for a patient encompassing 2 classes of elements, namely patient objective and patient subjective

ones. The model when applied may assist researchers, care providers, software developers and users

of the terminology in a more consistent approach of the patient. The advantage for the patient is

that the model offers an easy and complete overview of the components influencing its health and

when values are out of range how they can be related to disease. The model may also facilitate in the

design and process of data exchange.

Acknowledgements Dr. M. de Mul; Dr. E. Veringa

Keywords: Patient; Electronic medical record; data exchange; ontological definition; patient model



FAIRification at DNA, RNA and protein level in studying colorectal

tumor progression Menno de Vries (1), Malgorzata A. Komor (1), Mariska Bierkens (1), Annemieke C. Hiemstra (1),

Stefan Parayble (2), Wibo Pipping (2), Jan Hudecek (1), Guido Jenster (3), Gerrit A. Meijer (1),

Remond J.A. Fijneman (1)

(1) Department of Pathology, The Netherlands Cancer Institute, Amsterdam, (2) The Hyve, Utrecht, (3)

Department of Urology, Erasmus Medical Center Rotterdam

Colorectal adenomas, carcinomas and normal adjacent colorectal tissues were characterized at the

DNA, RNA and protein level as part of the NGS‐ProToCol study, in order to get a better understanding

of colorectal tumor progression. Large amounts of data were generated during this project, and

efforts were made to work towards FAIRification (Findable, Accessible, Inter‐operable and Re‐usable)

of the different data types; from raw data to processed or ‘final’ data, as well as accompanying

metadata.

Several applications were used to accommodate the FAIRification of diverse data types collected

during the study. The raw sequencing files (FASTQ) and accompanying phenotypic metadata were

uploaded to the European Genome‐phenome Archive (EGA); digital images of Tissue Microarrays

(TMA) were uploaded to the SlideScore server, hosted by the NKI; and all processed or ‘final’ data

(clinical, pathology, biosample and molecular data) were imported to the national instance of

tranSMART, a data‐integration platform specialized in cohort‐centric data selection/exploration, and

partly to cBioPortal. In tranSMART, metadata and hyperlinks were used to link to the raw data in

EGA, and to individual TMA cores in SlideScore. As a result, an overview of all existing data is

available in a user‐friendly way. In the nearby future, the full study will also be uploaded to

cBioPortal, another data‐integration platform hosted by Health‐RI. It has complementary view and

query capabilities, being specialized in sample, longitudinal and gene‐centric views and queries

combined with clear visualizations.

In conclusion, the rich dataset of NGS‐ProToCol can be re‐used for scientific research, resulting from

FAIRification of both its raw and processed data.

References

‐ European Genome Archive, https://www.ebi.ac.uk/ega/home

‐ SlideScore, www.slidescore.com

‐ tranSMART, https://trait.health‐ri.nl/trait‐tools/transmart

Acknowledgements All people who were involved with NGS‐ProToCol

Keywords: Colon, Carcinoma, EGA, European Genome Archive, tranSMART, SlideScore, NGS‐ProToCol



The Handbook for Adequate Natural Data Stewardship (HANDS) Els Swennen (1), Pascal Suppers (1), Tom Delnoy (1), Petra van Overveld (2), Marco Roos (2), Sonja

Meeuwsen (2), Salome Scholtens (3), Bert van Ooijen (4), Chantal Steegers (5), Klaudia Onnasch (5),

Erik van Ieperen (5), Rudy Scholte (5), Jeroen Belien (5), Sander de Ridder (5), Ronald van Schijndel

(5), Gepke Uiterdijk (5), Susanne Rebers (6), Ingeborg Verheul (7), Jan‐Willem Boiten (8), Linda van

den Berg (9) and Paula Jansen (10)

(1) Maastricht UMC+, (2) LUMC, 3. UMCG, (4) Erasmus MC, (5) Amsterdam UMC, (6) ELSI

Servicedesk/NKI, (7) LCRDM, (8) D4LS/Lygature, (9) WASHOE Life Science Communications, (10) UMC

Utrecht

Data stewardship refers to sustainable care for research data as integral part of the research process.

It covers all actions necessary to make digital research data Findable, Accessible, Interoperable and

Reusable (FAIR) during and after your research project, including data management, archiving and

reuse by third parties.

The Handbook for Adequate Natural Data Stewardship (HANDS) provides researchers at the eight

Dutch University Medical Centres (UMCs) with guidelines on data stewardship as well as lists of

practical steps to take for each stage of the research data life cycle. HANDS is one of the services and

tools that is offered on the Health‐RI website. A toolbox in HANDS refers to additional expertise and

resources within or outside your UMC. It offers information for all people involved in data

stewardship, from researchers and data stewards, to policy makers and developers of IT

infrastructure.

Acknowledgements Bijdrage 2015 versie: Peter Doorn (DANS, KNAW), Rob Hooft (DTL), Evert van Leeuwen

(Radboudumc), Leendert Looijenga (Federa), Barend Mons (DTL, LUMC), Arnoud van der Maas

(Radboudumc), Ronald Brand (LUMC), Morris Swertz (UMCG), Jan Jurjen Uitterdijk (UMCG), Pieter

Neerincx (UMCG), Jan Hazelzet (Erasmus MC), Linda Mook (Erasmus MC), Thijs Spigt (TTO, Erasmus

MC), Evert Ben van Veen (MedLawConsult), Margreet Bloemers (ZonMw), Jan Willem Boiten (CTMM‐

TraIT), Cor Oosterwijk (VSOP), Tessa van der Valk (VSOP), Jaap Verweij (Erasmus MC).

Keywords: Guidelines, datastewardship, FAIR, Toolbox



SURF Research Access Management, an authorisation and

authentication service optimised for researchers Rogier de Jong (1), Raoul Teeuwen (1), Michiel Schok (1)

(1) https://www.surf.nl/en/pilot‐authentication‐en‐autorisation‐for‐research‐services

SURF is a cooperative association of Dutch educational and research institutions in which its

members join forces. The members are the owners of SURF.

Researchers often experience problems logging in to research services. In order to make logging in

safe, easy and efficient, SURF has been conducting pilots with approximately 10 institutions with an

authorisation and authentication service optimised for researchers: SURF Research Access

Management (SRAM).

The service will be officially launched in Q2 2020 and is also applicable to existing / future Health RI

services.

SRAM tries to solve a number of specific challenges faced by researchers:

‐ How do you arrange consent and logging of access so that you comply with the GDPR?

‐ How can 'guests' from other institutions, companies or outside the Netherlands make use of

research services?

‐ How is group management arranged?

‐ How do we deal with specific research attributes?

‐ How can institutions limit the administrative workload resulting from having to create guest

accounts, 0‐hour contracts, etc.?

‐ How do you arrange access to non‐web resources (such as SSH) based on an institution account?

More info: https://www.surf.nl/en/pilot‐authentication‐en‐autorisation‐for‐research‐services

Acknowledgements SURF

Keywords: Identity & Acces Management for research, Trust and Identity, SURF



Administration of research logistics Dirk Veldman (1), Annemie Mordant (1) , Jacqueline Pisters (1), Luc Linden (1), Alfons Schroten(1)

(1) Maastricht University,MEMIC, Centre for data and information management

With large quantities of patients, extensive calling lists, SMS reminders and more, keeping track of a

research project can be challenging.

By using Ldot, you can build your own schedule that fits your research needs and preferences

perfectly. The Ldot Study Builder helps you simplify the execution of daily tasks in large studies

and/or complex protocols.

Benefits of Using Ldot

1. Have an immediate insight into study status pro‐active and visualize progress

2. Standardize protocol execution

3. Minimize required team efforts

4. Secure central storage of logistical data

5. Separate storage of personal data

6. Integration with data collection tools like Castor EDC, Open Clinica, and Qualtrics

Features

Ldot also offers a wide range of different features. These features include:

• GCP compliant (Good Clinical Practice guidelines)

• GDPR compliant (General Data Protection Regulation)

• Secure data storage (ISO 27001 certified)

• Secure and extended user role management

• Compatible for multicentre projects

Acknowledgements TraIT

Keywords: Logistics , GDPR, SelfService, Multicentre



Cardio‐metabolic profiling ‐ Association of EFV with increased levels of

circulating lipid metabolites Fariba Ahmadizar (1), Maxime Bos (1), Daniel Bos (1,2), Arfan Ikram (1), Mohsen Ghanbari (1),

Maryam Kavousi (1)

(1) Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands ,

(2)Department of Radiology and Nuclear Medicine, Erasmus MC ‐ University Medical Center,

Rotterdam, the Netherlands

Background/Objectives. Recent evidence highlights a link between larger epicardial fat volume (EFV)

and an unfavorable cardio‐metabolic profile. We explored the association of plasma lipid metabolites

with EFV among general population. We also performed the analyses in a subset of subjects with

type 2 diabetes (T2D).

Methods. We included a total of 695 participants from the population‐based Rotterdam Study.

Plasma samples were collected between 2002 and 2005 and metabolites were measured by proton

nuclear magnetic resonance (NMR). The assessment of EFV was through cardiac and extracardiac

multidetector computed tomography (MDCT), quantified in millilitres. Linear regression analysis

adjusted for age, sex, BMI, lipid‐lowering medications and smoking and corrected for multiple testing

(P‐value 0.05/142 independent lipid metabolites = 3.5 × 10‐4) was used to assess cross‐sectional

associations between EFV and 202 lipid metabolites.

Results. After correction for multiple testing, 102 lipid metabolites were independently associated

with EFV; the strongest positive association was shown with phospholipids in large VLDL (beta: 0.1;

SE: 0.01; p‐value: 9.9 × 10‐16), triglycerides, lipids in VLDL subclasses, and apolipoprotein B. Higher

levels of circulating phospholipids in large VLDL were significantly associated with larger EFV in

individuals with T2D (beta: 0.07; SE: 0.03, p‐value:1.6 × 10‐4). In individuals free of diabetes, the lipid

profile was similar to general population, except for phospholipids where the association was not

significant.

Conclusions. Larger EFV was associated with increased levels of circulating lipid metabolites mainly

phospholipids in large VLDL. Phospholipid metabolism has shown a central role in pathogenesis of

metabolic disease and is associated with insulin resistance and T2D. Association of lipidomics

signature to fat deposit may help provide more biological insights into risk stratification for metabolic

outcomes.

Acknowledgements The dedication, commitment, and contribution of inhabitants, general practitioners, and pharmacists

of the Ommoord district to the Rotterdam Study are gratefully acknowledged.

Keywords: Epicardial fat, lipidomics, type 2 diabetes



Amsterdam UMC expertise center for high performance computing Bob W. van Dijk (1), Paul F.C. Groot (1), Martijn D. Steenwijk (1), Daoud Sie (1), Ronald van

Schijndel (1)

(1) Amsterdam University Medical Centers, Amsterdam, The Netherlands

A number of trends make it urgent to improve high performance computing (HPC) services at

Amsterdam UMC:

• The extreme growth of data use in health care and research,

• The increasing demand for sharing data and analysis methods,

• The stricter regulations regarding data safety and privacy,

• The trend toward open science and FAIR data management,

• And the growing interest in artificial intelligence and machine learning.

As a result of these developments, between 300 and 400 researchers in Amsterdam UMC demand

better computing facilities. In surveys, these researchers responded a need for more compute

power, more data storage capacity, more flexibility in application use, and better ways to share data

than are available in the standardized ICT work environment. This mismatch between the centralized

ICT facilities and research needs has led to a fragmented landscape of self‐managed local solutions

for HPC that lacks cost efficacy and carries many risks.

To provide tailored HPC solutions, a center of expertise has been set up consisting of an HPC‐facility,

an HPC‐hub and an HPC‐community. The HPC expertise center has a front‐office and is positioned

within Research Support.

Amsterdam UMC aims to provide researchers with adequate ICT facilities for HPC that are scalable

and flexible as well as compliant with regulations and guidelines regarding privacy protection and

FAIR data management. An HPC program plan describes three principal solutions for HPC: the on‐

premises Research Zone, the Amsterdam UMC Research Cloud (through Surf Cumulus) and the Surf

HPC services. Realization of this detailed plan should ensure that from 2021 on HPC will be a basic

service for each researcher at Amsterdam UMC.

Acknowledgements not relevant

Keywords: HPC, Researchsupport



EPTRI ‐ European Paediatric Translational Research Infrastructure: a

bridge towards the future of paediatric medicine Tessa van der Geest (1), Valery Elie (2), Miriam G. Mooij (1, 3), Donato Bonifazi (4), Doriana

Filannino (4), Annalisa Landi (5), Mariangela Lupo (6), Lucia Ruggieri (5), Ales Stuchlik (7), Evelyne

Jacqz‐Aigrain (2), Saskia N. de Wildt (1, 8)

(1) Department of Pharmacology and Toxicology, Radboud University Medical Center, Nijmegen, The

Netherlands, (2) Paris Diderot University ‐ University Hospital Robert Debré – Paediatric Pharmacology and

Pharmacogenetics, 48 boulevard Sérurier ‐ 75019 Paris, France, (3) Department of Pediatrics, Leiden University

Medical Center, Leiden, The Netherlands, (4) Consorzio per Valutazioni Biologiche e Farmacologiche, Via

Putignani 178 ‐ 700122 Bari, Italy, (5) Gianni Benzi Pharmacological Research Foundation, Via Putignani, 133 ‐

70121 Bari, Italy, (6) TEDDY European Network of Excellence for Paediatric Clinical Research, Via Luigi Porta 14 ‐

27100 Pavia, Italy, (7) Institute of Physiology, Czech Academy of Sciences, Prague, Vídeňská 1083 ‐ 142 20

Praha, Czech Republic, (8) Intensive Care and Department of Paediatric Surgery, Erasmus MC Sophia Children’s

Hospital, Rotterdam, the Netherlands.

The European Paediatric Translational Research Infrastructure (EPTRI) aims to propose

developmental models for a future basic Research Infrastructure (RI) fostering high level basic and

applied research from drug discovery to paediatric formulation. The future RI will be complementary

to the existing RIs by putting together and networking all the available competences and

technologies useful to improve paediatric research in paediatric medicines.

For this purpose, EPTRI is preparing a Conceptual Design Report (CDR) describing the scientific and

technical requirements as well as the key components of the future RI. This CDR will represent a

relevant strategy for the design and future set‐up of the new RI. In addition, the project covers the

main areas of need in paediatric medicines technology, creating five technical and scientific domains

including 4 thematic platforms: 1) paediatric medicines discovery; 2) paediatric biomarkers and

biosamples; 3) developmental pharmacology; 4) paediatric medicines formulations and medical

devices (see Figure 1 and 2); and the scientific domain underpinning medicines development to

paediatric studies. EPTRI is coordinated by Consorzio per Valutazione Biologiche e Farmacologiche

(CVBF) and involves 29 partners from 21 EU/Associated countries including existing RIs and the major

paediatric expertise to cover the scientific topics in the proposal. Moreover, EPTRI has received

relevant support from several national/regional authorities, patients associations, academy and

health institutions, thus demonstrating a favourable framework for the future technology‐driven RI

focused on paediatrics.

Creating a framework for a future paediatric RI will help to accelerate paediatric drug development

processes, resulting in a substantial improvement of children’s quality of life. EPTRI will allow to

increase knowledge and research within the technical and scientific domains and facilitate transfer of

innovations to the clinics for the benefit of children.

Acknowledgements This project has received funding from the European Union’s Horizon 2020 Research and Innovation

Programme under Grant Agreement n. 777554.

Keywords: Key Words: Paediatric medicines; research infrastructure; children; biomarkers;

pharmacology; formulation



Public‐private partnerships in biobanking and biobank‐related

research Van der Stijl, R. (1, 4), Manders, P. (2), Scheerder, B. (1), Broeks, A. (3), Schaaij‐Visser, T. ( 4), van

Nuland, R. (5), Eijdems, E.W.H.M (1, 4)

(1) University Medical Center Groningen, (2) Radboud University Medical Center, (3) Netherlands

Cancer Institute, (4) BBMRI‐NL, (5) Health‐RI

Introduction

Biobanks and similar research infrastructures are responsible for their own sustainability. However,

they are also dependent on their surrounding macro‐environment. We have to create an

environment that enables and promotes sustainable biobanking. To do this we should strive for

appropriate overarching preconditions on a legislative, policy, organisational, and financial level. As a

first step we gathered input from biobanks and their users on current challenges and possible

solutions. This input from the supply and demand perspective is a starting point for further

discussions with relevant stakeholders.

Methods

We organised a workshop with 20 biobanks and data infrastructures. In addition, we organised four

focus groups with biobank users from academia and pharmaceutical industry.

Results

Biobanks and users indicated facing challenges on quality; accessibility; visibility; ethical, legal, and

societal issues; and financing. Sample and data issuance policies are complex and differ across

institutions. In addition, it became clear that the current incentives do not promote sharing and

sustainability. Further improvements could be made on the image of biobanks and clarity about their

impact. The potential for public‐private partnerships in biobanking and biobank‐related research

could be better utilized by bringing both parties to the table at an earlier stage. All parties indicated

the growing importance of data for research.

Discussion

The input of both biobanks and biobank users will be combined into recommendations for

overarching preconditions aimed to increase the use and sustainability of biobanks and similar

infrastructures. Setting suitable preconditions is only possible through the combined actions of all

stakeholders, including biobanks, researchers, research institutions, policymakers, and funders. Only

by collaborating can we work towards sustainable solutions, for the benefit of medical research,

health care, and the Dutch population.

Acknowledgements We would like to acknowledge the input of all workshop and focus group participants

Keywords: Sustainable biobanking, biobanks, stakeholders, recommendations



Recommendations for sustainable biobanking Van der Stijl, R. (1, 2), Scheerder, B. (1), Eijdems, E.W.H.M (1, 2)

(1). University Medical Center Groningen, (2) BBMRI‐NL

Introduction

To be sustainable biobank and other similar research infrastructures need to be operative, effective,

and competitive over their expected lifetime. However, sustainability is complicated and requires

finding the right balance on a social, financial, and operational level; within a dynamic environment.

Sample and data infrastructures can improve their sustainability by following these nine

recommendations, in the context of their own individual situation.

Methods

Through literature research, case study analysis, workshops, and focus groups we were able to

extract good practices and translate these into nine recommendations, with a focus on the financial

dimension, which might help biobanks in their search for sustainability.

Results

The nine recommendations are:

1. The right start by drafting a business plan

2. Adopt a user‐centred perspective

3. Know and show your value

4. Choose a business model and stick to it

5. Get a grip on your costs

6. Find multiple sources of funding

7. Engage with your key stakeholders

8. Become an attractive partner to industry

9. Make sure samples and data are (re)used

Discussion

There are things to learn from the business world. However, academic research infrastructures

should not forget that they are primarily a not‐for‐profit endeavour, operating in a complex medical‐

ethical environment. As there is a large diversity in biobanks and similar research infrastructures

there are no one‐size‐fits‐all solutions. These recommendations should therefore be applied from

one’s own perspective.

Acknowledgements We would like to acknowledge the support from Health‐RI, BBMRI‐NL and our BBMRI‐NL WP6

members. In particular, we want to acknowledge Tieneke‐Schaaij Visser for providing input and

support, and Rick van Nuland for supporting our workshops and focus groups.

Keywords: Sustainability, research infrastructure, recommendations, sustainable biobanking,

biobank



COREON – Committee on Regulation and Research P. Manders (1), M.K. Schmidt (2), M. Paardekooper (3), E.J. de Gaag (4), E.B. van Veen (5), L.M.

Bouter (6)

(1) Radboud Biobank, Radboudumc, Nijmegen, The Netherlands, (2) Division of Molecular Pathology,

NKI‐AVL, Amsterdam, The Netherlands, (3) EMGO+ Institute, Amsterdam UMC, Amsterdam, The

Netherlands, (4) Pharmo Institute, Utrecht, The Netherlands, (5) MLC Foundation, Den Haag, The

Netherlands, (6) Department of Epidemiology and Biostatistics, Amsterdam UMC, Amsterdam, The

Netherlands

Why?

The goal of COREON, the Committee on Regulation and Research of the Federa, is to encourage

careful and responsible research with health data and human tissues, aiming for a balance between

the public interest in such research, the participants involved and those of researchers. COREON

positions itself as the intersection between observational researchers and the regulation of such

research and as the platform where such issues are discussed amongst researchers.

Who?

COREON was established 2003, first as subcommittee of the VvE (the Dutch Epidemiological Society),

and later under the umbrella of the Federa (www.federa.org). COREON consists of people who are

active within observational health research. They represent a broad range of Dutch academic and

other research centres. The activities of COREON are funded by annual contributions from the

participating organizations.

What and how?

Via its plenary meetings (3‐4 times a year) COREON participants discuss scientific, legal and ethical

issues of observational research. Each year COREON organises a session on WEON, the yearly

convention of epidemiological researchers. COREON initiates working groups on specific questions

and publishes statements to guide researchers. COREON was responsible for the Code of Conduct on

health research with patient data of 2004 and the Code of Conduct on responsible use of human

tissue of 2011. COREON commissioned the revision of both Codes of Conduct and will be responsible

to submit the new Code of Conduct on health research with patient data for approval to the Dutch

Data Protection Authority.


Keywords: Regulation, observational research, code of conduct, health data, human tissue



Galaxy in education using the SURF Research Cloud M. J. Brandt (1), J. Koehorst (2), I. Nooren (1)

SURF (1), Wageningen University and Research (2)

At Wageningen University and Research (WUR) the Galaxy environment is used as a tool for

education to give students gives access to data intensive biomedical research tools in a user‐friendly

environment. For the environment it would be an advantage to be accessible from any computer by

starting it in the cloud. Setting up a Galaxy cloud environment for a group of students can

complicated and repetitive work.

The SURF Research Cloud (RSC) is a SURF service to make using the cloud easier for the SURF

members, ea. the Dutch research and education institutes. What RSC adds as a value to the existing

research cloud offerings is that it presents a single‐entry point to all cloud needs of a researcher with

all the state‐of‐the‐art technologies tied together. In addition to in‐house SURF IAAS offering “HPC‐

Cloud” and connections to other institute clouds, many of the available public cloud platforms such

as AWS and Azure are integrated to RSC allowing the existing users of these platforms manage their

applications with just a few clicks through the RSC portal.

For a research application like the Galaxy environment the RSC platform offers building blocks such

as starting with specific configuration and tool set every time, inviting users using their federated

identity and linking to datasets and persistent storage. This means the Galaxy service only has to be

configured once and can be restarted through our user‐friendly portal with a few clicks. Managing a

separate server is not needed anymore.

The users of the Galaxy environment can log in to the portal with institute account and start using

fully setup Galaxy environment by pressing the access button.

The next step after this pilot is to connect a fully configured Galaxy environment to scalable compute

for running more complex pipelines for research needs.

Acknowledgements SURF, Wageningen University and Research

Keywords: Galaxy, Cloud, data intensive biomedical research tools



European Paediatric Translational Research Infrastructure (EPTRI): a

survey to map the expertise of the excellence of developmental

pharmacology in pan‐European countries Tessa Van der Geest (1), Valery Elie (2), Miriam G. Mooij (1, 3), Donato Bonifazi (4), Doriana

Filannino (4), Annalisa Landi (5), Mariangela Lupo (6), Lucia Ruggieri (5), Ales Stuchlik (7), Evelyne

Jacqz‐Aigrain (2), Saskia N. de Wildt (1)

(1) Department of Pharmacology and Toxicology, Radboud University Medical Center, Nijmegen, The

Netherlands, (2) Paris Diderot University ‐ University Hospital Robert Debré – Paediatric Pharmacology and

Pharmacogenetics, 48 boulevard Sérurier ‐ 75019 Paris, France, (3) Department of Pediatrics, Leiden University

Medical Center, Leiden, The Netherlands, (4) Consorzio per Valutazioni Biologiche e Farmacologiche, Via

Putignani 178 ‐ 700122 Bari, Italy, (5) Gianni Benzi Pharmacological Research Foundation, Via Putignani, 133 ‐

70121 Bari, Italy, (6) TEDDY European Network of Excellence for Paediatric Clinical Research, Via Luigi Porta 14 ‐

27100 Pavia, Italy, (7) Institute of Physiology, Czech Academy of Sciences, Prague, Vídeňská 1083 ‐ 142 20

Praha, Czech Republic, (8) Intensive Care and Department of Paediatric Surgery, Erasmus MC Sophia Children’s

Hospital, Rotterdam, the Netherlands

INTRODUCTION: Currently, the European landscape related to the developmental pharmacology

appears scattered and with low awareness of available services and facilities in this field, resulting in

overlapping initiatives and inefficient use of financial, instrumental, and human resources. European

Paediatric Translational Research Infrastructure (EPTRI) aims to design the framework of a paediatric

Research Infrastructure (RI) intended to enhance technology‐driven paediatric drug discovery. Within

the project, 5 technical and scientific domains have been identified among which the developmental

pharmacology platform aimed to enhance knowledge on developmental changes affecting drug

disposition. We here present the developmental pharmacology platform

MATERIALS AND METHODS: Within EPTRI, a survey was launched among selected research centres in

the field of developmental pharmacology to map the expertise within paediatric pharmacology in

pan‐European countries and identify the possible gaps in the available paediatric research services

and facilities. Firstly, the survey was delivered to 74 recipients between April‐June 2018. Later on, to

have a wider map of the European research units and services, the survey was re‐opened and

distributed among 153 recipients between January‐April 2019.

RESULTS: 38 service providers answered the survey among which 8 came from UK, 7 from Italy, 6

from The Netherlands. The analysis allowed to define a map of services to be provided within the

developmental pharmacology platform and represented in Figure 1. Relevant expertise has been

identified such as analytical labs capable to set‐up sensitive drug assays, paediatric omics facilities,

pharmacometrics expertise, large databases adapted to paediatric pharmacoepidemiology, as well as

placental platforms.

CONCLUSION: This analysis allowed to map the research units and services that will be provided in

the field of developmental pharmacology platform within EPTRI. Likewise, it provided a point of

reflection for the scientific community on the strengths and weaknesses of this research areas and

the relevance of EPTRI to fill these gaps.


Acknowledgements This project has received funding from the European Union’s Horizon 2020 Research and Innovation

Programme under Grant Agreement n. 777554.

Keywords: Paediatric medicines; research infrastructure; children; biomarkers; pharmacology;

formulation



Thyroid function and metabolomics: from observational research in

BBMRI cohorts to causal inference through Mendelian Randomization Nicolien A. van Vliet (1), Maxime M. Bos (1, 2), Fariba Ahmadizar (2), Marian Beekman (3), Mariska

Bot (4), Layal Chaker (2, 5, 6), Christian Delles (7), Mohsen Ghanbari (2), Antonius E. van

Herwaarden (8), Evelyn Houtman (3), M. Arfan Ikram (2), Martin Jaeger (9, 10), J. Wouter Jukema

(11), Margreet Kloppenburg (12, 13), Jennifer Meessen (3, 14), Ingrid Meulenbelt (3), Yuri

Milaneschi (4), Simon P. Mooijaart (1), Mihai Netea (10), Romana Netea‐Maier (9), Robin P.

Peeters (5,6), Brenda Penninx (4), Naveed Sattar (7), Eline Slagboom (3), Carisha S. Thesing (4),

Stella Trompet (1), Raymond Noordam (1), Diana van Heemst (1), BBMRI Metabolomics

Consortium

(1) Section of Gerontology and Geriatrics, Department of Internal medicine, Leiden University Medical

Center, Leiden, the Netherlands, (2) Department of Epidemiology, Erasmus University Medical Center,

Rotterdam, The Netherlands, (3) Department of Biomedical Data Sciences, Section Molecular

Epidemiology. Leiden University Medical Center, Leiden, The Netherlands, (4) Department of

Psychiatry, Amsterdam Public Health research institute, Amsterdam UMC, Vrije Universiteit

Amsterdam, Amsterdam, the Netherlands, (5) Department of Internal Medicine and Academic Center

for Thyroid Diseases, Erasmus Medical Center, Rotterdam, the Netherlands, (6) Academic Center for

Thyroid Diseases, Erasmus Medical Center, Rotterdam, the Netherlands, (7) Institute of

Cardiovascular and Medical Sciences, College of Medical, Veterinary and Life Sciences, University of

Glasgow, United Kingdom, (8) Department of Laboratory Medicine, Radboud Laboratory for

Diagnostics (RLD), Radboud University Medical Center, Nijmegen, the Netherlands, (9) Department of

Internal Medicine, Division of Endocrinology, Radboud University Medical Center, Nijmegen, The

Netherlands, (10) Department of Internal Medicine and Radboud Center for Infectious Diseases,

Radboud University Medical Center, Nijmegen, The Netherlands, (11) Department of Cardiology,

Leiden University Medical Center, Leiden, the Netherlands, (12) Department of Rheumatology, Leiden

University Medical Center, Leiden, The Netherlands, (13) Department of Clinical Epidemiology, Leiden

University Medical Centre, Leiden, The Netherlands, (14) Department of Orthopaedics, Leiden

University Medical Center, Leiden, The Netherlands.

Thyroid hormones affect lipid metabolism, though it is unknown which specific lipid subclasses are

affected. Here we conducted an observational multi‐cohort study within the BBMRI framework and a

two‐sample Mendelian randomization (MR) study of thyrotropin (TSH) and free thyroxine (fT4) levels

within the reference range on Nightingale metabolomics.

We conducted an observational study in 6 cohorts (N=9,353) and MR analyses using published

summary‐level data from a genome‐wide association study (N=24,925). For the observational

analyses, we used linear regression adjusted for age, sex, BMI and smoking, subsequently meta‐

analyzed using random effects models. As genetic instruments for the MR studies we used 57 genetic

variants for TSH and 30 genetic variants for fT4 (explained variance 9.4% and 4.8% respectively).

Associations between the genetic instruments for TSH and for fT4 and the metabolites were modeled

using Inverse Variance Weighted (IVW). All analyses took into account multiple testing for 37

uncorrelated metabolites (P<1.34x10‐3).

Observationally, TSH was associated with 52/161 metabolite concentrations (mainly VLDL, fatty acids

and kidney function), and fT4 was associated with 21/161 metabolite concentrations across all


lipoprotein subclasses, fatty acids and ketone bodies. In the subset of 123 metabolites reported in

the data used for MR, genetically determined higher TSH levels were associated with lower

concentration of very large HDL only (IVW ‐0.09 SD, 95% C.I. ‐0.14;‐0.05, P=1.66x10‐4), while

genetically determined higher fT4 levels were associated with higher glycoprotein acetyls only (IVW

0.12 SD, 95% C.I. 0.05;0.19, P=7.88x10‐4). Sensitivity analyses yielded similar results.

Variation in thyroid status within the reference range is associated with a distinct metabolic profile,

though causality is not yet ascertained. Possible explanations for the discrepancy in the results

between the observational and MR analyses include differences in power, residual confounding or

different biological mechanisms for the observed compared to the genetically determined thyroid

status.

Acknowledgements This work was performed within the framework of the BBMRI Metabolomics Consortium funded by

BBMRI‐NL, a research infrastructure financed by the Dutch government through Netherlands

Organisation for Scientific Research (NWO) (Grant Nos. 184.021.007 and 184033111) and supported

by the European Commission project THYRAGE (Horizon 2020 research and innovation programme,

666869).

Keywords: BBMRI, thyroid, metabolomics, lipoproteins, Mendelian randomization



Applying the FAIR Data principles to a Rare Disease registry: a case

study of the VASCA registry Bruna dos Santos Vieira (1, 2), Karlijn Groenen (1), Annika Jacobsen (3), Martijn G. Kersloot (4, 5),

Rajaram Kaliyaperumal (3), Ronald Cornet (4), Peter A. C. ’t Hoen (2), Marco Roos (3), Leo Schultze

Kool (1, 6)

(1) Dept. of Radiology and Nuclear Medicine, Radboud university medical center, Nijmegen, The

Netherlands, (2) Center for Molecular and Biomolecular Informatics, Radboud university medical

center, Nijmegen, The Netherlands, (3) Dept. of Human Genetics, Leiden University Medical Center,

Leiden, The Netherlands, (4) Amsterdam UMC, University of Amsterdam, Dept. of Medical

Informatics, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands, (5) Castor

EDC, Amsterdam, The Netherlands, (6) VASCERN VASCA European Reference Centre

Registries of rare disease (RD) patients are extremely useful for establishing genotype‐phenotype

relationships, natural history studies and selection of patients for clinical trials. Each of the (local)

registries for a given disease may only contain a limited number of patients. Analyses across different

RD registries would help to increase patient numbers, but these analyses are usually difficult because

each registry is set up differently and data may not be accessed or at least not in a uniform way.

Therefore, applying FAIR data principles to RD registries is vital. Aiming at increasing FAIRness among

registries, the Platform on Rare Disease Registration (EU RD Platform) defined a set of Common Data

Elements (CDEs). This abstract describes how we implemented the CDEs and the FAIR data principles

in the Registry of Vascular Anomalies (VASCA).

A semantic model defining the CDEs and the relationships between them was created, adapted and

finalized by peer feedback. This model was then transformed into a Resource Description Framework

(RDF) template. Subsequently, an application was developed to feed a Twig template, which in turn

populates the RDF template with data entered in Castor EDC’s electronic Case Report Form (eCRF).

The RDF is accessible within a FAIR Data Point (FDP), allowing researchers to query and re‐use the

data in real‐time. During this implementation, several stakeholders were involved including patient

organization, domain‐, data‐ and ontology experts. Currently, VASCA is published in the EU RD

Platform metadata repository (ERDRI.mdr) and directory of registries (ERDRI.dor).

In conclusion, we successfully set up the infrastructure for a FAIR RD registry based on the CDEs. The

next step entails actual data collection within the participating centers. Also, we will investigate

interoperability by performing federated SPARQL queries between multiple registries.

Acknowledgements One author of this abstract is a member of the Vascular Anomalies Working Group (VASCA WG) of

the European Reference Network for Rare Multisystemic Vascular Diseases (VASCERN) ‐ Project ID:

769036.

Keywords: Vascular anomalies, rare disease registry, federated queries, RDF, SPARQL, Twig

template, FAIR data, common data elements, data model



The metabolic profile of arterial calcification in the multi‐cohort

BBMRI setting Maxime M. Bos (1), Nicolien A. van Vliet (2), Marian Beekman (3), Eline Slagboom (3), BBMRI

Metabolomics Consortium, Meike Vernooij (4), Jeroen van der Grond (5), Fariba Amahdizar (1),

Mohsen Ghanbari (1), Arfan Ikram (1), Diana van Heemst (2), Daniel Bos (1), Maryam Kavousi (1)

(1) Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands,

(2)Section of Gerontology and Geriatrics, Department of Internal medicine, Leiden University Medical

Center, Leiden, The Netherlands, (3) Department of Biomedical Data Sciences, Section Molecular

Epidemiology, Leiden University Medical Center, Leiden, The Netherlands, (4) Department of

Radiology, Erasmus University Medical Center, Rotterdam, The Netherlands, (5) Department of

Radiology, Leiden University Medical Center, Leiden, The Netherlands

Increasing evidence shows that greater arterial calcification leads to elevated risk of atherosclerotic

cardiovascular disease. However, the underlying biological mechanism of site‐specific calcification is

largely unknown. Within the BBMRI framework, we performed a multi‐cohort study on the

associations of the metabolic profile with calcification of coronary arteries (CAC), aortic arch (AAC)

and the aortic valve (AVC).

We included a total of 1114 participants from the population‐based Rotterdam Study and 390 from

the Leiden Longevity Study. Blood samples were used to determine a wide range of plasma

metabolites by proton nuclear magnetic resonance (NMR). Participants underwent non‐contrast

computed tomography to quantify CAC, AAC and AVC. Linear regression modelling adjusted for

relevant covariates was used to assess the associations of 166 metabolites with CAC, AAC and AVC.

Correction for multiple testing was based on 33 independent metabolites (p‐value 0.05/33 = 1.5 x 10‐

3).

One standard deviation (SD) increase in concentration of a1‐acid glycoprotein, was associated with a

0.10 SD increase in AAC (standard error (SE) = 0.03, p‐value = 9.5 x 10‐4). When considering sex‐

specific effects, we observed an association of acetate with CAC (beta = ‐0.09, SE = 0.03, p‐value = 4.1

x 10‐4) in women.

Higher levels of circulating glycoproteins were associated with increased AAC. Moreover, acetate was

associated with CAC only among women indicating differences in metabolic profile of CAC between

men and women. These results provide evidence for location‐specific differences and sex‐specific

effects in etiology of atherosclerosis.

Acknowledgements This work was performed within the framework of the BBMRI Metabolomics Consortium funded by

BBMRI‐NL, a research infrastructure financed by the Dutch government through Netherlands

Organisation for Scientific Research (NWO) (Grant Nos. 184.021.007 and 184033111). NvV and DvH

were supported by the European Commission project THYRAGE (Horizon 2020 research and

innovation programme, 666869).

Keywords: BBMRI, arterial calcifications, metabolomics, multi‐cohort, cardiovascular diseases



e/MTIC ‐ Health Data Portal initiative (1) Eindhoven University of Technology, (2) Catharina Hospital, (3) Kempenhaeghe Epilepsy and Sleep

Center (4) Royal Philips Eindhoven, (5) the Maxima Medical Center

Abstract e/MTIC ‐ Health Data Portal initiative

The Eindhoven MedTech Innovation Center (e/MTIC) is a large‐scale research collaboration aimed at

improving public healthcare through high‐tech health innovations.

Within this consortium with Catharina Hospital, the Maxima Medical Center, Kempenhaeghe Epilepsy

and Sleep Center, Eindhoven University of Technology and Royal Philips Eindhoven, we are

developing the Health Data Portal to facilitate and enable joint research projects.

The Health Data Portal (HDP) is a scalable collaboration platform that builds on existing initiatives to

provide an infrastructure where researchers can bring together and work safely with medical data. It

brings together medical institutes, academia and commercial partners to provide a fast track to

innovation.

In the design of the HDP, we have given highest priority to GDPR‐compliancy, without compromising

the research requirements. By using a trusted organisation as independent third party to process and

handle data requests, we can build trust and confidence that records are held securely and data is

de‐identified appropriately. The HDP also provides secure solutions for medical studies that require

combining data‐sets from multiple institutions and sources, stimulating collaboration across.

Healthcare innovations are increasingly based and depending on data‐driven research.

The HDP will gradually build up a rich metadata catalogue of medical information from the e/MTIC

partners, first within the domains of cardiovascular‐care, perinatal‐care and sleep‐care. This central

catalogue can grow by including other health domains and institutes in the future.

These large medical datasets, available between the data science platform members, will enable

scientists to develop and test new hypotheses about human health. The data sets can enable new AI

applications for machine learning to revolutionize healthcare.

We have the ambition to grow into a national platform that can lead to faster life sciences and health

innovation.

Acknowledgements The Eindhoven MedTech Innovation Center (e/MTIC)

Keywords: IT Architecture, Healthcare, Portal, Platform, Collaboration, Data analysis, Data Science,

Artificial Intelligence (AI), Anonymization, Pseudonymization, Medical Engineering, Diagnostics,

Value‐based.



The Pregnancy And Childhood Epigenetics (PACE) Consortium ‐ A

platform for epigenome‐wide association meta‐analyses Janine F. Felix (1,2), Stephanie J. London (3), on behalf of the PACE Consortium

(1) The Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the

Netherlands, (2) Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam,

Rotterdam, the Netherlands, (3) Epidemiology Branch / Genetics, Environment & Respiratory Disease

Group, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA

Differential DNA methylation represents a potential mechanism underlying associations of early‐life

exposures and later‐life health. In recent years, many pregnancy, birth and childhood studies have

initiated research on DNA methylation, using Illumina 450K or 850K EPIC arrays. These data can be

used in epigenome‐wide association studies. As individual studies are usually underpowered for such

studies, collaboration between studies and combined meta‐analyses are needed to optimize the use

of resources and increase the likelihood of detecting DNA methylation differences.

The global PACE Consortium brings together 40 studies with DNA methylation data in over 30,000

pregnant women, newborns, and children. Its primary aim is to identify differential DNA methylation

related to exposures and outcomes pertinent to health in pregnancy and childhood through joint

analysis of DNA methylation data. Secondary aims are to perform functional annotation‐based

analyses, to study causality of DNA methylation differences for child health phenotypes, to

contribute to methodologic development, and to exchange knowledge and skills. Studies include

participants from various backgrounds in terms of ethnicity, age, and living environment, enabling

testing of identified associations across different settings.

Findings to date include associations of prenatal maternal smoking, body mass index and air

pollution exposure with offspring DNA methylation at birth and in childhood, as well as associations

of DNA methylation with child asthma and lung function. Ongoing work focuses on further

gestational exposures, such as maternal stress and nutritional exposures, as well as child health

outcomes including cardio‐metabolic and neuro‐developmental phenotypes. A current overview of

published papers can be found at:

http://www.niehs.nih.gov/research/atniehs/labs/epi/pi/genetics/pace/index.cfm. The PACE

Consortium is an open, dynamic collaboration. Additional research groups are welcome to join. It

offers a strong platform to study the role of DNA‐methylation in the associations of early‐life

exposures and later health outcomes and to contribute to the field of population epigenetics.

Acknowledgements NA

Keywords: DNA methylation, consortium, epigenetics



The DANS services for sharing, cataloguing and archiving your health

data Cees Hof, Ingrid Dillo, Heidi Berkhout

Data Archiving and Networked Services (DANS)

DANS (Data Archiving and Networked Services) is the Netherlands institute for permanent access to

digital research resources. DANS encourages researchers to make their research data and related

digital outputs Findable, Accessible, Interoperable and Reusable (FAIR). To realise our mission, DANS

provides expert advice and certified services. DataverseNL is the DANS service for short‐term data

management, EASY our long‐term data archive, and NARCIS the national catalogue service for

scholarly information. Training and consultancy services are provided for generic Research Data

Management and Data Management Planning. More specific training sessions focus on repository

certification, metadata standards, software sustainability and knowledge organisation systems. The

(coordinating) activities of DANS in (inter)national projects and networks, ensure constant innovation

and a state‐of‐the‐art knowledge on infrastructural data developments.

Although the roots of DANS are within the humanities and social sciences, most DANS services are

generic services relevant for nearly all scientific disciplines, including the life and health sciences. As

part of the Dutch national e‐infrastructure for research data, DANS is involved in several projects and

initiatives around health data, often acting at the cross roads between the life and social sciences.

Also, the DANS training activities touch upon the developments around health data. Cataloguing the

Dutch “zorggegevens” in NARCIS, or the DANS training modules in the Helis Academy FAIR data

stewardship course, are examples of specific DANS contributions to the life and health sciences.

The DANS poster presentation provides an overview of the DANS services of interest to the owners

and custodians of health data, including examples of relevant recent projects. DANS invites

participants of the Health‐RI 2020 conference to probe how DANS could support the sharing,

cataloguing and archiving of their health data.


Keywords: FAIR data, services, training, archiving, data sharing, data catalogue



Metabolic risk scores: from metabolome to phenotype and back D. Bizzarri (1, 2), M.J.T. Reinders (2, 3), P.E. Slagboom (1,4), E.B van den Akker (1, 2, 3)

1) Molecular Epidemiology, LUMC, Leiden, The Netherlands, 2) LCBC, LUMC, Leiden, The Netherlands,

3) Delft Bioinformatics Lab, TU Delft Delft, The Netherlands, 4) Max Planck Institute for the Biology of

Ageing, Cologne, Germany

Introduction: The blood metabolome incorporates cues of the environmental and genetic

background of an individual, potentially offering a holistic view of its health status. Different types of

diseases have similar impacts on the blood metabolome, hence, that the blood metabolome might

not be disease specific. With this premises, we tried to identify novel metabolic states representing

the risk for multiple related cardio‐metabolic outcomes, using metabolic predictors for biomarkers

typically used in the clinic.

Methods: We will use the data available in BBMRI‐NL (composed by 1H‐NMR serum metabolomics

for 29 cohorts), to investigate the metabolic component of the available risk factors. We will use a

penalized regression model, to automatically select a subset of metabolites whose linear

combination will best predict each risk factors. We will employ two evaluation procedures: a 5‐Fold‐

Cross‐Validation and a Leave‐One‐Biobank‐Out‐Validation (holding out one cohort to use it as a test

set). From the trained models, we will obtain metabolic surrogate risk factors, which we will combine

training penalized Cox regression models, on 2 cohorts (enriched with cardiometabolic diseases) to

predict different cardio‐metabolic status of these individuals.

Results: In an exploratory analysis, we investigated which penalized regression method could deliver

the best metabolic prediction, using the data of one of the BBMRI cohorts (LLS‐P/O). We performed a

Two‐Deep‐Cross‐Validation analysis for 12 risk factors (both continuous and binary) to evaluate the

accuracy of: Ridge (RR), Lasso (LR) and ElasticNET (EN) regression. In these settings, we obtained

similar accuracy scores for the three methods, and we obtained good performances in particular for

the prediction of gender (auc~0.92), type2 diabetes (auc~0.88), statins use (auc~0.76).

Conclusions: We observed that the 1H‐NMR blood metabolomics could be used to accurately predict

several clinically relevant variables using penalized regression models.

Acknowledgements MOLEPI, LCBC, BBMRI‐NL

Keywords: metabolome, risk factors, metabotypes, BBMRI‐NL, regression models



Towards precision diagnostics: Untargeted metabolomics for the

diagnosis of inborn errors of metabolism in individual patients Purva Kulkarni 1, Albert Gerritsen 1,2, Udo F.H. Engelke 1, Brechtje Hoegen 2, Siebolt de Boer 1, Ed

van der Heeft 1, Marleen C.D.G. Huigen 1, Leo A.J. Kluijtmans 1, Karlien L.M. Coene 1

(1) Department of Laboratory Medicine, Translational Metabolic Laboratory (TML), Radboud

University Medical Center, Geert Groote Plein Zuid 10, 6525, GA, Nijmegen, The Netherlands.

(2) Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands.

Introduction

Inborn Errors of Metabolism are inherited conditions caused by genetic defects in enzymes or their

cofactors, resulting in a specific metabolite fingerprint showing accumulation of substrate or lack of

end‐product in patient body fluids(1). Untargeted metabolomics offers a comprehensive readout of

metabolic status on an individual patient basis. This makes it a promising tool for diagnostic

screening and treatment monitoring of IEM patients, especially when clinical presentations are non‐

specific.

Technological and methodological innovation

We have previously established Next‐Generation Metabolic Screening(2) as a metabolomics‐based

diagnostic tool for individual IEM‐suspected patients. To fully exploit the clinical potential of NGMS,

we have developed an automated computational pipeline to streamline analysis of complex data and

make it reproducible. The pipeline features a GUI that converts raw data, detects and aligns features

across samples and annotates them to identify significant deviations in patients as compared to

controls.

Results and impact

Using our automated computational pipeline, we have advanced the application of metabolomics in

clinical diagnostic setting to a next level. Our pipeline ensures reproducible and time‐efficient

metabolomics data management, processing and analysis. To validate this pipeline, we tested

samples of IEM patients, including several diagnoses that were not yet measured with NGMS, for

example L‐2‐hydroxyglutaric aciduria. Our results further expand the clinical applicability and IEM

portfolio of NGMS.

References

1. A. Tebani et al., International Journal of Molecular Sciences. 17 (2016)

2. K. L. M. Coene et al., Journal of Inherited Metabolic Disease. 41, 337–353 (2018)

Acknowledgements N/A

Keywords: metabolomics, diagnostics, rare diseases, big data, Inborn Errors of Metabolism, data

analysis, computational pipeline, Biomarkers, Mass spectrometry

Download - Abstracts - HEALTH-RI Health-RI... · Abstracts Conference 2020 “Towards Data Driven Health” enabling data driven health. Abstractbook Health‐RI conference 2020 # Primary contact

Top Related