cover page (dtu) emblrandd.defra.gov.uk/...comparepartb1-3.pdf · compare 1 cover page compare:...

70
COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and foodborne outbreaks in Europe Participant No Participant Organisation Name Country 1 Technical University of Denmark (DTU) DK 2 Erasmus Medical Center (EMC) NL 3 Statens Serum Institut (SSI) DK 4 Friedrich-Loeffler-Institute (FLI) G 5 Agence nationale de sécurité sanitaire de l’alimentation, de l’environnement et du travail (ANSES) F 6 Robert Koch-Institut (RKI) G 7 European Molecular Biology Laboratory (EMBL) G 8 Instituto Superiore di Sanita (ISS) IT 9 RijksInstituut voor Volksgezondheid en Milieu (RIVM) NL 10 Animal Health and Veterinary Laboratories Agency (AHVLA) UK 11 University of Edinburgh (UEDIN) UK 12 Universitäts Klinikum Bonn (UK-Bonn) G 13 Academic Medical Center (AMC) NL 14 Universiteit Antwerpen (UA) B 15 Artemis Wildlife Health BV (Artemis) NL 16 University of Cambridge (UCAM) UK 17 Tierärztliche Hochschule Hannover (TIHO) G 18 Universidad Castilla de la Mancha (UCLM) ES 19 Fondation Mérieux (FMER) F 20 Aristotle University Thessaloniki (AUTH) GR 21 L'Institut Français de Récherche pour l'Exploitation de la Mer (IFREMER) F 22 Erasmus Universiteit Rotterdam (EUR) NL 23 Australian National University (ANU) AU 24 Magyar Tudomanyos Akademia Wigner Fizikai kutatokozpont (WIGNER) HU 25 Civic Consulting Alleweldt & Kara Gbr (CIVIC) G 26 Responsible Technology (RT) F 27 University of Bologna (UNIBO) IT 28 Leibniz-Institut DSMZ GmbH (DSMZ) G 29 Wellcome Trust Sanger Institute (WTSI) UK 1. EXCELLENCE 2 1.1 Objectives 2 1.2 Relation to the work programme 7 1.3 Concept and approach 11 2. IMPACT 17 2.1 Expected impacts 18 2.2 Measures to maximise impact 20 3. IMPLEMENTATION 24 3.1 Work plan — Work packages, deliverables and milestones 24 3.2 Management structure and procedures 58 3.3 Consortium as a whole 65 3.4 Resources to be committed 68

Upload: others

Post on 11-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 1

COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and foodborne outbreaks in Europe Participant No

Participant Organisation Name Country

1 Technical University of Denmark (DTU) DK 2 Erasmus Medical Center (EMC) NL 3 Statens Serum Institut (SSI) DK 4 Friedrich-Loeffler-Institute (FLI) G 5 Agence nationale de sécurité sanitaire de l’alimentation, de l’environnement et du

travail (ANSES) F

6 Robert Koch-Institut (RKI) G 7 European Molecular Biology Laboratory (EMBL) G 8 Instituto Superiore di Sanita (ISS) IT 9 RijksInstituut voor Volksgezondheid en Milieu (RIVM) NL 10 Animal Health and Veterinary Laboratories Agency (AHVLA) UK 11 University of Edinburgh (UEDIN) UK 12 Universitäts Klinikum Bonn (UK-Bonn) G 13 Academic Medical Center (AMC) NL 14 Universiteit Antwerpen (UA) B 15 Artemis Wildlife Health BV (Artemis) NL 16 University of Cambridge (UCAM) UK 17 Tierärztliche Hochschule Hannover (TIHO) G 18 Universidad Castilla de la Mancha (UCLM) ES 19 Fondation Mérieux (FMER) F 20 Aristotle University Thessaloniki (AUTH) GR 21 L'Institut Français de Récherche pour l'Exploitation de la Mer (IFREMER) F 22 Erasmus Universiteit Rotterdam (EUR) NL 23 Australian National University (ANU) AU 24 Magyar Tudomanyos Akademia Wigner Fizikai kutatokozpont (WIGNER) HU 25 Civic Consulting Alleweldt & Kara Gbr (CIVIC) G 26 Responsible Technology (RT) F 27 University of Bologna (UNIBO) IT 28 Leibniz-Institut DSMZ GmbH (DSMZ) G 29 Wellcome Trust Sanger Institute (WTSI) UK 1.   EXCELLENCE 2  

1.1  Objectives   2  1.2  Relation  to  the  work  programme   7  1.3  Concept  and  approach   11  

2.   IMPACT 17  

2.1  Expected  impacts   18  2.2  Measures  to  maximise  impact   20  

   3.   IMPLEMENTATION 24  

3.1  Work  plan  —  Work  packages,  deliverables  and  milestones   24  3.2  Management  structure  and  procedures   58  3.3  Consortium  as  a  whole   65  3.4  Resources  to  be  committed   68  

Page 2: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 2

1. EXCELLENCE  1.1  OBJECTIVES  

COMPARE (COllaborative Management Platform for detection and Analyses of (Re-) emerging and foodborne outbreaks in Europe) is a collaboration of founding members of the Global Microbial Identifier (GMI) initiative (http://www.globalmicrobialidentifier.org) and institutions with hands-on experience in outbreak detection and response. GMI was established in 2011 with the vision to develop the potential of breakthrough sequencing technologies for the field of infectious diseases through a joint research and development agenda, with applications in clinical and public health laboratories across the world. In order to achieve that long-term goal, the GMI group aims to promote development and deployment of novel applications, data sharing and analysis systems across the diversity of pathogens, domains and sectors. The COMPARE project is set up to put this vision into action in Europe, addressing the following main objective:

• TO IMPROVE RAPID IDENTIFICATION, CONTAINMENT AND MITIGATION OF EMERGING INFECTIOUS DISEASES AND FOODBORNE OUTBREAKS,

• BY DEVELOPING A CROSS-SECTOR AND CROSS-PATHOGEN ANALYTICAL FRAMEWORK AND GLOBALLY LINKED DATA AND INFORMATION SHARING PLATFORM,

• THAT INTEGRATES STATE-OF-THE-ART STRATEGIES, TOOLS, TECHNOLOGIES AND METHODS FOR COLLECTING, PROCESSING AND ANALYZING SEQUENCE-BASED PATHOGEN DATA IN COMBINATION WITH ASSOCIATED (CLINICAL, EPIDEMIOLOGICAL AND OTHER) DATA,

• FOR THE GENERATION OF ACTIONABLE INFORMATION TO RELEVANT AUTHORITIES AND OTHER USERS IN THE HUMAN HEALTH, ANIMAL HEALTH AND FOOD SAFETY DOMAINS;

Next Generation Sequencing (NGS) used for Whole Genome Sequencing (WGS) or Whole Community Sequencing (WCS or metagenomics) are opening and dominating a new field of data generation and connection, revolutionizing pathogen detection and typing in human and animal health just as much as in food science. These technologies are rapidly dropping in cost and starting to become in reach of routine clinical and public health laboratories (Köser et al. 2012, Didelot et al. 2012). NGS/WGS/WCS enables generating the complete genomic information from the isolate or sample independent of both the sector (public health, veterinary health, food safety), and the type of pathogen (viruses, bacteria, parasites). The outputs (sequence data) provide one common language that can be exchanged and compared between laboratories and over time, in combination with other associated data defined here as “metadata” including contextual data (e.g. data on sample type and process, clinical, microbiological, epidemiological and other data) primary data (raw sequence reads) and derived data (e.g., genomic alignments of reads, assemblies and functional annotation data sets). COMPARE aims to harness the rapid advances in these technologies to improve identification, and mitigation of emerging infectious diseases and foodborne outbreaks. To this purpose COMPARE will establish a “One serves all” analytical framework and data exchange platform that will allow real time analysis and interpretation of sequence-based pathogen data in combination with associated data.

Figure 1: Genomic Information as the pathogen independent language across locations, sectors and time

Page 3: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 3

In order to achieve this, COMPARE will pursue specific objectives A through G, to be realized within the five-year duration of COMPARE. Together these 7 project objectives shape the overall structure of COMPARE – as depicted in figure 2 – showing the different components of the COMPARE Analytical Framework.

The project follows a common trajectory necessary to obtain high quality information for stakeholders to improve rapid identification, containment and mitigation of emerging infectious diseases and foodborne outbreaks. A. Starting from the design of risk-assessment models and risk-based sampling and data collection strategies

that enhance our capacity to detect potential disease outbreaks;

B. From samples and associated metadata to comparable data: The process to obtain high quality and comparable sequence data from and metadata associated with a specimen;

C. From comparable data to actionable information: The downstream analyses involved in turning comparable data into actionable information for addressing questions in frontline diagnostics, food-borne infections and (re-) emerging infections. “Actionable Information” is defined as information that enables users generating/receiving this information to take well-informed decisions and actions in pursuit of:

• Pathogen identification and characterization: Pathogen identification, genotyping and phenotyping, (e.g., detection of relevant antimicrobial resistance, virulence, epidemiological markers);

• Outbreak detection: Detection of putative clusters by examining strain-specific clusters in time, place and host (person, animal and food);

• Outbreak investigation: Rapid interrogation for given molecular strains to identify the potential origin of internationally distributed clones that may result in outbreaks; analysis tools to monitor extend of spread based on sequence diversity in relation to control measures;

• Outbreak prediction: Automatic analyses for predicting risk of emergence of pathogens with outbreak potential.

D. Researchers focusing on these steps team up with system developers that will build a data and information platform supporting rapid sharing, integration and analysis of sequence-based pathogen data in combination with other contextual metadata;

FIGURE 2: the COMPARE analytical framework and its main components from sample and data collection to generating actionable information to stakeholders in the human, animal and food sectors.

Page 4: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 4

E. Risk communication tools will be developed enabling authorities in the human and animal health and food safety sectors to effectively communicate to their stakeholders results obtained with the new analytical workflows;

F. The development of the analytical framework is underpinned by a set of supporting research, dissemination and communication activities promoting the acceptance of the system and its components. These activities encompass (i) consultations with our stakeholders serving on expert advisory panels throughout the project to maintain a prominent focus on user needs (ii) studies on the barriers (ethical, regulatory, administrative, logistical, political) to the implementation and widespread use of open-date sharing platforms, (iii) dissemination and training activities;

G. Finally COMPARE will include the development of a framework for estimating the cost-effectiveness of the COMPARE system, including the value of safety.

The specific objectives are listed below, and are further detailed in section 1.3, which describes the overall concept and approach of COMPARE. All objectives will take into consideration the inputs and feedback as gathered in the user stakeholder consultations and from existing networks and projects (see section 1.3) to avoid any duplication and to improve the input and output of available information. A To develop risk assessment (RA) models and risk-based sampling and data collection strategies for NGS

based analyses of food-borne and (re-)emerging infections. This work will build from experience of advanced RA modeling in the field of food-borne illnesses and for emerging infections, redesigning these RA models to include outputs from NGS analyses and developing matching risk-based sampling and data collection strategies. This involves the following set of specific objectives: 1. To develop a methodology for RA including NGS outputs including:

• A generic and spatial RA framework to identify which regions and species are at an increased risk of incursion and further spread of novel pathogens;

• To develop a generic food chain risk assessment framework based on NGS data • Tools for epidemiological transmission modeling and rapid spatial RA.

2. To develop risk-based sampling and data collection strategies for early detection of pathogens and for investigation of unusual patterns of infectious disease outbreaks including risk based sampling algorithms and protocols for: • Unusual patterns of clinical symptoms in humans and domestic animals in medical and veterinary

practice; • Early detection of emerging and re-emerging infections coming from wild or feral animals; • Detection of human pathogen circulation in the absence of recognized illness; • Food level sampling strategies for surveillance as well as food-borne outbreak investigation.

B. From samples and associated metadata to comparable data: To develop harmonised analytical

workflows for generation of high quality NGS data in combination with relevant meta-data for pathogen detection and typing across sample types, pathogens and domains.

This objective addresses the need for developing harmonized standards in the processes involved in generating high quality and comparable sequence data from a variety of sample types (and associated metadata) collected in different domains and sectors including: • Methods to optimize and harmonize sample handling for NGS; • Standardized protocols for sample processing for different sample types and viruses, bacteria and

parasites; • Standardized sequencing protocols for the different pathogens as well as for different purposes

(surveillance, diagnostics, single isolates, meta-genomics); • Improved sequence analyses, including novel bioinformatics tools for metagenomics, isolate typing, and

de novo reference-less identification of pathogen related sequences; • Protocols and tools for sequence curation and storage; • Historic and prospective reference biobanks; • A scheme for ring trials and external quality assurance systems.

Page 5: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 5

C. From comparable data to actionable information: To develop and apply innovative analytical tools and

methods for sequence-based pathogen and outbreak detection and analyses; The challenge in designing an analytical workflow that should generate ‘actionable information’ for these purposes is that the perspective on what is ‘actionable information’ can be quite different for different COMPARE users depending on their disciplinary background and focus. Objective C is therefore further divided into three complementary objectives, targeting: • Frontline diagnostics in human and veterinary clinical microbiology (C1); • Detection and mitigation of foodborne outbreaks (C2); • Detection and mitigation of (re-) emerging diseases (C3). Within each of these areas, the activities are divided in development of the essential analytical workflows and specific underpinning research studies, addressing key research questions, data, and output needs to pilot the developing workflow across pathogens, sectors and domains.

C1. To develop an analytical workflow for the use of single isolate and metagenomic NGS in human and

veterinary clinical microbiology including: • General work-flows for integration of NGS in clinical laboratory diagnostics; • A framework for prediction of phenotypic antimicrobial susceptibility based on the presence or absence

of genes and mutations in sequence data; • Tools for identification of hospital clusters and nosocomial transmission. The development of the analytical workflow is supported by underpinning research aimed at piloting this workflow and assessing the feasibility of NGS/WGS/WCS for clinical diagnostic use and hospital epidemiology with the specific objectives: • To validate application of NGS for diagnostics and hospital epidemiology; • To test feasibility of NGS for prediction of antimicrobial resistance phenotype to guide treatment; • To evaluate the use of NGS for syndromic surveillance based on data from hospitalized patients

C2. To develop an analytical workflow for population-based disease surveillance, outbreak detection and

epidemiological modeling of food-borne infections including (i) cross-sector and cross-pathogen methods for sequence based surveillance for food-borne pathogens and (ii) methods and tools to support outbreak detection, outbreak investigations and epidemiological analysis. This will include: • An analytical framework for routine sequence-based surveillance of current priority pathogens in food-

borne infections; • A framework for the application of the developed NGS and analysis tools in the epidemiological

handling of and response to food-borne outbreaks in Europe; • Tools for source attribution based on NGS based routine surveillance of food-borne pathogens; • The development of the analytical workflow will be supported by underpinning research on the use of NGS/WGS based pathogen genome data in the context of the surveillance activities for the main food-borne bacteria, viruses and parasites, i.e. Salmonella, E. coli, Listeria, cryptosporidium, norovirus, and hepatitis A. In this supporting research we will: • Establish robust analytic procedures of NGS/WGS data for Salmonella, STEC/EHEC, Listeria,

norovirus, hepatitis A, and Cryptosporidium to be able to identify epidemiologically linked isolates and differentiate these from similar unrelated isolates;

• Develop guidelines for interpretation criteria for defining clusters of disease and linking of isolates from various sources and reservoirs;

• Enable backward compatibility to important previous nomenclature and typing methods (e.g. serotypes, species, MLST);

• Perform studies in collaboration with global partners and networks including ECDC and EFSA. C3. To develop an analytical workflow encompassing cross-sector and cross-pathogen methods for

support of emerging pathogen identification and characterization in support of outbreak investigations and epidemiological analysis. This will include: • An analytical framework for detection of (re-) emerging pathogens in meta-genomic datasets; • Analytical tools for rapid sequence based detection of strain specific clusters in time, place and host for

the main emerging pathogen classes;

Page 6: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 6

• Analytical tools for fast and robust phylogenetic and phylogeographic analysis; • Analytical tools for detecting single nucleotide polymorphisms in pathogen NGS data; • Computational methods for the prediction of phenotype changes related to antigenicity, drug-resistance,

virulence, transmission, and other traits from nucleotide sequence data. The development of the analytical workflow is supported by underpinning research using the tools developed under C3, in the following studies: • Early detection and surveillance of enteric pathogens and genes through strategic sampling and

metagenomic analysis; • “ Hot-spot” based syndromic surveillance in animals and humans; • Early detection of surveillance of emerging zoonoses from wildlife reservoir through strategic sampling

and metagenomic analysis. • Early detection of changes in pathogen traits enhancing risk of outbreaks and pandemic.

D. To develop and apply a COMPARE data and information platform supporting rapid sharing, integration and analysis of sequence-based pathogen data in combination with other contextual metadata (clinical, microbiological, epidemiological, additional gene- and transcriptome-based analyses and other data) integrating both publicly available and confidential data; This objective is at the heart of COMPARE as it deals with developing and implementing the core system architecture and software that will enable COMPARE users to store, access, share and analyze sequence based pathogen data and associated metadata in support of the generation of actionable information. The system will include open and private/restricted areas, will be scalable and linked to appropriate existing systems and databases, and provide compute capacity, tools and interfaces in support of the COMPARE partners and future users.

E. To develop and apply novel risk communication tools and strategies for relevant authorities in the

human health, animal health and food safety domains; Building on the EU-funded TELL-ME project, the COMPARE framework will include risk communication tools (repertory of message templates, best cases, and guidelines for message creation in sever infectious disease events) in support of development of communication messages about findings, outbreaks, and new opportunities discovered and/or generated through COMPARE, addressing different sub-populations, in diverse EIDS and geographical, cultural, and temporal contexts.

F. To promote the (future) sustainable uptake of the analytical workflow or components thereof by the

key user-stakeholders of COMPARE by means of user consultations, dissemination and training activities The five-year project duration is regarded by COMPARE as the first start-up phase in the future wide-scale deployment of the system by its main users. Therefore, COMPARE will invest heavily in ensuring the direct involvement of its users and other stakeholders right from the start of the project by: • Organizing structured periodical User-Consultations promoting open and direct dialogue between

envisaged users of COMPARE, serving on Expert Advisory Panels (EAPs); • Identifying, clarifying and, as far as feasible, developing practical solutions for Political, Ethical,

Administrative, Regulative and Legal (PEARL) barriers, that hamper the timely and open sharing of data through COMPARE;

• Providing future users of COMPARE with adequate and easily accessible training programmes, combining help-desking, practical workshops, individual trainings and e-learning tools;

• Informing the potential users of COMPARE and their stakeholders (patients, public) on the rationale for COMPARE, its added value for rapid outbreak containment and response;

• Direct involvement of the European Nucleotide Archive (ENA) and thereby the International Nucleotide Sequence Database, of which ENA is the European node. Thus, all publicly available data will automatically receive global presentation through US-based NCBI’s GenBank, and the Asian counterpart DDBJ, hosted in Japan. Importantly, using ENA technology and data infrastructures will assure that non-COMPARE data sets, which will nonetheless be critical for COMPARE consumers, are integrated into the system and made available.

G. To develop a standardised framework for estimating the cost-effectiveness of the COMPARE system

and related methods and tools, including the value of safety;

Page 7: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 7

The adoption by users of COMPARE will largely be influenced by the cost-effectiveness of the tools made available in COMPARE. COMPARE will develop a framework for estimating the cost-effectiveness of the COMPARE system, including the value of safety, addressing the following specific objectives: • To identify the important elements in calculating costs and benefits of COMPARE and related methods

and tools (both regarding the system itself and from a societal perspective); • To identify and where necessary develop state-of-the art costing methodologies for the different elements

in the framework; • To develop and apply a methodology to value safety (provided through rapid identification of pathogens

through COMPARE) in several countries; • Using 1-3, to estimate the cost-effectiveness of COMPARE and related methods and tools using case

studies; • To assess options for refining selected elements of COMPARE in view of improving the overall cost-

effectiveness of the system.

1.2  RELATION  TO  THE  WORK  PROGRAMME  COMPARE is designed to meet the requirements of topic PHC-7– 2014: Improving the control of infectious epidemics and foodborne outbreaks through rapid identification of pathogens. Challenge Project proposals in response to PHC-7-2014 are expected to address the following challenge:

All partners in COMPARE either directly or indirectly are addressing this challenge on a daily basis as national and international diagnostic-, research-, surveillance-, and/or preparedness and response reference centers for emerging infections and/or food-borne infections. A major concern in addressing this challenge is their increasing incidence, and the increasing demand for rapid high quality data, coupled with global disparity in disease detection methods and capacities, leading to ongoing spread of pathogens that potentially could have been controlled if detected earlier (Braden et al. 2013; Jonges et al. 2013). More specifically, the challenges are as summarized below. For human health current routine disease surveillance typically captures the tip of the iceberg, as most systems are based on clinical cases observed at doctor’s visits of which an often very limited number are reported to health authorities (Gibbons et al. 2014). Furthermore, the current routine is based on individual pathogens, not necessarily capturing all emerging diseases threats, for instance antimicrobial resistance genes that are present in the commensal bacterial flora, thus going undetected (Aarestrup 2006). Regional and global surveillance networks have this limited coverage, and time-delay for outbreak confirmation and reporting has been estimated to be a median of 23 days (Chan et al. 2010). As a consequence, novel or re-emerging infectious agents are often not detected before they have spread locally and globally. For animal health well-defined surveillance systems exist, focusing on diseases with massive economic impact because of immediate restrictions to movement of production animals and food products if detected (e.g., avian influenza, foot and mouth disease). The vector-borne diseases bluetongue and Schmallenberg illustrate the highly unpredictable nature of disease emergence, and the challenges in controlling their spread given the density and connectedness of animal populations. Another major challenge is that novel human diseases often are rooted in the animal world, but may not affect animal health, thus escaping attention. This includes the recent emergence of avian influenza in China, and MERS CoV in the Middle East, that cause no or marginal symptoms in poultry and camels, respectively, but lethal human disease.

Specific Challenge: Human and animal health worldwide is increasingly threatened by potential epidemics caused by existing, new and emerging infectious diseases (including from antimicrobial resistant pathogens), placing a burden on health and veterinary systems, reducing consumer confidence in food, and negatively affecting trade, food chain sustainability and food security.

The increasing incidence and more rapid spread of such diseases are facilitated by modern demographic, environmental, technological, economic and societal conditions. Many of these infections are zoonoses, necessitating an integrated, cross-border, ‘one health’ approach to research and public health measures in the human and veterinary field, including the food chain.

Page 8: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 8

Wildlife play a substantial role in the emergence of infectious diseases in humans and production animals, illustrated by SARS, influenza, West Nile fever, or alveolar echinococcosis. Wildlife disease investigations, in contrast, focus on diseases that play an important role in the conservation and management of wild animal populations, for instance chytrid fungi and the decline of amphibian populations and die-offs of marine mammals caused by Morbillivirus infections. This illustrates an important challenge: early detection of emerging diseases that are a threat to other species is not a primary focus of the existing sector-specific surveillance networks, and requires cross sector collaboration with shared objectives. For food safety, an integrated system for monitoring of specific food safety threats exists in Europe, which involves sampling and pathogen characterization across the food chain, and linking and analysis of these data to study trends, detect diffuse outbreaks, and monitor effects of control measures (see ECDC/EFSA report on zoonoses and foodborne disease, feb. 2014)1. Molecular typing plays a crucial role in this system, but relies among others on the willingness of clinicians to refer patients for laboratory diagnostics and of these laboratories to refer isolates to public health laboratories for typing. The changing clinical practice, with rapid transition from culture-based methods to molecular detection, challenges this decade-old model of disease surveillance (Jones and Gerner-Schmidt 2012). In addition, these surveillance systems are less suited to capture the “new generation” of outbreaks, related with the global food market, as illustrated by recent examples of international diffuse food-borne outbreaks showing the vulnerability of the European population and industry for novel food-borne diseases (Newell et al. 2010; Aboutaleb et al. 2014; Petrignani et al. 2014: Grad et al. 2012). The currently used microbiological control criteria are not suitable for monitoring of presence or absence of emerging disease risks, and recent studies have shown vast underestimation of levels of contamination for many human pathogens, but also raise questions about the interpretation of molecular detection data in relation to consumer risk (EFSA 2013; Stals et al, 2011). COMPARE addresses these challenges by developing and implementing novel integrated approaches to pathogen and outbreak detection and mitigation, making use of the breakthroughs offered by the rapidly developing sequencing technologies. Prominent recent examples of this potential include detection and rapid in-depth characterization of the German E. coli outbreak strain in 2011 (Mellmann et al. 2011), the influenza A/H7N9 strains in China (Lam et al. 2013) and detection of Schmallenberg virus as a cause of severe malformations in ruminants in Europe (Garigliany et al. 2012). In surveillance and outbreak investigations, the use of NGS/WGS/WCS – if combined with smart sampling and collection of defined metadata (see text box 1) can greatly increase insight into sources, modes of transmission, virulence and pathogenesis differences between pathogen types (viruses, bacteria, and parasites), with examples from studies of emerging infections, and food-borne disease outbreaks (Haagmans et al. 2014; Mellou et al. 2013; Petrignani et al. 2012; Volz and Frost 2013;Gherasim et al. 2012). An interesting potential application is also the use of NGS/WGS/WCS analysis in studies of pathogen behavior in relation to storage and processing conditions, as well as spoilage. However, for wider use of this novel technology, important hurdles need to be addressed. In addition to ICT challenges and bioinformatics, described below, questions need to be addressed about quality of sequence data produced by different platforms and analytical programs, and the influence of sample type, sample storage, and extraction methods on these outputs. Compatibility with current methods used to characterize pathogens, such as serotyping, MLST-typing, single gene identification (epidemiological markers) and PFGE is essential for acceptability. For all of these applications, research is needed to understand the opportunities and pitfalls of the novel technology. This includes understanding how sequence-based methods can be used to support diagnostics, outbreak detection and investigations and advance quantitative risk assessment and source attribution models (Pires and Hald 2010; Muellner et al. 2013; Verhoef et al. 2011). COMPARE seeks to address these outstanding questions through targeted research studies, defined by user groups coming from different sectors, but all joining forces to develop and implement novel integrated approaches to pathogen and outbreak detection and mitigation that can be used across pathogens and sectors. COMPARE addresses this challenge by developing and implementing advanced IT-solutions where public and private data can be compared without compromising either the need for privacy of either sequence or conceptual meta-data. This will build on the vast experience from other large scale sequencing projects especially within human genomics and cancer research.

1  http://www.ecdc.europa.eu/en/press/news/_layouts/forms/News_DispForm.aspx?List=8db7286c-fe2d-476c-9133-18ff4cb1b568&ID=956  

Page 9: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 9

Current databases are not capable of handling the rich data produced by these platforms, and novel informatics solutions are needed to be able to analyze the data in a timeframe compatible with uses of the data in support of outbreak investigations. In addition, given the broad-ranging potential application across sectors, a flexible infrastructure is needed that can handle and analyze the data in connection with data from other sources, e.g., external databases from public health authorities or local research databases. Thirdly, the infrastructure developed by COMPARE will have research environments, supporting the research groups that are developing specific analytical workflows, for instance, and open broader user environments, that provide access to fully developed workflows and datasets.

Scope The objectives and associated activities proposed in COMPARE will cover the full scope of topic PHC-7-2014 as presented in the Work Programme. How COMPARE will address the scope is summarised below per item included in the scope of the call-topic. For more details on the activities we refer to section 1.3. “Sequence based data for pathogens should be generated, stored and analysed in combination with clinical, microbiological, epidemiological, additional gene- and transcriptome-based analyses and other data (e.g.: differing responses in women and men) for risk assessment (RA) in an appropriate information system for all sectors (public health, food, animal health)”

Text box 1: Examples of associated metadata to be included in COMPARE: Sample data e.g.,: • Source of sample: Human, Animal, Food (matrix specified by species), Water, specified using international scientific nomenclature; • Type of sample: e.g. blood, cerebrospinal fluid, respiratory; • Date: of sampling and of sequencing; • Reason for sampling: e.g., random, research, surveillance, outbreak, monitoring • Location of sampling: (how specific: country, coordinates (E.g. format Geodata GIS), Country origin of matrix production and of matrix

sampling); • In case of samples from domestic animals: information about husbandry, number of animals on farm, etc.; Sequence based typing data e.g.,: • Raw sequence data in appropriate formats (e.g. FASTQ/CRAM/BAM), ideally aligned to an appropriate reference; • Sequence assembly data (contigs, scaffolds or whole chromosomes, segments etc.) in appropriate formats • Pathogen species identification, with type referring to agreed nomenclature; • Pathogen clonal subtype; • Length in nucleotides; s. • Subtype / epidemiological markers, official PFGE pulsotype number, MLVA/MLST data (to allow comparison of NGS data to current

golden standard for molecular typing); • Data on the sequencing methods/process applied: wow sequence was obtained (direct or after culture); preparation methods (pathogen

concentration, extraction, enrichment, amplification, primers used etc.), information on cultivation and passaging information (how many passages in what cell type), adaptors, tags used and of the trimming and filtering of the reads;

Contextual metadata covering “clinical, microbiological, epidemiological, and other data” e.g., • Clinical data:

o Healthcare setting (e.g. General Practitioner, Outpatient department, Hospital, Intensive Care Unit); o Age and gender of patient; o Clinical syndrome(s): e.g. central nervous system, gastrointestinal, respiratory) o Comorbidities; o Date of illness onset; o Outcome of disease (fatal or non-fatal if known) o Country of infection; o History of travel; o History of vaccination; o Antimicrobial treatment (at time of sample collection); o Other information (free format) such as: contacts with animals (domestic and wild), drinking water (public, private);

• Pathological data: o Tissue/organ sampled; using an established anatomical system as used by CCWHC and Snowvet; o Pathological diagnosis using an established classification system either for human pathology or veterinary pathology;

• Phenotypic data and additional gene- and transcriptome-based data e.g; o Antimicrobial susceptibility genes/markers (if relevant); o Resistance genes/markers; o Virulence/Pathogenicity genes/markers; o Genetic epidemiological signatures/markers;

• Provider related data e.g.; o Identifier of provider; o ‘Owner’ (researcher responsible) of the sample and/or sequence (with name, institute and email); o Info on whether the sequence(s) is (are) free to be used in publications etc. or if owner should be consulted during the preparation of

publishing etc.

Page 10: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 10

The COMPARE Analytical Framework to be developed together with a series of innovative underpinning research studies will cover the generation, storage and analyses of sequence-based data in combination with the associated metadata as listed in text box 1 above. “Proposals should improve pathogen monitoring by rapid identification, comparison, and geographical mapping, including bio-tracing approaches. Proposals should include predictive models on RA, to identify ‘high-risk’ areas and disease-emergence patterns” The ultimate result of COMPARE is to build, maintain and moderate a “one serves all” analytical Framework that integrates state of the art technologies, tools, strategies and methods for collecting, processing and analysing sequence-based data in combination with other (meta-) data for the purpose of more rapid pathogen identification, characterisation, and outbreak detection, and for improved outbreak investigation. COMPARE will develop novel bioinformatics tools for rapid analysis of outputs from NGS/WGS/WCS and comparison across huge datasets. Novel algorithms will be developed for detection of spatial and temporal signals indicating both outbreaks of known pathogenesis as well as emergence of novel. COMPARE will also develop novel phylogenomic analysis predicting evolution and spread, as well as potential virulence of the pathogens. In addition, state-of-the-art risk assessment, epidemiological and source-attribution models will be redesigned to enable inclusion of NGS/WGS/WCS data into the analyses, and applied to determine the most important sources, unravel transmissions pathways and identify areas for intervention. These models will also be used to provide suggestions for additional risk-based sampling for determining the size and sources of epidemics and outbreaks. All analytic tools and pipelines will be centrally and locally available, as both a pre-defined standardized option, and a customizable version. “Proposals should ensure links and consistency with existing networks and databases (TESSY, RASFF, EWRS, EFSA/ECDC molecular testing database) and data protection requirements. Access to the system should be granted to relevant animal, food safety and human health service stakeholders” As regular users of and data-providers for these systems and databases, many of COMPARE’s partners are quite familiar with the current standards as well as need for data protection in view of possible privacy issues and ethical and legal barriers. Two work packages are specifically dedicated to research and piloting of NGS/WGS/WCS for suitability in comparison with existing surveillance. The organisations coordinating these networks and databases and others (see section 1.3) will be invited to serve on our External Advisory Panels (WP11) to give direct input to COMPARE’s developers to ensure that COMPARE is making full use of existing networks and databases. The COMPARE data and information platform (objective D) will be developed by ensuring broad interoperability with existing data resources, tools, workflows and domain-specific portals (for more details see WP9). “Harmonised standards for sampling, sequencing, sex-disaggregated representative (meta-) data collection, management and sharing should be developed. Likewise, better management tools for authorities, businesses and citizens and risk communication tools for authorities should be developed” The need for harmonisation across the process is clearly recognized by the COMPARE consortium. Objective A is geared towards developing risk-based strategies for the collection of samples and associated (meta-) data (for more details see WP1). Objective B is geared towards developing harmonised standards for the processing and sequencing of samples, in pursuit of generating comparable data across domains, time and locations. (For more details see WP2). The analytical workflow developed in COMPARE is designed to function as a management tool for authorities (in human and animal health) and businesses (in food). In addition, objective E is specifically geared towards developing risk communication strategies for these authorities in communicating the public health risks to their stakeholders, notably European citizens (for more details see WP10). “The cost effectiveness of the tools and methods should be assessed” As the system to be developed through the COMPARE project will be serving multiple user groups (clinicians, scientists, epidemiologists, risk assessors etc.), and sectors, determining the cost effectiveness requires a comprehensive approach, considering both direct costs and benefits, and indirect (including societal) cost and benefits. Therefore, WP12 is entirely dedicated to objective G encompassing the development of a framework for estimating the cost-effectiveness of the COMPARE system, including the value of safety and using this framework to estimate the cost-effectiveness of COMPARE and related methods and tools using case studies (for more details see WP12). “The improved understanding of outbreaks in regions with little or no surveillance systems, mass migration settings or post-disaster settings may require special attention for emerging and re-emerging pathogens”

Page 11: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 11

Partners in COMPARE have established track records in studying foodborne outbreaks, emerging and re-emerging pathogens and clinical and public health responses to emerging disease outbreaks, in collaboration with their colleagues in resource-limited regions in Africa and Asia, and in interaction with several global networks. This part is covered across several WPs, but especially by the strong focus on risk mapping and risk based sampling strategies in WP1, and the development of general applicable technologies and analytical pipelines in WP’s 1-5. Importantly, the open source model used in the system design (WP9) will allow for access from anywhere in the world to a first class analytic and storage systems, potentially enabling analysis for future hand-hold sequencing systems (e.g. as Oxford Nanopore) without the need to have accompanying bioinformatics programs or expertise.  1.3  CONCEPT  AND  APPROACH   Overall concepts underpinning the project In order to capture and develop the potential of breakthrough sequencing technologies for the field of infectious diseases across sectors, COMPARE aims to develop a data-sharing and analysis system, for deployment in a manner which cuts across pathogens, domains and sectors, and with equity in access and use, enabling cost-effective improvements in human, animal, environmental and plant health. COMPARE will do this by bringing together partners from academia, public health (human, veterinary, food) with expertise in emerging diseases and foodborne diseases, ICT and bio-informatics, as well as expertise in risk assessment, risk communication, health economics, and public health law. This team will work through a specific set of user-driven research projects developing and piloting potential applications of novel sequencing technologies to address key clinical, research and public health questions in the fields of emerging and foodborne diseases. COMPARE uses these insights, data and analytical workflows to jointly build a core bio-informatics and data-sharing system with harmonized protocols, reference sets and standards accessible for all users in medical, veterinary and food safety domains. In developing the overall platform COMPARE will adhere to the following main principles: • COMPARE is a sector, domain and pathogen-independent system; • COMPARE is a user driven system, designed with the information needs of its intended diverse group of

future users and other stakeholders in mind; • COMPARE will make optimal use of existing and future complementary sytems, networks and databases

ensuring compatibility where needed; • COMPARE is a flexible, scalable and open-source based information-sharing platform. Designed for users across sectors, domains and pathogens In developing the COMPARE analytical framework we will take into account the great diversity in perspectives across the large heterogeneous user population, operating in different sectors, domains, disciplines, expertise and levels of operation and having different information needs, interests, opinions and resistances or hesitations when using COMPARE. COMPARE participants have already done much work to identify these needs in the context of GMI and related initiatives, which are at the basis of this proposal. However, the identification of the needs and concerns of users of COMPARE and the translation into desired functionalities is not a one-off exercise. It is an ongoing process, to be continued throughout the development of COMPARE and crucial to promote broad European and global sustainable uptake of COMPARE. For this reason COMPARE will build in an ongoing, transparent and efficient process of user consultations, safeguarding that the perspectives from the users from all sectors, domains, pathogens, and roles are adequately incorporated into the design of COMPARE. The user consultation process will involve users covering the different perspectives: • Data providers and information users: As any system that is designed for generating and sharing information

between many users, the future success of COMPARE depends on the support of both the data providers and the information users. Any user can fulfill both roles, but the emphasis on one or the other role can vary between users. Users from both ends of the spectrum need to recognize each other’s roles and value to COMPARE. Thus, we have gone to great lengths in ensuring adequate representation across all domains, sectors, and pathogens in the COMPARE consortium.

• Actors across the sectors of human health, food safety and animal health, including wildlife: As mentioned in the call-topic, COMPARE should be an information system for all sectors: human health, food safety and animal health, based on the realization that the effective detection and mitigation of (re-) emerging infectious diseases and foodborne outbreaks requires collaboration across all three sectors. In the composition of the COMPARE consortium, all three sectors are represented to ensure that these different perspectives and areas of

Page 12: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 12

expertise are adequately represented in the design of COMPARE. This multi-sector coverage will also be followed in the user consultation process (see below).

• Actors across the field of virology, bacteriology and parasitology: COMPARE will also have to span the fields specializing in different groups of infectious organisms, including virology, bacteriology and parasitology. The traditional boundaries between these fields are apparent looking at the use of different technologies, standards, tools and methods in research. This especially applies to the use of advanced molecular typing tools. COMPARE will safeguard that the system adequately takes these differences into account.

• Molecular-, patient and population level researchers: Fourth, future COMPARE users encompass:

- Molecular level researchers who’s primary interest in COMPARE is its added value for the identification of pathogens and their genotypic and phenotypic characterization of the ability to infect their host, cause disease and transmit to new hosts. This group primarily consists of users that focus their research on elucidating the (evolutionary) mechanisms underlying the pathogenesis and transmissibility of pathogens in humans and animals (domestic and wildlife) or in food microbiology studying behavior of pathogens and spoilage in food (products) and food production environments. Users in this domain are a primary source of sequence-based data;

- Host level researchers studying infectious diseases at the patient or host (humans and animals) level in pursuit of improving diagnosis and treatment and knowledge of different aspects of the causative agent. Their primary interest in COMPARE is in its added value for more rapid diagnosis and better treatment of infected patients. This group primarily consists of clinical researchers in hospital settings (human and veterinary), GPs, veterinary practitioners and clinical laboratory researchers. Users in this domain are a primary source of clinical and pathological data and constitute the users of actionable information at clinical level generated with the use of COMPARE;

- Population level researchers studying infectious diseases at the level of populations (human population or animal populations). This group primarily consist of public health researchers, food microbiologists, veterinary practitioners, regional and national veterinary institutes, epidemiologists and risk assessment modelers with a primary interest in the added value of COMPARE for providing timely and relevant information in support of outbreak detection and containment. Users in this domain are a primary source of epidemiologically relevant sequence data (reference strains, clusters, epidemiological metadata), and constitute the majority of users of the actionable information generated with the system.

In addition to having all these sectors, domains and fields of expertise represented directly in the COMPARE, we will form Expert Advisory Panels (EAPs, see WP11 and sections 2.2 and 3.2 for more details) and online user panels that will provide for a continuous interaction between the future users and the designers and COMPARE in structured Stakeholder Consultations (covered by WP11). This will also offer a continuous dissemination platform for the COMPARE systems throughout the project period. Complementarity to and compatibility with other databases and information and communication systems COMPARE needs to be accepted by the community and anchoring into the existing European reality in order to be successful. This means that COMPARE should be built to enable establishment of links with existing data collection, surveillance and signal communication systems such as the ECDC Food- & Waterborne Epidemiology Intelligence Platform (FWD-EPIS), The European Surveillance System managed by ECDC (TESSy), The European Commission Early Warning and Response System (EWRS) and Rapid Alert System for Food and Feeds (RASFF), and WHO networks among others (see below for a short description). To accomplish this will not be easy. It will require a great deal of flexibility in the COMPARE system developers, and will involve extensive consultations and discussions with network owners in the above-mentioned User Consultations of WP11. As COMPARE comprises a number of scientists that in their daily work use, interact with and provide input to these systems we believe that chances are high that developed tools will be well received. COMPARE strives to develop tools that are actually implemented into practical use on the European outbreak scene. Examples of the existing systems to which COMPARE needs to develop liaisons are: TESSY (ECDC)

The European Surveillance System (TESSy) is a flexible metadata-driven system for collection, validation, cleaning, analysis and dissemination of data. Its key aims are data analysis and production of outputs for public health action. All EU Member States and EEA countries report their available data on communicable diseases as described in Decision No 2119/98/EC to the system. ECDC is to explore the potential for inclusion of molecular data into the TESSY reporting.

Page 13: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 13

RASFF (DG SANCO)

Rapid Alert System for Food and Feed (RASFF) (DG Health and Consumer) is an IT tool that facilitates the cross-border flow of confidential information between national authorities responsible for food safety. Through the RASFF network, food safety authorities in Europe are rapidly informed of potential hazards found in food and feed, in pursuit of initiating rapid and coordinated responses to emerging health threats.

EWRS (ECDC and national PH authorities)

The Early Warning and Response System (EWRS) is a web-based system linking the Commission, the public health authorities responsible for measures to control communicable diseases and the ECDC. The aim is to promote cooperation and coordination between the Member States, with a view to improving the prevention and control, in the Community, of communicable diseases. Novel information on emerging infectious diseases is shared confidentially through this channel, but also critically appraised for reliability before release to the public.

EPIS (ECDC) The Epidemic Intelligence System of ECDC is an online communication tool for the exchange of non-structured and semi-structured information regarding current or emerging public health treats with a potential impact in the EU amongst risk assessment bodies.

Compatibility of the analytical workflows and the data and information sharing system will be specifically investigated, in order to ensure optimal uses of EU funding. Currently, the only system that is similar to the system COMPARE envisages is GenomeTrackr developed by the Food and Drug Administration (FDA) and the National Center for Biotechnology Information (NCBI) in the USA, although GenomeTrackr focuses solely on foodborne bacteria. COMPARE partners have close and structural collaborations with these institutions. In addition to these existing systems, there is a multitude of other existing (inter)national databases and networks that have in common that they are widely accepted and used by the scientific and public health community and authorities for exchange of sequence-based data and other relevant structured and semi-structured information of relevance to human health, animal health and/or food safety. None of these is currently capable of handling the complex data from NGS/WGS/WCS, but ensuring interoperability of these databases with COMPARE is pivotal for the user acceptance of COMPARE. With the majority of these existing systems and databanks partners of COMPARE already have well-established and close links, and representatives of the database and network hubs will be invited to serve in the External Advisory Panels of COMPARE. This way COMPARE is building in direct communication lines with these systems in pursuit of harvesting the potential for building synergies. Other relevant systems and databases will be identified in the course of the project through the extensive professional networks of the COMPARE partners and external advisory panels. Networks with big data and sequence focus It is important to note that collection, storage and especially rapid searches and analysis of the vast amount of sequence information created using NGS technologies requires special technologies and expertise which have so far only been explored in a few places around the world. Today generation of data can take place in a few hours, but – besides in a few places – analysis of this data will often take weeks to months. COMPARE will have a special focus on collaboration with centers and projects that have experience with handling big data as listed below. INSDC

The International Nucleotide Sequence Database Collaboration, consists of the three organizations that capture, present and exchange data arriving from all major sequencing centres in the world: • NIH/NCBI: Genbank® is the NIH genetic sequence database, an annotated collection of all

publicly available DNA sequences; • Partner 7 EMBL-EBI: The European Nucleotide Archive (ENA) is made up of a number of

distinct databases that includes EMBL-Bank, the newly established Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards.

• Japan DDBJ: DNA DataBank of Japan collects nucleotide sequence data as a member of INSDC and provides freely available nucleotide sequence data and supercomputer system, to support research activities in life science.

ELIXIR

ELIXIR(www.elixir-europe.org) is an inter-governmental organization harboring European data resources and services in a node and hub model (single hub located at COMPARE partner EMBL-EBI). ELIXIR links hundreds of biological databases into a larger bioinformatics infrastructure, connecting them with each other and with tools enabling researchers to interpret the data they contain.

FDA’s Genome Trakr Network

Genome Trakr is a network of US State and Federal Public Health Laboratories collecting and sharing genomic data from foodborne pathogens. The data are submitted to and stored at NCBI. Researchers around the US will be able to analyze and compare data in real time, speeding up investigations and contamination control.

Page 14: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 14

NIH/NIAID Influenza Research Database

The Influenza Research Database (IRD, http://www.fludb.org) is a free, open, publicly accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user-friendly interfaces for data retrieval, visualization and comparative genomics analysis, together with personal log in-protected 'workbench' spaces for saving data sets and analysis results.

GISAID EPIFLUTM database

The GISAID platform provides a publicly accessible database (EPIFLUTM) for the sharing of all influenza type virus sequences, related clinical and epidemiological data associated with human isolates, and geographic and species-specific data associated with avian and other animal isolates. Germany is the official host of the GISAID platform. Quality control of the data is performed by FLI, partner in COMPARE.

NORONET and HAVNET

NoroNet (www.rivm.nl/en/Topics/N/NoroNet) and HAVNET are two networks of scientists working in public health institutes and/or in universities sharing virological, epidemiological and molecular data on norovirus and hepatitis A virus, respectively, coordinated by RIVM, partner in COMPARE. It also includes norovirus typing tools facilitating early recognition of globally emerging strains or indications of common sources.

METAHIT (www.metahit.eu)

Metagenomics of the human intestinal tract (Metahit) is a highly successful FP7 funded project that has sequenced the entire microbial communities in guts from human and identified >4 million genes. Metahit is the EU participant in International Human Microbiome Consortium COMPARE partner DTU is involved in the bioinformatics of Metahit and has the entire sequence-database stored.

Other relevant networks OIE World Animal Health Information Database (WAHID)

A comprehensive range of information is available in OIE's new World Animal Health Information System (WAHID) including (i) immediate notifications and follow-up reports submitted by Country/Territory Members notifying exceptional epidemiological events in their territory, (ii) six-monthly reports stating the health status of OIE-listed diseases in each Country/Territory and (iii) annual reports providing health information and information on the veterinary staff, laboratories and vaccines, etc.

WHO – Global Foodborne Diseases network (GFN)

The GFN (http://www.who.int/gfn/en/) is a capacity-building program that promotes integrated, laboratory based surveillance and intersectoral collaboration among human health, veterinary and food-related disciplines. GFN is part of WHO's endeavors to strengthen the capacities of its Member States in the surveillance and control of major foodborne diseases and to contribute to the global effort of containment of antimicrobial resistance in foodborne pathogens. In May 2012 GFN had 1,062 members from 184 Member States and territories. COMPARE partner DTU is co-founder, steering committee member and responsible for global ring trials of GFN.

CDC Pulsenet (International)

More than 120 laboratories from over 80 countries currently participate in PulseNet International, dedicated to tracking foodborne infections worldwide for which SSI is the European coordinating lab. All participating laboratories utilize standardized genotyping methods. Subtyping and epidemiological information are shared in real-time between the participating laboratories. PulseNet International provides early warning of international food and waterborne disease outbreaks through the detection of case-clusters associated with a particular subtype.

FAO EMPRESS-i

The Global Animal Disease Information System (EMPRES-i) is a web-based application that has been designed to support veterinary services by facilitating the organization and access to regional and global disease information. Timely and reliable disease information enhances early warning and response to trans boundary and high impact animal diseases, including emergent zoonoses, and supports prevention, improved management and progressive approach to control.

EU Animal Disease Notification System (ADNS)

The ADNS is a notification system of the European Union designed to register and document the evolution of the situation of important infectious animal diseases in Europe. It is a management tool that ensures immediate notification of alert messages as well as detailed information about outbreaks of these animal diseases in the connected countries. This permits immediate access to information about important animal disease outbreaks and enables for a prompt response for controlling the epidemiological situation. While ADNS is a system not directly related with food safety, it has an impact on public health in relation to all zoonotic diseases within its scope.

National notification and syndromic surveillance systems

Next to these international systems and databases links will be established with the national notification systems and national syndromic surveillance systems of the EU Member States through the COMPARE partners. Contacts at these notification and surveillance systems (for the latter see also http://www.syndromicsurveillance.eu) will be identified and

Page 15: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 15

approached by COMPARE in light of the Stakeholder Consultation process. International (EU funded) networks and research activities The third category of international research activities that will be linked to COMPARE include a number of EU funded research networks in the area of emerging epidemics research in the human, animal and food sectors. Partners in COMPARE have coordinating or leading roles in these networks that can either serve as a basis for COMPARE building on results and data from these networks and/or as an outlet of COMPARE activities/ results (e.g. as first movers in the use of COMPARE for their research activities). As selection of these EU-funded projects is presented here. EMPERIE (223498) www.emperie.eu

EMPERIE is a large-scale integrated project funded in FP7 Health, coordinated by EMC and involving the COMPARE partners FLI, AMC, UK-Bonn, UCAM and WTSI. EMPERIE has built Europe’s leading pathogen identification and characterization platform using state of the art sequencing technologies. COMPARE will build from this expertise and the available biobanks, as well as access the wider network for stakeholder consultations.

ANTIGONE (278976) www.antigonefp7.eu

ANTIGONE is a large-scale integrated project funded in FP7 Health, coordinated by EMC and involving COMPARE partners AMC, AHVLA, UCLM, UK-Bonn, AUTH and UCAM. Its main purpose is to identify the factors (at the molecular, host or population level) that promote the pandemic potential of zoonotic bacteria and viruses. The pathogen identification efforts in COMPARE can feed into ANTIGONE for subsequent research and the promoting factors identified in ANTIGONE can inform the sampling strategies in COMPARE.

PREPARE (602525): www.prepare-europe.eu

PREPARE is a large-scale integrated project, coordinated by COMPARE partners UA and AMC (deputy), and with EMC and UK-Bonn as partners. The purpose of PREPARE is to transform Europe’s response to severe epidemics or pandemics by providing infrastructure, co-ordination and integration of existing clinical research networks, both in community and hospital settings. The clinical research networks in PREPARE can make use of COMPARE (e.g. the communication toolbox, linked to the dissemination of clinical guidelines developed in PREPARE). Vice versa, the findings in PREPARE can inform the sampling strategies in COMPARE and can be used as the platform for further testing COMPARE’s results.

EVA: European Virus Archive www.european-virus-archive.com

EVA is an EU funded infrastructures project, that mobilises a European network of scientific centres with expertise in virology to collect, characterize, standardize and distribute viruses and derived products. UK-Bonn, EMC, FLI, and AHVLA are partners in EVA. This will provide COMPARE direct access to a European repository of reference strains and sequences, and the COMPARE system will be a potential hub for storage of the data generated by EVA.

Med-Vet-Net http://www.mvnassociation.org

The Med-Vet-Net association plays a central role in one-health by increasing and disseminating scientific knowledge on zoonoses through collaboration between Public Health and Animal Health organisations across Europe. The Association’s key goal is the provision and dissemination of knowledge and expertise in order to support the scientific community in combating zoonoses. Six COMPARE partners are Med-Vet-Net members.

Microbial research Resources Infrastructure (MIRRI) www.mirri.org

The mirror image of EVA for non-viral pathogens is MIRRI, an EU funded infrastructures project with the goal to improve access to the microbial resources and services that are needed to accelerate research and discovery processes. MIRRI is building a Pan-European distributed research infrastructure that gives access to reference strains, their derivatives and associated data for research, development and application. COMPARE partner DSMZ is a central partner in MIRRI.

RAPIDIA-FIELD www.rapidia.eu;

RAPIDIA-FIELD (Rapid Field Diagnostics and Screening in Veterinary Medicine) is an EU FP7 funded project aimed to develop rapid diagnostics systems for livestock, companion animal and wildlife diseases. For molecular test systems NGS data are generated and used as a basis of assay design and results confirmation. COMPARE partners FLI and ANSES are central partners and will ensure close links with RAPIDIA.

RiskSUR http://www.fp7-risksur.eu/

RiskSUR (Risk Based Animal Health Surveillance) is an EU FP7 funded project with AHVLA as partner aimed to develop and validate conceptual and decision support frameworks and associated tools for designing efficient risk-based animal health surveillance systems. It will support the Community Animal Health Policy and EFSA activities relating to cost-effective animal disease surveillance programmes. It will inform COMPARE on the current and future animal disease surveillance strategies and structures including economic aspects.

Page 16: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 16

EPI-SEQ (219235) www.epi-seq.eu

EPI-SEQ is an EMIDA Era net funded network that aims to exploit NGS technologies to generate improved tools that can be used during epidemics of viral diseases threatening livestock industries in Europe. FLI is partner of Epi-Seq. Experiences of and data generated within Epi-Seq will help COMPARE to optimize NGS data management, providing a perfect starting point allowing the immediate optimization within COMPARE without any delay.

APHAEA http://www.aphaea.org

APHAEA (“harmonised Approaches in monitoring wildlife Population Health, And Ecology and Abundance” is an EMIDA ERA-NET (s) project, coordinated by COMPARE partner IREC and including investigators from DTU, FLI, ISVZ and Artemis. APHAEA a.o. aims at establishing harmonized procedures for wildlife population abundance estimation, sampling and diagnosis and provide data on key host abundance and key pathogens in their wild hosts. As such its results will feed primarily in to WPs 1 and 2.

TELL-ME (278723) www.tellmeproject.eu

The EU FP7 funded TELL ME project is an integrated research project involving experts in social and behavioral sciences, communication and media, health professionals at various levels and specialties and representatives of civil society organizations to develop an evidence-based behavioral and communication package to respond to major epidemic outbreaks, notably flu pandemics. The TELL ME Integrated Communication Kit for Outbreak Communication and simulation software will be at the basis of the Communication Toolkit of COMPARE, brought by partner 26 RT.

GESTURE

GESTURE was an EU funded project led by COMPARE partners EMC and RIVM exploring barriers to sharing of sequence data in emerging disease outbreaks under the framework of the International Health Regulations. The detailed background investigation of the current legal and ethical framework, law and regulations, as well as experiences of a broad range of stakeholders, will be used as input for the COMPARE legal and ethical barriers and solutions.

EFFORT www.effort-against-amr.eu/

Ecology from Farm to Fork Of microbial drug Resistance and Transmission (EFFORT) is a newly started (2014) EU-project that will study the complex epidemiology and ecology of antimicrobial resistance between bacterial communities, commensals and pathogens in animals, the food chain and the environment. The measurement of antimicrobial resistance will mainly be though meta-genomic analysis coordinated by COMPARE partner DTU.

FLURISK This recently completed EFSA project developed and validated an influenza risk assessment framework (IRAF) for the ranking of animal influenza A strains in their potential to cross the species barrier and cause human infection. AHVLA and EMC were partners in FLURISK.

BASELINE www.baselineeurope.eu

BASELINE is an EU FP7 large integrated project coordinated by COMPARE partner UNIBO that aims to improve sampling plans for microbial risk analysis by 1) definition of fit-for purpose sampling plans; 2) proposal of Food Safety Criteria in different animal food chains (egg, meat and milk); 3) harmonization and validation of new, fast and sensitive molecular methods for detection and quantification of foodborne pathogens (bacteria and viruses).

Flexible, scalable and open source based information sharing platform The analytical framework and linked data and information sharing platform will have generic functionalities that are accessible to all users, as well as functionalities that are tailored to the specific needs of different user communities. Likewise, the system will distinguish different authorisation levels. In addition to this compartmentalised approach, COMPARE will be designed and built for the long-term and therefore designed to be able to deal with currently unforeseen amounts of data (petabytes), new technologies and associated data (both in type and scale), and working from large infrastructural platforms that already are established in Europe (EBI, ELIXIR). Where possible, COMPARE will be open-source based, to ensure long-term sustainability and sufficient independence from private sector ICT developers (thus avoiding inherent risk of international take over), and to allow for the desired flexibility, scalability and compatibility to other initiatives. COMPARE will however, also allow for local download and confined local use to ensure confidentiality for the biotechnological and food industry as well as public health along the lines of analytic pipelines developed for human genomes and cancer research. 1.4 AMBITION The ambition of COMPARE is to establish “a One System serves all” platform that will allow for real time comparison, detection, analysis and interpretation of pathogen and disease signals in a truly integrated inter-sectorial, multi-discipline, cross-border “one health” approach. The COMPARE platform should enable linking

Page 17: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 17

research and frontline laboratories to public health, across human and animal health and the food chain in Europe and beyond, thus maximizing the laboratory capacity to detect and respond to disease outbreaks. We will do so by capitalizing on the huge promise of NGS/WGS/WCS technologies through an integrated research and development agenda, and translation of insights and analytical approaches from these studies into an open-source and durable European ICT infrastructure for long term future use. Rapid identification of emerging and foodborne pathogens and subsequent provision of timely insights into the modes of transmission and prevention, pathogenesis, and clinical impact of such diseases is essential to reduce the impact and costs of disease outbreaks. As many emerging diseases are zoonoses, such response activities require both inter-sectorial collaboration and international coordination, promoted as the global One Health concept. The US Institute of Medicine also included food safety into the One health concept.2 This concept is however far from reality: preparedness and response to (food-borne) infectious diseases are still largely handled through sectorial, national and regional laboratories using conventional microbial methodologies with a specific focus on a specific domain (animals, food, and public health) and bacterial, viral or parasitic pathogens despite the promotion of the One-Health concept during the last decade. A potential breakthrough is offered by the ongoing revolution in genome technology, leading to increasing speed and reducing costs of sequencing (next generation sequencing, NGS). As the common denominator to all pathogens and hosts, regardless of species and domain, is the presence of a genome, the ability to rapidly determine the genome sequence provides a common language by which data on pathogens can be compared. Such a single technology applicable to different disciplines (e.g., bacteriology, virology, parasitology) and domains (human, food, animal, environment) would facilitate global cross-cutting collaboration and information exchange (integrated surveillance), leading to rapid and coordinated responses to novel and known health threats as they emerge. Conditional to this success is the capacity to generate and analyze the complex genome data in a manner that addresses clinical and public health questions reliably and timely. Early adopters are state-of-the-art laboratories involved in research activities or outbreak investigations. Some of these laboratories are presently running high-resolution sequencing activities, but key to success in applying sequence technology in clinical and public health practice is to capture data, needs and expectations from all involved stakeholders, beginning with clinicians and point-of care laboratories, and public health experts in charge of national surveillance and risk assessment. By linking their needs to those for national, regional, and global public health, a coordinated technology shift is possible, providing the opportunity for a novel and integrated system supporting disease detection and response activities that would be unique in the world (Aarestrup et al. 2012). COMPARE has brought together a carefully selected group of partners from the different disciplines and domains, to work out the essential components of a data-sharing and analysis system that would do exactly this: cut across pathogens, domains and sectors and provide actionable information for clinic and public health. This team will execute user-driven research projects developing and piloting cutting-edge applications of novel sequencing technologies to address key clinical, research and public health questions in the fields of emerging and foodborne diseases. COMPARE uses these insights, data and analytical workflows to jointly build a core bio-informatics and data-sharing system with harmonized protocols, reference sets and standards accessible for all users in medical, veterinary and food safety domains. The system will enable integration of data from clinical symptoms to meta-genomic analysis and across all pathogens and reservoirs. Such a system will be unprecedented and will drastically improve local and global preparedness to detect infectious diseases threatening public and animal health. NGS generates a huge amount of data and in the near future it will become necessary to rapidly search and analyze petabytes of data. Therefore, we aim to develop novel technological solutions that cannot be achieved with current locally available data management and informatics systems, using insights from information technology, bioinformatics and astrophysics. In addition, compared to the current situation where most analyses are done for research purposes we aim to address the need for much more comprehensive analysis and searches to be completed in hours for outbreak detection also in situations where we do not a priory know which disease signals we are looking for, or for less experienced users. COMPARE will build directly on the experience from other large-scale biological projects, such as human genetics. It will utilize the advantages and flexibilities of cloud-based informatics in combination with the -in EU- unique computational infrastructure at EMBL-EBI to provide a flexible informatics system with individual control over own data, but allowing for rapid searches and comparison to all available data.

2  IOM.  Improving  food  safety  through  a  one  health  Approach,  September  2012;  http://www.iom.edu/reports/2012/improving-­‐food-­‐safety-­‐through-­‐a-­‐one-­‐health-­‐approach.aspx#sthash.RuPgDKMc.dpuf  

Page 18: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 18

2.     IMPACT  2.1  EXPECTED  IMPACTS  

COMPARE’S impact on containment and mitigation of epidemics by competent authorities on the basis of a shared information system and global standards for rapid pathogen identification Rapid population growth and urbanization, deforestation, invasion of previously pristine habitats for agriculture, and increasing demand for animal protein all likely contribute to increased emergence of novel infectious disease threats, while climate change and the increasing global connectedness and mobility facilitate their global spread (Jones et al., 2008). Consequently, the pattern of disease outbreaks has changed, from localized clusters of disease in confined populations to dispersed outbreaks with excellent opportunity for further transmission. Similarly, a transition is observed from localized food-borne epidemics to diffuse international food-borne outbreaks due to globalization of the food market (Verhoef et al. 2011). COMPARE meets this challenge by providing a “One System serves all” solution to (emerging and foodborne) disease detection. Using the COMPARE system, once validated, should greatly increase our ability to detect outbreaks, in part through automated signals, reduce the time to detection of diffuse food-borne outbreaks, and reduce the time needed for deployment of response to emerging infectious diseases. In addition, the COMPARE framework will allow for more precise predictions of the size and scope of outbreaks, suggest more precise sampling and give cost effective measures to mitigate outbreaks. This will result in a significant reduction in mortality and morbidity associated with these diseases, as well as reduced cost for with-drawl of food products when outbreaks are identified and controlled much earlier.

COMPARE’s impact on improved resource efficiency and reduction of economic impact of outbreaks, facilitation of international trade, increasing competitiveness of European food and agricultural sector; reinforcement of food chain sustainability and enhancement of food security, reduced carbon footprint Emerging disease outbreaks or widespread food-borne outbreaks can cause huge economic losses even when the human health impact is limited. The global cost of SARS coronavirus emergence, that affected approximately 8000 persons, was estimated to be 40 billion US dollars, due to travel alerts, reduced international travel and trade amongst others. This was despite reassurances by public health authorities that such measures were not necessary. Although quantification of the impact of infectious diseases is complex and due to this complexity estimates are rarely uncontroversial, the costs of a number of large scale outbreaks of infectious animal diseases are well documented: For example, the costs of the outbreak of Classical Swine Fever in the Netherlands (1997-1998) are estimated at 2 billion Euro, the total costs of the outbreak of Foot-and-Mouth Disease in the United Kingdom (2001) were put by the UK National Audit Office at 8.5 billion GBP (13.6 billion Euro). There are also detailed and recent estimates of the cost of foodborne illness in the US, which indicate the impact of foodborne illnesses in monetary terms: Hoffmann et al. (2012) estimate that cost of illness for 14 major pathogens in the US alone could range from as low as $4.4 billion to as high as $33.0 billion Scharff (2012). Another estimate for the US covering 30 pathogens estimated a cost as high as $77.7 billion (range 28.6 to 144.6 billion $). In Europe, only fragmentary estimates regarding costs of food borne illnesses exist. For example, following the German 2010 E. coli outbreak the EU offered compensation to farmers up to 210 million € (http://www.dw.de/eu-boosts-e-coli-compensation-offer-for-farmers/a-15141284-1), even though it has been claimed that losses were more than twice that amount. In addition to this come expenses for human illness, that already during the outbreak were estimated as high as 3,6 billion $ (http://www.marlerblog.com/case-news/e-coli-o104h4-death-toll-hits-37-with-3335-ill-and-817-with-hus/) as well as expenses for the outbreak investigation and risk handling. Rapid resolution of outbreaks and particularly sources of infection and modes of spread are crucial to reducing these indirect effects of outbreaks (Lee and McKibbin 2004). COMPARE will facilitate a much earlier identification and control of such outbreaks as well as a correct identification of the potential source(s). This will result in considerable savings for the European primary and secondary food industry as well as reduced illness for consumers. This will also help increasing the food safety of the European food products and thus, facilitate global competitiveness of the European food and agricultural sector. In addition, the early detection and control of potentially notifiable disease outbreaks in European farm animals will drastically reduce costs for control as well as reduce the consequences of restricted trade.

COMPARE’s impact on improving innovation capacity and the integration of new knowledge NGS technology is considered a potential game-changing technology, that could fundamentally change the way disease detection and monitoring will be done in the future. COMPARE – if successful- would put Europe, its scientific community and public health community in a globally leading position to develop, explore and utilize

Page 19: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 19

this technology and the generated results, thereby also setting standards and providing leadership in the global debate on the future of “big data” and its impact on human and animal health. Compiling and providing access to a large infrastructure of organized genomic data in combination with relevant meta-data will also provide an invaluable resource for data-mining for European researchers and facilitate the development of rapid diagnostics, vaccines and novel therapeutics, as also put forward by the ESFRI-roadmap for the ELIXIR-infrastructure. This is expected to trigger major spin-off for many sectors, including food, agricultural, and health.

COMPARE’s impact on the implementation of the ‘Global Research Collaboration for Infectious Disease Preparedness’ (GloPID-R) and its objectives. GloPID-R is a network of research funders set up with the aim to facilitate a global research response to any significant outbreak of a new or re-emerging infectious disease. The main objectives of GloPID-R are to: • Address scientific, logistical, legal, regulatory, ethical and financial challenges to the rapid mounting of such a

response; • Establish a strategic agenda to address the above-mentioned challenges and identify gaps. It should lead to the

building of a sound research capacity to ensure a rapid international research response in case of an outbreak; • Facilitate exchange of information between funders; • Connect the existing and future research networks in this area. The objectives of COMPARE are all supportive of and aligned to the overall aim of GloPID-R. As described above, COMPARE will have a durable impact on initiating coordinated international research responses to severe infectious disease outbreaks. The main results of COMPARE will all be instrumental in enabling such a coordinated research response. The analytical workflows developed and piloted by COMPARE, spanning from risk-based strategies for sample and data collection, via harmonized standards for sample processing and sequencing to state-of-the-art novel analytical tools for sequence-based pathogen and outbreak detection and analyses could serve as the basis for launching this desired coordinated international research response. The underlying COMPARE data resource can link all relevant actors across the domains of human health, animal health and food safety to a single system for accessing and analyzing sequence based data and accompanying data of relevance, using state-of-the-art analytical tools accessible through the system. This will have a durable impact on the ambition to connect existing and future networks in the area of infectious diseases research, allowing for exchange, analyses and comparison of data and information across domains, time, disciplines, sectors and databases. Moreover, the studies done in COMPARE on barriers to rapid and open data sharing (see below and WP12) can be linked to the efforts of GloPID-R on addressing the legal, regulatory, ethical challenges to rapid initiation of a research response to emerging outbreaks. COMPARE already through its partners is extremely well connected with other research projects focusing on emerging infections, such as the clinical research consortium PREPARE, the ANTIGONE project addressing barriers to disease emergence, and EPISeq, amongst others. Barriers and framework conditions influencing the achievement of the expected impacts In achieving the above-mentioned impacts, COMPARE is dependent on broad and sustained user uptake across sectors, domains, pathogens and geography. The willingness of all users – specifically those that provide samples and associated meta-data and that generate sequence-based information – to rapidly share their data with others is a crucial prerequisite for achieving the impact COMPARE can have on rapid pathogen/outbreak detection and mitigation. There are various barriers or bottlenecks to rapid and open sharing of sequence-based data and contextual metadata that influence the impact of COMPARE: • Publication priorities: First publication of the analysis of one’s own data, as a fundamental scientific

incentive. At least for emergency situations the prompt sharing of data is of utmost importance and should not be delayed, but the pre-publication of data might/will block the acceptance of scientific articles for publication.

• Protection of Foreground: Time consuming patenting processes, to ensure possible economic benefits through valorisation of knowledge and securing of intellectual property rights, may delay rapid sharing of newly found molecular data in emergency health situations.

• Exploitation of Foreground: Some authorities may wish to withhold samples and/or data from research for fear of losing possible long-term national economic benefits from these biological materials. Lower resource countries have the experience that samples and DNA-information out of their community are being commercialised by industry for the production of life saving farmaco-products, which they themselves cannot afford buying for their own population.

• Capacity and capability gaps: Inequalities in laboratory and analytical capacity in different countries may hamper rapid sharing of samples and/or data. In quite some situations, low (but in some cases also high)

Page 20: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 20

capacity countries are in fear of losing control over their outbreak management and decision making when they dispatch samples and data abroad for diagnostics and analyses they cannot perform themselves.

• Reputation/ Economic damages: Foreseen lack of consent of healthcare providers and food producers for international sharing of their data, for fear of traceability and non-recoverable damage to their economics as a consequence. In Europe the privacy of medical information of individuals is generally well guaranteed and guarded by law. This is not the case when data and information of business companies and (private) institutions are involved. Still, many existing surveillance systems are fully dependent on their voluntary cooperation; in addition, WGS opens doors to new, life saving, surveillance and tracking systems, but on the condition of access to these business-sensitive data.

• Other barriers to be identified: e.g. patient’s DNA in samples; biosecurity issues; liability issues; metadata protection etc.

Each barrier is a problem on its own, touching on (public) health law and alternating fields of specialised law-expertise and politics, with different target groups / partners involved, and as a consequence different levels of discussion and possible solutions. On a virtual scale we see on one end problems that are purely managerial/content driven and can be analyzed, discussed and solved by the users of COMPARE themselves. On the other end of the scale we see barriers that are of a complex international political nature, that need addressing on the agenda of supra-national organizations, such as WHO, World Trade Organisation (WTO)/WIPO, World Animal Organisation (OIE) or European Commission/ European Parliament, ECDC/EFSA at the European level. In between are barriers that need involvement of private international associations, e.g. the International Committee of Medical Journal Editors, or the international associations of microbiology ECCMID and IUMS, or branch organizations like FoodDrink Europe and others. The WHO International Health Regulations (IHR) constitutes the only legally binding global agreement relative to global public health security. According to IHR all WHO Member States have accepted responsibility for sharing relevant laboratory data related to important public health events; such data will in the future clearly include genome sequences. COMPARE also needs to take into account other global discussions, i.e. about the implications of the Convention on Biological Diversity, the Doha Declaration on the TRIPs agreement concerning IP-rights and public health, biosecurity issues and regulations of these by the Australia Group, the effects on the International Health Regulation, and – returning to Europe – on the new EU Declaration on information sharing in confrontation with cross border health threats. For these reasons, COMPARE will include in-depth studies that focus on defining barriers delaying or hampering the swift and open-source exchange of whole genome sequence data and connected data, and will develop interim solutions while contributing to the global discussions (see WP12 Barriers).

2.2  MEASURES  TO  MAXIMISE  IMPACT  

a) Dissemination and exploitation of results The results of COMPARE can be divided into direct results and indirect results. The direct results of COMPARE are the components that comprise the overall analytical framework, to be used by relevant users in human and animal health and food safety for the purpose of improving pathogen identification and characterization and outbreak detection and mitigation, including foodborne outbreaks. The analytical framework comprises of several complementary components as presented in section 1.1 and 1.3 that are direct results from COMPARE: • Risk-Assessment models and the risk-based strategies for sampling and data collection (objective A); • Harmonised standards for sample processing and sequencing (objective B); • The analytical tools for cross-sectorial and cross pathogens sequence based surveillance and emerging

pathogen detection, outbreak investigations and epidemiological analysis (objective C); • The components of the Data and Information Platform (objective D): COMPARE data resource, generic

workflows engine, portal, and the User configurable virtual test and private working environments; • The Risk Communication tools (objective E); • The Cost Effectiveness framework for determining the cost-effectiveness of the above-mentioned components

(objective G); • Data sharing standards and MTA’s Next to these direct results, COMPARE will lead to a range of indirect results encompassing the results from the underpinning research performed in WPs 6, 7 and 8 using (components of) the overall analytical framework: • Samples and associated (meta-) data collected;

Page 21: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 21

• Sequence-based molecular typing data generated, including data on hitherto unknown (subtypes) of pathogens;

• Pathogen isolates resulting from the sampling performed, including novel (sub) types; • Knowledge generated using the collected and analyzed data and the analytical tools integrated in COMPARE

(e.g. identification of a new outbreak); All these results are potentially relevant to the direct stakeholders of COMPARE as well as to the general public (citizens, patients, consumers and farmers). Some of the indirect results can be sensitive in nature and will require appropriate coordinating and consultation mechanisms to ensure that the results are adequately screened for exploitation potential and considered for dissemination beyond COMPARE using the proper communication channels (e.g. in case of detection of notifiable diseases). The Executive Board of COMPARE is the appropriate coordinating mechanism for ensuring that opportunities for exploitation are recognized and adequately acted upon in a timely manner. To this end, the EAPs, established for the purpose of User Consultations can also serve as independent advisory bodies to discuss and advise on the appropriate means of dissemination of results. COMPARE will be committed to establish open and direct dialogue with its stakeholders, the future data providers and information users in particular. In line with its overall purpose, (components of) the COMPARE analytical workflows and the data and information sharing platform should be adopted and applied by the envisaged users across the human and animal health and food safety sectors. These users are (see also WP11 User Consultation and section 3.2 on Management): • International public and veterinary health and Food safety authorities (e.g., ECDC, EFSA, WHO, OIE); • National public and veterinary health and food safety authorities (e.g. public health institutes of Member States

and national reference laboratories); • Food (processing) industry with sequencing capacities (e.g. Unilever, Nestlé); • Research institutes and their individual researchers (e.g., molecular, clinical, epidemiological, bioinformatics)

across the human and animal sectors and covering virology, bacteriology and protozoology; • Clinicians covering primary and hospital care (human and veterinary); Realizing a broad and sustained user base across these users is pivotal for achieving the desired breadth of the system in terms of covering human health, animal health and food safety. With this in mind, the results of COMPARE are not being pursued with the primary goal of commercial exploitation. However, in view of the (self-) sustainability of (the core components of) the Data and Information Sharing Platform beyond the five-year project period, COMPARE will explore alternative exploitation strategies to generate sufficient financial income to maintain the platform. Alternatives for exploitation of (components of) COMPARE to be further explored by COMPARE are: • (Voluntary) membership-fees (e.g., such as those for Web of Science), for users in the public domain and

industry for maintaining the publicly available services of COMPARE (e.g. data resource and basic informatics tools);

• Fees (following various pricing models) for users/organisations willing to hire private virtual working environments within the Data and Information sharing platform, for (developing, testing and applying) advanced proprietary analytical tools, on the backbone of the COMPARE engines;

• On-demand access fees and/or course fees to make use of advanced non-public analytical tools within private defined virtual environments, requiring in-depth expertise of the tool for application acquired by practical training and professional assistance form the developers of the analytical tools;

In exploring these alternatives, any potential detrimental effects on the desired widespread adoption by the envisaged users will be taken into account.

Management of data generated and collected during the project COMPARE will collect and generate a range of data and information (see text Box 1), depending on the sector, domain, and geography of the users. The COMPARE Data and Information platform will as it evolves store (or provide access to) data on samples (e.g., source, type, date, location (ID)), sequence-based data (e.g. raw sequence data, assembly data, pathogen species ID, clonal subtype), contextual data (e.g. clinical (patient age, gender, diagnosis), pathology data, phenotypic data and gene- and transcriptomic data (e.g., markers) and data on the provider of the data. Standards for collecting, curating and preserving the data are the primary subject of WPs 1 (risk-based sample and data collection), WP2 (harmonized standards for sample processing and sequencing) and WP9 (COMPARE data and information platform). Where appropriate, we will leverage existing standards, and continue to engage with their developers, for data representation, including the GMI Minimal Information for

Page 22: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 22

Identifications, the BAM/CRAM/SAM/VCF stack of read alignments tools, the MINSEQE minimal reporting standard for sequence-based experiments and the MIxS family for standards. Three classes of availability of COMPARE data will exist: public data, managed access data and private data. Public data will be made permanently and freely available through the EMBL-EBI ENA permanent data resource, which will also take responsibility for global data sharing through the INSDC. Managed access data will be stored under strict security provisions and access to these data sets will be provided to bona fide researchers able to demonstrate that they meet the requirements for privacy protection and security as assessed by the relevance data access committee (typically drawn from the ethical body overseeing consent-giving procedures). Private data (expected to be a minor share of COMPARE data) will be made available only to defined closed consortia that will retain all access management functions. Strategy for knowledge management and protection COMPARE will conclude a Consortium Agreement (CA) detailing the rights and obligations of the beneficiaries with regards to the management of intellectual property, compliance with privacy rules and ethics, and contributions to the project (amongst others). The CA will detail the rights and obligations and associated procedures regarding issues such as access rights to background and ownership, protection, exploitation and dissemination of results, in compliance with relevant articles in section 3 of the H2020 Grant Agreement. All Background will be identified by the beneficiaries and will be included in the CA. Coordination of the Knowledge Management activities and monitoring compliance to the CA and the relevant articles in the EC H2020 Grant Agreement (e.g. on Access to Background, Ownership, Protection and Exploitation of Results), will fall under the direct responsibility of the COMPARE Executive Board in WP15 Management, involving the General Assembly as appropriate (e.g., in case of conflict resolving). The general principles underpinning the IP policy of COMPARE are that: - All principal investigators will review their results obtained in COMPARE for exploitation potential and

patentability and communicate without delay to the Executive Board of COMPARE any decision to protect results generated in COMPARE;

- Adequate protection of IP shall take into account the need for rapid and proper dissemination of the results as mentioned above to the appropriate stakeholders;

- All beneficiaries will involve their local Technology Transfer Offices, or equivalent staff with expertise in the legal aspects of IP, in the overall IP management of COMPARE.

b) Communication activities Interactions with our user-stakeholders are built into the overall design of COMPARE as depicted in figure 3. The activities of COMPARE involving communication are divided over three complementary work-packages: • WP10 RISK COMMUNICATION

TOOLS: designed to offer our (future) user-stakeholders the tools to communicate with their respective stakeholders (e.g., patients, consumers, farmers, general public) about issues related to emerging infectious diseases outbreak (response) measures;

• WP11 USER CONSULTATIONS: Key user-stakeholder organizations and individual experts that are representative of these User Stakeholders are invited to serve on Expert Advisory Panels (EAPs) of COMPARE, who will be consulted throughout the duration of COMPARE through structured User Consultations (WP11) to inform COMPARE of the needs, concerns and questions to be addressed by the methods, tools, strategies and technologies developed in COMPARE. In COMPARE we will install seven EAPs as depicted at the bottom left of figure 3. One EAP is specifically focused on advising the WP12 Barriers, that will focus on identifying the political, ethical,

Figure 3: communication-related activities in COMPARE

Page 23: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 23

administrative, regulatory and legal (PEARL) barriers to open data sharing. The persons and organizations serving on these EAPs are listed in section 3.2 and encompass organizations including ECDC, EFSA, DG SANCO, FAO, WHO, OIE, US-CDC, China-CDC and NCBI;

• WP12 DISSEMINATION & TRAINING: designed to achieve visibility, awareness and acceptance of COMPARE’s activities, results and their added value to our key user-stakeholders and the broader general public;

All three complementary sets of communications promote the visibility and awareness beyond the COMPARE consortium. The specific objectives of these WPs are mentioned in the respective WP tables. In this paragraph, we will focus on the communication activities of WP12 DISSEMINATION AND TRAINING, as these are directly geared to promoting the project and its findings beyond the consortium during the period of the grant. COMPARE’s dissemination activities will comprise of: • Developing and maintaining a user-stakeholder database: In collaboration with WP10 RISK

COMMUNICATION we will build and maintain an up to date contact database of key organizations and contactpersons to be informed regularly by COMPARE on its activities, its results and the added value thereof for these organizations and persons. These stakeholders will encompass the organizations as represented in the Expert Advisory Panels and similar other organizations at national Member State level. Considering the combined professional networks of the lead investigators in COMPARE, we are confident about the outreach of COMPARE to the relevant scientific community and public and veterinary health authorities at national and international level;

• Producing promotion materials applying consistent COMPARE identity and visibility: COMPARE will develop and apply a ‘corporate identity’ that will be applied in all internal and external communications of COMPARE, in compliance with article 29.4 of the H2020 grant agreement. This will involve a leaflet, periodic newsletters, poster presentations and presentation slides as appropriate;

• Online and social media presence: COMPARE will develop a public website, linked to the COMPARE Data Resource, and linked to a COMPARE Twitter account where interested parties can find background information on COMPARE. Other social media such as YouTube and LinkedIn will also be explored. These social media will be primarily used to raise public awareness about COMPARE and its added value to public and veterinary health and food safety in Europe and beyond. COMPARE will also strive to link its public websites to other websites/portals of the linked network (e.g. GloPID-R website, Globe network, ANTIGONE website, GMI website);

• Presentations at national and international conferences, seminars and symposia: The lead investigators in COMPARE are frequently asked as speakers at national and international conferences, seminars and symposia. At these professional gatherings, many of our stakeholders are also present. They will leverage this to the benefit of raising awareness of COMPARE at these stakeholders by presenting COMPARE and its results and added value to public and veterinary health and food safety;

• Publishing peer-reviewed scientific journals: COMPARE will actively seek to publish the results of COMPARE in peer-reviewed scientific journals, acknowledging COMPARE and the European Commission (in compliance with article 29.4 of the H2020 Grant Agreement). As can be seen from the partners’ descriptions in section 4, the COMPARE lead investigators frequently publish their results in high-ranking journals with a broad audience.

General publication policy The COMPARE CA will detail the rights and obligations of the partners in COMPARE concerning publication of results obtained in COMPARE. All principal investigators will strive to publish their results in open-access journals (Gold model). In case publications in open-access journals is not feasible, the publications will be made available via the COMPARE website, respecting any embargo periods of the journal (Green-model). COMPARE training activities Next to the communication activities, COMPARE will contain a strong training component to promote the impact of the analytical framework and the data and information-sharing platform on the pathogen detection and outbreak analyses and mitigation efforts by the professional communities in human and animal health and food safety. This training component in COMPARE consists of the following activities: E-learning material Each WP will produce e-learning manuals that can be accessed on-line via the COMPARE portal, including SOPs, guidance documents, manuals, demonstration slides or videos, FAQs as appropriate, and how to contact the

Page 24: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 24

COMPARE helpdesk (see below) for questions and hands-on assistance. This training material will be developed and updated throughout the project to reflect the latest status of the developments in COMPARE. COMPARE helpdesk COMPARE will establish a virtual helpdesk, consisting of the developers of the analytical workflows, methods, and tools developed in COMPARE. This helpdesk will function as first line assistance to COMPARE users, for accessing, applying and interpreting the results of the analytical tools and workflows available through COMPARE. This direct help-desking and interactions between users and the developers enables that the user community of COMPARE can be trained ‘on the job’ in applying the tools and methods integrated in COMPARE and to ensure that full and adequate use is made of these tools for the purposes they have been built. COMPARE workshops for human and animal health and food safety authorities Making use of the EAPs established in WP11, we will develop a COMPARE modular workshop in which principal investigators will present the analytical tools and software tailored to the specific needs of the organizations represented in the EAPs. The detailed programme of the workshops will be designed in consultation with the respective EAPs. We will plan and budget a total of ten workshops, where the objective is to have organized at least one workshop for each category of user-stakeholders as organized into the EAPs (e.g. at least one for public health researchers, at least one for clinicians, at least one for food-companies etc.). The workshops will be advertised on the COMPARE portal and those of other related networks (e.g. ANTIGONE, PREPARE, GMI, TELLME, etc.). Expert (practical) courses and researcher exchanges The final training component, designed to promote the impact of COMPARE consists of practical courses and researcher exchanges. In case the first line helpdesking does not suffice, and users would like to receive more in-depth training in the analytical workflows developed in COMPARE, ad-hoc training can be given on a case-by-case basis by the lead investigators responsible for the development of the tools. This can involve practical training on location of researchers from outside the COMPARE consortium or researcher exchanges between COMPARE partners. Partners in COMPARE can also leverage existing renowned courses run/organized by them for many years, making optimal use of the existing infrastructures. By applying the train-the-trainer concept, the expertise and capabilities needed to apply the COMPARE analytical workflows can grow exponentially across Europe and beyond. The number and content of these practical courses and researcher exchanges will be based on the demand and needs of the users.

3.     IMPLEMENTATION  3.1  WORK  PLAN  —  WORK  PACKAGES,  DELIVERABLES  AND  MILESTONES  

Figure 4: Work packages in COMPARE and their interrelationships The activities in COMPARE are organized into 15 Work Packages (WPs). The division into WPs is based on the overall approach as introduced in sections 1.1 and 1.3 and as depicted in figure 4.

Page 25: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 25

WP1 covers the development of the risk-assessment models and risk-based sample and data collection strategies as covered by objective A. WP2 covers the development of harmonized standards for sample processing and sequencing as covered by objective B. The development of analytical tools and methods for sequence-based pathogen and outbreak detection and analyses as covered by objective C is addressed in WPs 3-5, where WP 3 addresses objective C1 (application for frontline diagnostics), WP4 addresses objective C2 (application for foodborne outbreaks) and WP5 addresses objective C3 (application for (re-) emerging outbreaks). WPs 6, 7 and 8 encompass the underpinning research studies using the analytical frameworks of WPs 1-5. The development of the data information sharing platform (objective D) and the risk communication toolbox are covered by WPs 9 and 10 respectively. Finally, the supporting activities of objectives F and G are covered by WPs 11-14. This set of 14 WPs is supported by a central WP15 on Management (not depicted above). The timing of the WPs and the main tasks is depicted below.

Year Year 1 Year 2 Year 3 Year 4 Year 5 Quarter 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

WP1 RA and risk-based sample and data collection 1.1 Generic and spatial risk assessment framework 1.2 Generic food chain risk assessment framework 1.3 Tools for epid. transmission modeling and rapid RA 2.1 Risk based sampling for unusual clinical symptoms 2.2 Targeted sampling for early detection 2.3 Risk-based sampling algorithms and protocols 2.4 Food level sampling strategies for surveillance WP2 harmonised standards sample processing and sequencing

1.Harmonized standards for sample handling 2. Standardised processes for sample processing 3. Selection of Next Generation Sequencing platforms 4. Basic bioinformatics toolbox 5. Harmonized standards for metadata collection 6. Historic and prospective biobanks as reference 7. Quality management and ringtrials WP3 Analytical workflows frontline diagnostics 1. Workflows for integration of NGS in clin. lab diagnostics 2. Framework for prediction 3. tools for id of hospital clusters and nosoc. transmission WP4 analytical workflows foodborne outbreaks

1. Methods for seq. based surv. food-borne pathogens 2.1. Framework NGS and analysis tools food-borne

outbreaks

2.2. Tools for source attribution WP5 additional tools for detection .response (re-) emerging infections

1 Analytical framework for detection of (re)emerging pathogens in meta-genomic datasets

2. Tools for rapid sequence based detection of strain specific clusters

3. Tools phylogenetic and phylogeographic analysis 4. Tools for detecting SNPs in pathogen NGS data 5. prediction of phenotype changes

WP6 underpinning research frontline 1. Feasibility of NGS for diagnostics in the clinical

diagnostic laboratory

2: Feasibility of NGS for prediction of antimicrobial resistance phenotype to guide treatment

3: To evaluate the use of NGS for syndromic surveillance based on data from hospitalized patients

WP7 Underpinning Research foodborne outbreaks

1. To develop and validate a workplan for NGS based food-borne outbreak investigation

2. international study WP8 Underpinning research (re-) emerging outbreaks

1 Mining metagenomic data for surveillance purposes 2. Early detection of surveillance of emerging zoonoses

Page 26: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 26

Year Year 1 Year 2 Year 3 Year 4 Year 5 Quarter 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

3. early detection of changes in pathogen traits WP9 Data and Information sharing platform

1: COMPARE data resource component 2: COMPARE workflow engine 3: User spaces for COMPARE workflow development 4: COMPARE portal 5: long-term sustainability

WP10 Risk Communication 1. Stakeholder Analysis 2. Targeted messages 3. compare risk communication tool box

WP11 User-Stakeholder Consultations 1: Establishment of the Expert Advisory Panels 2: implementation of EAP meetings (by Executive Board 3. establishment and consultation of Online User Panel

WP12 Barriers 1: Establishment of the Expert Advisory Panel Barriers 2: legal limitations, conditions in open data sharing 3: Constructing an ethical framework and charter 3. Developing standard procedures for data shipment 4. Data sharing guidelines 5. support for COMPARE and workpackage leaders

WP13 Dissemination & Training 1 Developing and maintaining user-stakeholder database 2 Producing and distributing promotion materials 3 COMPARE public website and Twitter-account 4 international conferences, seminars and symposia 1production and distribution of e-learning material 2 Establishment of COMPARE helpdesk 3 COMPARE workshops 4 Expert (practical) courses and researcher Exch.

WP14 Cost effectiveness 1 Identify the important elements 2 Identify develop costing methodologies 3 Define the baseline 4 value safety methodology in several countries. 5 Estimate the cost-effectiveness of COMPARE 6 assess options for refining selected elements of COMPARE

WP15 Consortium Management 1 installment of management bodies 2 implementation of internal procedures conform DoW and CA

3 Periodic technical and financial reporting to the Commission

4 Knowledge and contract management

Work package number 1 Starting date or Starting Event Month 1

WP title RISK ASSESSMENT AND RISK-BASED STRATEGIES FOR SAMPLE AND DATA COLLECTION

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

21 27 1 14 5 2 57 31 8 8 17 30 2 1 4

Overall objective: A. To develop risk assessment models and risk-based sampling and data collection strategies for NGS

Page 27: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 27

based analyses of food-borne and (re-) emerging infections: 1. To develop methodology for risk assessment including NGS outputs:

1.1 To develop a generic and spatial risk assessment framework to identify which regions and species are at an increased risk of incursion and further spread of novel pathogens;

1.2 To develop a generic food chain risk assessment framework based on NGS data 1.3 To develop tools for epidemiological transmission modeling and rapid spatial risk assessment.

2. To develop risk-based sampling and data collection strategies for early detection and investigation of unusual patterns of infectious disease outbreaks: 2.1 To develop risk based sampling algorithms and protocols for unusual clinical symptoms in humans and

domestic animals in medical and veterinary practice; 2.2 To developed risk based sampling algorithms and protocols for early detection of emerging and re-

emerging infections coming from wild or feral animals; 2.3 To develop risk-based sampling algorithms and protocols for detection of human pathogen circulation

in the absence of recognized illness; 2.4 To develop food-level sampling strategies for surveillance as well as food-borne outbreak

investigation. All objectives and tasks will take into consideration the inputs and feedback as gathered in the WP11 User Consultations and from existing networks (see section 1.3, e.g. PREPARE, EMPERIE,) to avoid any duplication and to improve the input of available information. Description of Work Risk assessment of outbreaks has no established methodology. Based on all the samples taken, one does not only want to predict 1) whether an outbreak might occur, or 2) with what probability an outbreak might occur, but also 3) what the severity and consequences of this outbreak will be. Once a pathogen, disease, or outbreak is detected, population-level analyses of NGS data can provide important insights in evolution of outbreaks, sources of infection, modes of transmission, changes in virulence and transmission dynamics, and effects of control measures. Hence, for an optimal outbreak risk assessment, the probability of occurrence is predicted, surveillance is designed based on these predictions, and methods are developed to prepare for outbreak investigations to determine its size, the affected population, the potential for further spread, and the (potential) modes of transmission. This framework also allows prediction and measurement of the effect of interventions at different points, in order to advise risk managers on where and how to intervene, and to prioritize such interventions. In this work-package, insights and methods from risk assessment for food-borne illnesses and for emerging infections will be redesigned to include outputs from NGS analyses across COMPARE, generating a novel model framework for risk assessment. In addition, matching sampling strategies to capture the essential specimens and metadata will be designed and methods for rapid initial assessment of disease outbreaks including NGS data will be developed. Task 1: To develop a methodology for risk assessment inputting NGS data Task 1.1: To develop a generic and spatial risk assessment framework to identify which regions and species are at an increased risk of incursion and further spread of novel pathogens (AHVLA, FLI) The aim of this task is to develop a generic spatial risk assessment framework that will assess the risk to humans and animals from emerging (zoonotic) pathogens using GIS and spatial network methods, inputting COMPARE data and other readily available non-disease data, e.g. trade data, population data (livestock /wildlife /human) and habitat data. By describing the spatially-dependent interactions between species (animal to animal and animal to human) such a risk assessment framework is applicable across a wide range of pathogens and hosts, and would hence allow a rapid assessment of (zoonotic) outbreak potential of multiple pathogens and host species. The framework will enable the identification of regions/member states/livestock production types/ age groups that are at an increased risk from any identified outbreak external to or within the EU’s borders. Therefore, the key output of the risk assessment will be the identification of hotspots for incursion, and the potential for the disease to spread if and when it entered via that hotspot. An advantage of the generic risk assessment is that it relies on readily available system data to infer the potential for disease spread and thus will be applicable to countries which do not have such comprehensive research and surveillance activities. The risk assessment will be developed as a user-friendly tool that will be made publicly available. Task 1.2. To develop a generic food chain risk assessment framework based on NGS data (DTU, UEDIN, RKI, FLI) From the perspective of food safety risk assessment, NGS data will improve the hazard identification by increased specificity and potentially by a fundamental change in the definition of the hazard: this is no longer a

Page 28: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 28

“species”, but a specific virulent strain or even a (set of) gene(s). In other words, NGS may be able to provide the knowledge/information that is needed to quantify the difference between various subtypes/strains with regard to causing human illness and thereby assist with the characterisation of ‘pathotypes’. This requires a novel approach to risk assessment (Brul et al. 2012), where the role of NGS data in the improvement of exposure assessment and hazard characterization (dose response modeling) is as yet unclear. The aim of this task is therefore to develop a generic risk assessment framework that will assess the risk to humans from foodborne pathogens using stochastic modeling techniques, and inputting COMPARE NGS data and other available data including consumption data and trade data. Challenges include the novel definition of the “unit” of the hazard at the genomic level and the current need for quantitative (enumeration) data in risk assessment (Havelaar et al 2008). For most pathogens, quantitative data are also needed for estimating growth and/or inactivation through the production chain. Novel methods for rapid risk assessment will be developed, by studying how and when existing risk assessment approaches can be simplified, and how complications in model can be avoided without violating the validity of the assessments. Focus will be on how NGS data can be optimally used, including how quantitative data from genome sequencing can be applied in risk assessments. We will develop different modules each representing a specific part of the food-production chain and applicable over a wide range of foodborne pathogens and food categories, which will allow for users to conduct and compile their own rapid risk assessment depending on their needs/food safety questions. Task 1.3 To develop tools for epidemiological transmission modeling and rapid risk assessment to support outbreak investigations (UEDIN, FLI) A variety of epidemiologically modeling tools are available to help identify infectious disease outbreaks (e.g. Bogich et al. 2013), estimate transmission parameters (e.g. Fraser et al. 2009) and make short- and long-term predictions of rates of spread (Matthews and Woolhouse 2005). Model-fitting in real time has proven potential to inform risk assessment and policy responses to outbreaks, provided there is rapid access to epidemiological and sequence data (e.g. Volkova et al. 2011). Data inputs include spatiotemporal distribution of cases (and non-cases), contact tracing and individual risk factors. Models are available for various infections, for instance foot and mouth disease, avian and human influenza, MRSA, but formal validation in an epidemic situation is an iterative process. In contrast to the infectious disease outbreak models, where disease transmission primarily occur between animals and humans or from human to human, risk assessment of food-borne outbreaks has the challenge that there is a vehicle – food – that “disturbs” the estimation of transmission parameters. So in order to predict the potential spread of a food-borne outbreak and thereby the exposed population, models need to include the food vehicle i.e. the type of food, the food-distribution and food-consumption patterns. Starting with models that have been calibrated to previous events, we will update and re-parameterize the models to prepare for food-borne and emerging disease investigations (and optimized sampling schemes) in order to rapidly determine the size of the outbreak, the affected population, the potential for further spread, and the modes of transmission. Since some outbreaks will have a mixture of transmission modes e.g. food-borne and human-to-human transmission, the two types of transmission models are going to be integrated. Phylogenetic approaches (implemented in the BEAST software platform) can provide estimates of changes in effective population size and transmission parameters (R0)(see WP5). Combining these approaches can significantly improve both the accuracy and precision of parameter estimates and projections in both time and space, especially in the crucial earliest stages of an outbreak. This is most straightforwardly achieved by providing informative priors based on epidemiological information for phylodynamics analysis (e.g. initial estimates of infectious period and case-fatality rate, which can be updated as new information becomes available). These models will be further developed and refined through COMPARE. Combined, such a novel framework for rapid risk assessment of food-borne and emerging disease outbreaks outbreaks would also allow for prediction and measurement of the effect of control measures at different points, in order to advise risk managers on where and how to intervene. Task 2: To develop risk-based sampling and data collection strategies for early detection and investigation of unusual patterns of infectious disease outbreaks Task 2.1: To develop risk based sampling for unusual clinical symptoms in humans and domestic animals (AMC, UA, FMER, RKI, FLI, AUTH, UK-Bonn, TIHO) Early recognition of emergence and spread of food-borne or (re-)emerging infectious diseases is essential for early detection and mitigation of outbreaks, but is extremely challenging. However, many human and veterinary health care settings routinely produce and store clinical and diagnostic microbiological data. While the type and resolution of such routine data will differ per region or setting - (figure 5) – these data are of potential high relevance for monitoring of disease and pathogen emergence patterns in real time.

Page 29: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 29

We will explore the utility of these routinely collected data in sentinel (human and veterinary) health care settings located in ‘hotspot’ regions for the emergence of novel or re-emerging

pathogens (as identified in task 1.1). For follow-up of such signals, risk-based and epidemiologically proven sampling strategies for different scenarios, hosts (human, domestic animals/livestock) and pathogens will be developed and validated. Where available, this will be done following agreed protocols (ISARIC, WHO, OIE, EVA, ENIVD), including details of collection, devices, media, shipment and storage conditions, that will not interfere with the methods and techniques used for identifying the infectious agent or its components. The work builds from expertise in the PREPARE clinical research consortium, where standardized protocols for sampling known diseases are developed, and of which founding members are partners in COMPARE (AMC, UA, EMC, UKbonn). The analytical approach will be piloted through WP6. Task 2.2: Targeted sampling for early detection of emerging and re-emerging infections coming from wild or feral animals (EMC, FLI, RKI, Artemis, TIHO, IREC) Wildlife is a peculiar component of the one health triad because the distribution, population size and health status are largely unknown as compared to the human and domestic animal components. Disease emergence in wildlife can be detected through an increase of reported mortality or disease events, but often the only sign is a sudden population change (e.g. peaks of voles and hares in tularemia, or severe declines in bird flavivirus outbreaks), and adequate systems to maximize the information derived from wildlife monitoring are still missing in most countries. Consequently targeted sampling efforts are usually necessary upon the suspicion of the involvement of a wildlife species in an emerging infection in humans or domestic animals. Such suspicion could come either from unusual clinical manifestations, or be guided by risk mapping, and sampling needs to be guided by the knowledge about pathogenesis. Smart sampling algorithms and protocols for the detection and characterization of relevant pathogens in wild life and vector reservoirs will be designed (this WP) and piloted through WP8. This includes sampling strategies for vectors for transmission of pathogens from reservoir species to humans and production animals. Task 2.3: To develop risk-based sampling algorithms and protocols for detection of human pathogen circulation in the absence of recognized illness (EMC, SSI, DTU, AMC) Potential sources of introduction of novel pathogens (and therefore potential surveillance targets) are travelers, imported foods with a high risk of environmental contamination, specific species of wildlife, and trade (legal and illegal) of animals. Each of these is amenable to early warning surveillance if generic methods for metagenome profiling become widely available. Therefore, we will evaluate the potential for improved surveillance targeting hotspots for disease introduction (seized animals from illegal trade, travelers, selected wildlife samples), amplification (health care settings), combined with the potential for rapid profiling of the extend of spread of an emerging pathogen through sewage sampling. Outputs will be sampling algorithms and protocols that will be piloted in WP8. Task 2.4 To develop food level sampling strategies for surveillance as well as foodborne outbreak investigation (DTU, IFREMER, UNIBO, ISS) The epidemiological considerations concerning detection and analysis for emerging diseases to a large extent also applied for foodborne diseases. However, the food chain gives some special challenges that need to be addressed to ensure early detection of potential problems as well as optimal identification of the routes of transmission in outbreak situations. In this task we will specifically address sampling strategies along the entire food chain aimed at the most important potential contamination events (e.g. sewage/manure contamination versus food handler contamination). Deliverables D1.1 Automated tools for rapid assessment of key transmission parameters and rates of spread estimates (M24) D1.2 Risk based surveillance plans, sampling algorithms and protocols for surveillance of emerging and food-

borne diseases and pathogens not covered by existing surveillance (M40) D1.3 Generic risk assessment framework to target surveillance activities and outbreak investigations (M46)

Figure 5: tiers of routine data: at present NGS/WGS data are produced in a (geographically) limited (but expanding) number of specialized health care settings while collection of diagnostic and typing data are more widespread and clinical data are routinely produced in most healthcare settings. If an outbreak is suspected in a region based on clinical or diagnostic data in sentinel sites (indicated by ‘explosions’), NGS/WGS efforts can be directed appropriately (indicated by arrows).

Page 30: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 30

Work package number 2 Starting date or Starting Event Month 1

WP title HARMONISED STANDARDS FOR SAMPLE PROCESSING AND SEQUENCING Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

30 26 54 15 17 5 15 10 4 8 17 19 7 1 26 24 72

Objectives B. To develop harmonized analytical workflows for generation of high quality NGS data in combination

with relevant meta-data for pathogen detection and typing across sample types, pathogens and domains. Specific objectives: 1. To optimize and harmonize sample handling for NGS and related methods; 2. To develop standardized protocols for sample processing for different sample types and viruses, bacteria

and parasites; 3. To develop standardized sequencing protocols for the different pathogens as well as for different

purposes (surveillance, diagnostics, single isolates, meta-genomics); 4. To improve sequence analyses, including novel bioinformatics tools for metagenomics, isolate typing,

and de novo reference-less identification of pathogen related sequences; 5. To develop sequence curation and storage protocols and tools; 6. To develop historic and prospective biobanks as reference; 7. To develop a scheme for ring trials and external quality assurance systems.

All objectives and tasks will take into consideration the inputs and feedback as gathered in the WP11 Stakeholder Consultations and from existing networks (see section 1.3) to avoid any duplication and to improve the input of available information. Description of Work All partners indicated above will work on tasks 1 through 6. Task 1. Developing harmonized standards for sample handling An essential element of quality of information for clinical and public health applications is the sample handling. The composition of microbial communities can change significantly when the targeted samples are drawn as microbes immediately adapt to changes in their environment. These adaptations also occur when samples are transported over an extended timespan under sub-optimal conditions. All adaptations in turn will change the outcome of the sequencing analysis. Therefore, there is the need to establish optimized standards for sample collection and transport (including ensuring the appropriate cold chain). The effects of sampling, transport, and storage conditions on results of sequencing (raw data quantity and quality, quality of processed data, depth and coverage of obtained genome sequences, compositions of communities) will be compared in order to develop standard procedures compatible with the defined surveillance objectives for the pilot organisms. Task 2: Developing standardized processes for sample processing The necessary sample processing includes pathogen inactivation, nucleic acid (DNA/RNA) extraction, and subsequent processing like cDNA preparation from RNA, amplification and enrichment protocols. Several different methods for preparation of RNA and DNA from various sample matrices (human (clinical); environmental samples; food samples (animal or plant origin); animal samples (domestic and wild life, insect vectors)) are available. These available procedures will be evaluated and compared with regard to their performance, yield, and quality. Based on the obtained data, an inventory of methods and protocols will be prepared and first standardization data will be generated and provided to all partners. Finally, an optimal standard protocol for all steps of sample processing for sequencing will be available. This will be used independently of the surveillance objectives (e.g. pathogen detection versus typing of virulence traits or cluster detection). In addition, the development of methods targeting nucleic acid extraction and hence sequencing of specific pathogens in order to improve their detection will be included. Task 3: Selection of Next Generation Sequencing platforms As different NGS technologies (e.g., Illumina, Iontorrent, Solid, Pacific Biosciences) with different

Page 31: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 31

characteristics (throughput, read length, multiplexing capacities, within and between run contaminations) are currently available and new ones are soon to come, criteria for selection of the proper platform for the different surveillance objectives will be defined. While for WGS, a combination of different platforms might be necessary, metagenomics analyses may rather require longer read lengths or in contrary if the aim is to identify specific signals more sequencing depth but with shorter reads. Therefore, minimum standards will be established for the different tasks. The broad expertise of the consortium partners will be bundled in order to define main criteria for the platform selection. This task will finally result in standardized protocols for NGS and WGS starting from the prepared DNA (task 2) through library preparation and sequencing to generate raw data sets of the highest possible quality. Task 4: Developing a basic bioinformatics toolbox SOPs for NGS data analyses are crucial for reliable results and conclusions. Therefore, this task will deal with the analysis of the raw sequence data sets from the different platforms. The use of existing standard sequence formats (FastQ, sff and mapped BAM files) will enable the processing of sequence data from all available platforms. A bioinformatics toolbox for optimized quality assurance, trimming (platform specific adapter and vector sequences), identification and elimination of contaminations resulting from sample processing, first level analysis, and annotation will be developed based on available software. Many such components are also available as research code from COMPARE partners, for example the WTSI and partners have developed the programs QUASR (Watson et al. 2013), xcompare (Oude Munnink et al. 2013) and SLIM (Cotten et al. 2014) and DTU and partners several online tools for detection of antibiotic resistance genes (Zankari et al. 2012), virulence genes (Joensen et al. 2014), phylogenetic analysis (Leekitcharoenphon et al. 2012) and MLST determination (Larsen et al. 2012) and meta-genomic analysis (Hasman et al. 2014) based on whole genome and whole community sequencing. These will run through a software development cycle to produce robust, transferable and platform independent programs to act as plug and play resources for COMPARE. Software stacks will be made available through WP9. This will ensure reproducibility, sustainability of the system and avoid duplication. Evaluation of the toolbox with available real sequencing data and artificial standard data sets will prove functionality prior to the piloting phase. There are now many datasets of pathogen genome sequences produced by NGS with COMPARE and the WTSI has sequenced over 5000 virus genomes accumulating 7Tb of data. A structured subset of these data from different sequencing platforms will be made available to assess data process pipelines. Task 5: Developing harmonized standards for metadata collection A standard for epidemiological and other metadata collection will be defined based on current surveillance (EFSA 2010), but also building on expertise from previous projects (Dugan et al., Standardized Metadata for Human Pathogen/Vector Genomic Sequences, 2014 submitted; Field et al. 2008, 2011). A minimal threshold will be set for rapid sequence submissions, including source of sequence (host, sample type, submitter), technical information (platform used, protocol), time and geographic region. Annotation to a taxonomic level relevant for the surveillance objectives will be added through typing tools relevant for the given pathogen. Compatibility of NGS based pathogen characterization with existing typing tools will be investigated (e.g., MLST, MLVA, partial genome sequencing), and when needed, new standards will be developed. Data storage (from raw data to analyzed and curated data) will be tested an optimized in combination with WP9. Task 6: Historic and prospective biobanks as reference For full validation of the potential uses of NGS for the different purposes addressed in this proposal, well- defined biobanks are needed. These include reference strains and clinical samples for the main pathogens, syndromes, and reservoirs that are studied in WPs 1-8. Importantly, the WTSI in collaboration with Public Health England will sequence 3500 genomes from National Collection of Type Cultures (NCTC) (bacteria) and the National Collection of Pathogenic Viruses (NCPV) over the next three years using NGS. These data will be publically available and will be integrated into COMPARE. In addition, during outbreaks of newly emerging infectious diseases in certain areas, the question is always raised whether the infection is really recently introduced, or has existed before within the human or animal population involved. The availability of historical serum banks of different age cohorts of animals and humans may prove to be of utmost importance. Similarly historical tissue banks can serve for the identification of infectious agents by whole agent, antigen or nucleic acid identification. For this purpose such historical and prospectively collected serum and tissue banks will be identified, safeguarded and prospectively collected. For this, we seek collaboration with the EVA network, in which members of COMPARE participate (UKbonn, EMC, FLI).

Page 32: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 32

Task 7: Quality management and ringtrials (FLI, WTSI) Quality management will be ensured and regularly tested using standardized sample matrices as well as reference strains and reference datasets from humans, animals, and food. The newly developed protocols and workflows from previous tasks and related work packages will be evaluated and validated by a series of (1) sample preparation, (2) sequencing and (3) data analysis ring trials. It is the intention that such established ring trials should form the basis for future formal external quality assurance systems that may be part of future ISO accreditation or FDA approval of different frontline diagnostic laboratories. Deliverables (brief description and months of delivery) D2.1 Matrix-dependent sample handling protocols for human, animal, wildlife/vectors and food samples (M12) D2.2 Standard protocol for sample processing (M24) D2.3 Sequencing workflow including relevant NGS platforms (M36) D2.4 Evaluated and documented data analysis pipeline (M24 and onwards) D2.5 Testing results of molecular analytical workflow in ring trials with reference materials (M36, M48, M60)

Work package number 3 Starting date or Starting Event Month 1

WP title FROM COMPARABLE DATA TO ACTIONABLE INFORMATION: ANALYTICAL WORKFLOWS FOR FRONTLINE DIAGNOSTICS

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

6 18 15 14 2 20 17 6 9 1 9 12 36

Overall objective: C1. To develop an analytical workflow for the use of single isolate and metagenomic NGS in human and

veterinary clinical microbiology. Specific objectives: 1. To develop general work-flows for integration of NGS in clinical laboratory diagnostics; 2. To develop a framework for prediction of phenotypic antimicrobial susceptibility based on the presence

or absence of genes and mutations in sequence data; 3. To develop tools for identification of hospital clusters and nosocomial transmission.

Description of Work Rapid diagnostic identification and characterization of infectious pathogens are essential to guide therapy, to predict disease outcomes, and to detect transmission events or treatment failures. Depending on the pathogen, this procedure may take 1 to 7 days for culture, an additional 1 to 2 days for species identification and susceptibility testing, and weeks for molecular typing. As WGS of primary isolates combines identification, molecular typing and (potentially) prediction of antimicrobial susceptibility and virulence, it can theoretically reduce the time to a complete set of relevant results to 1 to 2 days for culture and around 12 h or less for sequencing and analysis. Further substantial reductions in time to result can potentially be achieved by metagenomic or targeted WGS directly on clinical samples, particularly suitable for pathogens that cannot be cultured. NGS technologies are already under implementation in several frontline laboratories, mainly hospitals, worldwide. The expected global increase of the use of NGS/WGS in routine diagnostic settings will pave the way for increasing deposition of whole genome- and metagenomic sequences in the COMPARE platform from veterinary and human clinical settings in real time and with increasing geographical coverage, providing increasing opportunities for early recognition and mitigation of outbreaks, and timely insights into evolution, pathogenicity and transmission. To obtain information from the frontline diagnostic to national and global detection and surveillance it is however, essential to provide added value for these laboratories, and this will be explored in this WP. Task 1 To develop general workflows for integration of NGS in clinical laboratory diagnostics (AMC, EMC, FLI, TIHO, FMER, WTSI) Subtask 1.1. Developing a sampling, sample treatment, sequencing and analytical framework for routine sequence-based identification and characterization of clinical isolates of priority pathogens in the main clinical specimen types in a timeframe that permits clinical decision-making. Building on current practices in clinical microbiology and in close collaboration with WP2, a framework will be

Page 33: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 33

constructed which will include all the essential elements required for performing diagnostic NGS, with the aim to rapidly detect and characterize relevant pathogens in the appropriate clinical samples, and to assess sensitivity, specificity and costs of this approach in comparison with current routine methods for pathogen detection. The focus will initially primarily be on human clinical samples, but where relevant specific issues related to veterinary clinical microbiology will be included. Subtask 1.2. Developing a sampling, sample treatment, sequencing and analytical framework for analysis of pathogen sequences directly in clinical specimens using a meta-genomics approach in a timeframe that permits clinical decision making. NGS is not only applicable on single isolates but also for complete sequencing of entire bacterial and viral communities. Meta-genomic sequencing combined with rapid and validated bioinformatics could potentially provide a "one test detects all pathogens" approach. Prior work from COMPARE partners has shown that it is possible to use metagenomic approaches to identify and characterize known and unknown viral pathogens directly from clinical specimens (including CSF and respiratory specimens). However, this is limited to exceptional samples, and with turnaround times that are not yet compatible with clinical application. Similarly, research by COMPARE partners has shown that identification of bacterial pathogens, including their antimicrobial resistance pattern and clonal relationship, directly in urine specimens from patients with urinary tract infections may be possible, but little is known of the validity of this approach for direct diagnostic applications. Building on above efforts, the analytical workflow for the use of metagenomic approaches for clinical specimens will be further developed and validated. Task 2. To develop a framework for prediction of phenotypic antimicrobial and antiviral susceptibility based on the presence or absence of genes and mutations in sequence data (AMC, DTU, ANSES, RKI, UA, FMER, DSMZ). Resistance against antimicrobial drugs in bacterial and viral pathogens is the result of a multitude of mechanisms, often acting in concert and encoded by a range of different genetic factors, located on the genome (DNA, RNA) and/or on mobile elements such as plasmids, transposons and integrons. These factors can be detected and identified using NGS/WGS. Pilot data shows that the antibiotic resistance profile of bacterial isolates can be accurately and rapidly predicted from the genome sequence (Zankari et al. 2013; Stoesser et al. 2013) A framework which allows prediction of phenotypic susceptibility against different antimicrobial drug classes and individual drugs will need to include algorithms that incorporate all relevant genes and mutations within these genes, as well as factors that may influence expression of these genes. An analytical workflow for prediction of phenotypic susceptibility will be developed for priority organisms, such as the Enterobacteriaceae, Staphylococcus aureus, Enterococci, Mycobacterium tuberculosis, and influenza. Prediction algorithms will be built upon existing knowledge of genetics of antimicrobial/antiviral resistance, including pathways identified as important for antimicrobial resistance. The workflows will be developed in collaboration with WP2 and 4 (cluster detection), and piloted as described in WP6. Task 3. To develop tools for identification of hospital clusters and nosocomial transmission (EMC, AHVLA, AMC, UA, FMER, Sanger, RKI) Retrospective use of bench top sequencing for selected isolates of methicillin-resistant Staphylococcus aureus (MRSA) M. tuberculosis, Clostridium difficile, norovirus and influenza has indicated the great potential of the technology for understanding and potentially limiting intra hospital transmission of these important pathogens. We will establish harmonized standards for the identification, characterization and typing of the most relevant health-care associated pathogens based on NGS/WGS, in collaboration with WP4, where this is done for most common food-borne pathogens. For bacterial pathogens, this will be done with a focus on their antimicrobial-resistant variants, for methicillin-resistant Staphylococcus aureus, vancomycin-resistant enterococci, extended-spectrum beta lactamase- and carbapenemase-producing Enterobacteriaceae (E. coli, Klebsiella), Clostridium difficile, and multidrug-resistant Acinetobacter. For viruses, algorithms developed in WPs 1 and 4 (influenza and norovirus, respectively) will be used. Deliverables (brief description and months of delivery) D3.1 Analytical workflow for clinical diagnostic application (M18) D3.2 Prediction algorithm for antimicrobial resistance markers in sequence data (M24) D3.3 Standardized protocols for detection of clusters of healthcare associated infections (M42)

Page 34: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 34

WP number 4 Starting date or Starting Event Month 1

WP title FROM COMPARABLE DATA TO ACTIONABLE INFORMATION: ANALYTICAL WORKFLOWS FOR FOODBORNE PATHOGEN SURVEILLLANCE, OUTBREAK DETECTION, AND EPIDEMIOLOGICAL ANALYSIS

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

18 7 38 12 19 3 18 12 5 1

Objectives C2 To develop a general analytical workflow for population-based disease surveillance, outbreak

detection and epidemiological modeling of food-borne infections. Specific objectives: 1. To develop cross-sector and cross-pathogen methods for sequence based surveillance for food-borne

pathogens: 1.1. To develop an analytical framework for routine sequence-based surveillance of current priority

pathogens; 2. To develop cross-sector and cross-pathogen methods and tools to support outbreak detection,

outbreak investigations and epidemiological analysis: 2.1. To develop a framework for the application of the developed NGS and analysis tools in the

epidemiological handling of and response to food-borne outbreaks in Europe; 2.2. To develop tools for source attribution based on NGS based routine surveillance and outbreak data for

food-borne pathogens (DTU). The general framework will be developed through a number of pilot studies in WP7 and then be tested together with the system designed in WP9. Methods for detection of emerging threats, like new virulence traits, antimicrobial resistance traits or new combinations of known traits will be developed in the context of WP5. Description of Work One of the areas where the NGS technologies may have major impact is the routine surveillance for priority public health concerns, as agreed across the EU /EEA member states. This list of 49 communicable diseases is monitored by the European Centre for Disease Control (ECDC) and the European Food Safety Authority and forms the basis of EU disease surveillance. This includes (molecular) typing for many topics, leading to the decision to start including molecular data into the ECDC surveillance strategy. ECDC’s multi-annual strategic plan (2014-2020) acknowledges the rapid development of NGS and the impact of rapid molecular tests on the traditional culture-based surveillance approaches, already affecting routine surveillance of Salmonella, influenza and enteroviruses (http://www.ecdc.europa.eu/en/publications/Publications/Strategic-multiannual-programme-2014-2020.pdf). The work described here focuses on these developments, and therefore prepares for piloting of NGS based analyses for surveillance, including epidemiological analyses and risk assessments. Task 1: To develop cross-sector and cross-pathogen methods for sequence based surveillance for food-borne pathogens (RIVM, RKI, UNIBO, DTU, SSI, ANSES, ISS) For the food-borne pathogens currently under surveillance, reference strains, samples, and sequences will be selected from biobanks available throughout the network. In WP1, additional novel sampling strategies will be explored (e.g. hot-spot and sewage-based surveillance). Where available, NGS datasets from large scale sequencing projects will be used (such as the 100K project, funded by the US FDA, the NCBI short read archive, EMBL-EBI, and data from previous projects in which COMPARE partners participated (see section 1.3). Analytical workflows will be developed for rapid annotation of all current priority food-borne pathogens, and for translation of these data to currently used methods for typing. The most common DNA-based typing methods either generate banding patterns (e.g. PFGE, MLVA) or DNA sequences (e.g. MLST). Often these methods have been developed specifically to characterize very closely related isolates (e.g. in outbreak investigations) or, in contrast, to compare very distant related isolates (e.g. in evolutionary studies) (Barco et al., 2013). Since NGS methods have a high discriminatory level resulting in a high specificity, but low sensitivity, an important prerequisite to the use of NGS generated data is therefore to redefine how meaningful (groups of) subtypes can be assigned for the different downstream questions (trend monitoring, cluster detection, source attribution). In addition, a measure for quality of surveillance information will be developed using novel analytical approaches

Page 35: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 35

to estimate surveillance depth from systematically sampled sequence data, as these may inform the need for additional sampling (Chan and Rabadan 2013). These approaches have been developed to quantify completeness of genomic surveillance and can be used with minimal metadata (time of sequence collection), and knowledge of a pathogen's evolutionary rate. Task 2: To develop cross-sector and cross-pathogen methods for support of outbreak investigations and epidemiological analysis Task 2.1. To develop a framework for the application of the developed NGS and analysis tools in the epidemiological handling of and response to food-borne outbreaks in Europe (SSI, RKI, RIVM, EMC, AHVLA, IFREMER) This task will quantify how NGS can assist in a better and more timely outbreak response, including algorithms for outbreak detection, for measuring parameters about timeliness in response, addressing if a cluster of cases constitutes an actual outbreak, to make an efficient and operational case definition, to find cases, to perform efficient descriptive epidemiological analyses in order to generate hypothesis as to the vehicle and source of the outbreak, and not least to provide microbiological proof for suspected sources of outbreaks. Automated analytical tools as described in WP1 task 1.3 will be incorporated into a user-friendly workflow that enables temporal investigation of count data for the mandatory minimum dataset of current food-borne pathogen from available surveillance data for pathogens and pathogen traits, against historic backgrounds and external European and international typing databases where available. The tools will be integrated in a practical epidemiological workflow, which will be tested in the context of the pilots in WP6 and prospectively during new outbreaks, and by reviewing and collecting feedback from users. This work will be linked to the network of European epidemiologist working with outbreak investigations through the FWD network of ECDC through epidemiology partners in COMPARE. A particular important task is to develop guidance about when to respond to a detected cluster. Task 2.2. To develop tools for source attribution based on NGS based routine surveillance and outbreak data

for food-borne pathogens (DTU, RKI, RIVM) Efforts to quantify the importance of specific sources (including foods) and animal reservoirs for human illness have been gathered under the term ‘source attribution’ (Pires et al. 2009). The principle is to compare the distribution of subtypes in potential sources (e.g. animals and food) with the distribution in humans applying mathematical modeling. The most commonly used models are the frequency-matching models (Hald et al. 2004) and the population-genetic models (Wilson et al., 2008), which require a collection of temporally and spatially related isolates from various sources and humans as generated through COMPARE and using validated subtyping algorithms developed in task 1. A complicating factor is that most bacteria do not conform strictly to clonal models, but exhibit variable rates of horizontal gene transfer that distort the genetic relationships among isolates, but may be important to consider in source attribution studies. This is a challenge for the currently existing population genetic methods, which are mainly considering genetic relationships. In contrast, the frequency-matched attribution methods are not considering the genetic relationship between isolates, but require a “perfect match” of types to be considered related. In COMPARE, we will use NGS based surveillance data and outbreak data to define meaningful clusters for source attribution for each pathogen in task 1, and use this data as inputs to redefine and if possible integrate existing source attribution models including also the approach based on foodborne outbreak data as described by Pires et al. 2012. Novel approaches will be built in a Bayesian framework to combine information on the pathogen (phylogenetic information, plasmid occurrence, virulence, etc.), the source (exposure, prevalence, import-export, etc.) and the human host (consumption, travel history, etc.). This analytical framework will be piloted in WP6, and based on the initial results and data we will also attempt to identify novel markers for host association to further improve the distinction between sources of human infection. Deliverables (brief description and months of delivery) D4.1 Reference sequence database based on already available datasets (M6) D4.2 Algorithm for detection of informative (sub)types for epidemiological analysis and RA for the main food-

borne pathogens (M32) D4.3 Analytical workflow for epidemiological handling of and response to food-borne outbreaks in Europe

(M48) D4.4 Validated SA model for NGS data (M48)

Page 36: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 36

Work package number 5 Starting date or Starting Event Month 1

WP title FROM COMPARABLE DATA TO ACTIONABLE INFORMATION: ADDITIONAL TOOLS FOR DETECTION OF AND RESPONSE TO (RE-) EMERGING INFECTIONS

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

E

UR

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

22 18 9 1 6 14 2 6 36 8 5

Overall objective: C3. To develop cross-sector and cross-pathogen methods for support of emerging pathogen

identification and characterization in support of outbreak investigations and epidemiological analysis

1 To develop an analytical framework for detection of (re) emerging pathogens in meta-genomic datasets; 2 To develop tools for rapid sequence based detection of strain specific clusters in time, place and host for the

main emerging pathogen classes; 3 To develop tools for fast and robust phylogenetic and phylogeographic analysis; 4 To develop tools for detecting single nucleotide polymorphisms in pathogen NGS data; 5 To improve the prediction of phenotype changes related to antigenicity, drug-resistance, virulence,

transmission, and other traits from nucleotide sequence data. The general framework will be developed through a number of underpinning research studies in WP8 and in collaboration with the partners focusing on foodborne disease.

Description of Work The analytical workflows developed in this work package aim to answer essential basic questions such as: • Which pathogen is causing the emergence or outbreak and what are its characteristics (e.g. virulence,

transmission routes, toxins, antibiotic resistance)? • Where and what is the original source of the pathogen? • How do people get infected, and do they pass it on to others? • What can we do to stop the emergence/outbreak or limit its impact?

While some of these are essentially the same questions as the ones addressed in WP4, the pathogens targeted here are not or only partially covered by routine diagnostics and surveillance programs or entirely unknown. Therefore, while WP3 and 4 focus on potential added value of NGS/WGS based approaches in the context of existing diagnostics and surveillance with current or improved sampling strategies designed for these surveillance systems (ECDC, EFSA, WHO), WP5 seeks to take a step further by working on novel applications to enhance emerging disease detection and investigation. This will be combined with innovative, risk based sampling as described in WP1. Task 1 To develop an analytical framework for detection of (re)emerging pathogens in meta-genomic

datasets (EMC, FLI, DTU, RKI) In this task the focus will be on analytical workflows that can be used on outputs from novel risk-based sampling design strategies (WP1) combined with metagenomic analyses, in addition to the pathogen specific and initial bioinformatics described in WP2. The rationale for this is that with increasing generation of metagenomic data, these form a growing body of potential early warning surveillance data. Currently, however, such data are not mined for this type of information, and analytical tools that allow profiling of metagenomic data for a package of “high threat” pathogens do not exist. In addition, in most metagenomic projects only a minority of the sequences can be mapped to known host, bacterial, virus or other already sequenced species. We will develop methods for mining of metagenomic data for early warning surveillance signals for emerging infections, which will focus on detection of novel pathogens (e.g. based on matching with conserved sequences from known pathogen classes), important genes, or variants thereof (antimicrobial, virulence), building from experience within the COMPARE partnership (see section 1.3). Task 2. To develop tools for rapid sequence based detection of strain specific clusters in time, place and host for the main emerging pathogen classes (UEDIN, UCAM, FLI, ISS)

Page 37: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 37

While emerging infections by default are unpredictable, a careful review of emerging disease outbreaks has shown some pathogen classes that are more likely to (re-) emerge and spread. For instance, a recent study classified a list of 86 emerging zoonoses relevant for Europe according to 7 criteria, in order to help prioritize surveillance and preparedness planning (http://ezips.rivm.nl/). Algorithms for NGS/WGS/WCS -based epidemiological analysis including assessment of discriminatory power and robustness will be implemented for the main emerging pathogen classes (viral, bacterial and protozoan). These sequence-based cluster analysis tools and combined analyses of sequence data with metadata will be based on expertise with current sequence-based typing techniques (minimum spanning trees, strength and directionality of the correspondence or congruence with current reference typing methods and epidemiological cluster detection methods). Task 3. To develop tools for fast and robust phylogenetic and phylogeographic analysis (UEDIN, UCAM, FLI, RKI, UK-BONN). The availability of pathogen genome sequences allows much deeper analysis of pathogen diversity, through phylogenetic reconstruction. Systematically collected sequence data, when analyzed in combination with time of sampling and geographic location can be used to investigate important epidemiological, immunological, and evolutionary processes, such as epidemic spread, new introductions into the population, and antigenic drift (Volz et al. 2013). How much information can be derived from phylodynamics analyses in part depends on pathogen traits such as generation times, mutation rates, and pathogen ecology (e.g. periods of stasis due to environmental persistence), and therefore differs significantly across pathogens. Here, we will explore the use of phylodynamics analyses for the different pilot studies in WP6-8, against previous analyses based on partial genome sequences or other typing data. The analytical options will include phylogeny tools that allow the generation of “evergreen trees” that use curated reference sequence alignments that are updated automatically when new sequence collections become available. Taxa in the evergreen trees can be labeled based on time, location, host, drug resistance, other phenotypes, as well as other metadata, to facilitate tracing and visualize pathogen spread and evolution. In anticipation of new threats, we will (in Years 1 and 2) prepare spatiotemporal phylogenies of currently recognized emerging disease threats for Europe, e.g. hanta-, MERS corona- Chikungunya and FMD viruses into which sequences from incursions into Europe could rapidly be incorporated and characterized.

Task 4. To develop tools for detecting single nucleotide polymorphisms (SNPs) in pathogen NGS data (EMC,

AMC, UCAM, FLI, UNIBO) An important added value of NGS/WGS methods is the potential to detect SNPs associated with specific pathogen traits, including drug resistance and virulence, not only in consensus pathogen genome sequences, but also as (minor) variants among heterogeneous mixtures of sequences (e.g. quasispecies). Finding such minority variants can advance the detection of relevant changes by days or even weeks. Reducing the signal to noise ratio through pre-analytical steps (WP2) as well as downstream quality controlled bio-informatic analyses are crucial for reliable application of NGS to identification of informative polymorphisms. While current day practice can lead to quantifying minor SNP variants to ~1% reliably (e.g. Linster et al. Cell 2014), for some applications there is a clear need to develop technologies beyond this threshold (Russell et al., Science 2014). This task will focus on optimizing quantitative output on SNPs by developing appropriate laboratory methods and analysis tools. Recently, approaches that reduce NGS errors and allow SNP detection with unprecedented accuracy were described (Acevedo et al. 2014). COMPARE will further develop such tools for use in surveillance and outbreak scenarios. For ease-of-interpretation, tools of SNP identification will be combined with the phylogenetic tree tools of task 1.1 that will allow identification of the SNP changes along the branches of trees, to visualize patterns of accumulation of SNPs. Task.5. To improve the prediction of phenotype changes related to antigenicity, virulence, transmission, and other traits (EMC, UCAM, FLI, AHVLA) Output of NGS data in terms of risk analysis is maximized further if phenotypic changes of the pathogens can be inferred from the sequences. The CDC “H5N1 genetic changes inventory” (http://www.cdc.gov/flu/pdf/avianflu/h5n1-inventory.pdf) is an excellent example on how – based on limited experimental data – potential phenotypes can be inferred from influenza virus genome sequences and mutations therein. However, it is generally believed that there are multiple evolutionary pathways that would change important phenotypic traits of pathogens. Laboratory studies have for instance implicated convergent evolution of certain molecular determinants for pathogenicity and host specificity for influenza viruses in nature. Under this task, we plan to investigate the extent to which functionally equivalent substitutions are found in nature, using model pathogens and phenotypes. Although experimental data remain the gold standard when relating sequence changes to phenotypic changes,

Page 38: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 38

providing experimental data for each SNP detected in pathogen genome sequences will be unfeasible. We therefore further propose a computational structural approach to understand the underlying principles for functionally-equivalent substitutions already identified, and to use these to identify possible other functionally equivalent substitutions. Although the initial focus is on existing experimentally derived structures of the influenza HA molecule, methods will be applicable to other pathogens for which structure information for key proteins is available. Models of receptor binding of influenza viruses will be predicted using comparative modeling techniques, building from work done during the 2009 influenza A(H1N1) pandemic to determine viral characteristics including receptor binding properties (Garten et al. 2009; Chutinimitkul et al. 2010). These initial studies on HA receptor binding, will pave the road for studies on other phenotypes, proteins, and pathogens Deliverables (brief description and months of delivery) D5.1 Update of pathogen repositories for pilot and research studies (M18) D5.2 Tools for rapid sequence based detection of strain specific clusters in time, place and host for the main

emerging pathogen classes (M30) D5.3 Phylogenetic and phylogeographic tools for epidemiological investigations and rapid risk assessment

(M30) D5.4 Tools for detecting single nucleotide polymorphisms and analyses within and between hosts (M36) D5.5 Methods for prediction of pathogen phenotype from genotype data and structure models (M42)

Work package number 6 Starting date or Starting Event Month 18

WP title UNDERPINNING RESEARCH: FRONTLINE DIAGNOSTICS USING THE COMPARE ANALYTICAL FRAMEWORK

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

9 36 5 20 19 27 14 12

Overall objective: To assess feasibility of NGS/WGS/WCS for clinical diagnostic use and hospital epidemiology Specific objectives:

1. To validate application of NGS for diagnostics and hospital epidemiology 2. To test feasibility of NGS for prediction of antimicrobial resistance phenotype to guide treatment 3. To evaluate the use of NGS for syndromic surveillance based on data from hospitalized patients

Description of Work Acceptance of NGS in the clinical laboratory relies heavily on performance criteria relevant to clinicians, such as speed, cost, positive and negative predictive value, and relevance for treatment guidance. The potential benefits of NGS in clinical diagnostics for individual patient care and for provision of important data for basic and public health research, as outlined in WP3 will be tested in a series of pilot studies using the analytical workflows developed in WPs 2 and 3. Task 1: Feasibility of NGS for diagnostics in the clinical diagnostic laboratory (AMC, EMC, AUTH, RKI) The optimized analytical workflows developed in WP2 and 3, for metagenomic analysis will be piloted for diagnostic application. For this, we will profile the genome content of most common sample types collected from patients with neurological symptoms, enteric and respiratory infections. These syndromes are chosen because they are common causes of hospitalization, may be caused by a range of infections, , and remain without an etiological diagnosis in substantial proportions despite extensive diagnostic efforts. Embedded within these syndromes also are pathogens that have epidemic potential and are associated with high morbidity and mortality, including pathogens covered by (mandatory) surveillance, for instance vaccine-preventable pathogens (e.g. meningoccci, pneumococci, influenza, rotavirus) for which sequence information is essential for public health purposes. Feasibility of use of NGS based diagnostics for hospital epidemiology will be piloted. Initial exploratory pilot studies will be performed using existing biobanks available at COMPARE partner institutes that have been characterized clinically and microbiologically in detail, in order to fine tune workflows and platforms developed in WPs 1-3 for clinical diagnostic use. Subsequently, prospective pilot studies using the entire COMPARE analytical framework will be performed in CNS, enteric and respiratory infections. These

Page 39: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 39

pilot studies will be performed within existing networks and prospective studies, e.g. those in PREPARE, and will focus on a comparison of NGS based diagnostics with current gold standard methods. Task 2: Feasibility of NGS for prediction of antimicrobial resistance phenotype to guide treatment (AMC, UA, DSMZ, DTU) The feasibility of NGS to guide treatment will be tested using collections of well-characterized bacterial strains with known antimicrobial susceptibility profiles for which key resistance mechanisms have been identified. Whole genome sequences of these strains will be used to test the prediction algorithms. In addition, we will similarly evaluate the predictive value of WGS generated directly from the specimens from which these strains were isolated, as well as use metagenomic approaches directly in clinical specimens to explore the reservoir of antimicrobial resistance genes in a given specimen. A second phase will be based on an ecological study of the reservoir of AMR genes in stool samples from healthy returned travelers. The repository, collected during the COMBAT study (ZonMW funded, 2000 travelers prospectively followed), includes isolates of ESBL-producing Enterobacteriaceae; collected total growth on MacConkey agar, including an enrichment of Gram-negative microorganisms; and a stool sample from each participating traveler. This repository therefore will allow a stepwise analysis from single isolates to mixed bacterial growth to the stool samples from which these bacteria were grown, and comparison of WGS as well as metagenomics using the prediction framework developed in WP3. Sequencing data will be compared with available phenotypic and microarray-generated genotypic data of AMR. Depending on the sample size, a risk assessment for carriage of AMR genes in stool on the basis of travel destination and history will be performed. Task 3: To evaluate the use of NGS for syndromic surveillance based on data from hospitalized patients

(AMC, EMC, AUTH, FMER, RKI) We will use risk mapping to identify ‘hotspot’ regions for the emergence of novel or re-emerging pathogens as determined through WP1. Primary and secondary healthcare settings in hotspots in Europe and beyond from established clinical and laboratory research networks such as PREPARE, ISARIC, GABRIEL and RESAOLAB will be contacted for systematic sampling and analysis through the COMPARE analytical workflow to evaluate patients with unexplained illness. In case of unusual disease events, enhanced sampling will be done, using the protocol developed in WP 2, and analytical protocols and workflows developed in WP 3, 4 and 5. In case of emerging outbreaks in regions not represented by sentinel sites, immediate efforts will be undertaken to include these regions (from sites in our established networks in Europe through PREPARE and globally through GABRIEL and ISARIC). Deliverables (brief description and months of delivery) D6.1 Report on NGS based diagnostics in comparison to gold standard methods (month 42) D6.2 Report on WGS and NGS based detection of antimicrobial resistance in stool samples in patients and

traveler (Month 50) D6.3 Reports on WGS and NGS for syndromic surveillance (Month 60)

Work package number 7 Starting date or Starting Event Month 18

WP title UNDERPINNING RESEARCH ON FOODBORNE OUTBREAKS USING THE COMPARE ANALYTICAL FRAMEWORK

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

15 14 41 12 21 10 27 12 11 23 1 17

Objectives: • To establish robust analytic procedures of NGS/WGS data for Salmonella, STEC/EHEC, Listeria, ,

norovirus, hepatitis A, and Cryptosporidium to be able to identify epidemiologically linked isolates and differentiate these from similar unrelated isolates;

• To develop guidelines for interpretation criteria for defining clusters of disease and linking of isolates from various sources and reservoirs;

• Enable backward compatibility to important previous nomenclature (e.g. serotypes, species, MLST);

Page 40: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 40

• Perform pilot studies in collaboration with global partners and networks including ECDC and EFSA. In collaboration with WPs 3, 4 and 5: • Evaluate analysis tools developed in WP3 and WP5 to predict phenotypic traits such as antimicrobial

resistance, invasiveness and virulence; • To improve the evolutionary reconstruction of the organisms and thus provide a rational frame work for

investigations into the population structure, genetic diversity, epidemiology and transmission and identification of spatial-temporal clusters;

• to provide the input for the epidemiological, source attribution and risk assessment models developed in WP4.

Description of Work (where appropriate, broken down into tasks), lead partner and role of participants) Here, we will pilot the use of NGS/WGS based pathogen genome data in the context of these surveillance activities for the main food-borne bacteria, viruses and parasites, i.e. Salmonella, E. coli, Listeria, cryptosporidium, norovirus, and hepatitis A. The research studies will use well defined biobanks with affiliated metadata that are essential for the outbreak detection, source tracing, and analysis of spread in time and geography. The partners involved are national and international reference laboratories operating in relevant international networks. The potential use in real time requires the use of metadata that are often highly sensitive (person-identifiable, can point at specific food companies) and can therefore generally not be freely shared. For this, the pilot team will collaborate with WP12 Barriers to explore differential access rights to the data or combined analysis of sequence data and metadata stored in separate systems (e.g. TESSy, WHO, genometrackr, Noronet, and EU-RL databases). The discriminatory power and robustness of NGS -based epidemiological analysis will be determined for the different pilot pathogens and surveillance questions e.g. trend analysis, cluster detection, SNP detection, detection of specific virulence genes and antimicrobial resistance determinants, or presence of certain plasmids detailed analysis of data. This requires in depth analysis of pilot datasets, selected based on experience with current surveillance standards and curated against the factors that may influence data quality (WP1 and 2). Sequence-based cluster analysis tools and combined analyses of sequence data with metadata will be compared with current sequence-based typing techniques (minimum spanning trees, strength and directionality of the correspondence or congruence with current reference typing methods and epidemiological cluster detection methods). Background on pathogens and collections in the pilots. Partners contributing to this WP are selected because they: • Are national reference laboratories for the different pathogens, familiar with existing surveillance set-up and

methods • Have access to well defined biobanks and preliminary data, ensuring that the work done in COMPARE is not

redundant • Perform high quality research on the different pathogens. In addition to large collections of representative samples and isolates for each of the pathogens, as well as paired samples from known outbreaks, essential for validation of NGS/WGS for cluster detection, the following samples and datasets are unique in COMPARE: • RKI has about 1000 characterized EHEC samples and a rich set of metadata from the outbreak caused by a

virulent, rare EHEC strain of serovar O104:H4 that was identified as the source of the E. coli outbreaks that struck Germany and France in the spring and summer of 2011. The outbreak in Germany was the country’s biggest food-borne bacterial outbreak for 60 years;

• ANSES is European reference laboratory for Listeriosis, coordinating a network of National Reference Laboratories (NRLs) throughout Europe. The biobank includes samples from a multinational outbreak due to Quargel cheese affecting Austria, Germany and Czech Republic with 34 cases and 9 fatalities (Fretz et al. 2010a, 2010b);

• ISS is European reference laboratory for foodborne parasites, and will contribute rich data on cryptosporidia; • RIVM and EMC coordinate both global noronet and HAVnet, international networks of laboratories

developing sequence based surveillance for early detection of diffuse food-borne outbreaks of norovirus and hepatitis A. Recent examples for which specimens are available are in the outbreaks linked to semi-dried tomatoes with cases in France, The Netherlands and Australia and strawberries with cases in several Nordic countries (Petrignani et al. 2012; Nordic outbreak investigation team. 2013).

Workplan Task 1- To develop and validate a work planfor NGS based food-borne outbreak investigation (SSI, ANSES,

Page 41: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 41

ISS, AHVLA, DTU, EMC, IFREMER, UNIBO, RKI) Year 1: A representative set of samples/isolates of the different pathogens from different sources and locations, including samples from well-defined point-source and diffuse outbreaks as well as sporadic “background” isolates, will be selected for NGS/WGS analyses. Publicly available sequence data will be used when possible and supplemented with sequencing in this project. The selection will cover relevant diversity for each of the pathogens, to allow comparison of NGS/WGS based typing with reference methods and thereby establish backward compatibility. An expected 1,000 isolates of each target pathogen will be sequenced. The data will be used for structuring the phylogenetic analysis and cluster definitions in WP4. Year 2: Based on the data provided in year 1, various NGS/WGS analysis procedures will be evaluated and optimised for use in surveillance, outbreak detection and cross-sector comparison of strains. Based on previous well-defined outbreaks, the level of variation within outbreaks and other epidemiologically linked isolates will be estimated. Relevant existing nomenclature and markers (e.g., serotype, MLST, antimicrobial resistance) will be extracted to enable backward compatibility. If relevant, new nomenclature will be proposed and integrated in the COMPARE platform. Year 3-5: Prospective sequencing of samples/isolates from patients, food and animals. Further development of the analytic tools developed in WP4 to include accessory genes, plasmids, prophages in addition to the phylogenetic based core genome analysis developed in year 1-2. Integration of the pilot data with epidemiological modelling, source attribution and risk assessment. Integration of COMPARE in national and international surveillance and used for prospective outbreak detection and investigation. Process input from external users of COMPARE and further develops based on this. Define markers for host specificity that can be used for source attribution and risk assessment. Task 2 – international study in collaboration with global partners and networks (EMC, DTU, SSI, ANSES, RIVM) Based on the work performed during the first years of the COMPARE project we will during year 5 in global collaboration perform a large scale pilot study for supranational surveillance and outbreak detection of foodborne disease outbreaks. The details cannot be planned at this stage since it will depend on the initial work as well as the global development and availability of sequencing technologies around the world. It is however, at this stage expected that it will be possible to execute a large pilot-collaboration based on at that time several routinely sequenced foodborne organisms from laboratories in +40 countries, in all continents, where data are generated, analysed, and compared in real-time and automatic epidemiological analysis automatic in real-time provides information to those who need to take action. It is further expected that the pilot will integrate data generated from both single isolates and meta-genomic data from both clinical and environmental samples. Deliverables (brief description and months of delivery) D7.1 Database of reference genomes of an epidemiologically relevant selection of each of the food-borne

organisms in the pilot (M16). D7.2 Report on NGS comparison between NGS based analysis and reference methods for cluster detection for

all pilot organisms (M36) D7.3 Database of markers for host-association and ecophysiology (M36). D7.4 Improved guidelines for interpretation criteria for defining clusters of disease and linking of isolates from

various sources and reservoirs (M48). Work package number 8 Starting date or Starting Event Month 12

WP title UNDERPINNING RESEARCH: NOVEL APPROACHES TO (RE-)EMERGING DISEASE DETECTION AND OUTBREAK RESPONSE USING THE COMPARE ANALYTICAL FRAMEWORK

Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

9 48 18 18 4 8 30 22 10 20 29 20 62 25 8 8 1

Page 42: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 42

Overall objective: 1. To develop and pilot early detection and surveillance of enteric pathogens and genes through strategic

sampling and metagenomic analysis; 2. To develop and pilot “ hot-spot” based syndromic surveillance in animals and humans 3. To develop and pilot early detection of surveillance of emerging zoonoses from wildlife reservoir through

strategic sampling and metagenomic analysis; 4. To develop and pilot early detection of changes in pathogen traits enhancing risk of outbreaks and pandemic. Description of Work Current European (and global) detection of emerging diseases and surveillance rely heavily case ascertainment by clinicians, thereby subjective to important selection biases. Here, we will pilot the use of NGS/WGS based pathogen genome data in the context of early detection of emerging infectious diseases for bacteria, viruses and parasites, using alternative approaches. The proposed research will help to define, refine and pilot the analytical workflows developed in WPs 1, and 5 while addressing important research questions in different domains. The workplans have been developed and will be executed in close collaboration with existing networks including the specific networks for food-borne pathogens (see section 1.3). Since emerging infections are per definition a priory unknown, the pilots have not been chosen according to pathogen, but chosen to reflect different modes of emergence. Task 1 - Mining metagenomic data from stool and environmental samples for surveillance purposes (EMC, RIVM, AMC, ISS, DTU, SSI, IFREMER, UNIBO, FLI) The increasing use of metagenomic sequencing produces potentially interesting sources of surveillance information that currently is not used for that purpose. Surveillance data provide essential background for outbreaks, by serving as reference for what is “normal” in the population. We will explore this potential as a novel approach to surveillance for pathogens that are transmitted by the fecal oral route, through strategically chosen samples that are already collected or can be collected with minimal effort. The choice of sites will be guided by the risk mapping exercise in WP1, and samples will be analyzed using optimized protocols (WP2), followed by mining of the workflows developed in WP’s 3, 4 and 5 for the presence of antimicrobial resistance genes, common food-borne pathogens, and emerging pathogens, respectively. This will be done, based on the analysis of the following samples. Stool samples collected for population screening for cancer Population based testing for early detection of colon cancers is currently implemented in several countries in Europe. This screening program relies on self-sampling of adults, and testing of these specimens for presence of hemoglobin in screening laboratories, after which samples are disposed. In collaboration with persons responsible for the rollout of this screening program in The Netherlands, the potential for use for surveillance will be investigated. Stool samples from travelers and airline sewage An increasing part of the world’s population is travelling though airports and this is known as a major route of transmission for infectious diseases (WHO 2007). In 2012 approximately 1.11 billion passengers travelled on an international flight; a number expected to increase to 1.45 billion in 2016 (http://dtxtq4w60xqpw.cloudfront.net/sites/all/files/pdf/unwto_barom13_01_jan_excerpt_0.pdf). Metagenomic sequencing of toilet waste from 18 international flights into Copenhagen shown significant differences between flights from SEA and North America in presence of AMR genes, selected enteric bacterial and viral pathogens, indicating this could be a target for surveillance. Similarly, the cohort described in WP3 task 2 will allow a pilot ecological study of metagenomic detection of enteric pathogens and commensals (viral, bacterial, parasitical) in stool samples from healthy returned travelers, and the potential association of travel destination, travel related enteric disease history, and risk behavior with the presence of species specific sequences. Sewage collected at slaughterhouses An increasing part of the World’s population is connected to sewage systems; in Western Europe up to 90%. Thus, sewage is from a surveillance point of view an attractive matrix, combining material from a large healthy population, which would otherwise not be feasible to monitor. This is today a part of polio monitoring, and has been piloted for measles virus as well. In addition, as slaughtering of animals is concentrated in larger slaughterhouses, there is a potential to use their sewage system to quantify all pathogens present. In this pilot-project we will sample the effluent of major slaughterhouses and compare this to prevalence of pathogens in animals (e.g using hepatitis E as a model).

Page 43: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 43

Task 2: To develop and pilot early detection of surveillance of emerging zoonoses from wildlife reservoir through strategic sampling and metagenomic analysis (EMC, IREC, UKBonn, Artemis, TIHO, FLI, AUTH) Important strategic sampling sites for wildlife are wildlife health centres (WHCs) for endemic wildlife and border inspection posts (BIPs) for exotic wildlife. Regarding endemic wildlife, a survey of WHCs from 25 European countries indicated that about 18 000 wild animals are examined by general surveillance (testing for cause of death or illness by autopsy, supplemented by histopathology, bacteriology, parasitology and—less frequently—virology) and >50 000 wild animals are examined by targeted surveillance (testing for a specific pathogen) every year. Regarding exotic wildlife, annual records indicate that legal imports into the EU via BIPs include millions of captive-bred wild birds and small mammals (such as prairie dogs and meerkats), increasing numbers of non-human primates, as well as 1.5 billion ornamental fish and 10 million reptiles. Quarantine requirements at BIPs only apply to certain species, and monitoring takes place only for a small selection of pathogens. In this task, we will determine the suitability of sampling protocols (WP1), protocols for handling and processing of samples (WP2), and analyses of obtained data for routine surveillance, epidemiological investigations, and rapid risk assessment (WP1 and 5) for at least one representative WHC and BIP in each participating country. The initial part of the task will involve retrospective NGS analysis of known cases of wildlife morbidity and mortality, where the primary cause has been established to be of viral, bacterial, or protozoal origin, and where suitable samples of tissue, respiratory excreta, feces, blood, and/or arthropod vectors are available. Task 3. To develop and pilot early detection of changes in pathogen traits enhancing risk of outbreaks and pandemics (EMC, AMC, UK-Bonn, UCAM, UEDIN, FLI, AHVLA) In this task, we will pilot the analytical workflows for monitoring of pathogen traits during emerging disease outbreaks, by focusing on viruses from three families, that have representatives both in humans and in animals, and for which cross species transmission events have occurred in the past (influenza viruses, coronaviruses, flaviviruses [hepaciviruses and pestiviruses]). We will use sample collections from WP1 (bats, rodents, humans) and experimentally infected animals that will be processed through the pipelines of WP2, to test and optimize the analytical tools described in WP5. Initially, we will focus on zoonotic viruses associated with respiratory disease or hepatitis (influenza A viruses, coronaviruses, hepaciviruses), and on the economically important pestiviruses, because excellent sample collections are available (influenza viruses in birds, humans, and ferrets, coronaviruses and hepaciviruses in bats and rodents), and there are key scientific questions of relevance to pathogen emergence to be addressed. When successful, follow-up work may include other pathogens that are associated with additional diseases (e.g. enteric/foodborne pathogens, including bacteria). Initially, we will work on prediction of traits associated with host range and organ tropism from NGS data. While bat-associated coronaviruses show fecal-oral transmission patterns (Drexler et al. 2011), a large clade of coronaviruses detected recently in rodents (Drexler, unpublished) show systemic infection with signs of hepatitis. When influenza A viruses are transmitted from birds to mammals they switch from replicating in the intestinal tract (birds) to the respiratory tract (some birds and mammals), which is associated with a switch in receptor-specificity and additional changes associated with the new host environment (e.g. body temperature, pH)(Herfst et al. 2012, Russell et al. 2012, Linster et al. 2014, in press). We will investigate the evolutionary mechanisms underlying the switches in host range and tissue-tropism in relation to pathogen emergence and transmission from their original hosts. We will use (sequential) samples from infected humans and animals to characterize within-host and between host virus evolution, including evolution in response to various selection pressures, and identify evolutionary relevance of (minority) variation observed. This analysis will provide information about the required depth and frequency of deep sequencing in future routine surveillance studies. We will test the prediction of phenotypes from genotypes extensively for influenza A viruses, including prediction using structural homology models for genotypes not detected previously. For the hepaciviruses and coronaviruses of rodents and bats, this pilot will include NGS-based attempts to identify novel virus lineages in bats and rodents and analyze these lineages in relation to numerous (metadata) variables, such as organ tropism, transmission patterns, or host traits (e.g. social group size, host phylogeny). This pilot thus aims to bring together the full spectrum of NGS approaches (WGS, metagenomics, and SNP detection by deep-sequencing) with extensive genetic/phylogeny/phylogeography analysis, in association with histopathologic analyses and various analyses of metadata. We anticipate that this pilot will identify and predict key drivers of host-switching, transmission, and pathogen emergence, and will be key to further develop – and identify the limits – of the analytical tools developed in COMPARE. Deliverables (brief description and months of delivery) D8.1 NGS sequencing data on influenza A viruses, coronaviruses, pestiviruses, and hepaciviruses from various

hosts (month 36)

Page 44: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 44

D8.2 Identification of key drivers of evolution of influenza A viruses, coronaviruses, pestiviruses and hepaciviruses in relation to virus emergence (month 56)

D8.3 Report on results of pilot study of metagenomic pathogen detection e.g. in stool samples from healthy travelers, before and after travel (month 60)

Work package number 9 Starting date or Starting Event Month 1

WP title COMPARE DATA AND INFORMATION PLATFORM Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

120 20 12 131

1 60 12

Objectives An important barrier to routine application of NGS/WGS/WCS of pathogens in clinical and public health laboratories is the current capacity for bio-informatics and data management. This “big data” challenge has been recognized internationally, and calls for new solutions for data storage and rapid sharing that will be capable of handling the expected massive increase in data in the coming years. Here, we develop and pilot the core infrastructure for such future routine applications, building from existing European ICT infrastructure that has served the needs of the wider research and public health community for the past 30 years, but linking this to frontline developments in bioinformatics and NGS/WGS/WCS. The system will support the spectrum of sequence-based analyses of relevance to pathogen detection and characterization, surveillance, outbreak detection and investigation, from single locus approaches, through whole genome methods to metagenomics studies. Data types will include contextual meta-data, primary data (sequence reads), and derived data (such as genomic alignments of reads, assemblies and functional annotation). Supporting the public health, clinical, research and tools development communities, the system will be scalable and sustainable beyond the duration of funding for COMPARE. Full technical specifications will be provided to allow the system to be replicated in alternative hardware infrastructures according to future need and capacity. The objectives are: • To create and operate the COMPARE Data Resource, workflow engine and portal; • To create user spaces for COMPARE workflow development and pilot projects; • To ensure the long-term sustainability of the tools developed and data generated. Description of Work The above indicated partners will design, prototype, construct and deploy the COMPARE Data and Information Sharing Platform (see figure 6).

Figure 6: System outline for COMPARE Data and Information Sharing Platform, showing the existing Embassy cloud computing environment, connected data sources (second box from the left), an environment in which the core data analysis pipelines will operate (the workflow engine), the workflow development environments relating to WPs 3-5 (three boxes) and the COMPARE end-user portal.

Page 45: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 45

An informatics infrastructure will be constructed in which a virtualized continuous data and compute environment will be provided for the use of the COMPARE Consortium, leveraging existing EMBL-EBI and DTU informatics systems and the existing ‘Embassy’ cloud system at EMBL-EBI. This environment will offer four key benefits. First, large sequence data sets of relevance to COMPARE (including but not limited to COMPARE Consortium-provided data) will be physically co-located with substantial dedicated computational capacity. Second, 'elasticity' offered by cloud technologies will allow access at times of peak demand to existing computational capacity at EMBL-EBI and DTU beyond the capacity dedicated to COMPARE. Third, the globally comprehensive public data set (including COMPARE, NCBI, INSDC and US-FDA) will be available directly within the system for immediate access and use. Fourth, existing mechanisms and data resources for archiving and global sharing of sequence data3 will be leveraged; such mechanisms and resources have been operated by EMBL for over three decades, have been key drivers for the establishment of EMBL-EBI 20 years ago and now serve as a critical infrastructure within ELIXIR (http://www.elixir-europe.org/), the hub of which is provided by EMBL-EBI. While COMPARE will have an operation focus on COMPARE analytical methods, the broad community beyond COMPARE will be able to present their data for use with COMPARE tools and their tools for use with COMPARE data through i) the provision of publicly accessible application programmatic interfaces (APIs), ii) specifications of these interfaces for deployment on data sources and portals external to COMPARE, and iii) the COMPARE Registry of data sources. Task 1: To create and operate the COMPARE Data Resource The data resource component (see figure 6: 'Sources') will contain data under three classes of availability: unrestricted public data, managed access data (accessible to all users complying with access agreements and applying appropriate security measures) and private data (for closed consortia). Public data will be exchanged globally and automatically through INSDC. Because the EMBL-EBI mandate includes the maintenance of public domain research data, covering public data and managed access data generated under COMPARE, data management and storage requirements for this sizeable proportion of data will be met outside COMPARE, representing a significant in-kind contribution. For core data management technology, we will leverage existing technology that is used across EMBL-EBI's sequence data resources (such as the data validation and indexing framework and the CRAM sequence data compression software; https://www.ebi.ac.uk/ena/about/cram_toolkit), while we will specifically leverage the institute's European Nucleotide Archive (ENA) for the management of public data, and the access management and security provisions in use in the institute's European Genome-phenome Archive (EGA). Application programming interfaces To support programmatic entry of data, we will construct a data input Application Programming Interface (API). While taking advantage of design lessons learnt in developing the existing ENA data submissions API, this new API will be constructed so it can support high-throughput batch uploads of sequence data at the high speeds appropriate for data reporting in public health systems (i.e. on a scale of a few hours for hundreds of isolate data sets). The API will support uploads of accompanying meta-data. We will work with providers of sequencing platforms to integrate the data produced by their platforms as smoothly as possible with the COMPARE system, for example with Illumina's BaseSpace cloud analytical platform, for which the vendors have already confirmed support for our CRAM sequence data compression format. The system will support transfer of all three availability classes of COMPARE data and will support scenarios where the user operates a local meta-data store outside the COMPARE system. The data output API will provide the technical back end for all interfaces and workflows that require access to COMPARE data. It will support both querying and retrieval of all COMPARE data and all publicly available data from beyond COMPARE (e.g. NCBI and US-FDA). Querying services will include look-up by annotated contextual data (e.g. sample description, pathogen taxonomy, pathogenicity profile), will support hierarchy-aware queries (e.g. to search for defined taxonomic groups), numeric range queries and geospatial functions, and will provide sequence similarity search methods. Data retrieval functions will be supported that include whole data set retrieval (e.g. sequence data file retrieval for a given reported data set), or data ‘slices’ (e.g. partitions representing stacks of assembled sequences for a given locus across multiple samples). Data retrieval will provide multiple output format options to facilitate downstream use with different software programs.

3  This  global  sequence  data  sharing  exists  under  the  International  Nucleotide  Sequence  Database  Collaboration  (INSDC),  a  long-­‐standing  data  sharing  initiative  between  the  NCBI,  the  DDBJ  and  EMBL  in  which  all  public  domain  sequence  data  are  shared  globally  and  rapidly  to  serve  the  scientific  community  (see  http://www.insdc.org/).  

Page 46: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 46

Registry We will operate a registry service in which providers of data resources external to COMPARE who have committed to support the COMPARE Data Resource output API will provide information about content that they offer. The registry will support querying of data across a global ecosystem of data sources. Data coordination Data coordination will be provided for data emerging from COMPARE research projects in WPs 6-8. This will include support in data reporting, with a focus on training and assistance in the use of COMPARE reporting tools and APIs, ensuring the operation of appropriate generic workflows and the provision and reporting of analytical outputs to pilot working groups. Task 2: To create ad operate the COMPARE Workflow Engine The COMPARE Workflow Engine, represented in the 'Processes' section in figure 6, will support data analysis by provision of compute capacity supporting the COMPARE quality-assured workflows developed in WP1-5 and piloted in WP6-8. These workflows will generate systematic data products, such as assemblies and annotation sets, which will be stored and made available from the COMPARE Data Resource. Early target workflows will include processes using tools available within and beyond the COMPARE consortium, such as quality filtering of raw sequence data, assembly from reads into sequence-overlap contigs using VELVET, species identification from mixed samples, MLST-determination from isolates, detection and quantification of resistance and virulence genes using CBS-tools, ARDB-antibiotic resistance gene database, the Virulence factor database (VFDB), a.o. (http://molbiol-tools.ca/Genomics.htm), functional annotation using PROKKA, meta-genomic functional analysis using InterProScan MG Portal, MG-RAST and MG-Mapper, QIIME-Based taxonomic profiling based on in silico filtered rRNA, assembly of viral populations using ALLPATHS, gene prediction of virus’ using FGENESV, and phylogenetic analysis using e.g. SNPtree. Task 3: To create user spaces for COMPARE workflow development and pilot projects Virtual user instances We will make user-configurable virtual compute environments available to COMPARE Consortium partners to support the development of analytical methods and workflows. These will be provided from the Embassy infrastructure. Users will be given full root access to these environments and will be able to configure operating systems of choice and whichever tools and supporting software they find to be appropriate. Integration of workflows In a further compute environment under Embassy, we will harden, configure and provide iterative feedback on emerging workflows to be appropriate for the COMPARE workflow engine, deriving our input from requirements gathered from the pilots and user/stakeholder consultations. Task 4: To create and operate the COMPARE portal COMPARE will create and operate a portal in which partners and outside researchers and users will report their new data sets into the system and will query, analyse and visualise COMPARE data, global public data of relevance and connected external data resources. Users reporting data into the system will have access to a simple web interface, a spreadsheet upload facility and (linking more directly to the COMPARE Data Resource input API) a full programmatic reporting service, reflecting the need of COMPARE to support all scales of data generated. The portals will be tailored to the needs of the main user groups (clinical, public health, research), and it is expected that the majority of consumer end-users engaging with COMPARE will use this portal. The COMPARE Consortium will customize the exact functions to be made available in the Portal through the WP11 User Consultations. Advanced users will be provided with an API on the portal that will support programmatic access to portal functions. For both web and programmatic uses full access to all data types will be supported. Overall, the COMPARE Data and Information Sharing Platform will provide and support as an open publicly available service the combined APIs of the Registry, COMPARE Data Resource, Workflow Engine and portal. These combined APIs will constitute full back-end functionality for those within and beyond the COMPARE Consortium to, for example, consume through the registry a number of data sources and apply their own analysis for their users or, alternatively, provide their own data back-end and take advantage of analysis routines in the generic workflow engine. Task 5: To ensure long-term sustainability The informatics configuration selected for the COMPARE project places long-term sustainability beyond the

Page 47: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 47

funded period as a central priority. Substantial leverage of existing well-supported resources means that even in the event that no operational support is available beyond the funded period, public data and managed access data will remain available for the long-term. In addition, EMBL and DTU see the continued support to the capture and analysis of new data as part of their core mandate. While commitment beyond the funded period to analysis workflows and other COMPARE user services will require the availability of new funding, the past performance of these partner institutions shows decades of provision of service and informatics supporting the scientific community. Therefore, we consider it likely that continued support will become available. As an additional safeguard for long term sustainability, we will make available full technical documentation (covering such areas as system architecture, file format descriptions, API protocols and data coordination workflows), software and pre-configured virtual machines to the public as part of the project (see D8, for example), offering all components freely and under open source principles, using, for example, the Apache 2.0 license for software. This will not only allow others to recreate such components as the COMPARE Workflow Engine (note here the convenience of using cloud virtualisation technologies to support simple reinstatement of virtual machines) in the event that our system ceases to be supported. If continued funding is secured, this approach will enable a diversity of groups to benefit from the work under COMPARE, thus developing an ecosystem of COMPARE-enabled resources. In addition, the outline of the system description will be submitted to an appropriate peer reviewed journal for publication; this publication will also serve as the entry point for the technical specifications for the system. We will, at the end of the COMPARE project, explore future cost-recovery strategies to generate sufficient financial income to continue the development of the platform (see section 2.2). Deliverables (brief description and months of delivery) D9.1 Hardware and cloud environment infrastructure (Month 12) D9.2 COMPARE registry and data resource component, including input and output APIs, supporting read,

alignment, assembly and annotation data types (Month 18) D9.3 Generic workflow engine supporting assembly and functional annotation workflows (Month 36) D9.4 Integrated analytical workflows from WPs 2, 3, 4 and 5 (Month 48) D9.5 Full technical specifications and publication submission (Month 60)

Work package number 10 Starting date or Starting Event Month 1

WP title COMPARE RISK COMMUNICATION TOOLS Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

60

Objectives E. To design and develop appropriate risk communication tools and strategies for stakeholders involved in the process, as well as set of management tools for authorities to handle uncertainties and available options according to the input received through the COMPARE system.

Description of Work (where appropriate, broken down into tasks), lead partner and role of participants) Task 1: Stakeholder Analysis Before developing COMPARE risk communication tools (CRCT), it is paramount to carry out a stakeholder analysis, in order to identify those stakeholders who are affected by or benefit from – directly or indirectly – the COMPARE system (i.e. intended users), as well as those actors who can contribute in the system any sort of relevant information and data (e.g. clinical, epidemiological, microbiological, etc.). We will build on previous research projects on public communication in infectious diseases (see section 1.3) and will inventory both standard syndromic surveillance systems and informal surveillance systems (e.g. ProMED-mail, EpiSPIDER, HealthMap etc.). The main outcome of this task will be a comprehensive database, which will include key actors in the risk communication processes relevant to COMPARE. The DB will include stakeholder characteristics (e.g., position interests, knowledge, alliances, resources, power, leadership, trust, etc.), which will be summarised for easy reference in stakeholder tables with indexes, graphs, and maps. Task 2: Targeted messages

Page 48: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 48

Preparation of appropriate risk messages for dissemination to, and communication with, specific audiences is the second step, which immediately originates from stakeholder analysis. The notion of targeted messages includes three further concepts, 1) timing (i.e., the right message delivered in the wrong moment could become useless or even counterproductive); 2) contents (i.e., messages are always interpreted by the receiver, they must be tailored by taking into consideration such an avoidable distortion); 3) potential gaps or barriers in the risk communication processes. A FAO/WHO report4 on risk communication states that -depending on what is to be communicated and to whom- risk communication messages require consideration of various elements for an effective risk communication. The following elements need to be considered for the design and development of communication and management tools in the process of risk assessment for novel pathogens and infectious disease outbreaks: 1) the nature of risk5; 2) the nature of the benefits6; 3) Uncertainties in risk assessment7; 4) Risk management options8. The main outcome of this task will be a repertory of message templates, best cases, and guidelines for message creation in EIDs, in consultation with WP leaders for WP1-8. These templates and guidelines will take into consideration the current complexity of risk communication, notably uncertainties, conflicting information, misinformation and false alarms, cultural, ethnic and religious variables. A report will describe 1) timing for an effective message; 2) contents of an effective message; 3) potential gaps or barriers in the risk communication processes for each of the stakeholder categories. Task 3: COMPARE risk communication tool box (CRCT box) Targeted messages will be then embodied in the overall COMPARE platform, which is the third step of the Communication Workpackage. In this task we will develop risk communication strategies and tools fully compliant with and embedded in the COMPARE system. This will imply the creation of full interoperability at various levels, strategic, syntactic, semantic, conceptual, legal, and so. The main outcome of this task will be the creation of the COMPARE RISK COMMUNICATION TOOL BOX (CRCT Box) that will support development of communication messages about findings, outbreaks, and new opportunities discovered and/or generated through COMPARE, addressing different sub-populations, in diverse EIDS and geographical, cultural, and temporal contexts. The CRCT Box will be built from the Framework Model and the Communication Kit developed by the TELL ME project, and will be largely based on new media and Internet communication. Its main features, originally targeted to influenza pandemics, will be extended to a universal risk communication strategy to cover the various types of EIDs and wide range of pathogens, with the “One Health” paradigm at the centre of the overall approach. Deliverables (brief description and months of delivery) D10.1: Stakeholder Analysis Report including a comprehensive database, with stakeholder characteristics Month 12 D10.2: Targeted Message Report describing for each main stakeholder category timing and contents of an effective message and potential gaps or barriers in the risk communication processes – Month 24 D10.3: Initial version COMPARE Risk Communication Tool box V1 (crct box V1) – Month 36 D10.4: Beta version COMPARE Risk Communication Tool box V2 integrated and interoperable with the COMPARE platform (crct box V2) – Month 48

 

4  http://www.fao.org/docrep/005/x1271e/x1271e00.HTM    5 The characteristics and importance of the hazard of concern; The magnitude and severity of the risk. The urgency of the situation; Whether the risk is becoming greater or smaller (trends); The probability of exposure to the hazard; The distribution of exposure; The amount of exposure that constitutes a significant risk; The nature and size of the population at risk; Who is at the greatest risk? 6 The actual or expected benefits associated with each risk; Who benefits and in what ways; Where the balance point is between risks and benefits; The magnitude and importance of the benefits; The total benefit to all affected populations combined. 7 The methods used to assess the risk; The importance of each of the uncertainties; The weaknesses of, or inaccuracies in, the available data; The assumptions on which estimates are based; The sensitivity of the estimates to changes in assumptions; The effect of changes in the estimates on risk management decisions. 8 The action(s) taken to control or manage the risk; The action individuals may take to reduce personal risk; The justification for choosing a specific risk management option; The effectiveness of a specific option; The benefits of a specific option; The cost of managing the risk, and who pays for it; The risks that remain after a risk management option is implemented.  

Page 49: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 49

Work package number 11 Starting date or Starting Event Month 1

WP title USER CONSULTATIONS Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

3 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Objectives To design the COMPARE systems’ analytical workflow and its main components based on the expert inputs and associated information needs of its intended future users and other stakeholders working in human, animal and wildlife health and food safety. Description of Work (where appropriate, broken down into tasks), lead partner and role of participants) Task 1: Establishment of the Expert Advisory Panels (by Executive Board) Task 1 involves the formal establishment of the External Advisory Panels. External advisors (see section 3.2) will be formally invited to serve on the appropriate External Advisory Panel. They will have an advisory role in COMPARE, with full access to all relevant COMPARE documentation necessary to perform their tasks in the EAP. The EAP meetings will be chaired and organised by the relevant Working Group leader, who ‘qualitate qua’ is also responsible for the preparation and follow up of the EAP meetings. The EAPs serve as a forum to bring in the wider stakeholder perspective, to discuss the primary choices made and to provide input on further development of the system. EAP members will have full (confidential) access to all COMPARE documentation. Task 2: implementation of EAP meetings (by Executive Board) The EAPs will be invited to a total of five EAP meeting rounds. Each meeting will be structured on the basis of the following five main questions, to be addressed from the perspective of the experts: • “What would trigger you to use the COMPARE system, what are the main questions you would want to

address, what information should COMPARE provide for this purpose? What formats and timescales would you consider useful?”

• “What data does COMPARE need to facilitate in terms of capture, storage, access, and how should these be standardized? Which analytic tools should be included in COMPARE to provide these desired outputs? Are these covered by the current set-up of the COMPARE project and system?”

• “To which other national or international research and innovation activities, databases and systems should COMPARE be linked and how?”

• “What do you consider to be the non-technological barriers in using/implementing a system such as COMPARE given the desired outputs, inputs and analytic tools?”

EAP Meetings will be held in parallel, to allow for adequate interactions between the EAPs. Starting point of first round of EAP meetings is the information as presented in this proposal and related background documentation. During the course of COMPARE a total of five consecutive EAP rounds is planned, with each round building on the results of the last period. In preparation of each EAP meeting round (except the first), the EAP chair and WP leader will collect updates on the main issues to be discussed from the other EAP members, and use this to develop and circulate draft reports to be discussed at the next face-to-face EAP meeting. Task 3: establishment and consultation of Online User Panel (Executive Board) The EAPs established under Task 1 will identify a broader group of international peers to serve on an Online User Panel that will be invited to provide their expert inputs on the results of COMPARE. This is done to ensure the outreach to a broader group of future intended users of COMPARE. To this purpose periodic online questionnaires will be administered to the Online User panel, in collaboration with WP13 Training and Dissemination. The results of the online user consultations will be provided as feed-back to the other WPs. Deliverables (brief description and months of delivery) D11.1 Combined EAP report 1st cycle (month 3) D11.2 Combined EAP report 2nd cycle (month 12) D11.3 Combined EAP report 3rd cycle (month 24)

D11.4 Combined EAP report 4th cycle (month 36) D11.5 Combined EAP report 5th cycle (month 48)

Page 50: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 50

Work package number 12 Starting date or Starting Event Month 1

WP title BARRIERS to open data sharing Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

5 2 2 5 48 5

Overall objective: To identify, clarify and, as far as feasible, develop practical solutions for Political, Ethical, Administrative, Regulatory and Legal (PEARL) barriers, that hamper the timely and openly sharing of data through COMPARE. Description of Work COMPARE will contribute from a European perspective to the long-term development of legally sustainable regulations for WGS- and meta-data sharing on a global scale. Within the project timeframe the main focus of COMPARE will be on the development of interim solutions of PEARL-barriers that contribute to open source sharing of sequence-based and contextual meta-data (see text Box 1 in section 1.2), in the tradition and preservation of the microbial commons and building on the GESTURE project. This will involve the following tasks, which will be performed in consultation with WP11 User Consultations. All tasks below will be performed by Partners 9 RIVM and 1 DTU in consultation with the EAP Barriers, COMPARE Ethics Advisory Board and the other six EAPs as established in WP11. Task 1: Establishment of the Expert Advisory Panel Barriers (RIVM, EUR, DTU) Task 1 involves the formal establishment of the External Advisory Panel on Barriers. External advisors (see section 3.2) will be formally invited to serve on the EAP Barriers. They will have an advisory role in COMPARE, with full access to all relevant COMPARE documentation necessary to perform their tasks in the EAP. The EAP meetings will be chaired and organised by the Working Group leader of this WP. The meetings with the EAP barriers will be held in parallel with the other EAP meetings of WP11 to allow for adequate interactions with the other EAPs. Task 2: identifying the legal limitations, conditions and obligations in open data sharing (RIVM, EUR, DTU) This task describes the current situation, focusing on identification and analysis of existing legal limitations, conditions and obligations, under which samples, isolates, WGS-data and contextual meta-data, can be shared through an open source international database. This will be done by desk research to analyse the legal basis, interviews, consultation of other EAPs, and expert meeting(s) with EAP “ barriers”. We will focus on: • Identification of barriers under national legislation (to bring data outside national jurisdiction); • Clarification of the existing European acquis communautaire on: (public) health, data sharing and data

protection (in the human / food / veterinary domains); • Inventory of public obligations and commitments under: the International Health Regulations and the EU

Decision on cross border health threats, EU food and veterinary regulations on notification of hazards and incidents.

Task 3: Constructing an ethical framework and charter of principles for sharing of NGS-data on a European level. (RIVM, EUR, DTU) The next steps will development of a framework for sharing of NGS data. For this, we will: • Study and clarify the Microbial Commons concept (Uhlir 2009, Dedeurwaerdere/IJC 2010/13) • Do contextual research of the Declaration of Helsinki and international ethical guidelines on biomedical

research (CIOMS/WHO 2002) • Do a case study of the Pandemic Influenza Preparedness Framework (WHO resolution 64.5, 24 May 2011). • Organise structured consensus discussion of stakeholders and producers of data. • Organise a mirror discussion on a global level through GMI Conferences and members. In addition to the approaches described in task 1, we will use structured international Delphi discussions. Task 4. Developing standard procedures for data shipment (RIVM, EUR, DTU)

Page 51: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 51

This task involves the development of standardized procedures for timely/prompt shipment of data, while safeguarding publication rights and the protection of foreground, through securing intellectual property rights and possible valorization of knowledge. This will include: • Comprehensive problem analysis; • Evaluative study of the 1996 Bermuda Principles, 2003 Fort Lauderdale declaration and 2009 Toronto

Statement, 2009 GESTURE recommendations; • Case studies and inventory of best practices (e.g. human genome project / Wellcome Trust/Sanger); • Identification and consultation of IP stakeholders (e.g, ICMJEditors, World Trade Organsiation, World

Intellectual Property Organisation); • Laboratory and real-time speed pilots of research-driven stakeholders (COMPARE partners). Methodology: Desk study and legal analysis, identification and consultation of IP experts, piloting possible procedures. Task 5. Data sharing guidelines (RIVM, EUR, DTU) This task is geared towards developing guidelines for data-analysts and (public health) authorities to prevent (fear of) losing control over their outbreak management and decision making when they dispatch samples and/or data abroad for diagnostics, analyses and monitoring. • Identification of stakeholders: national, public supra-national and private international; • Study and analysis of international law on signaling and notification; • Case studies of known incidents and examples. Methodology: desk study and analysis, interviewing national focal points, consultation supra-national organisations, COMPARE user-panels, desktop exercises of stakeholders. Task 6. Legal, administrative and ethical support for COMPARE and workpackage leaders (RIVM, EUR, DTU) To clarify existing rules of law and ethics, and to develop COMPARE tailor made solutions on request: • Factsheets and management tools; • Proposals for guidelines, codes of conducts; • Model Material Transfer Agreements, model Data Transfer Agreements; • Memoranda of understanding, model-contracts; • Concrete ethical principles and legal conditions for database infrastructure; In close collaboration with the Executive Board and Ethics Advisory Board (management section 3.2) Methodology: quick scans, tailor made advises, expert consultations and meetings, discussion papers. Deliverables (brief description and months of delivery) D12.1 Report on the legal limitations, conditions and obligations in open data sharing (task 2) (month 24) D12.2 Ethical framework and charter of principles for sharing of NGS-data on European level (month 48) D12.3 Final report and recommendations on data-sharing guidelines (task 4), (month 56)

Work package number 13 Starting date or Starting Event Month 1

WP title DISSEMINATION AND TRAINING Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

30 2 3 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 1 1 12

Overall objective: To ensure that relevant stakeholders of COMPARE are adequately informed about COMPARE’s progress and results and have access to the training they need in order to apply the harmonized workflows, analytical tools and data resources developed and implemented by COMPARE in their pathogen detection and outbreaks response activities. Description of Work This WP serves to establish dialogue with our stakeholders to create awareness and acceptance of our users and

Page 52: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 52

other stakeholders of COMPARE’s results and their added value for their roles in pathogen and outbreak detection and response. This WP will make use of the contacts established with the representatives of our stakeholders in the EAPs of WP11 and of the stakeholder analyses, communication strategies and tools developed in WP10 RISK COMMUNICATION TOOLS. It is built up of two main mutually reinforcing components: Dissemination and Training. Considering the strategic importance of this WP for the sustainability of COMPARE, and the cross-cutting nature of its activities, the coordination of this WP will fall under the direct responsibility of the Executive Board of COMPARE, consisting of the Working Group leaders and co-leaders. A. Dissemination Task 1: developing and maintaining user-stakeholder database In collaboration with WP10 RISK COMMUNICATION this WP will build and maintain an up to date contact database of key organizations and persons to be informed regularly by COMPARE on its activities, its results and the added value thereof for these organizations and persons. The database will group the organization and persons according to their specific interests in the activities and results of COMPARE. This grouping will be the same as the grouping applied in the EAPs of WP11 to attain consistency across the WPs. In addition to this personalized user-stakeholder database, the dissemination activities of COMPARE will also be deployed for the communication with the general public as the ultimate recipients and stakeholders of COMPARE. Task 2: Producing and distributing promotion materials applying consistent COMPARE identity and visibility COMPARE will develop and apply a ‘corporate identity’ that will be applied in all internal and external communications of COMPARE, in compliance with article 29.4 of the H2020 grant agreement The COMPARE identity will include a logo and standard color schemes and typefonts, processed into and templates for a leaflet, periodic newsletters, poster presentations and presentation slides as appropriate. Task 3: Developing and moderating COMPARE public website and Twitter-account COMPARE will develop a public website, linked to the Data and Information platform of WP9, where interested parties can find background information on COMPARE, the consortiumpartners and the results obtained. Linked to the website will be a COMPARE Twitter account, moderated by the COMPARE Project Office, where COMPARE can engage in direct communications with the public. Other social media such as Youtube and LinkedIn will also be explored. These social media will be primarily used to raise public awareness about COMPARE and its added value to public and veterinary health and food safety in Europe and beyond. COMPARE will also strive to link its public websites to other websites/portals of the linked network (e.g. GloPID-R website, ANTIGONE website, GMI website); Task 4: participation in national and international conferences, seminars and symposia The lead investigators in COMPARE are all well embedded in national and international networks, advisory committees, professional societies and conferences. As such they will function as the ambassadors of COMPARE within these large networks. COMPARE will present its progress of the activities at national and international conferences. Task 5: Publishing in peer-reviewed scientific journals COMPARE will actively seek to publish the results of COMPARE in these journals, acknowledging COMPARE and the European Commission (in compliance with article 29.4 of the H2020 Grant Agreement). All principal investigators will strive to publish their results in open-access journals (Gold model). In case publications in open-access journals is not feasible, the publications will be made available via the COMPARE website, respecting any embargo periods of the journal (Green-model). B. Training: The training component of this WP is made up of the following tasks: Task 1: production and distribution of e-learning material (Executive Board) Each Working Group will produce e-learning manuals that can be accessed on-line via the COMPARE portal, including SOPs, guidance documents, manuals, demonstration slides or videos, FAQs as appropriate, and how to contact the helpdesk for questions and hands-on assistance. This training material will be developed and updated throughout the project to reflect the latest status of the developments in COMPARE. Task 2: Establishment of COMPARE helpdesk (Executive Board) COMPARE will establish a virtual helpdesk, consisting of the developers of the analytical workflows, methods, and tools developed in WPs 1-10 in COMPARE. This helpdesk will function as first line assistance to COMPARE users, for accessing, applying and interpreting the results of the analytical tools and workflows available through COMPARE. This direct help-desking and interactions between users and the developers enables that the user community of COMPARE can be trained ‘on the job’ in applying the tools and methods

Page 53: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 53

integrated in COMPARE and to ensure that full and adequate use is made of these tools for the purposes they have been built. Task 3: development and implementation of COMPARE workshops (Executive Board) Whereas tasks 1 and 2 above rely of the initiative of a user to contact COMPARE, this task is meant to proactively reach out the (potential) users of (elements of) COMPARE. Making use of the EAPs established in WP11, this WP will produce a one-day COMPARE modular workshop (modules following the overall WP structure of COMPARE) for the organizations represented in the EAPs (e.g. ECDC, EFSA, WHO, OIE, private sector food companies, for full list see section 3.2) and others (upon request). In this workshop, principal investigators will present the analytical tools and software tailored to the specific needs of the workshop audience. The detailed programme of the workshops can be designed in consultation with the respective EAPs. For the duration of the COMPARE project we will plan and budget a total of ten workshops, where the objective is to have organized at least one workshop for each of the category of user-stakeholders as organized into the EAPs (e.g. at least one for public health researchers, at least one for clinicians, at least one for food-companies etc.). The workshops will be advertised on the COMPARE portal and those of other related networks (e.g. ANTIGONE, PREPARE, GMI, TELLME). Task 4 Expert (practical) courses and researcher exchanges The final training component, designed to promote the impact of COMPARE consists of practical courses and researcher exchanges. In cases the first line helpdesking does not suffice, and users would like to receive more in-depth training in the analytical workflows developed in COMPARE, ad-hoc training can be given on a case-by-case basis by the lead investigators responsible for the development of the tools. This can involve practical training on location of researchers from outside the COMPARE consortium or researcher exchanges between COMPARE partners. Partners in COMPARE can also leverage existing renowned courses run/organized by them for many years, making optimal use of the existing infrastructures. The number and content of these practical courses and researcher exchanges will be based on the demand and needs of the users. Deliverables (brief description and months of delivery) D13.1 Stakeholder contact database (continuously updated) (month 3); D13.2 Leaflet and templates for promotional material in COMPARE corporate style (month 3); D13.3 COMPARE public website (month 3); D13.4 e-learning materials (continuously updated) (month 12); D13.5 COMPARE workshop programme (10 planned, continuously updated) (month 12); D13.6 COMPARE acknowledged publications and presentations (month 60).

Work package number 14 Starting date or Starting Event Month 1

WP title COST EFFECTIVENESS FRAMEWORK Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

60 56

Overall objective: To develop a standardised framework for estimating the cost-effectiveness of the COMPARE system and related methods and tools, including the value of safety. Specific objectives: 1. To identify the important elements in calculating costs and benefits of COMPARE and related methods and

tools (both regarding the system itself and from a societal perspective); 2. To identify and where necessary develop state-of-the art costing methodologies for the different elements in

the framework; 3. To develop and apply a methodology to value safety (provided through rapid identification of pathogens

through COMPARE) in several countries; 4. Using 1-3, to estimate the cost-effectiveness of COMPARE and related methods and tools using case studies; 5. Based on the results, to assess options for refining selected elements of COMPARE in view of improving the

overall cost-effectiveness of the system; Description of Work All activities will be done by partners 22 (EUR) and 25 (CIVIC) Task 1: Identify the important elements in calculating costs and benefits of COMPARE and related methods

Page 54: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 54

and tools. This stage, related to objective 1, will identify the important items to be included in the final framework for assessing the cost-effectiveness of COMPARE. This firstly includes important elements of the COMPARE system itself, as a system that uses/facilitates NGS for the identification of pathogens. Secondly, it includes costs from a societal perspective, costs of treatments/vaccinations/ policies resulting from ‘early warnings’, societal costs such as productivity losses, losses related to food withdrawals, reduced international trade, etc. We will conduct a literature review in this stage, also looking at previous cost-benefit and cost-effectiveness analyses of current pathogen identification systems and outbreak prevention, considering both human and animal health. Expert opinions will also be collected through participation in the EAPs of WP11. Specific sub-tasks include: • 1.1: Identify the important elements of the COMPARE system, including:

o Smart sampling (risk-based sample and data collection) o Sequencing (from samples to comparable data) o Making data actionable (from comparable data to actionable information) o Risk assessment and communication

• 1.2: Identify important elements related to wider framework. These include costs of treatments / vaccinations / policies resulting from ‘early warnings’, societal costs such as productivity losses, losses related to food withdrawals, reduced international trade.

Task 2: Identify and where necessary develop state-of-the art costing methodologies for the different elements in the framework. While for some elements in the framework, clear measurement and valuation methodologies will be readily available; this will not be the case for all elements. Based on the literature and economic theory, methods will be proposed or developed for the identified elements under Task 1. Here, we will draw, where possible, on existing guidelines for cost-effectiveness and cost-benefit analysis, and adapt these, where needed, for the current purpose. Where necessary new methods and guidance will be developed. Specific sub-tasks include: • 2.1: Costing methodologies of elements related to system components, and selected methods and tools • 2.2: Costing methodologies of elements related to wider framework Task 3: Define the baseline (current system and standard methods). In order to have a clear benchmark with which the costs of COMPARE can be compared so as to establish an assessment of cost-effectiveness, this task will involve determining the baseline costs and benefits, i.e. the costs and benefits of relevant elements of the current systems and methods. • 3.1: Baseline related to system components, and selected methods and tools; • 3.2: Baseline related to wider framework; Task 4: Develop and apply a methodology to value safety in several countries. Health protection is an important goal of early warning systems. Moreover, they provide a sense of safety to populations. The latter element is very difficult to value, yet highly relevant from the perspective of policy makers as well as from a welfare economic perspective. The feeling of unsafety, felt throughout countries, in face of a potential outbreak such as in the case of SARS, can be highly disruptive and unsettling. It is well known that the value of safety or avoiding losses, can be high (e.g. high expenditures for blood safety), but methods to determine these values are lacking. Here we will develop and apply a method to estimate this value of safety in different European countries. This will be done by developing an innovative Willingness to Pay experiment (e.g. Bobinac et al. 2011, 2014), which will be tested, refined and subsequently used in several countries to establish the value of safety associated with interventions like COMPARE. • 4.1: Development of a methodology to value safety; • 4.2: Implementation of a methodology to value safety in selected countries; • 4.3: Analysing results and relating them to the cost-effectiveness COMPARE system. Task 5: Estimate the cost-effectiveness of COMPARE and related methods and tools using case studies (scenario/pilot/retrospective studies). • The case studies will be retrospective studies (e.g. in relation to past outbreaks such as SARS or H1N1

(swine flu)), scenario studies (e.g. covering specific outbreak scenarios in countries with less developed surveillance systems), and pilot studies, relating to actual uses of the COMPARE system. In addition, case studies will focus on a given element of the approach, situation, pathogen, user situation/perspective, or region (developed or developing).

• 5.1: Selection of case studies and case study methodology; • 5.2: Implementation of case studies; • 5.3: Analysis of case study results and relating them to the cost-effectiveness of COMPARE system. Task 6: To assess options for refining selected elements of COMPARE in view of improving the overall cost-

Page 55: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 55

effectiveness of the system. On the basis of the case studies’ results, various options will emerge for refining selected elements of COMPARE in view of improving the overall cost-effectiveness. These will each be assessed, and recommendations will be provided as to priority areas for improvement of cost-effectiveness. • 6.1: Assessing options for refining selected elements of COMPARE in view of improving the overall cost-

effectiveness of the system; • 6.2: Providing guidance to implement options, where relevant. Deliverables D14.1 Framework for assessing cost-effectiveness of COMPARE in which key costs and benefits are

highlighted. This will include a section on the system itself and on the wider framework (Month 18); D14.2 State-of-the-art methodologies for the measurement and valuation of the elements specified in the

framework. This will include a section on the system itself and on the wider framework (Month 30); D14.3 A scientific paper describing the methodology and results of estimating the value of safety, with the

results from several European countries (Month 45); D14.4 Report on the (potential) cost-effectiveness of COMPARE, based on the case studies

(scenario/pilot/retrospective studies). Each case study presented will include a section on elements related to the system and on the wider framework (Month 54);

D14.5 A report on the assessment of options for refining selected elements of COMPARE in view of improving the overall cost-effectiveness of the system, with recommendations (Month 60)

Work package number 15 Starting date or Starting Event Month 1

WP title MANAGEMENT Participant # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Short name of Participant

DTU

EM

C

SS

I

FLI

AN

SE

S

RK

I

EM

BL

ISS

RIV

M

AH

VLA

UE

DIN

UK

-Bon

n

AM

C

UA

Arte

rmis

UC

AM

TIH

O

UC

LM

FME

R

AU

TH

IFR

EM

ER

EU

R

AN

U

WIG

NE

R

CIV

IC

RT

UN

IBO

DS

MZ

WTS

I

Person months per participant

70 18

Overall objectives: The overall objective of WP15 is to implement the appropriate organizational structures and processes to ensure COMPARE’s compliance to the EC Grant Agreement and the COMPARE Consortium Agreement (CA). This includes the following specific objectives: - To maintain the COMPARE CA; - To implement the project management structure and decision making processes as agreed in the DoW and

CA (see section 3.2); - To coordinate the financial-administrative processes at project level - To coordinate the ethic management at project level (see section 5) - To coordinate the management of intellectual property at project level (see section 2.2)

Description of Work The project structure and decision making processes to be implemented in COMPARE and to be detailed in the COMPARE Consortium Agreement are described in section 3.2. COMPARE will install a project management structure, which is typical for EU collaborative projects of this scale and scope with a General Assembly as the highest decision making unit, an Executive Board as the central coordinating unit of COMPARE and Working Groups at operational WP level. A central project support office supports the COMPARE structure. Seven external Advisory panels will be installed for the purpose of the User Consultations as described in WP11. Finnaly, an independent external Ethics Advisory Board will be formed to advise the COMPARE GA on the ethics management. The Consortium Management of COMPARE consists of the following groups of tasks, to be coordinated by DTU (F. Aarestrup) with the assistance of the other lead investigators and their supporting staff, and the abovementioned management and advisory bodies. Task 1: installment of management bodies and formal decision making, meeting and reporting procedures (DTU) - Installing GA, EB, EAPs, Working Groups and Ethics Advisory Boards; - Developing and distribution project management manual stipulating the roles and responsibilities of the

management bodies in and the decision-making processes and mandates; - Maintenance of central COMPARE contact list (who’s who);

Page 56: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 56

Task 2: implementation of internal meeting and reporting procedures conform DoW and CA (DTU) - Organization and administrative handling of periodic GA and EB meetings; - Developing meeting agenda’s minutes and distribution of background documentation for meetings; - Developing and distributing COMPARE internal progress reporting templates; - Periodic reporting of EB to GA, describing the progress of work in COMPARE towards its main

objectives, any bottlenecks encountered and measures taken (or proposed); Task 3: Periodic technical and financial reporting to the Commission (All partners) - Maintenance of Electronic EU reporting system (SESAM and FORCE): contact per partner, uploading

deliverable reports, publications and patents (management of knowledge) and periodic reports (Forms C and Core of the Project reports);

- Distribution of EC contribution to partners; - Assisting partners in providing their partner level inputs to the technical and financial reporting to the

Commission; Task 4 Knowledge and contract management of COMPARE at project level (DTU, EMC) - Maintenance of the COMPARE Consortium Agreement (e.g. exclusion of background) and administrative

management of Material transfer Agreements, Non-disclosure Agreements; - Legal management of Intellectual Property related issues and development of exploitation strategy; - Administrative handling of amendments to the EC H2020 grant agreement (DoW) or the CA; - Management of ethics and gender aspects in COMPARE;

Deliverables (brief description and months of delivery) D15.1 Consortium Agreement (Month 0) D15.2 Project Management manual (month 2); D15.3 Internal reporting templates (month 2); D15.4 Plan for the Dissemination and Exploitation of Results (month 6);

Table 3.1 b: List of work packages

WP No

Work Package Title Lead Partici-pant No

Lead Participant

Short Name

Person-Months

Start Month

End month

1 Risk Assessment and Risk-based strategies for sample and data collection

10 AHVLA 228 1 60

2 Harmonised standards for sample processing and sequencing

4 FLI 349 1 60

3 From comparable data to actionable information: Analytical workflows for frontline diagnostics

14 UA 165 1 60

4 From comparable data to actionable information: Analytical workflows for foodborne pathogen surveilllance, outbreak detection, and epidemiological analysis

3 SSI 142 1 60

5 From comparable data to actionable information: Additional tools for detection of and response to (re-) emerging infections

2 EMC 128 1 60

6 Underpinning research on frontline diagnostics using the COMPARE analytical framework

14 UA 142 18 60

7 Underpinning research on Foodborne outbreaks using the COMPARE analytical framework

3 SSI 204 18 60

8 Underpinning research: novel approaches to (re-) emerging disease detection and outbreak response using the compare analytical framework

2 EMC 340 12 60

9 COMPARE Data and information platform 7 EMBL 356 1 60 10 COMPARE Risk Communication tools RT 60 1 60 11 User-Stakeholder Consultations 2 EMC 30 1 60 12 Barriers to open data sharing 9 RIVM 67 1 60 13 Dissemination and Training 1 DTU 80 1 60 14 Cost Effectiveness Framework EUR 116 1 60 15 Management 1 DTU 88 1 60 2492

Page 57: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 57

Table 3.1 c: List of Deliverables

# Deliverable name WP Short name

of lead participant

Type Diss. level

Delivery date

1.1 Automated tools for rapid assessment of key

transmission parameters 1 AHVLA OTH PU 24

1.2 Risk based surveillance plans, sampling algorithms and protocols

1 AHVLA R PU 40

1.3 Generic RA framework 1 AHVLA OTH PU 46 2.1 Matrix-dependent sample handling protocols 2 FLI R PU 12

2.2 Standard protocol sample processing 2 FLI R PU 24 2.3 Sequencing workflow 2 FLI R PU 36 2.4 Data analysis pipeline 2 FLI OTH PU 24 2.5 Testing results of molecular analytical workflow in

ring trials 2 FLI R CO 36,48,6

0 3.1 Analytical workflow for clinical diagnostic

application 3 UA OTH PU 18

3.2 Prediction algorithm AMR markers 3 UA R PU 24 3.3 Standardized protocols 3 UA R PU 48 4.1 Reference sequence database 4 SSI CO 6 4.2 Algorithm for detection of informative (sub)types 4 SSI R PU 32 4.3 Analytical workflow 4 SSI OTH PU 48 4.4 Validated RA model for NGS data 4 SSI OTH PU 48 5.1 Update of pathogen repositories 5 EMC OTH CO 18 5.2 Tools for rapid sequence based detection of strain

specific clusters 5 EMC OTH PU 30

5.3 Phylogenetic and phylogeographic tools 5 EMC OTH PU 30 5.4 Tools for detecting SNPs 5 EMC OTH PU 36 5.5 Methods for prediction of pathogen phenotype from

genotype) 5 EMC OTH PU 42

6.1 Report on NGS based diagnostics in comparison to gold standard methods

6 UA R PU 42

6.2 Report on WGS and NGS based detection of AMR in stool samples in patients and travelers

6 UA R CO 50

6.3 Reports on WGS and NGS based hospital outbreak detection and monitoring

6 UA R CO 60

7.1 Database of reference genomes 7 SSI OTH CO 16 7.2 Report for cluster detection for all pilot organisms 7 SSI R CO 36 7.3 Database of markers for host-association and

ecophysiology 7 SSI CO 36

7.4 Improved guidelines for interpretation criteria for defining clusters of disease

7 SSI R CO 48

8.3 Report on results of pilot study of metagenomic pathogen detection

8 EMC R CO 60

8.1 NGS sequencing data 8 EMC R CO 36 8.2 Identification of key drivers of evolution 8 EMC OTH CO 56 9.1 Hardware and cloud environment 9 EMBL OTH CO 12 9.2 Registry and data resource component 9 EMBL OTH CO 18 9.3 Generic workflow engine 9 EMBL OTH PU 36 9.4 Integrated analytical workflows 9 EMBL OTH PU 48 9.5 Full technical specifications 9 EMBL R PU 60

10.1 Stakeholder Analysis Report 10 DTU R CO 12 10.2 Targeted Message Report 10 DTU R CO 24

Page 58: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 58

10.3 Initial version COMPARE RCT box V1 10 DTU OTH PU 36 10.4 Beta version COMPARE RCT box V2 10 DTU OTH PU 48 11.1 EAP report 1st cycle 11 EMC R CO 3 11.2 EAP report 2nd cycle 11 EMC R CO 12 11.3 EAP report 3rd cycle 11 EMC R CO 24 11.4 EAP report 4th cycle 11 EMC R CO 36 11.5 EAP report 5th cycle 11 EMC R CO 48 12.1 Report legal limitations, conditions and obligations 12 RIVM R CO 24 12.2 Ethical framework and charter 12 RIVM R CO 48 12.3 Report data-sharing guidelines 12 RIVM R CO 56 13.1 Contact database 13 DTU DEC CO 3 13.2 Leaflet and templates 13 DTU DEC PU 3 13.3 Public website 13 DTU DEC PU 3 13.4 E-learning materials 13 DTU DEC PU 12 13.5 Workshop programme 13 DTU R PU 12 13.6 Publications and presentations 13 DTU R PU 60 14.1 CE Framework 14 EUR OTH PU 18 14.2 Methodologies report 14 EUR R PU 30 14.3 Paper on methodology and results of estimating the

value of safety, 14 EUR R PU 45

14.4 Case study cost-effectiveness report 14 EUR R PU 54 14.5 Cost-effectiveness report 14 EUR R PU 60 15.1 Consortium Agreement 15 DTU R CO 0 15.2 Project Management manual 15 DTU R CO 2 15.3 Internal reporting templates 15 DTU R CO 2 15.4 Plan for the Dissemination and Exploitation of

Results 15 DTU R CO 6

Type: R (Document, report (excluding the periodic or final report)); DEC (Websites, patents filing, market studies, press & media actions, videos), OTHER: (Software, technical diagram, etc)

Dissemination level: PU = Public, fully open, e.g. web / CO = Confidential, restricted under conditions set out in Model Grant Agreement/ CI = Classified, information as referred to in Commission Decision 2001/844/EC.

3.2  MANAGEMENT  STRUCTURE  AND  PROCEDURES   Project management structure COMPARE will install a project organization structure a depicted in figure 6. The COMPARE project is complex and ambitious. It involves a large number of research groups in a large number of beneficiaries, aligned to a range of user-stakeholder organizations, active across various sectors and domains and covering a wide variety of expertise. To ensure that the strength of this complementary partnership is fully utilized, the dedication and commitment from all partners and users are absolutely essential. Therefore, we have chosen to adopt a management structure where all partners have direct influence on and take common responsibility for all major scientific, financial and management decisions. Such a participatory organization model is in our experience the preferred structure in very large and scientifically complex projects where optimal commitment is essential compared to more monolithic structures with a more centralized authority structure. The project structure is designed into five responsibility areas: • The COMPARE General Assembly: the highest authority in COMPARE and the central forum for the

strategic discussions in COMPARE, responsible for the overall performance of COMPARE in compliance with the EC Grant Agreement and its annexes and the COMPARE Consortium Agreement;

Figure 6: COMPARE Project Management Structure

Page 59: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 59

• The COMPARE Executive Board: the central executive level coordinating body of COMPARE, responsible for the implementation of the activities as planned and budgeted in the COMPARE Work packages;

• The COMPARE Working Groups: the teams responsible for the implementation of the respective work packages tasks at operational level;

• The COMPARE Expert Advisory Panels: External expert advisors to the Working Groups, providing their expert opinions and feedback on the planned activities and obtained results in COMPARE;

• The COMPARE Ethics Advisory Board: External Ethics advisors, providing their experti advice on the ethics management in COMPARE;

• The COMPARE Support staff: staff secretariat, providing organizational, secretarial, financial, legal and administrative support to the Executive Board and individual COMPARE partners;

The organizational structure and the accompanying progress monitoring and decision-making procedures will be formalised in the COMPARE Consortium Agreement. The composition, responsibilities and accompanying tasks of each of these COMPARE bodies are described in more detail below. COMPARE General Assembly (GA) The GA is the highest decision-making body in COMPARE. The GA consists of the principal investigators of the beneficiaries that have acceded to the COMPARE EC Grant Agreement. The Coordinator of COMPARE, Prof. Frank Aarestrup, chairs the GA. Prof. Marion Koopmans of P2 EMC is the Deputy Coordinator of COMPARE and as such the vice-Chair of the GA. The size and complexity of COMPARE warrants the instalment of a deputy-Coordinator (Marion Koopmans) that will assist the Coordinator in the overall scientific coordination of COMPARE. Both persons have been at the basis of building this proposal and consortium and are well-established leaders in their fields (see section 4.1. for more details). As the formal Coordinator, Frank Aarestrup will be responsible for the formal tasks of the Coordinator including the coordination of the financial-administrative, legal and gender aspects and functioning as the formal and first point of contact between the Commission services and COMPARE. As Deputy-Coordinator, Marion Koopmans will be responsible for the scientific coordination of COMPARE. For the composition of the GA we refer to section 4.1 listing the lead investigators per beneficiary. Each beneficiary will have one vote in the GA, regardless of the number of persons of that beneficiary in the GA. The GA will oversee the project’s progress and provide a forum for discussions on the strategic orientation and development of the project and the decision-making on issues on strategic-project level. As the highest hierarchical body, the GA has the sole authority to decide on issues that necessitate changes in the EC Grant Agreement (in consultation with the European Commission) and/or the COMPARE Consortium Agreement. This includes decisions on issues such as (but not limited to) changes in the Consortium (new partners, replacement of partners), changes in the overall objectives or approach of COMPARE, conflicts between partners, significant changes in partners’ budgets, or changes in the composition of the Management Bodies. COMPARE Executive Board (EB) and working groups The scientific coordination of the activities in COMPARE is delegated by the GA to the COMPARE Executive Board. As such, the EB is the central scientific coordinating unit in COMPARE, consisting of the leaders and co-leaders of the COMPARE Working Groups and the Chair and Co-Chair of the General Assembly as listed in the table below. In the selection of EB members (Working Group leaders and co-leaders) COMPARE has strived for a composition of the EB that is a good reflection of the full breadth of domains, sectors, disciplines and pathogen backgrounds brought together in the consortium. In addition, also in light of future uptake of COMPARE across Europe, COMPARE has strived for a balanced EB in terms of gender and nationalities.

Working Group Work package(s)

Working Group Leader

WP Co-leader(s)

Risk Assessment and risk-based strategies for sample and data collection

1 Emma Snary Christian Gortazar

Harmonised standards for sample processing and sequencing

2 Martin Beer Simone Caccio Paul Kellam

Sequence based Frontline diagnostics 3 and 6 Surbhi Malhotra

Menno de Jong Anne Pohlmann

Analytical tools for detection and analysis of foodborne outbreaks

4 and 7 Eva Moller Nielsen Tine Hald Anne Brissabois

Page 60: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 60

Analytical tools for detection and analysis of re-emerging outbreaks

5 and 8 Ron Fouchier Mark Woolhouse

System Architecture and Software 9 Guy Cochrane Ole Lund Istvan Csabai

Risk Communication tools 10 Emilio Mordini - Barriers 12 George Haringhuizen Jørgen Schlundt Cost Effectiveness Framework 14 Werner Brouwer Nicholas-

McSpedden-Brown User Consultations, Dissemination and Training, Management

11, 13, 15 Frank Aarestrup Marion Koopmans

The EB is responsible for monitoring the overall scientific quality of the projects’ deliverables and the alignment of the activities across the 10 Working Groups. It has the authority to take appropriate measures to ensure that the activities across the Working Groups are synergistic, such as reprioritization of activities and associated reallocation of resources and budgets. The EB reports to the GA and the EB members are responsible for ensuring the implementation of EB decisions in their respective Working Groups. Any decisions by the EB that potentially can necessitate amendments to the EC Grant Agreement and/or Consortium Agreement should be proposed to the GA for discussion and final decision (or rejection). COMPARE Working Groups At operational level, the activities in COMPARE are structured into the above-mentioned working groups, in line with the organization of the activities in the Work packages as described in section 3.1. Each Working Group is responsible for the implementation of EB decisions and the activities in the relevant Work Package(s), according to the timelines, and budgets agreed in the EC Grant Agreement and its annexes. As such the Working Group is the appropriate decision making level for issues that are on WP level (and that do not affect other WPs). Any decisions by Working Groups that (potentially) affect other WPs should be proposed to the Executive Board for discussion and final decision. Each Working Group consists of the lead investigators that are directly involved in the supervision and/or execution of the activities in the respective WPs. The Working Group leader represents the Working Group ‘ qualitate qua’ on the Executive Board (see above). The Working Group leaders of larger Working Groups, in which several lead investigators and their teams, from different domains and sectors work together will be assisted by co-leaders, that also ‘qualitate qua’ take seat on the Executive Board. This will ensure a balanced representation of the different domains, sectors and disciplines in the COMPARE Executive Board as described above. COMPARE Expert Advisory Panels (EAPs) The Expert Advisory Panels play a crucial role in COMPARE as they function as the main link between COMPARE and its future users. The EAPs will provide the Working Groups their expert feedback as future users and stakeholders on the objectives, approaches, results and activities of the respective Work Packages in the User Consultations of COMPARE of WP11. Given the diversity of the users, as described in previous sections of this proposal, we have organized the User Consultations into 7 EAPs. As with any organization into expert panels, the chosen distinction into the seven EAPs as listed below is subject to debate, as most experts do not fall exclusively in one of these EAPs. However, a basis of grouping the experts into panels has been chosen to allow for more in-depth discussions amongst the experts and their counterpart Working Groups in COMPARE on common questions, issues, challenges and concerns to be addressed with support of the COMPARE system and the. This has resulted in seven Expert Advisory panels as listed in the tables below, cross-cutting the Working Groups of COMPARE. The methods and approach of the User Consultation process are described in WP11

Figure 7: the seven COMPARE Expert Advisory panels for User-stakeholder consultations in WP11

Page 61: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 61

Expert Advisory Panel Public Health Marc Struelens* European Center for Disease Control (ECDC) Danilo Lo Fo Wong* World Health Organisation (WHO) Tom Chiller*, Peter Gerner Schmidt* Center for Disease Control (CDC) tbd Directorate Health and Consumers (DG SANCO) David Heymann* Centre on Global Health Security, Chatham House, UK Pathom Sawanpanyalert** Ministry of Public Health, Thailand Dr. Xu Jianguo* Director, Chinese Center for Disease Control and Prevention Dr Changwen Ke*, Director of Virology CDC Guangdong, China

Expert Advisory Panel Food Safety Ernesto Liebiani* European Food Safety Authority (EFSA) Henk Jan Ormel*, Masami T. Takeuchi**     Food and Agricultural Organisation (FAO) Marc Allard*, Eric Brown * Food and Drug Administration (FDA) Alisdair Wotherspoon** Food Standards Agency tbd European Centre for Expertise & Research on Microbial Agents (CEERAM) Lourens Heres* VION food, Netherlands Marthi Balkumar* Unilever Leen Baert*, John Donaghy* Nestlé

Expert Advisory Panel Clinical Health Dr. Elisabeth Bell (tbc) European Respiratory Society (ERS) Gunnar Kahlmeter** European Society for Clinical Microbiology and Infectious Diseases

(ESCMID) Jean-Daniel Chiche (tbc) European Society for Intensive Care Medicine (ESICM) Alex van Belkum* BioMérieux Gibson Kibiki* Kilimanjaro Clinical Research Institute, Tanzania

Expert Advisory Panel Domestic Animals Vincenzo Caporale** World Organisation for Animal Health (OIE) Dr. Jan Vaarten (tbc) Federation of Veterinarians in Europe (FVE) Dr. Michael Day (tbc) World Small Animal Veterinary Association (WSAVA), Canada Dr. Clark Fobian (tbc) American Veterinary Medical Association (AVMA) Gwenelle Dauphin/ Keith Hamilton FAO Herman Unger IAEA

Expert Advisory Panel Wildlife Dr William Karesh (tbc World Organisation for Animal Health (OIE) Dr Erik Agren, co-chair EWDA board; tbc European Wildlife Disease Association (EWDA) Dr Willem Schaftenaar Dr Sjaak Kaandorp; tbc European Association of Zoo and Wildlife veterinarians (EAZWV) Peter Daszak, President tbc EcoHealthAlliance

Expert Advisory Panel Databases Jim Ostell*, Eric Brown*, David Lipman* NCBI tbd DNA DataBank of Japan (DDBJ) George Weinstock* George Washington University, St. Louis Susan Knowles* Illumina

Expert Advisory Panel PEARL Barriers Prof. Tom Dedeurwaerdere* University of Leuven, Belgium Martin Buijsen*, André den Exter* Erasmus Univ. R’dam, (iBMG) Gunnar Kahlmeter** ECCMID Danilo Lo Fo Wong* WHO Europe Paul Kellam* (as partner) Sanger Institute tbd World Trade Organization (WTO) David Heymann* London School of Hygiene and Tropical Medicine * = confirmed; ** = to be confirmed; tbc = to be contacted; tbd = to be determined COMPARE Ethics Advisory Board COMPARE will install an external Ethics Advisory Board that will be responsible for providing the COMPARE GA solicited and unsolicited expert advice on the ethics management in COMPARE. This Ethics Board does not replace the tasks of local Ethics Committees and Review Boards at partner level, but is a consortium level external advisory body. The Ethics Board will include experts on the ethical issues related to research involving humans, data protection and privacy, research involving animals, and research with non-EU countries (benefit sharing). For a detailed overview of the ethical issues relevant to COMPARE we refer to section 5. The Ethics Board will have access to all relevant internal documentation (e.g. copies of ethics approvals) necessary to form their opinion on the ethics management in COMPARE.

Page 62: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 62

COMPARE Support Office The Support office will provide financial- administrative, secretarial, operational and organizational support to the EB members and Working Group leaders. Activities of the Support Office will include organization and follow-up of the formal Consortium meetings (see below), development and distribution of a project management manual functioning as a project quick guide for all participants (e.g. contact lists, meeting dates, important milestones, project management structures and decision making, reporting formats, H2020 guidance documents references), and the set-up and maintenance of the Consortium’s electronic documentation dossier. The Support Office will assist the members of the Consortium in the formal reporting to the Commission, including the maintenance of the Project documentation in the Commission’s Participant Portal and the development and submission of the technical and financial reports in compliance with articles 19 and 20 of the EC Grant Agreement. The support office will be adequately staffed with an experienced project manager, secretarial staff and with direct access to the central legal and financial offices of the Coordinator DTU. Decision making procedures, Meetings and Reporting The COMPARE GA will adopt internal decision making rules and procedures, supported by consortium meeting and reporting cycles, formalized in the COMPARE Consortium Agreement. The overall principles to be further detailed in the Consortium Agreement are that each Management Body (GA, EB and Working Group) will take decisions on the basis of consensus. In case a Working Group can reach no consensus the decision is forwarded to the EB for decision-making. If the GA or EB can reach no consensus the decision is made on the basis of a two-third majority of votes. In the GA each Beneficiary has one vote (regardless of the number of principal investigators). In the EB, each member of the EB has one vote. Members have the right to veto a decision in case their legitimate interests would be severely affected by the decision. In case the GA has to decide on issues involving a conflict between two or more beneficiaries or the underperformance of a beneficiary and a solution to the conflict requires a vote, the beneficiary(ies) concerned are not allowed to vote. The COMPARE Consortium agreement will include a set of agreed articles on the decision making procedures (e.g., voting by proxy, quorum) to ensure that all beneficiaries are clear about their rights and obligations concerning the formal decision making process in COMPARE. Internal meeting and reporting cycle COMPARE is a highly complex project, involving many research groups, working in many different Working Groups and having different expertise and associated interests. Ensuring coherence and consistency across the activities of each research group will necessitate frequent communications between research groups to ‘calibrate’ their activities. At Working Group level, this is the responsibility of the Working Group leader and co-leaders through organizing frequent bilateral and multilateral teleconferences and/or skype meetings. This way, quick and efficient exchange of opinions is established in support of smooth Working Group level decision-making. At project-level COMPARE will install a formal meeting a reporting cycle that is supportive of monitoring the overall scientific quality of the project’s results, the alignment of the activities across the various Working Groups and the annual technical and financial reporting of COMPARE to the Commission. The COMPARE Consortium Agreement will include articles on the agreed procedures involved (e.g., periodicity, timing of invitations, agenda setting, acceptance of minutes, representation on formal face-to-face meetings). Subject to this further detailing the COMPARE GA will adopt a formal meeting and reporting cycle as listed below. Hierarchical level

Meeting Frequency

Type Reports

General Assembly

Annual Face-to-Face Technical and Financial reports to Commission in compliance with articles 19 and 20 of H2020 Grant Agreement.

Executive Board

Quarterly Face-to-face and teleconference (iterative)

Quarterly technical project-level progress update to General Assembly.

Working Group

Frequent At discretion of Working Group leader

Quarterly technical Working-Group level progress update to Executive Board (status of objectives, deliverables and milestones (see table 3.2a below). Upload of Deliverable report(s) on EC participant portal concerning deliverables resulting from the Work package.

Page 63: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 63

Hierarchical level

Meeting Frequency

Type Reports

Beneficiary local team(s)

At discretion principal investigators

At discretion of principal investigators

Quarterly status update of beneficiary-level budget utilization to Project Coordinator Annual individual financial statement (Annex 4 of H2020 MGA) to Project Coordinator (via EC participant portal).

For the internal reports the Project Support office will develop and distribute reporting templates as appropriate to ensure compliance with the internal reporting procedures as detailed and agreed in the COMPARE Consortium Agreement. The internal (and external) reporting of the progress monitoring and reporting of work of the Working Groups will be done on the basis of the main project milestones as listed in table 3.2a below. Milestone number

Milestone name Related work package(s)

Estimated project month

Means of verification

M1 When the CA has been signed by all partners

WP15 0 CA

M2 GA and EB formally installed WP15 1 Kick off minutes M3 When all EAPs have been formally

installed WP11, 12 3 First EAP

meetings M4 When first EAP report is finished WP11 3 EAP report M5 When the COMPARE public website is

live WP13 3 Internet

M6 Common protocol drafted with target pathogens and transmission modes for WP1 and WP5 model development

WP1, WP5 12 Protocol available

M7 Inventory of existing sampling and storage protocols completed

WP1 12 Inventory shared amongst partners

M8 Hardware cloud environment ready WP9 12 Access to cloud M9 When the first COMPARE workshop has

been held WP13 12 Workshop

attendees M10 Performance of available sequencing

platforms and protocols for COMPARE study questions reviewed

WP2, WP3 18 WP report to EB

M11 Cost Effectiveness Framework ready WP14 18 WP14 report M12 COMPARE registry and data resource

component ready WP9 18 Access to data

resource M13 Joint protocol for WP8 task 1 studies WP8 24 Protocol

available M14 First version COMPARE RCT live WP10 24 Access via

internet M15 Set of core specimens prepared to be

used as controls across consortium for assay comparison

WP2, WP6, WP7

30 WP reports, specimens

M16 Inventory of available cluster detection algorithms finalized

WP4 30 WP4 report

M17 Review completed of known genetic markers for virulence and resistance for target pathogens in WP3 and 5.

WP3, WP5 30 WP reports, publications

M18 Methodologies report ready WP14 30 WP14 report M19 Validation protocol for NGS based

clinical and public health diagnostics WP6, WP 7 36 Protocol

available M20 List of samples and metadata for WP7

studies finalized, based on task 1 criteria WP7 36 WP7 report

M21 Protocol for wildlife surveillance and WP8 36 Protocol

Page 64: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 64

Milestone number

Milestone name Related work package(s)

Estimated project month

Means of verification

sampling submitted to pilot sites available M22 Review of pathogen traits associated

with enhanced transmissibility completed WP8 36 Publication

M23 Beta version COMPARE RCT live WP10 36 Beta version accessible via internet

M24 Safety data collected and analysed WP14 40 WP14 report M25 Workflows from WPs 2,3,4 and 5

integrated into COMPARE platform WP9 48 COMPARE

platform M26 CE study COMPARE reported WP14 54 Draft report

WP14 Innovation Management Innovation management in H2020 is defined as “a process, which requires an understanding of both market and technical problems, with a goal of successfully implementing appropriate creative ideas. A new or improved product, service or process is its typical output. It also allows a consortium to respond to an external or internal opportunity”. Implementing an effective innovation management process is highly relevant to COMPARE, as the development and broad user uptake of the COMPARE system is pivotal for realizing its long-term objectives. The user consultation process as described in WP11, involving the frequent interactions with a broad group of external experts representative of the many users and stakeholders of COMPARE, is the primary tool to achieve this. It ensures that the novel technologies (tools, processes, software, methods, knowledge) developed in COMPARE are developed with an up to date and direct view on the practical applicability and on the demands and concerns of its relevant (future) users. Innovations are for the most part achieved at the intersections of different sectors, domains and associated technologies. Through the User Consultations, the COMPARE partners will be able to directly interact with other stakeholders, ensuring that the deliverables achieved in COMPARE are meeting client demands and are responsive to client needs and concerns. Critical risks related to project implementation COMPARE has identified risks that could threaten or delay the achievement of its objectives. The critical risks (‘critical’ meaning risks with having a high potential impact on COMPARE) identified and the proposed risk-mitigation measures are summarized in the table 3.2b below. Together with the milestone table and the partner budgets, they form the focus of attention in monitoring the overall progress of work at Working Group and project level, supported by the internal meeting and progress-reporting cycle as described above. Table 3.2b: Critical risks for implementation Description of Risk Work

Packages involved

Proposed risk-mitigation

Severe infectious disease outbreak necessitating urgent reallocation (some or most of) the partners’ resources to mounting research and public health response.

all In the event of a severe ID outbreak at the earlier stages of COMPARE, relevant planned underpinning research in COMPARE could be adapted to allow for incorporating the outbreak response in COMPARE. Such a flexibility clause is also included in comparable emerging infectious disease project such as EMPERIE, ANTIGONE and PREPARE.

There are insufficient resources available for one or more tasks to be completed

all Needed re-allocation of resources will be handled as described under management. Additional funding will be sought.

External Experts not accepting COMPARE invitation to serve on an EAP.

WP11 User-Stakeholder Consultation

In addition to the listed experts a larger candidate list, drawing from the large professional scientific networks of the principal investigators, will be developed to ensure adequately sized and populated EAPs. In the unlikely event that an EAP has too little experts to warrant a separate EAP, the EAP concerned can be merged with another EAP that is complementary (e.g. combining Food with Public Health or Wildlife with Domestic Animals).

Page 65: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 65

3.3  CONSORTIUM  AS  A  WHOLE  The COMPARE consortium harbors 29 partners from 10 different EU Member States and 1 associated country. The logic behind the composition of the COMPARE consortium is that it reflects the heterogeneous group of future users and stakeholders of COMPARE. In addition to the user-stakeholder consultations in WP11, this is the most powerful and effective way to ensure that the results obtained in COMPARE will be accepted and adopted by its intended users. COMPARE brings together organizations, public- and animal health, and food safety institutions and research groups from different sectors, domains, disciplines and backgrounds, into a unique consortium of unprecedented scope in Europe as depicted in figure 8 below. Multidisciplinary expertise in human health, animal health and food safety; Focused on objectives A, B and C and work packages 1-8 The largest cluster – depicted on the top right hand side of figure 8 – consists of national public health and food safety research institutes, hospitals and academic research groups that together will be responsible for the development of the analytical workflows of WPs 1-8. This cluster harbors scientists from different disciplinary backgrounds, covering the domains, sectors and pathogens that need to work together in developing COMPARE. Bringing together these different fields of expertise into one consortium holds the promise of COMPARE harvesting the added value of true interdisciplinary collaboration across these domains, sectors and disciplines. We firmly believe that this interdisciplinary approach is paramount for timely detection of and effective responses against infectious diseases outbreaks. Fragmenting our efforts along the traditional boundaries between human health, animal health and food safety, or maintaining the divide between public health research, clinical research and basic research or even between bacteria, virus or parasites, does not do justice to the complexity of this task. Therefore any consortium that addresses topic PHC-7 in our view should reflect this complexity in its composition, or else face the risk of only addressing part of the problem and not taking into account the needs of a large proportion of its envisaged data providers and users. This complementarity is summarized in the following table, which highlights where each of the COMPARE partners fits in the above-mentioned areas of expertise. For a more detailed description of the expertise of the consortium partners we refer to section 4.1.

For unforeseen reasons a partner has to leave COMPARE

all In this event the COMPARE executive board, in collaboration with the general assembly will seek for a replacement within the COMPARE consortium and if this is not feasible for an external partner.

Too many External Experts do not foresee general acceptance of the model where users submit data to a central database.

WP9, WP11, This is considered highly unlikely since this would contradict both the +20 years of experience from global exchange of biological data, be against EU-strategies for increased exploration of big data, be highly costly because of the need for several local super-computer and storage facilities and directly contradict the intentions with the current call where rapid exchange and comparison of data is essential. However, in the event COMPARE will enable local pipelines of the entire core workflow to be installed locally and only have the development phase centrally.

Sample and/or data collection delayed or hampered.

WPs1-8 Partners in these WPs have ample experience with managing sampling and data collection activities as proposed. They have several alternative options for accessing the needed samples and data for the proposed work.

Heterogeneity in scientific backgrounds and disciplines in COMPARE hampering quick and smooth collaborations and exchanges of ideas.

All WPs Having a strong management structure with within each managerial layer (EB, WP) having teams of different backgrounds and expertise to promote direct interactions at all level mitigates this risk.

Page 66: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 66

Partner Main disciplines Sector pathogen

Working Group(s)

Hum

an H

ealth

A

nim

al h

ealth

Fo

od s

afet

y Vi

ruse

s B

acte

ria

Prot

ozoa

1 DTU Veterinary & food bacteriology, Bioinformatics Meta-genomics Foodborne pathogens, Risk Assessment, Risk modeling, Foodborne virus

X X X X X 1, 2, 4, 9, 11, 13, 15 2 EMC Public Health Epidemiology, Molecular virology, Comparative

pathology, bioinformatics X X X 1-9, 11, 12, 13, 15

3 SSI Public health epidemiology, Foodborne pathogens X X X 1, 4, 7, 8, 11, 13 4 FLI Veterinary Virology/Zoonoses, Epidemiology, Databases

Data analysis and curation, Next generation sequencing X X X X 1, 2, 3, 5, 8, 11, 13

5 ANSES Molecular bacteriology, Foodborne pathogens, Epidemiology

X X X X X 2, 3, 4, 7, 11, 13 6 RKI Molecular diagnostics, epidemiology, biochemistry X X 1-7, 11, 13 8 ISS Molecular biology and epidemiology, Parasitology X X X 1, 2, 4, 5, 7, 8, 11, 13 9 RIVM Veterinary virology, epidemiology X X X 4, 5, 8, 12 10 AHVLA Veterinary bacteriology and virology, bioinformatics,

epidemiology, risk assessment modeling X X X 1-5, 7, 8, 9, 11, 13

11 UEDIN Epidemiology, pathogen evolution X X X X 1, 5, 8, 11, 13 12 UK Bonn Virology, molecular diagnostics, virus evolution X X 1, 2, 5, 8, 11, 13 13 AMC Microbiology, bacteriology, clinical diagnostics X X X 2, 3, 5, 6, 8, 11, 13 14 UA Microbiology, bacteriology, clinical diagnostics X X 3, 6, 11, 13 15 Artemis Virology, wildlife health, virus discovery X X 1, 2, 8, 11, 13 16 UCAM Mathematical modeling, pathogen evolution X X X X 5, 7, 8, 11, 13 17 TIHO Veterinary pathology X 1, 2, 3, 8, 11, 13 18 IREC Wildlife health X X 1, 8, 11, 13 19 FMER Microbiology, pathogen discovery X 3, 5, 6, 8, 11, 13 20 AUTH Microbiology, molecular epidemiology, veterinary virology X X X 1, 2, 6, 7, 8, 11, 13 21IFREMERR

Molecular bacteriology X X 1, 2, 4, 7, 8, 11, 13 23 ANU Epidemiology X X X 2, 3, 4, 7, 11 27 UNIBO Veterinary & food bacteriology X X X 1, 2, 3, 5, 7, 8, 11, 13 28 DSMZ Bacterial population genomics X X 2, 3, 6, 11, 13 29 Sanger Genomics, molecular microbiology X X X X 2, 3, 11, 13 Expertise in System Architecture & Software, bioinformatics and computing; Focused on objective D and work package 9 The core of COMPARE is formed by the lead investigators that have been at the forefront of advocating and practicing the establishment of a common interactive microbiological genomic platform for pathogen identification and outbreak analysis and mitigation. Driven by the advent of whole genome sequencing and supporting information technology resources, lead investigators of Partner 1 DTU, partner 2 EMC and 7 EMBL have been at the basis of the GMI (http://www.globalmicrobialidentifier.org/) initiative in 2010. Building the core data and information sharing platform of COMPARE requires state-of-the-art expertise, resources and experience in designing, programming, testing and maintaining IT architecture and software for capturing, storing, curating and analyzing datasets in huge volumes. With the inclusion of EMBL-EBI in COMPARE we are ensured of Europe’s leading experts and resources in this area and a direct and already established link to the EU ESFRI-infrastructure and global DNA repositories. EMBL is supported by DTU, Wigner, RIVM and WTSI in the development of the WP9 data and information platform Risk Communication expertise; Focused on objective E and work package 10 The information generated by COMPARE will be used for communicating information to various stakeholders, including the public, of information about risks and threats to public health. COMPARE will in the future likely identify more problems, more rapidly and with the need for more urgent actions and public involvement compared to the current situation. This communication should be managed carefully and in a coordinated fashion, supported by appropriate tools for public health actors primarily responsible for this communication. The design and building of these tools requires specialised expertise that is provided by partner 26 RT that can build on its achievements in the EU TELL-ME project. Public Health and Animal Health Economics and cost effectiveness: Focused on objective G and WP14 Assessing the cost effectiveness of the tools and procedures developed in COMPARE demands the inclusion of experts in the field of public health and animal health economics. Cost-benefit analysis of where and when to use

Page 67: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 67

which detection and characterization technology or sampling scheme need to be addressed as well as the economic implications of performing public interventions, also based on the epidemiological predictions of the size and magnitude of outbreaks. This is covered by partners 22 EUR and 26 CIVIC who together will be responsible for achieving objective G in WP 14. Combined, the COMPARE consortium is ideally suited to realize its project objectives.

Industrial and commercial involvement In the COMPARE consortium, three Small and Medium Sized Enterprises (SMEs) take part as full consortium members (beneficiaries) that have specific expertise in areas of high relevance to COMPARE. These SMEs are P15 Artemis, P25 CIVIC and P26 RT. Artemis will be involved in designing and testing the analytical tools in COMPARE from the perspective of wildlife as one of the important sources of emerging infectious diseases outbreaks. CIVIC and RT bring in their experience and expertise in cost-effectiveness studies and risk communication tools and strategies, respectively. The second way in which the private sector is involved in COMPARE is through the involvement of clinical, food producing and processing actors in the WP11 User Consultations. Through their membership in the Expert Advisory Panel on Food safety, they will have direct influence on the design of COMPARE. Their involvement in the design of COMPARE is crucial to optimally position COMPARE for future take-up of the system by the food and feed producing and processing organizations from the private sector. We see the development of standards for genomic and meta data information exchange and development of proof of concept software as a pre competitive step and do therefore not assume that companies want to use large resources on this. We will therefore in the first half of the project involve companies mainly as stakeholders but if some companies want to follow the work closer they are welcome to join the efforts. We plan to release all the developed software under an Apache 2.0 license which are a very permissive open source license often preferred by companies since it allows them to adapt the code to their own use without being obliged to publish the revised version of the code. In the second half of the project we expect to reach companies as well as academic groups through outreach projects where we seek to educate about the standards we have developed and how to implement them in software.

Figure 8: the COMPARE consortium

Page 68: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 68

COMPARE has made the fundamental choice not to involve the private sector in the development of its core system architecture and software, covered by partners EMBL, DTU, RIVM, WIGNER and AHVLA. This fundamental choice stems from two main considerations: - The data and information-sharing platform central to COMPARE cannot afford to be dependent for its long-

term sustainability on private sector companies that are subjected to the volatile ICT market. Crucial components of the systems’ software and/or hardware should be in the hands of the public sector, with adequate safeguards against private sector events such as mergers, (hostile) takeovers or bankruptcy that can compromise the independent nature of the system;

- The long-term sustainability of the system depends highly on its users (both from the public sector as well as from the private sector) trusting their data to the system. Having core elements of the system maintained by private sector companies can be detrimental to this trust, in light of the abovementioned (perceived) risks.

Other countries All the partners in the COMPARE consortium are based in EU member states with the exception of partner 23 ANU from Australia. The involvement of partner 23 springs from the wish to directly include the expertise of Prof. Martyn Kirk, employed by P23, and funded nationally for which we refer to section 4.1.

3.4  RESOURCES  TO  BE  COMMITTED  Table 3.4a Summary of staff effort shows the estimated person months per partner/WP. WP Leaders are indicated in bold. WP co- leaders are indicated in italics.

WP1 WP2 WP3 WP4 WP5 WP6 WP7 WP8 WP9 WP10 WP11 WP12 WP13 WP14 WP15 Total 1 DTU 21 30 0 18 0 9 15 9 120 0 3 5 30 0 70 330 2 EMC 27 26 6 7 22 36 14 48 20 0 3 2 2 0 18 231 3 SSI 1 0 0 38 0 0 41 18 0 0 1 0 3 0 0 101 4 FLI 14 54 18 0 18 0 0 18 12 0 1 2 6 0 0 143 5 ANSES 0 15 15 12 0 0 12 0 0 0 1 0 1 0 0 56 6 RKI 5 17 14 19 9 5 21 0 0 0 1 0 1 0 0 92 7 EMBL 0 0 0 0 0 0 0 0 131 0 1 0 1 0 0 133 8 ISS 2 5 0 3 1 0 10 4 0 0 1 0 1 0 0 27 9 RIVM 0 0 0 27 0 0 27 0 0 0 1 5 1 0 0 61

10 AHVLA 57 15 2 12 6 0 12 8 1 0 1 0 1 0 0 115 11 UEDIN 31 0 0 0 14 0 0 30 0 0 1 0 1 0 0 77 12 UK-Bonn 8 10 0 0 2 0 0 22 0 0 1 0 1 0 0 44 13 AMC 0 4 20 0 6 20 0 10 0 0 1 0 1 0 0 62 14 UA 0 0 17 0 0 19 0 0 0 0 1 0 1 0 0 38 15 Artemis 8 8 0 0 0 0 0 20 0 0 1 0 1 0 0 38 16 UCAM 0 0 0 0 36 0 0 29 0 0 1 0 1 0 0 67 17 TIHO 17 17 6 0 0 0 0 20 0 0 1 0 1 0 0 62 18 UCLM 30 0 0 0 0 0 0 62 0 0 1 0 1 0 0 94 19 FMER 0 0 9 0 8 27 0 25 0 0 1 0 1 0 0 71 20 AUTH 2 19 0 0 0 14 11 8 0 0 1 0 1 0 0 56 21 IFREMER 1 7 0 5 0 0 23 8 0 0 1 0 1 0 0 46 22 EUR 0 0 0 0 0 0 0 0 0 0 0 48 1 60 0 109 23 ANU 0 1 1 1 0 0 1 0 0 0 1 0 0 0 0 5 24 WIGNER 0 0 0 0 0 0 0 0 60 0 0 0 1 0 0 61 25 CIVIC 0 0 0 0 0 0 0 0 0 0 0 0 1 56 0 57 26 RT 0 0 0 0 0 0 0 0 0 60 1 5 5 0 0 71 27 UNIBO 4 26 9 0 5 0 17 1 0 0 1 0 1 0 0 62 28 DSMZ 0 24 12 0 0 12 0 0 0 0 1 0 1 0 0 50 29 WTSI 0 72 36 0 0 0 0 0 12 0 1 0 12 0 0 133

Totals 228 349 165 142 128 142 204 340 356 60 30 67 80 116 88 2492

Page 69: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 69

Table 3.4b: ‘Other direct cost’ items The table below lists the justification of the other costs items for those partners of which these costs are higher than 15% of their direct personnel costs. The majority of these costs are related to the sequencing activities in WPs 1-8, which give rise to other direct costs on top of the regular laboratory and research consumables. 1 DTU Cost (€) Justification

Travel 298.000 5 Annual GA meetings, EB meetings, 10 Workshops WP13 Other goods & services (G&S)

298.000 Software licenses, Consumables and sequencing, Computer and data-storage, Website, leaflets, training materials and dissemination materials;

Total 596.000 2 EMC Cost (€) Justification

Travel 280.000 GA/EB meetings and reimbursement 30 experts annual meetings EAPs WP11 Other (G&S) 277.000 Molecular biology reagents, sequencing reagents, PCR reagents and kits;

extraction kits, chemicals, laboratory reagents and buffers, plastic materials; cell culture reagents, antibodies, sampling materials and kits, diagnostic kits; costs for open access publications, audit certificate;

Total 557.000 4 FLI Cost (€) Justification

Travel 32.200 Annual project meetings/WP meetings, international congresses Other (G&S) 187.000 See EMC justification for other G&S

Total 219.200 5 ANSES Cost (€) Justification

Travel 38.000 Annual project meetings/WP meetings, international congresses Other (G&S) 173.000 Consumables for molecular biology, libraries preparation and NGS sequencing

and conventional typing of micro organisms (listeria) Total 211,000

7 EMBL Cost (€) Justification Travel 40.500 Cost of travel within Europe (average 8 trips per year) including site visits to

analysis development and submitting partners, Cost of travel outside of Europe (average 1 trip per year) including GMI meetings and INSDC meetings relevant to COMPARE

Other (G&S) 407.508 Embassy cloud services are provided on the basis of a single annual charge of €38,838 for the overall supported environment offering a starting 10GHz CPU, 32GB RAM and 0.5TB storage, with an annual upscale charge of €460 per GHz CPU, €45.3 per GB RAM and €1,505 per TB storage. Beyond the compute requested from the Embassy cloud service, we also request support for a storage area for private data.. We provide costings for the storage of 40TB of private data at €1,505 per TB, totalling €60,214

Total 448.008 8 ISS Cost (€) Justification

Travel 28.000 Annual project meetings/WP meetings, international congresses Other (G&S) 333.000 See EMC justification other G&S

Total 361.000 9 RIVM Cost (€) Justification

Travel 40.000 Annual project meetings/WP meetings, EAP barriers reimbursement for experts;

Other (G&S) 78.000 See EMC justification for other G&S Total 118.000

10 AHVLA Cost (€) Justification Travel 39.286 Annual project meetings/WP meetings, international congresses

Other (G&S) 115.461 See EMC justification and costs for purchase and handling laboratory animals (ferrets and poultry);

Total 154.747 11 UEDIN Cost (€) Justification

Travel 26.000 Annual project meetings/ periodic WP meetings, international congresses Other (G&S) 109.200 Laboratory and research Consumables, Computer software, Audit certificate

Total 135.200 12 UK-BONN Cost (€) Justification

Travel 10.000 Annual project meetings/ periodic WP meetings, international congresses Other (G&S) 85.000 See EMC justification other G&S

Page 70: COVER PAGE (DTU) EMBLrandd.defra.gov.uk/...COMPAREpartB1-3.pdf · COMPARE 1 COVER PAGE COMPARE: COllaborative Management Platform for detection and Analyses of (Re-) emerging and

COMPARE 70

Total 95.000 13 AMC Cost (€) Justification

Travel 10.000 Annual project meetings/ periodic WP meetings, international congresses Other (G&S) 104.000 See EMC justification other G&S

Total 114.000 14 UA Cost (€) Justification

Travel 9.000 Annual project meetings/ periodic WP meetings, international congresses Other (G&S) 103.000 See EMC justification other G&S

Total 112.000 15 Artemis Cost (€) Justification

Travel 5.000 Annual project meetings/ periodic WP meetings, international congresses Other (G&S) 40.000 Costs for necropsy, histology, immunohistology, in situ hybrdization, generation

of probes, transmission microscopy. Total 45.000

17 TIHO Cost (€) Justification Travel 5.000 Annual project meetings/ periodic WP meetings, international congresses

Other (G&S) 48.000 See 15 Artemis justification of costs + audit certificate Total 53.000

18. UCLM Cost (€) Justification Travel 20.000 Annual project meetings/ periodic WP meetings, international congresses

Equipment 26.200 Depreciation costs for microscope Other (G&S) 41.000 Field sampling equipment and consumables, lab. diagnostics consumables

Total 87.200 19 FMER Cost (€) Justification

Travel 15.000 Annual project meetings/WP meetings, international congresses Other (G&S) 80.000 See EMC justification other G&S

Total 95.000 20 AUTH Cost (€) Justification

Travel 30.000 Annual project meetings/ periodic WP meetings, sampling Equipment 15.000 Real time PCR thermal cycler

Other (G&S) 136.000 See EMC justification other G&S Total 181.000

21 IFREMER Cost (€) Justification Travel 23.000 Annual project meetings/ periodic WP meetings, international congresses

Equipment 25.500 Digital PCR (used for the project at 25% of the time) in WP2 Other (G&S) 72.500 See EMC justification other G&S

Total 121.000 24 WIGNER Cost (€) Justification

Travel 20.000 Annual project meetings/ periodic WP meetings, international congresses Other (G&S) 63.000 consumables, publication cost, conference fees, minor computer hardware

replacement/extension/repair; PC, computer server extension modules, CFS Total 83.000

25 Civic Cost (€) Justification Travel 29.600 Travel to project meetings, travel for case study (retrospective/pilot/scenario

study) of WP14 Other (G&S) 28.000 Data collection costs, costs for audit certificate

Total 57.600 27 UNIBO Cost (€) Justification

Travel 20.000 Annual project meetings/ periodic WP meetings, international congresses Equipment 20.000 Computer hardware for WP9 Data Resource

Other (G&S) 114.000 See EMC Justification of other G&S Total 154.000

28 DSMZ Cost (€) Justification Travel 4.000 Annual project meetings/ periodic WP meetings, international congresses

Other (G&S) 47.000 Total 51.000