pollock 3/30/11 biostatistician’s role in managing ... · pdf filethe...

11
Pollock 3/30/11 1 The Biostatistician’s Role in Managing Clinical Translational Research Data The Biostatistician’s Role in Managing Clinical Translational Research Data Brad Pollock, MPH, PhD Chairman, Department of Epidemiology and Biostatistics University of Texas Health Science Center at San Antonio Main Campus Biostatistics and Informatics Core, Cancer Therapy & Research Center Biostatistics and Research Design Core, Institute for the Integration of Science and Medicine (CTSA) Biomedical Informatics Core, Institute for the Integration of Science and Medicine (CTSA) Children’s Oncology Group Community Clinical Oncology Program (CCOP) Research Base Cooperative group statistician for the Pediatric Oncology Group and he successor Children’s Oncology Group Cancer center biostatistics core director GCRC and CTSA biostatistics and informatics core director Biostatistics cores: computational infrastructure and data management Data quality Discipline roles and responsibilities • Projects • Trends Biostatistical Support Units

Upload: lamcong

Post on 22-Feb-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

1

The Biostatistician’s Role in Managing Clinical Translational Research Data

The Biostatistician’s Role in Managing Clinical Translational Research Data

Brad Pollock, MPH, PhD

Chairman, Department of Epidemiology and Biostatistics

University of Texas Health Science Center at San Antonio

Main Campus

Biostatistics and Informatics Core, Cancer Therapy & Research Center

Biostatistics and Research Design Core, Institute for the Integration of Science and Medicine (CTSA)

Biomedical Informatics Core, Institute for the Integration of Science and Medicine (CTSA)

Children’s Oncology Group Community Clinical Oncology Program (CCOP) Research Base

• Cooperative group statistician for the Pediatric Oncology Group and he successor Children’s Oncology Group

• Cancer center biostatistics core director

• GCRC and CTSA biostatistics and informatics core director

• Biostatistics cores: computational infrastructure and data management

• Data quality

• Discipline roles and responsibilities

• Projects

• Trends

Biostatistical Support Units

Page 2: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

2

The Biostatistician’s Role in Managing Clinical Translational Research Data

Types of Biostatistical Support Units

• CTSA Biostatistics, Epidemiology, Research Design (BERD) units

• Cancer Center Support Grants (P30)

• Data Coordinating Centers

• Statistical and Data Centers for NCI cooperative groups

Academic Homes for Biostatistical Support Units

• Units can be based in:– Divisions

– Departments

– Schools/colleges

– Centers/institutes

– Administrative units of universities

– External coordinating centers

Biostatistics Core Functions

• Design studies– Clarify hypotheses and objectives

– Define endpoints

– Select study/experimental design

– Sample size/power calculations

– Develop analytic plans

• Monitor studies– Efficacy/futility

– Safety

• Analyze studies– Statistical analysis

– Writing reports/manuscripts

Co

mp

uta

tio

n

Who should define, manage, and oversee clinical translational research data operations?

Premise

• With some exceptions, computation in biostatistics has been heavily focused on analysis

• With the CTSAs, managing data for clinical translational research may be shifting toward the biomedical informatics discipline

WIKIPEDIA

• Biostatistics: …biostatistics encompasses the design of biological experiments, especially in medicine and agriculture; the collection, summarization, and analysisof data from those experiments; and the interpretation of, and inference from, the results.

• Biomedical Informatics: …at the intersection of information science, computer science, and health care. It deals with the resources, devices, and methods required to optimize the acquisition, storage, retrieval, and use of information in health and biomedicine. Health informatics tools include not only computers but also clinical guidelines, formal medical terminologies, and information and communication systems.

Page 3: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

3

The Biostatistician’s Role in Managing Clinical Translational Research Data

Biomedical Informatics

Focus areas:– Ontologies– Vocabulary/terminology– Data models– Human-machine interface– Natural language processing– Electronic health records– Data repositories

Who should define, manage, and oversee clinical translational research data operations?

It depends…

Answer:

• No brainer if you are a:– NCI cooperative Group Statistician

– Director of a NIH-funded Data Coordinating Center (DCC)

– Director of a structured biostatistics core: e.g., Centers for AIDS Research (CFAR), Alzheimer’s Disease Core Centers, etc.

• Often a requirement of the RFA

Answer:

• Less clear if you are a:– Director of a CTSA BERD unit

– Director of a CCSG P30 Biostatistics Core

– Director of an institutional biostatistics support unit with a separate group informatics group

• Clinical informatics

• Bioinformatics

– National Children’s Study center

DATA MANAGEMENT

What is Data Management?

• The development, execution and supervision of plans, policies, programs and practices that control, protect, deliver, and enhance the value of data and information assets*

*Data Management Association, Data Management Body of Knowledge (DAMA-DMBOK), 2008

Page 4: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

4

The Biostatistician’s Role in Managing Clinical Translational Research Data

Who’s Involved in Data Management

SubjectsParticipantsPatients Investigators

CliniciansResearch StaffClinical Staff

StatisticiansEpidemiologistsAnalytic Staff

Central ITCIOISOSNO

Research ITAnalystsProgrammersDBAs

End-to-End Process

Data Management within the Research Process

Final StatisticalAnalysis

ProtocolDevelopment

Data ManagementProcess

ITInvolvement

Data Management Changing Within the Research Process

Final StatisticalAnalysis

ProtocolDevelopment

Data ManagementProcess

Data managementconsiderations arebeginning to influencethe science

}

{

Storage and long term utilization affect the data long after the protocol’s final analysis

Data Management Responsibilities

• Maintain a functional, flexible, scalable, cost-efficient resource to handle a variety of data:– Demographic– Clinical/laboratory – Bioinformatics– Environmental

• Data quality and compliance with regulatory requirements– HIPAA– 21 CFR Part 11– FISMA

• Planning for:– Long time horizons (e.g., NCS)– Interoperability and federation (e.g., caTissue Suite,

caGRID, OpenMDR)

Database Management Functions

• Database design– Data elements– Relationships (data model)– Access control/security/integrity

• Application development– Data capture– Data curation– Querying– Reporting– Audit

• Database operation

How Data Are Handled

• Paper forms (CRFs) and keypunch

• Client-server DBMS and networked DBMS

• Web-front end DBMS– Pediatric Oncology Group replaced paper

in 1998• Web front-end

• Oracle back-end

• Clinical Trials Management System (CTMS)

Advancing Techn

ology

Page 5: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

5

The Biostatistician’s Role in Managing Clinical Translational Research Data

Clinical Trials Management Systems

IMPACT® CTMS

• Maintain and manage: Planning, preparation, performance, and reporting of

clinical trials

• Up-to-date contact information for participants

• Tracking deadlines and milestones • Regulatory approval • Progress reports

IDEAS

DATA QUALITY

Criteria for Reproducible Epidemiologic Research*

Research Component

Requirement

Data Analytical data set is available.

Methods Computer code underlying figures, tables, and other principal results is made available in a human-readable form. In addition, the software environment necessary to execute that code is available.

Documentation Adequate documentation of the computer code, software environment, and analytical data set is available to enable others to repeat the analyses and to conduct other similar ones.

Distribution Standard methods of distribution are used for others to access the software, data, and documentation.

*from Peng, Dominici, Zeger. Am J Epidemiol 2006;163:783–789

Little emphasis on how we get to this point!

Little emphasis on how we get to this point!

Endgame

• Our goal is to do meaningful analyses to address study hypotheses

• Ethical analyses requires quality data– Gelfond et al. “Principles for the Ethical

Analysis of Clinical and Translational Research” (resubmitted to Statistics in Medicine)

Information vs. Analytical Quality

The features that make information useful are directly related the features that make statistical analyses useful.

1.Statistical analyses should preserve the good qualities of the data.

2.The value of statistical analysis heavily depends on the information quality.

InformationQuality

StatisticalQuality

DISCIPLINE ROLES AND RESPONSIBILITIES

Biostatistics Cores

Page 6: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

6

The Biostatistician’s Role in Managing Clinical Translational Research Data

Computational Disciplines for Clinical Translational Research

Research ITComputer Science

Biomedical Informatics

Clinical Translational Research Enterprise

Computational and Biostatistics Disciplines for Clinical Translational Research

Research ITComputer Science

Biomedical Informatics

Clinical Translational Research Enterprise

Biostatistics

University of Texas Health Science Center at San Antonio

Informatics Data Exchange and Acquisition System (IDEAS)

• Began database development in 2001 for the San Antonio Cancer Institute (P30) using the general design approach of the POG Data System

• Extended to support the GCRC in 2002

• Adapted to caBIG requirements in 2007

caBIG

• NCI’s cancer Biomedical Informatics Grid launched in 2004

• Goals

– Connect scientists and practitioners through a shareable and interoperable infrastructure

– Develop standard rules and a common language to more easily share information

– Build or adapt tools for collecting, analyzing, integrating, and disseminating information associated with cancer research and care.

Informatics Data Exchange and Acquisition System (IDEAS)

• Began database development in 2001 for the San Antonio Cancer Institute (P30) using the general design approach of the POG Data System

• Extended to support the GCRC in 2002

• Adapted to caBIG requirements in 2007

• Extended to support the Institute for the Integration of Medicine and Science (CTSA) in 2008

Page 7: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

7

The Biostatistician’s Role in Managing Clinical Translational Research Data

Informatics Data Exchange and Acquisition System (IDEAS)

• Began database development in 2001 for the San Antonio Cancer Institute (P30) using the general design approach of the POG Data System

• Extended to support the GCRC in 2002

• Adapted to caBIG requirements in 2007

• Extended to support the Institute for the Integration of Medicine and Science (CTSA) in 2008 Single Point of Contact portal

Practice-Based Research Network (PBRN) support added in 2010

IDEAS Design Philosophy

• Open-development:Tools and infrastructure developed through an open,

participatory process.

• Open-access:Resources are freely obtainable…to ensure broad data-sharing

and collaboration.

• Open-source:Source code is available to view, alter, and redistribute.

• Federated: Software and resources are widely distributed, interlinked, and

available.

Complexity Encapsulation• Object-based templates• Common business objects• Custom object libraries• Standard Interfaces

User Interface

Data

Business Rules

WebProgrammers

Domain experts and Informatics analysts

DBA

Informatics Data Exchange and Acquisition System

The IDEAS

FrameworkAn interwoven structure of

interdependent components

Security Application

Data Collection Database

• Web• Interface• Batch

Pathology&

Genetics

Security

Protocols

Patient

IDEASThree Tier MVC Framework

IDEAS Interoperable Components

• IDEAS Custom

Meta-data generator

• Shibboleth: Federated Single Sign-On Authentication Service

• caTissue Suite

• Patient Study Calendar (PSC)

• Qualtrics

IDEAS and the IIMS-Affiliated Practice-Based Research

Networks (PBRNs)

1. StarNet (family practice) PBRN

2. Psychiatry PBRN

3. Dental PBRN

4. VA PBRN

Page 8: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

8

The Biostatistician’s Role in Managing Clinical Translational Research Data

Genetics and Biology of Liver Tumorigenesis in Children

• Bioinformatics and Biostatistics Core (BIBSC)

• Bring together disparate data: Pediatric Oncology Group, Children’s Cancer Group,

Children’s Oncology Group, the Cooperative Human Tissue Network (CHTN), Baylor pathology reference lab

Bioinformatics data from a range of high throughput platforms: Illumina, Affy, NextGen Sequencing, etc.

Demographic and clinical information

Human Studies Database Project

The Human Studies Database (HSDB) Project

• Premise:• Study results and design information

should be made computable for large-scale data mining, synthesis, re-analysis, and reuse

• HSDB: A CTSA multi-institutional project to federate study design descriptors and results of the human research portfolio over a grid-based architecture.

HSDB Use Cases

• Inform the design of new studies

• Facilitate systematic reviews/meta-analyses

• Identify potential collaborators by:

• Disease, population, bio-specimens available, analytic method of interest, etc.

• Aid in research management: Portfolio management (inventory of studies by design type) Comparison of human research portfolios across institutions Subject recruitment and community engagement

Ontology of Clinical Research (OCRe)

• HSDB is being developed using the Ontology of Clinical Research (OCRe) and common clinical vocabularies to standardize the storage of information

• Focus on: Study design (Study Design Classifier), interventions,

exposures, and analytic methods of individual-human studies

Any design type, for any intent, in any clinical domain

Federation across CTSAs

Page 9: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

9

The Biostatistician’s Role in Managing Clinical Translational Research Data

TRENDS

Twenty-five years from now…

…we will almost certainly have:

– New programming languages

– New methods for data management

– New architectures

– Completely different applications

– Unforeseen revolutions in how we create, store, manage, and process information

We Need to Expand Computing Education in the

Statistics Curricula

• Nolan and Temple Lang* – “Computational literacy and

programming are as fundamental to statistical practice and research as is mathematics”

– “Statisticians must be able to access data from various sources”

Nolan D, Temple Lang D. American Statistician, 2010, 64:97-107

Computation in Biostatistical Education

• Focus has been on statistical packages, statistical programming

• In 1990, UCLA set-up a new concentration in Data Management for the MS in Biostatistics degree (~fizzled)

• Emerging trend is increased training in bioinformatics and statistical genetics/genomics in curricula

Comparative Effectiveness Research (CER) Concerns

• Death of randomized controlled clinical trials

• Weak analytic designs without appropriate control for bias

• Poor data quality from existing data stores (e.g. EMR)

• Data mining and data dredging – Data dredging is the inappropriate (sometimes

deliberately so) use of data mining to uncover misleading relationships in data.

“Turning Dross into Gold”

• Just as alchemists tried turning dross into gold, the use massive repositories of clinical data (from electronic medical records) does not necessarily yield valid and meaningful inferences

• Data quantity is no substitute for quality

Page 10: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

10

The Biostatistician’s Role in Managing Clinical Translational Research Data

Need to Increase Interactions with Bioinformatics and Cross Train

• NIH’s 2000 definition– Bioinformatics: Research, development, or

application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.

– Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.

http://www.bisti.nih.gov/docs/CompuBioDef.pdf

Interaction of the Disciplines(Bayat A., BMJ, 324:324-27)

Clinical Research

Biostatistics

Clinical Informatics

Need to Increase Interactions with Bioinformatics and Cross Train

• This is already beginning to happen in some places:– Organizational structures

– Curricula

SUMMARY

Take Home Points

• Computational technologies for managing data are changing faster than technologies for analysis

• Data management Data quality

• Data quality Analytic quality

Take Home Points (continued)

• We need to think beyond the immediate project when designing our databases

• Future proof them– Will the data collected by tomorrow’s technology

be scientifically comparable with data collected by today’s technology if the technology is vastly different?

– Considerations:• Software

• Hardware platform

• Database content

Page 11: Pollock 3/30/11 Biostatistician’s Role in Managing ... · PDF fileThe Biostatistician’s Role in Managing Clinical Translational Research Data ... Role in Managing Clinical Translational

Pollock 3/30/11

11

The Biostatistician’s Role in Managing Clinical Translational Research Data

Take Home Points (continued)

• Databases should be designed specifically with the analysis plan in mind

• Proper statistical analysis is still the mainstudy goal, not creating the “perfect” database

Other Take Home Points

• Ramp up the data side of computation into the biostatistics curriculum

• CER efforts should focus on: – Hypothesis testing vs. data mining

– Use of complete, high quality data

– Use of appropriate data models and analysis methods

Other Take Home Points (continued)

• Take advantage of opportunities to partnerwith biomedical informaticians:– Development of translational research which

melds biological data with clinical/population data– Adaptive design methods in clinical trials– National research networking– Future-proofing and repurposing our databases…

“Databases for Clinical Translational Research: Re-Purposing and Designing for Unanticipated Needs”

2:30 PM – 3:45 PM Thursday, April 28, 2011