alexander bisignano, recombine // the path to personalized medicine

27
Alexander Bisignano Co-Founder & CEO, Recombine [email protected] @alxbz 2/17/15 Copyright Recombine 2015 1 Data Science & Genomics The Path to Personalized Medicine

Upload: firstmark-capital

Post on 07-Jan-2017

464 views

Category:

Technology


3 download

TRANSCRIPT

Alexander Bisignano Co-Founder & CEO, Recombine [email protected] @alxbz

2/17/15 Copyright Recombine 2015 1

Data  Science  &  Genomics  

The  Path  to  Personalized  Medicine  

Historical Approaches to Generating & Analyzing Genetic Data

Genetic Data

2/17/15 Copyright Recombine 2015 2

A DIGITAL DATA STORAGE SYSTEM WHAT IS GENETIC MATERIAL?

2/17/15 Copyright Recombine 2015 3

THE HUMAN GENOME A LITTLE BIT ABOUT

2/17/15 Copyright Recombine 2015 5

GENETICS 101 §  23 Pairs of Chromosomes §  2 Sex Chromosomes (X/Y) §  Chromosomes Made of Nucleic Acids §  4 Nucleic Acids in DNA: A, T, C, & G

§  Adenine §  Cytosine §  Guanine §  Thymine

§  Chromosomes contain Genes §  Genes are instructions for Proteins §  ~3 Billion Bases

WHAT DO WE UNDERSTAND? GENETIC DISEASE

2/17/15 Copyright Recombine 2015 6

ANEUPLOIDY – CHROMOSOMAL ABNORMALITIES §  Ex. Trisomy 21: Down Syndrome §  Errors made during Meiosis – Formation of Sperm or Eggs §  INCIDENCE: 1/500 Live Births

WHAT DO WE UNDERSTAND? GENETIC DISEASE

2/17/15 Copyright Recombine 2015 8

SINGLE GENE DISORDERS– INHERITED DISEASES §  Ex. Cystic Fibrosis: Broken CFTR Gene §  Carriers of 1 Broken Copy are not affected; 2 Broken Copies are affected §  INCIDENCE: 1/300 Live Births

Decreasing Costs & Increasing Information

The Genomic Revolution

2/17/15 Copyright Recombine 2015 10

COST OF A HUMAN GENOME RAPIDLY DECLINING

2/17/15 Copyright Recombine 2015 11

EXPONENTIALLY MORE DATA DECREASING COSTS LEAD TO

2/17/15 Copyright Recombine 2015 12

1 �25 �

625 �15625 �

390625 �9765625�

244140625�6.104E+09 �

FISH� aCGH� SNPs � EXOME � WGS�

15�Minutes �

3.47�Days�

208.33�Days�

117.96�Years �

5.85�Millenia �

*If each data point takes ~1 minute to analyze, how long will a single sample take?�

DATA CHALLENGES IN GENOMICS DEALING WITH ALL THE DATA

2/17/15 Copyright Recombine 2015 13

DATA STORAGE §  While great database systems exist, standard data storage remains problem §  ‘Laboratory’ diagnostic companies do not employee CS or Data Engineers

DATA INTEGRITY §  There are >20 major databases with Genomic Annotation §  Database differences are ubiquitous (mutation names, conventions, etc.)

DATA SHARING IS STILL CHALLENGING §  Many Universities & Companies concerned over ‘forfeiting IP’ §  Platforms for sharing are still very ‘young’ §  Data quality varies enormously

How do we model, predict & clinically act upon complex disease?

Complex Genetic Diseases

2/17/15 Copyright Recombine 2015 14

MODELING COMPLEX DISEASE WILL THE SINGLE-GENE PARADIGM HOLD?

HOW DO MANY GENES INTERACT? §  Complex Signal-Transduction pathways §  How do they affect a single outcome §  Can be influenced by environmental factors §  Non-100% inheritance pattern

2/17/15 Copyright Recombine 2015 15

COMPLEX GENETIC TRAITS ANALYSIS NO LONGER POSSIBLE BY HUMANS ALONE

2/17/15 Copyright Recombine 2015 16

Gene   Variants   Progressive  Combina2ons  

FSH   6   6  

PKA   4   24  

GAB2   5   120  

R2C2   7   840  

IRS1   4   3360  

AKT   3   10080  

The FSH signal transduction pathway in in granulosa cells leads to follicle recruitment.

SAMPLE APPLICATIONS §  Predicting patient response to drugs §  Predicting patient disease §  Forecasting treatment success rate §  Predicting optimal treatment pathway

COMPLEX GENETIC TRAITS COMMUNICATING OUTCOME & MAKING MEDICAL DECISIONS

2/17/15 Copyright Recombine 2015 17

AVERAGE POPULATION RISK VS. PERSONAL RISK §  Ex. Breast Cancer: Average v. BRCA1 Mutation Carrier

AVERAGE PATIENT 12.5% LIFETIME RISK

BRCA1 CARRIER 60% LIFETIME RISK

STEPS TOWARDS PERSONALIZATION OUTLIERS STILL INFORM THE MEAN

2/17/15 Copyright Recombine 2015 18

• Ex. Single-Gene, Population Outliers

Step 1: Identify & Predict Extreme Cases

• Ex. Multi-Gene, Population Segments

Step 2: Segment the Slightly More

Complex • Ex. Multi-Gene, Personal Predictions

Step 3: Predict & Personalize

• Ex. Single-Gene, Population Outliers

Step 1: Identify & Predict Extreme Cases

• Ex. Multi-Gene, Population Segments

Step 2: Segment the Slightly More Complex

Num

ber o

f Ind

ivid

uals

An Unexpected Challenge due to the Unknown Unknowns of Genomics

Automation of Data Analysis

2/17/15 Copyright Recombine 2015 19

TEST 1000s OF MUTATIONS AT ONCE DNA MICROARRAYS

2/17/15 Copyright Recombine 2013 20

…TAACTGCTATTTTCGTACCA…

Hybridized  Genomic  DNA  

ATTGACGATAAAAGCATGG_

SyntheJc  Probe  

T

Single  Base  Extension  

Coupled  Light  ReacJon  

CAN DIFFERENTIATE 2 NUCLEIC ACIDS SAMPLE MUTATION CLUSTER

2/17/15 Copyright Recombine 2015 21

Num

ber o

f Ind

ivid

uals

Wavelength of Light

AA AC CC

ONLY 2 LIGHT CHANNELS DNA MICROARRAYS

2/17/15 Copyright Recombine 2013 22

2 LIGHT CHANNELS, 4 NUCLEIC ACIDS. HUH? §  With only 2 light channels how do we differentiate all 4 nucleic acids?

§  Paradigm 1: §  Only test for highly conserved, bi-allelic Single Nucleotide

Polymorphisms (SNPs) (Bi-allelic SNPs thought only to differ between two of the four amino acids)

§  i.e. There is an ASSUMPTION that targeted mutations only differ between 2 of the 4 nucleic acids

But…

CANAVAN DISEASE MUTATION REAL LIFE EXAMPLE

2/17/15 Copyright Recombine 2015 23

A single mutation breaks the ASPA Gene §  Mutation known as p.Y231X (c.693C>A) §  Causes a premature Termination Codon (aka STOP) §  Assumption of allele frequencies based on known literature:

§  Reality of allele frequencies in CERTAIN POPULATIONS:

C (>99%)

A (<1%)

C (~72%)

A (<1%)

G  (~27%)  

WHAT ABOUT POLYMORPHISMS? DNA MICROARRAYS

2/17/15 Copyright Recombine 2015 24

Num

ber o

f Ind

ivid

uals

Wavelength of Light

AA

AG AC

GG CC

OVERLAP PRIMERS ACCOUNTING FOR POLYMORPHISMS

2/17/15 Copyright Recombine 2015 25

ATCGCCTATAGACCCA_ 5’ ATCGTAGCGGATATCTGGGTCAAATCATGATCAAGGATA 3’ 3’ TAGCATCGCCTATAGACCCAGTTTAGTACTAGTTCCTAT 5’

_CAAATCATGATCAAGG

SOME PROBES ARE DESIGNED TO FAIL §  If an unplanned polymorphism exists, certain primers of ours simply will

not show any result.

_TAAATCATGATCAAGG

PLAN FOR ALL POSSIBILITIES ASSUMPTIONS BASED ON INCOMPLETE DATA

2/17/15 Copyright Recombine 2015 26

Takeaways §  Assumptions about the genome are often being disproved §  Even great genomic technologies need sanity checking

Alexander Bisignano Co-Founder & CEO, Recombine [email protected] @alxbz

2/17/15 Copyright Recombine 2015 27

855-­‐OUR-­‐GENES  –  [email protected]  –  www.recombine.com