functional mapping of qtl and recent developments chang-xing ma department of biostatistics...

58
Functional Mapping of Functional Mapping of QTL QTL and Recent Developments and Recent Developments Chang-Xing Ma Chang-Xing Ma Department of Biostatistics Department of Biostatistics University at Buffalo University at Buffalo [email protected] [email protected] Rongling Wu Rongling Wu University of Florida University of Florida

Upload: coleen-nash

Post on 13-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1

Functional Mapping of QTL and Recent Developments Chang-Xing Ma Department of Biostatistics University at Buffalo [email protected] Rongling Wu University of Florida Slide 2 Outline Interval Mapping Functional Mapping Functional Mapping Demo Recent Developments Conclusion Slide 3 Gene, Allele, Genotype, Phenotype Chromosomes from Father Mother Gene A, with two alleles A and a Genotype Phenotype AA 185 100 AA 182 104 Aa 175 103 Aa 171 102 aa 155 101 aa 152 103 Height IQ Slide 4 Regression model for estimating the genotypic effect Phenotype = Genotype + Error y i = x i j + e i x i is the indicator for QTL genotype j is the mean for genotype j e i ~ N(0, 2 ) Slide 5 The genotypes for the trait are not observable and should be predicted from linked neutral molecular markers (M) M1M1 M2M2 M3M3 MmMm QTL...... Our task is to construct a statistical model that connects the QTL genotypes and marker genotypes through observed phenotypes The genes that lead to the phenotypic variation are called Quantitative Trait Loci (QTL) Slide 6 Subject Marker (M) Phenotype Genotype frequency M 1 M 2 M m (y) QQ( 2 ) Qq( 1 ) qq( 0 ) 1 AA(2) BB(2) Y1Y1 2 AA(2) BB(2)...y2y2 3 Aa(1) Bb(1)...y3y3 4 Aa(1) Bb(1)...y4y4 5 Aa(1) Bb(1)...y5y5 6 Aa(1) bb(0)...y6y6 7 aa(0) Bb(1)...y7y7 8 aa(0) bb(0) y8y8 Data Structure n = n 22 + n 21 + n 20 + n 12 + n 00 + n 02 + n 01 + n 00 Parents AA aa F 1 Aa F 2 AA Aa aa Slide 7 Finite mixture model for estimating genotypic effects y i ~ p(y i | , ) = f 2 (y i ) + f 1 (y i ) + f 0 (y i ) QTL genotype (j) QQ Qq qq Code 2 1 0 f j (y i ) is a normal distribution density with mean j and variance 2 = ( 2, 1, 0 ), = ( 2 ) where Slide 8 j|i is the conditional (prior) probability of QTL genotype j (= 2, 1, 0) given marker genotypes for subject i (= 1, , n). Likelihood function based on the mixture model L( , , |M, y) Slide 9 QTL genotype frequency : j|i = g j ( p ) Mean : j = h j ( m ) Variance : = l( v ) We model the parameters contained within the mixture model using particular functions p contains the population genetic parameters q = ( m, v ) contains the quantitative genetic parameters Slide 10 F2 QTL genotype frequency: Freq QQ(2)Qq(1)qq(0) MM(2)NN(2) (1-r) 2 /4 1/4(1-a) 2 (1- b) 2 1/2a(1-a)b(1-b)1/4a 2 b 2 Nn(1) (1-r)r/2 1/2(1-a) 2 b(1-b)1/2a(2b 2 -2b+1)(1-a)1/2a 2 b(1-b) nn(0) r 2 /4 1/4(1-a) 2 b 2 1/2a(1-a)b(1-b)1/4a 2 (1-b) 2 Mm(1)NN(2) (1-r)r/2 1/2a(1-a)(1- b) 2 1/2b(1-2a+2a 2 )(1-b)1/2a(1-a)b 2 Nn(1) -(1-r)r a(1-a)b(1-b)1/2(2b 2 -2b+1)(1- 2a+2a 2 ) a(1-a)b(1-b) nn(0) (1-r)r/2 1/2a(1-a)b 2 1/2b(1-2a+2a 2 )(1-b)1/2a(1-a)(1- b) 2 mm(0)NN(2) r 2 /4 1/4a 2 (1-b) 2 1/2a(1-a)b(1-b)1/4(1-a) 2 b 2 Nn(1) (1-r)r/2 1/2a 2 b(1-b)1/2a(2b 2 -2b+1)(1-a)1/2(1-a) 2 b(1- b) nn(0) (1-r) 2 /4 1/4a 2 b 2 1/2a(1-a)b(1-b)1/4(1-a) 2 (1- b) 2 M a Q b N r=a+b-2ab j|i 2|22 1|12 Slide 11 Log- Likelihood Function Slide 12 The EM algorithm M step E step Iterations are made between the E and M steps until convergence Calculate the posterior probability of QTL genotype j for individual i that carries a known marker genotype Solve the log-likelihood equations Slide 13 - Type of Study Interval Mapping Program - Genetic Design Slide 14 - Data and Options Names of Markers (optional) Cumulative Marker Distance (cM) Interval Mapping Program Map Function Parameters Here for Simulation Study Only QTL Searching StepcM Slide 15 - Data Interval Mapping Program Put Markers and Trait Data into box below OR Slide 16 - Analyze Data Interval Mapping Program Trait: Slide 17 - Profile Interval Mapping Program Slide 18 - Permutation Test Interval Mapping Program #Tests Cut off Point at Level Is Based on Tests. Slide 19 An innovative model for genetic dissection of complex traits by incorporating mathematical aspects of biological principles into a mapping framework Functional Mapping Provides a tool for cutting-edge research at the interplay between gene action and development Slide 20 Subject Marker (M)Phenotype (y)Genotype frequency 1 2 m 1 2 T QQ( 2 ) Qq( 1 ) qq( 0 ) 1 2 2 y 1 (1) y 1 (2) y 1 (T) 2 2 2...y 2 (1) y 2 (2) y 2 (T) 3 1 1 y 3 (1) y 3 (2) y 3 (T) 4 1 1 y 4 (1) y 4 (2) y 4 (T) 5 1 1 y 5 (1) y 5 (2) y 5 (T) 6 1 0 y 6 (1) y 6 (2) y 6 (T) 7 0 1 y 7 (1) y 7 (2) y 7 (T) 8 0 0...y 8 (1) y 8 (2) y 8 (T) Data Structure n = n 22 + n 21 + n 20 + n 12 + n 00 + n 02 + n 01 + n 00 Parents AA aa F 1 Aa F 2 AA Aa aa Slide 21 The Finite Mixture Model Observation vector, y i = [y i (1), , y i (T)] ~ MVN(u j, ) Mean vector, u j = [u j (1), u j (2), , u j (T)], (Co)variance matrix, Slide 22 Modeling the Mean Vector Parametric approach Growth trajectories Logistic curve HIV dynamics Bi-exponential function Biological clock Van Der Pol equation Drug response Emax model Nonparametric approach Lengedre function (orthogonal polynomial) B-spline Slide 23 Stem diameter growth in poplar trees Ma, Casella & Wu: Genetics 2002 Slide 24 Logistic Curve of Growth A Universal Biological Law Logistic Curve of Growth A Universal Biological Law (West et al.: Nature 2001) Modeling the genotype- dependent mean vector, u j = [u j (1), u j (2), , u j (T)] = [,, , ] Instead of estimating u j, we estimate curve parameters q = (a j, b j, r j ) Number of parameters to be estimated in the mean vector Time points Traditional approach Our approach 5 3 5 = 15 3 3 = 9 10 3 10 = 30 3 3 = 9 50 3 50 = 150 3 3 = 9 Slide 25 Modeling the Variance Matrix Stationary parametric approach Autoregressive (AR) model Nonstationary parameteric approach Structured antedependence (SAD) model Ornstein-Uhlenbeck (OU) process Nonparametric approach Lengendre function Slide 26 Autoregressive model AR(1) = q = (a j, b j, r j, , 2 ) Slide 27 Box-Cox Transformation Differences in growth across ages UntransformedLog-transformed Poplar data Slide 28 EM Algorithm (Ma et al 2002, Genetics) Estimate (a j, b j, r j ; rho, sigma^2) Slide 29 An example of a forest tree The study material used was derived from the triple hybridization of Populus (poplar). A Populus deltoides clone (designated I-69) was used as a female parent to mate with an interspecific P. deltoides x P. nigra clone (designated I-45) as a male parent (WU et al. 1992 ). In the spring of 1988, a total of 450 1-year-old rooted three- way hybrid seedlings were planted at a spacing of 4 x 5 m at a forest farm near Xuchou City, Jiangsu Province, China. The total stem heights and diameters measured at the end of each of 11 growing seasons are used in this example. A genetic linkage map has been constructed using 90 genotypes randomly selected from the 450 hybrids with random amplified polymorphic DNAs (RAPDs) (Yin 2002) Slide 30 Functional mapping incorporated by logistic curves and AR(1) model QTL Slide 31 The dynamic pattern of QTL expression: Slide 32 - Data Functional Mapping Genetic Design: Curve: Marker Place: Time Point: Parameters Here for Simulation Study QTL Position: Curve Parameters: Sample Size: Sigma^2:Correlation rho: Search Step: cM Map Function: Slide 33 - Data Functional Mapping Put Markers and Trait Data into box below OR Slide 34 - Data Curves Functional Mapping Slide 35 - Profile Functional Mapping Initiate Values Slide 36 - Profile Functional Mapping Slide 37 - Data Curves Functional Mapping Slide 38 Recent Developments transform-both-sides logistic model. Wu, Ma, et al Biometrics 2004 Multiple genes Epistatic gene-gene interactions. Wu, Ma, et al Genetics 2004 Multiple environments Genotype x environment Zhao,Zhu,Gallo-Meagher & Wu: Genetics 2004 Multiple traits Trait correlations Zhao et al Biometrics 2005 Genetype by Sex interactions - Zhao,Ma,Cheverud &Wu Physiological Genomics 2004 Slide 39 transform-both-sides logistic model Developmental pattern of genetic effects Wu, Ma, Lin, Wang & Casella: Biometrics 2004 Timing at which the QTL is switched on Slide 40 Functional mapping for epistasis in poplar QTL 1 QTL 2 Wu, Ma, Lin & Casella Genetics 2004 Slide 41 Functional mapping for epistasis in poplar The growth curves of four different QTL genotypes for two QTL detected on the same linkage group D16 Slide 42 Genotype environment interaction in rice Zhao, Zhu, Gallo-Meagher & Wu: Genetics 2004 Slide 43 Plant height growth trajectories in rice affected by QTL in two contrasting environments Red: Subtropical Hangzhou Blue: Tropical Hainan QQ qq Slide 44 Functional mapping: Genotype sex interaction Zhao, Ma, Cheverud & Wu Physiological Genomics 2004 Slide 45 Red: Male mice Blue: Female mice QQ Qq qq Body weight growth trajectories affected by QTL in male and female mice Slide 46 Functional mapping for trait correlation Zhao, Hou, Littell & Wu: Biometrics submitted Slide 47 Growth trajectories for stem height and diameter affected by a pleiotropic QTL Red: Diameter Blue: Height QQ Qq Slide 48 Functional Mapping: toward high-dimensional biology A new conceptual model for genetic mapping of complex traits A systems approach for studying sophisticated biological problems A framework for testing biological hypotheses at the interplay among genetics, development, physiology and biomedicine Slide 49 Functional Mapping: Simplicity from complexity Estimating fewer biologically meaningful parameters that model the mean vector, Modeling the structure of the variance matrix by developing powerful statistical methods, leading to few parameters to be estimated, The reduction of dimension increases the power and precision of parameter estimation Slide 50 Slide 51 Slide 52 Slide 53 Slide 54 Slide 55 Slide 56 Slide 57 Slide 58