an animal breeders introduction to hmm · an animal breeders introduction to hmm @hickeyjohn....
TRANSCRIPT
![Page 1: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/1.jpg)
An animal breeders introduction to HMM
www.alphagenes.roslin.ed.ac.uk@hickeyjohn
![Page 2: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/2.jpg)
Relationships between haplotypes
![Page 3: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/3.jpg)
Hidden Markov models
• Lets think of Genetics– Alleles (A=0, a=1)– Genotype data (0,1, or 2)
10100111011100111001110011
11110222111111111111121021
01010111100011000110011010
The phasing problem – split diplotype into a pair of haplotypes
The founder haplotype mosaic problem
![Page 4: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/4.jpg)
Missing pedigree
• One parent may be known and genotyped– Use heuristics
• Other parent not– Use HMM as it is pedigree free
• Logic can be extended to missing grandparents
![Page 5: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/5.jpg)
We start with a simple HMM
• HMM – Hidden Markov Model– Hidden = it models processes that are not directly
observed– Markov = Given time n-1, what exists at time n is
conditionally independent of everything that went on before time n-1
– Model = Can make predictions on the basis of the model
• Genetics is perfect– Gametes are hidden– Markers are linked to the ones that went before– Lots of things to predict (e.g. missing markers, QTL)
![Page 6: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/6.jpg)
HMM• In Genetics
– Gary Churchill– Beagle– fastPHASE– Impute2– AlphaPhase (combined with heuristics)– AlphaImpute (combined with heuristics)– MaCH
• In other fields – Speech recognition– Weather– Checking Casino’s
![Page 7: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/7.jpg)
In speech recognition
![Page 8: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/8.jpg)
Real improvements in accuracy of modeling weather
![Page 9: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/9.jpg)
HMM - Weather example
• I am locked in an office without any windows • I want to predict what the weather is each day
• Each day my office mate “Andreas” comes– But we don’t talk
• Can I extract information from the behavior of Andreas?– Andreas likes ice-cream– He eats a different number of ice-creams each day– Could I use that to predict the weather outside?
![Page 10: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/10.jpg)
Reality underlying the data
• Data– For 30 days I record the number of ice-creams
Andreas eats
• “Biological” knowledge– Just two weather states (Sunny or Cloudy)– Weather today is correlated with weather tomorrow– Correlation dissipates over time– Ice-cream intake is correlated with weather
![Page 11: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/11.jpg)
Weather example• Hidden markov process
• Markov - the state at time n is conditionally dependent only to the state at time n-1
• Hidden– See ice-cream, but really modeling the weather
![Page 12: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/12.jpg)
Hidden Markov models
• Lets think of Genetics– Alleles (A=0, a=1)– Genotype data (0,1, or 2)
10100111011100111001110011
11110222111111111111121021
01010111100011000110011010
The phasing problem – split diplotype into a pair of haplotypes
The founder haplotype mosaic problem
![Page 13: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/13.jpg)
Discrete Markov process
• A system which may be described at any time as being in one of K distinct states (k1,k2,…,kk)
• At regularly spaced discrete times, the system undergoes a change according to a set of probabilities
• Time = n• Actual state at time n = xn• Transition probabilities = aij
![Page 14: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/14.jpg)
Weather example
• States – kC = Cloudy– kS = Sunny
• Time = days
• Transition matrix = A
Cloudy SunnyCloudy 0.9 0.1Sunny 0.7 0.3
![Page 15: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/15.jpg)
Weather example
• States – kC = Cloudy– kS = Sunny
• Time = days
• Transition matrix = A
Cloudy SunnyCloudy 0.9 0.1Sunny 0.7 0.3
p xn = ki | xn−1 = kj, xn−2 = kh,"# $%
= p xn = ki | xn−1 = kj"# $%
= ai, j
![Page 16: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/16.jpg)
Observable Markov process
• We have been describing an observable Markov process
• Output of the process is a set of states at each instant of time
• Each state relates directly to a physical (observable) event– Andreas eats x ice-creams each day
![Page 17: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/17.jpg)
We can enumerate
• For 7 consecutive days the weather is:– S,C,S,C,C,C,S (S=Sunny=kS; C=Cloudy=kC) – More formally
• O = observation sequence• O={kS, kC, kS, kC, kC, kC, kS }
• What is the probability of this happening?– Pr(O | Model) = Pr(kS, kC, kS, kC, kC, kC, kS | Model)
• Get our transition matrix• Initial state probability π = (kS=0.5, kC=0.5)
• πS� aS->C� aC->S� aS->C� aC->C� aC->C� aC->C� aC->S
• 0.5 � 0.7 � 0.1 � 0.7 � 0.9 � 0.9 � 0.1 = 0.00198
• Probability of a given realization is low, and imagine how low it would be with 1 million SNP, but concerned only with relative probability
Cloudy Sunny
Cloudy 0.9 0.1
Sunny 0.7 0.8
![Page 18: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/18.jpg)
Hidden markov models
• Rather than observing each state directly
• Extend so that each observed state is a probabilistic function of an unobserved event– i.e. Andreas eats ice-cream– But not perfectly correlated with the weather– Latent variables versus observed variables
Emission probabilities
Transmission probabilities
![Page 19: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/19.jpg)
Hidden Markov models
• For 7 consecutive days I observe Andreas eating– 3,1,3,1,2,1,3 (where n is the number of ice-creams eaten per day) – More formally
• O = observation sequence• O={x1=3, x2=1, x3=3, x4=1, x5=2, x6=1, x7=3}• Called Observed variables
• The weather is:– S,C,S,C,C,C,S (S=Sunny=k1; C=Cloudy=k2) – More formally
• St = state sequence• Latent variables• St={z1=S, z2=C, z3=S, z4=C, z5=C, z6=C, z7=S}
• Emission probabilities– How many ice-creams cloudy emits– How many ice-creams sunny emits
![Page 20: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/20.jpg)
Lets show this visually
![Page 21: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/21.jpg)
Genetic interpretation
![Page 22: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/22.jpg)
Elements of a HMM
• K – the number of states in the model
• M – the number of distinct observation symbols per state
– Can be discrete (e.g. 1, 2, or 3 ice-creams)
• A – the transition probabilities
• Φ – Parameters controlling the distribution of Emission probabilities– Emission probabilities = P(xn | zn, Φ)
• What is the probability of emitting a certain symbol at time n given in state kj
• π - the initial state probabilities
• λ = (A, Φ, π)
![Page 23: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/23.jpg)
3 aspects to be solved
• Given the observation sequence (O = O1,O2,…,OT) and the model (λ = (A, Φ, π))– How do we efficiently compute P(O|λ) – The probability of the observation sequence given the model– Solved via the Forward-Backward Algorithm
• Given the observation sequence (O = O1,O2,…,OT) and the model (λ = (A, Φ, π))– How do we choose a corresponding state sequence (Q=q1,q2,…,qT)
which is optimal in some meaningful way– Solved via the Viterbi algorithm
• How do we adjust the model parameters (λ = (A, Φ, π)) to maximize P(O|λ)– Solved via the Baum-Welch algorithm– (Same as the EM algorithm)
![Page 24: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/24.jpg)
Unfold A to get Trellis
Trellis structure gives computational efficiency
![Page 25: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/25.jpg)
Forward-Backward Algorithm
Forward probabilities = α
Trellis
Backward probabilities = β
= Emission
= Emission
= Emission
= Data
Compute P(O|λ)Enumerate every possible state sequence of length n
![Page 26: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/26.jpg)
Viterbi Algorithm
• Forward algorithm sums over all paths
• Viterbi algorithm finds the most likely path– The most likely path has the shortest route through
the trellis (the smallest sum)– Not a fan of Viterbi, prefer to use all paths weighted
by their probability– Genetics = gives the most likely haplotype
![Page 27: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/27.jpg)
Starting values
n Ice-creams 1 2 3
Cloudy emission 0.7 0.2 0.1Sunny emission 0.1 0.2 0.7
Cloudy SunnyCloudy 0.49 0.51Sunny 0.51 0.49
![Page 28: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/28.jpg)
Back to the ice-cream
• DataDay n Ice-creams
1 22 33 34 25 36 27 38 29 2
10 311 112 313 314 115 116 117 218 119 120 121 222 123 124 125 226 327 328 229 330 2
![Page 29: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/29.jpg)
ResultsDay n Ice-creams Day ProbC ProbS
1 2 1 0.06 0.942 3 2 0.00 1.003 3 3 0.00 1.004 2 4 0.00 1.005 3 5 0.00 1.006 2 6 0.00 1.007 3 7 0.00 1.008 2 8 0.01 0.999 2 9 0.01 0.99
10 3 10 0.00 1.0011 1 11 0.10 0.9012 3 12 0.00 1.0013 3 13 0.00 1.0014 1 14 0.92 0.0815 1 15 0.99 0.0116 1 16 1.00 0.0017 2 17 0.98 0.0218 1 18 1.00 0.0019 1 19 1.00 0.0020 1 20 1.00 0.0021 2 21 0.98 0.0222 1 22 1.00 0.0023 1 23 0.99 0.0124 1 24 0.95 0.0525 2 25 0.33 0.6726 3 26 0.00 1.0027 3 27 0.00 1.0028 2 28 0.00 1.0029 3 29 0.00 1.0030 2 30 0.04 0.96
![Page 30: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/30.jpg)
Parameters
n Ice-creams 1 2 3
Cloudy emission 0.79 0.21 0.00Sunny emission 0.06 0.37 0.57
Cloudy SunnyCloudy 0.89 0.11Sunny 0.07 0.93
![Page 31: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/31.jpg)
How is this imputation?Day n Ice-creams Day ProbC ProbS
1 2 1 0.06 0.942 3 2 0.00 1.003 3 3 0.00 1.004 2 4 0.00 1.005 3 5 0.00 1.006 2 6 0.00 1.007 3 7 0.00 1.008 2 8 0.01 0.999 2 9 0.01 0.99
10 3 10 0.00 1.0011 1 11 0.10 0.9012 3 12 0.00 1.0013 3 13 0.00 1.0014 1 14 0.92 0.0815 # 15 0.99 0.0116 1 16 1.00 0.0017 2 17 0.98 0.0218 1 18 1.00 0.0019 1 19 1.00 0.0020 1 20 1.00 0.0021 2 21 0.98 0.0222 1 22 1.00 0.0023 1 23 0.99 0.0124 1 24 0.95 0.0525 2 25 0.33 0.6726 3 26 0.00 1.0027 3 27 0.00 1.0028 2 28 0.00 1.0029 3 29 0.00 1.0030 2 30 0.04 0.96
Day n Ice-creams1 22 33 34 25 36 27 38 29 2
10 311 112 313 314 115 116 117 218 119 120 121 222 123 124 125 226 327 328 229 330 2
![Page 32: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/32.jpg)
Imputation
n Ice-creams 1 2 3Cloudy emission 0.79 0.21 0.00Sunny emission 0.06 0.37 0.57
Day ProbC ProbS15 0.99 0.01
Most likely number of ice-creams Andreas eats on day 15 =
0.99× 0.79×1( )+ 0.21×2( )+ 0.00×3( )"# $% + 0.01× 0.06×1( )+ 0.37×2( )+ 0.57×3( )"# $%
=1.22
True value = 1 Ice-cream on day 15Predict on the basis of the hidden states
![Page 33: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/33.jpg)
HMM for Genetics
• fastPHASE• Beagle• MaCH• Impute2• AlphaPhase (combined with heuristics)• AlphaImpute (combined with heuristics)
• Haplotyping and imputation
![Page 34: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/34.jpg)
fastPHASE
• The model– Haploid gametes underly Diploid genotypes
• Hidden markov process– Haploid gametes in present population derived from
ancient founder haplotypes• Hidden process
– Founder haplotypes can be considered to be cluster medoids
– fastPHASE can be considered to be analogous to a mixture model
![Page 35: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/35.jpg)
Now put in the genetics
– Alleles (A=0, a=1)– Genotype data (0,1, or 2)
10100111011100111001110011
11110222111111111111121021
01010111100011000110011010
The phasing problem – split diplotype into a pair of haplotypes
The founder haplotype mosaic problem
![Page 36: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/36.jpg)
The genetics
• Alleles are correlated along the haploid gametes– Closer alleles are more correlated
• fastPHASE is an IBD probability model– Each allele of each gamete has a probability of
deriving from each founder haplotype
• With this information we can do lots of things– Phase– Impute– Build genomic relationship matrices
![Page 37: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/37.jpg)
30 animal example
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10Founder1 0 0 0 0 0 0 0 0 0 0Founder2 1 1 1 1 1 1 1 1 1 1Founder3 1 0 1 0 1 0 1 0 1 0Founder4 0 1 0 1 0 1 0 1 0 1
M1 M2 M3 M4 M5 M6 M7 M8 M9 M101 1 1 2 2 2 2 2 2 2 12 0 0 1 1 1 1 1 1 1 03 0 0 0 0 0 0 0 0 0 04 0 1 0 1 0 1 0 0 0 05 0 2 0 2 0 2 0 1 0 16 2 2 2 2 2 2 2 2 1 17 0 1 0 1 0 1 0 1 0 18 0 2 0 2 0 2 0 1 0 09 0 2 0 1 0 1 0 0 0 010 0 1 1 1 1 1 1 1 1 011 0 1 0 1 0 1 0 1 0 112 0 1 0 1 0 1 0 0 0 013 1 1 1 1 1 1 1 1 1 114 0 1 0 2 0 1 0 0 0 015 0 1 0 1 0 1 0 0 0 016 0 0 0 0 0 0 0 0 0 017 1 1 1 1 1 1 1 1 0 018 0 0 0 0 0 0 0 0 0 019 1 2 1 2 1 2 1 2 1 120 0 1 0 1 0 1 0 1 0 121 1 2 1 2 1 2 1 2 0 122 0 0 0 1 0 1 0 1 0 123 1 2 1 2 1 2 0 1 0 024 0 1 0 0 0 2 1 1 0 025 2 2 2 2 2 2 1 2 1 226 1 1 1 1 1 1 1 1 0 027 0 1 0 1 0 1 1 2 0 128 0 0 0 0 0 0 0 0 0 029 0 0 0 0 0 0 0 1 0 130 0 2 0 2 0 2 0 1 0 0
M1 M2 M3 M4 M5 M6 M7 M8 M9 M101 0 0 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 02 0 0 0 0 0 0 0 0 0 02 0 0 1 1 1 1 1 1 1 03 0 0 0 0 0 0 0 0 0 03 0 0 0 0 0 0 0 0 0 04 0 1 0 1 0 1 0 0 0 04 0 0 0 0 0 0 0 0 0 0
![Page 38: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/38.jpg)
The parameters• Slightly different to standard HMM
– α, r, Θ
• α and r are partitions of the transition matrix– r = recombination rate between markers
• i.e. probability of a transition happening
– α is the frequency of each allele of each founder haplotype
• i.e. given there is a transition, where do I transition to
• Θ = the emission probability• The frequency of allele 1 in at position k in founder haplotype j– (Emit a 1 as opposed to a 0)
![Page 39: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/39.jpg)
HMM parameters
Theta0.00 0.80 0.00 0.83 0.00 0.80 0.17 0.94 0.00 0.700.79 0.86 1.00 1.00 1.00 1.00 0.86 1.00 0.57 0.500.00 0.04 0.00 0.04 0.00 0.07 0.00 0.04 0.00 0.000.00 0.99 0.00 0.96 0.00 1.00 0.03 0.24 0.00 0.01
Alpha0.24 0.01 0.00 0.00 0.14 0.18 0.88 0.90 0.07 0.060.25 0.00 0.09 0.00 0.00 0.00 0.00 0.04 0.00 0.110.28 0.97 0.90 0.99 0.61 0.10 0.02 0.01 0.27 0.260.23 0.02 0.01 0.01 0.25 0.72 0.10 0.05 0.66 0.58
R0.0004 0.0004 0.0005 0.0001 0.0004 0.0017 0.0035 0.0004 0.0002
![Page 40: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/40.jpg)
Ancestral haplotypes
ThetaF4 0.00 0.80 0.00 0.83 0.00 0.80 0.17 0.94 0.00 0.70F2 0.79 0.86 1.00 1.00 1.00 1.00 0.86 1.00 0.57 0.50F1 0.00 0.04 0.00 0.04 0.00 0.07 0.00 0.04 0.00 0.00F4 0.00 0.99 0.00 0.96 0.00 1.00 0.03 0.24 0.00 0.01
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
Founder1 0 0 0 0 0 0 0 0 0 0
Founder2 1 1 1 1 1 1 1 1 1 1
Founder3 1 0 1 0 1 0 1 0 1 0
Founder4 0 1 0 1 0 1 0 1 0 1
![Page 41: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/41.jpg)
Impute missing marker
• Combine output probabilities with the parameters of the model– With Theta – the emission probabilities
29 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.0029 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.0029 3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.0029 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
29 1 0.79 0.79 0.79 0.79 0.79 0.79 0.85 0.98 0.98 0.9829 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.0129 3 0.21 0.21 0.21 0.21 0.21 0.21 0.15 0.01 0.01 0.0129 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
F4 0.00 0.80 0.00 0.83 0.00 0.80 0.17 0.94 0.00 0.70F2 0.79 0.86 1.00 1.00 1.00 1.00 0.86 1.00 0.57 0.50F1 0.00 0.04 0.00 0.04 0.00 0.07 0.00 0.04 0.00 0.00F4 0.00 0.99 0.00 0.96 0.00 1.00 0.03 0.24 0.00 0.01
29 0 0 0 0 0 0 0 1 0 1• Genotype of individual 29 at marker 1
• Gamete 1 comes from founder 3
• Gamete 2 is a combination of founders 1 and 3
• Founders 1 and 3 emit a 0
• True genotype is a 0
![Page 42: An animal breeders introduction to HMM · An animal breeders introduction to HMM @hickeyjohn. Relationships between haplotypes. Hidden Markov models •Lets think of Genetics –Alleles](https://reader030.vdocuments.mx/reader030/viewer/2022041110/5f0fc1127e708231d445b7d3/html5/thumbnails/42.jpg)
HMM of fastPHASE in an nutshell
Column IndexIDPaternal gameteMaternal gameteProbabilities for marker 1
For individual 29 it is highly probable that its two gametes derive from founder haplotype 2