IntroductiontoTriangulatedGraphs
TandyWarnow
Topicsfortoday
• Triangulatedgraphs:theoremsandalgorithms(Chapters11.3and11.9)
• Examplesoftriangulatedgraphsinphylogenyestimation(Chapters4.8,11.3-11.5)
Triangulated(i.e.,Chordal)Graphs
• Definition:Agraphistriangulatedifithasnosimplecyclesofsizefourormore.
DCMsareDivide-and-Conquerstrategies!
DCMsforphylogenyreconstruction
• Defineatriangulatedgraphsothatitsverticescorrespondto
theinputtaxa(orsequences)• Decomposethegraphintooverlappingsubgraphs,thus
decomposingthetaxaintooverlappingsubsets.• Applythe“basemethod” toeachsubsetoftaxa,toconstruct
asubsettree• Applyasupertreemethodtothesubsettreestoobtaina
singletreeonthefullsetoftaxa.
DCMs(Disk-CoveringMethods)
• DCMsforpolynomialtimemethodsimprovetopologicalaccuracy(empiricalobservation)andhaveprovabletheoreticalguaranteesunderMarkovmodelsofevolution.
• DCMsforhardoptimizationproblemsreducerunningtimeneededtoachievegoodlevelsofaccuracy(empiricalobservation)
DecomposingTriangulatedGraphs
Max Clique Decomposition Separator-component Decomposition
Given:TriangulatedgraphG=(V,E)Output:DecompositionoftheverticesintooverlappingsubsetsRequire:Polynomialtime!Technique:Usespecialpropertiesabouttriangulatedgraphs
SimplicialVertices
Definition:LetG=(V,E)beagraph,andletvbeavertexinV.Thenvissimplicialifitssetofneighbors(i.e.,Γ(v))isaclique.Todo:• Giveexampleofagraphthathasnosimplicialvertices.
• Giveexampleofagraphwhereeveryvertexissimplicial.
PerfectEliminationOrdering
Definition:LetG=(V,E)beagraphonnvertices.Aperfecteliminationorderingisanorderingoftheverticesv1,v2,…,vnofGsothateachvertexviissimplicialinthegraphinducedon{vi+1,vi+2,…,vn}.Theorems:• Everytriangulatedgraphhasasimplicialvertex.• Infact,everytriangulatedgraphthatisnotacliquehastwonon-adjacentsimplicialvertices.
PerfectEliminationOrdering
Definition:LetG=(V,E)beagraphonnvertices.Aperfecteliminationorderingisanorderingoftheverticesv1,v2,…,vnofGsothateachvertexviissimplicialinthegraphinducedon{vi+1,vi+2,…,vn}.Theorems(Rose1970):AgraphGistriangulatedifandonlyifithasaperfecteliminationordering.Furthermore,givenatriangulatedgraph,aperfecteliminationorderingcanbefoundinpolynomialtime.
Somepropertiesofchordalgraphs
• Theorem:EverychordalgraphG=(V,E)hasatmost|V|maximalcliques,andthesecanbefoundinpolynomialtime:– Maxcliquedecomposition.
• Provethisusingtheexistenceofaperfecteliminationordering.
Somepropertiesofchordalgraphs
• Theorem:EverychordalgraphG=(V,E)hasatmost|V|maximalcliques,andthesecanbefoundinpolynomialtime:– Maxcliquedecomposition.
• Provethisusingtheexistenceofaperfecteliminationordering.
Somepropertiesofchordalgraphs
• Everychordalgraphthatisnotacliquehasavertexseparatorthatisamaximalclique,anditcanbefoundinpolynomialtime:– Separator-componentdecomposition.
Somepropertiesofchordalgraphs
• Everychordalgraphhasatmostnmaximalcliques,andthesecanbefoundinpolynomialtime:Maxcliquedecomposition.
• Everychordalgraphthatisnotacliquehasavertexseparatorthatisamaximalclique,anditcanbefoundinpolynomialtime:Separator-componentdecomposition.
DecomposingTriangulatedGraphs
Max Clique Decomposition Separator-component Decomposition
Given:TriangulatedgraphG=(V,E)Output:DecompositionoftheverticesintooverlappingsubsetsRequire:Polynomialtime!Technique:Usespecialpropertiesabouttriangulatedgraphs
DCMsareDivide-and-Conquerstrategies!
Howtocombinesubsettrees?
• Everytriangulatedgraphhasaperfecteliminationordering:– enablesustomergecorrectsubtreesandgetacorrectsupertreeback,ifsubtreesarebigenough(sothattheycontainalltheshortquartettrees).
TriangulatedGraphsandTrees
Theorem(Gravil1974,Buneman1974):AgraphGistriangulatedifandonlyifGistheintersectiongraphofasetofsubtreesofatree.Proof:Onedirectioniseasy,andtheotherisnot…
ExamplesofTriangulatedGraphs
• ThresholdgraphsTG(D,q):Disadditiveandqisanyrealnumber,and(x,y)isanedgeifandonlyifD[x,y]<=q.
• ShortSubtreeGraphsSSG(T,w):Tisatreewithedge-weightingw,andeveryshortquartetcontributesa4-clique.
• Character-stateintersectiongraphsfromperfectphylogenies.
ExamplesofTriangulatedGraphs
• ThresholdgraphsTG(D,q):Disadditiveandqisanyrealnumber,and(x,y)isanedgeifandonlyifD[x,y]<=q.
• Theorem:ForalladditivematricesDandthresholdsq,TG(D,q)istriangulated.
• Proof:Usethefactthatallsubtreeintersectiongraphsaretriangulated.
DCM1-boostingdistance-basedmethods[Nakhlehetal.ISMB2001]
• Theorem(Warnowetal.,SODA2001):DCM1-NJconvergestothetruetreefrompolynomiallengthsequences
NJ DCM1-NJ
0 400 800 1600 1200 No. Taxa
0
0.2
0.4
0.6
0.8
Erro
r Rat
e
ExamplesofTriangulatedGraphs
• ShortSubtreeGraphsSSG(T,w):Tisatreewithedge-weightingw,andeveryshortquartetcontributesa4-clique.
• Theorem:ForalltreesTwithedgeweightingw,SSG(T,w)isadditive.
• Proof:Usethefactthatallsubtreeintersectiongraphsaretriangulated.
Rec-I-DCM3 significantly improves performance
Comparison of TNT to Rec-I-DCM3(TNT) on one large dataset
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0 4 8 12 16 20 24
Hours
Average MP score above
optimal, shown as a percentage of
the optimal
Current best techniques (TNT)
Rec-I-DCM3(TNT)
ExamplesofTriangulatedGraphs
• Character-stateintersectiongraphsfromperfectphylogenies.
• Theorem:Forallperfectphylogenies,thecharacterstateintersectiongraph(wherenodescorrespondtocharacterstatesandedgescorrespondtoanytwostatesatanynodeinthetree–internalandleaf)istriangulated.
• Proof:Usethefactthatallsubtreeintersectiongraphsaretriangulated.
“Homoplasy-Free”Evolution(perfectphylogenies)
YESNO
PerfectPhylogeny
• AphylogenyTforasetSoftaxaisaperfectphylogenyifeachstateofeachcharacteroccupiesasubtree(nocharacterhasback-mutationsorparallelevolution)
30
Perfectphylogenies,cont.
• A=(0,0),B=(0,1),C=(1,3),D=(1,2)hasaperfectphylogeny!
• A=(0,0),B=(0,1),C=(1,0),D=(1,1)doesnothaveaperfectphylogeny!
Aperfectphylogeny
• A=00• B=01• C=13• D=12• E=03• F=13
A
B
C
D
Aperfectphylogeny
• A=00• B=01• C=13• D=12• E=03• F=13
A
B
C
D
E F
ThePerfectPhylogenyProblem
• GivenasetSoftaxa(species,languages,etc.)determineifaperfectphylogenyTexistsforS.
• TheproblemofdeterminingwhetheraperfectphylogenyexistsisNP-hard(McMorrisetal.1994,Steel1991).
TriangulatedGraphs
• Agraphistriangulatedifithasnosimplecyclesofsizefourormore.
TriangulatedGraphs
Theorem(Gravil1974,Buneman1974):AgraphGistriangulatedifandonlyifGistheintersectiongraphofasetofsubtreesofatree.Proof:Onedirectioniseasy,andtheotherisnot…
PerfectPhylogeniesandTriangulatedColoredGraphs
• SupposeMisacharactermatrixandTisaperfectphylogenyforM.
• ThenletM’betheextensionofMtoincludetheadditional“species”addedattheinternalnodes.
• CharacterStateIntersectionGraphGbasedonM’:– ForeachcharacteralphaandstateiinM’,giveavertexv(alpha,i)andcolorthevertexwiththecolorforalpha.
– Putedgesbetweentwoverticesiftheyshareanyspecies.– Gistriangulatedandproperlycolored.
PerfectPhylogeniesandTriangulatedColoredGraphs
• SupposeMisacharactermatrixandTisaperfectphylogenyforM.
• ThenletM’betheextensionofMtoincludetheadditional“species”addedattheinternalnodes.
• ThecharacterstateintersectiongraphGbasedonM’istriangulatedandproperlycolored.(Why?)
• ButifwehadbaseditonMitmightnothavebeentriangulated.(Why?)
Aperfectphylogeny
• A=00• B=01• C=13• D=12• E=03• F=13Drawthecharacterstateintersectiongraph.
A
B
C
D
E F
Matrixwithaperfectphylogeny
c1c2c3s1321s2122s3113s4211
Draw the perfect phylogeny and compute the sequences at the internal nodes.
Matrixwithaperfectphylogeny
c1c2c3s1321s2122s3113s4211
Draw the character state intersection graph for the extended matrix (including the sequences at the internal nodes).
Thepartitionintersectiongraph
“Yes”InstanceofPP:c1c2c3s1321s2122s3113s4211
Triangulatingcoloredgraphs
• LetG=(V,E)beagraphandcbeavertexcoloringofG.ThenGcanbec-triangulatedifasupergraphG’=(V,E’)existsthatistriangulatedandwherethecoloringcisproper.
• Inotherwords,Gcanbec-triangulatedifandonlyifwecanaddedgestoGtomakeittriangulatedwithoutaddingedgesbetweenverticesofthesamecolor.
Agraphthatcanbec-triangulated
Agraphthatcanbec-triangulated
A vertex-colored graph G=(V,E) can be c-triangulated if a supergraph G’=(V,E’) exists that is triangulated and where the coloring is proper.
Agraphthatcannotbec-triangulated
TriangulatingColoredGraphs(TCG)
TriangulatingColoredGraphs:givenavertex-coloredgraphG,determineifGcanbec-triangulated.
ThePPandTCGProblems
• Buneman’sTheorem:AperfectphylogenyexistsforasetSifandonlyiftheassociatedcharacterstateintersectiongraphcanbec-triangulated.
• ThePPandTCGproblemsarepolynomiallyequivalentandNP-hard.
Ano-instanceofPerfectPhylogeny
• A=00• B=01• C=10• D=11
0 1
0
1
Aninputtoperfectphylogeny(left)offoursequencesdescribedbytwocharacters,anditscharacterstateintersectiongraph.Notethatthecharacterstateintersectiongraphis2-colored.
SolvingthePPProblemUsingBuneman’sTheorem
“Yes”InstanceofPP:c1c2c3s1321s2122s3113s4211
SolvingthePPProblemUsingBuneman’sTheorem
“Yes”InstanceofPP:c1c2c3s1321s2122s3113s4211
Somespecialcasesareeasy
• Binarycharacterperfectphylogenysolvableinlineartime
• r-statecharacterssolvableinpolynomialtimeforeachr(combinatorialalgorithm)
• Twocharacterperfectphylogenysolvableinpolynomialtime(produces2-coloredgraph)
• k-characterperfectphylogenysolvableinpolynomialtimeforeachk(producesk-coloredgraphs--connectionstoRobertson-Seymourgraphminortheory)
EarlyHistory• LeQuesne(1969,1972,1974,1977):initialformulationofperfectphylogenies• Estabrook(1972),Estabrooketal.(1975):mathematicalfoundationsofperfect
phylogenies• McMorris(1997):binarycharactercompatibility• Felsenstein(1984):reviewpaper• Estabrook&Landrum,Fitch1975:compatibilityoftwocharacters• Steel(1992)andBodlaenderetal.(1992):NP-hardness• Buneman(1974):reducedperfectphylogenytotriangulatingcoloredgraphs• KannanandWarnow(1992):establishedequivalenceofTCGandPP
SeeT.Warnow,1993.Constructingphylogenetictreesefficientlyusingcompatibilitycriteria.NewZealandJournalofBotany,31:3,pp.239-248(linkedoffmyhomepage)forasurveyoftheearlyliteratureandthefullcitations.
Literaturesample• R.AgarwalaandD.Fernandez-Baca,1994.Apolynomial-timealgorithmfortheperfectphylogenyproblem
whenthenumberofcharacterstatesisfixed.SIAMJournalonComputing,23,1216–1224.• R.AgarwalaandD.Fernandez-Baca,1996.Simplealgorithmsforperfectphylogenyandtriangulating
coloredgraphs.InternationalJournalofFoundationsofComputerScience,7,11–21.• H.L.Bodlaender,M.R.Fellows,MichaelT.Hallett,H.ToddWareham,andT.Warnow.2000.Thehardnessof
perfectphylogeny,feasibleregisterassignmentandotherproblemsonthincoloredgraphs,TheoreticalComputerScience244(2000)167-188
• H.L.BodlaenderandT.KIoks,1993:Asimplelineartimealgorithmfortriangulatingthree-coloredgraphs.JournalofalgorithmsJ5:160-172.
• D.Fernandez-Baca,2000.Theperfectphylogenyproblem.Pages203–234of:Du,D.-Z.,andCheng,X.(eds),SteinerTreesinIndustries.KluwerAcademicPublishers.
• D.Gusfield,1991.Efficientalgorithmsforinferringevolutionarytrees.Networks21,19–28.• R.IduryandA.Schaffer.1993:Triangulatingthree-coloredgraphsinlineartimeandlinearspace.SIAM
journalondiscretemathematics6:289-294.• S.KannanandT.Warnow,1992.Triangulating3-coloredgraphs.SIAMJ.onDiscreteMathematics,Vol.5
No.2,pp.249-258(alsoSODA1991)• S.KannanandT.Warnow,1997.Afastalgorithmforthecomputationandenumerationofperfect
phylogenieswhenthenumberofcharacterstatesisfixed.SIAMJ.Computing,Vol.26,No.6,pp.1749-1763(alsoSODA1995)
• F.R.McMorris,T.Warnow,andT.Wimer,1994.TriangulatingVertexColoredGraphs.SIAMJ.onDiscreteMathematics,Vol.7,No.2,pp.296-306(alsoSODA1993).
ApplicationsofPerfectPhylogeny
• TumorPhylogenetics(MohammedEl-KebirwilltalkthisonApril10-17,2018)
• HistoricalLinguistics(IwilltalkaboutthisonMarch15,2018)
• PopulationgeneticsandHaplotypeinference