adventures in bioinformatics (2012)

50
Adventures in Bioinforma1cs (2012) Leighton Pritchard

Upload: leighton-pritchard

Post on 15-Jun-2015

126 views

Category:

Science


3 download

DESCRIPTION

An end-of-year personal update from 2012, given at JHI's Cellular and Molecular Sciences seminar series

TRANSCRIPT

Page 1: Adventures in Bioinformatics (2012)

Adventures  in  Bioinforma1cs  (2012)  

Leighton  Pritchard  

Page 2: Adventures in Bioinformatics (2012)

Summary  l Potato:  NB-­‐LRRs  

l Bacteria:  diagnos5cs  

l Potato:  phylogene5cs  

l Bacteria:  genomics  

Page 3: Adventures in Bioinformatics (2012)

Potato:  NB-­‐LRRs:  Introduc1on  l Potato  genome  sequenced  by  PGSC  (2011)  

l  NB-­‐LRR  predic5ons  thought  to  be  incomplete  

l NB-­‐LRR  l  Large  plant  gene  family  

l  Modular:  [nuclear  binding:NB]-­‐[leucine-­‐rich  receptor:LRR]  

l  Several  subclasses  (modular  varia5on)  

l  R  (resistance)  genes  are  a  subset  of  NB-­‐LRRs  

Page 4: Adventures in Bioinformatics (2012)

Potato:  NB-­‐LRRs:  Method  l Computa5onal  iden5fica5on  

l  Modular/domain  varia5on  is  an  issue  

l  Work  within  predicted  gene  complement  (improved  annota5on)  

‘good’  score  

‘bad’  score  

NB-­‐LRR:  

Not  NB-­‐LRR:  

No  false  posi5ves  No  false  nega5ves  

Page 5: Adventures in Bioinformatics (2012)

Potato:  NB-­‐LRRs:  Method  l Mo5f  iden5fica5on/composi5on  

l  Train  MEME  (psp-­‐gen)  on  posi5ve  and  nega5ve  examples  to  build  model  

 

 

l  Use  MAST  to  iden5fy  score  thresholds  and  predic5ve  performance  

 

l  Could  dis5nguish  between  posi5ve  and  nega5ve  example  sets  absolutely  on  basis  of  MAST  reported  E-­‐value  

 

 

NB-­‐LRR:  

Not  NB-­‐LRR:  

‘good’  score  

‘bad’  score  

Page 6: Adventures in Bioinformatics (2012)

Potato  NB-­‐LRRs:  Results  l Applied  model  to    

l  predicted  gene  complement  

l  gene  models  extended  by  3kbp  to  iden5fy  addi5onal  domains  

l  manual  correc5on  

l  mapped  to  genome  

l  clusters  iden5fied  

Page 7: Adventures in Bioinformatics (2012)

Potato:  NB-­‐LRRs:  Results  l NB-­‐LRR  model  used  to  build  gene  enrichment  bead  ‘array’  

l  Iden5fied  338  addi5onal  candidate  NB-­‐LRRs  

Page 8: Adventures in Bioinformatics (2012)

Potato  NB-­‐LRRs:  Summary    l Developed  novel  predic5ve  model  for  NB-­‐LRRs  

l  Iden5fied  and  located  438  NB-­‐LRRs  on  the  potato  genome  (~10%  more  than  published  annota5on)  

l Classified  phylogene5cally  and  on  domain  composi5on  

l Models  used  to  build  enrichment  arrays  to  iden5fy  338  novel  NB-­‐LRRs  

Jupe  F,  Pritchard  L,  Etherington  GJ,  MacKenzie  K,  Cock  PJ,  et  al.  (2012)  Iden5fica5on  and  localisa5on  of  the  NB-­‐LRR  gene  family  within  the  potato  genome.  BMC  Genomics  13:  75.  doi:10.1186/1471-­‐2164-­‐13-­‐75.  

Page 9: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Introduc1on  I  l Dickeya  spp.  

l  Major  pan-­‐European  bacterial  plant  pathogen  

l  Emerging  threat:  Dickeya  “solani”  

l  Exis5ng  diagnos5c  primers  did  not  discriminate  D.“solani”  

� ADE/pel:  all  Dickeya  

� Laurila,  Nassar,  Toth  primers  not  specific  at  species  level  

l  Use  drag  genomes  to  develop    diagnos5c  primers?  

Toth  IK,  der  Wolf  van  JM,  Saddler  G,  Lojkowska  E,  Hélias  V,  et  al.  (2011)  Dickeya  species:  an  emerging  problem  for  potato  produc5on  in  Europe.  Plant  Pathol:  doi:10.1111/j.1365-­‐3059.2011.02427.x.  

Page 10: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Method  

DraG  genomes  (several  species):  

Design  >1000  primers  per  genome    (Primer3)  

Pritchard  L,  Holden  NJ,  Bielaszewska  M,  Karch  H,  Toth  IK  (2012)  Alignment-­‐free  design  of  highly  discriminatory  diagnos5c  primer  sets  for  Escherichia  coli  O104:H4  outbreak  strains.  PLoS  ONE  7:  e34498.  doi:10.1371/journal.pone.0034498.  

l Design  primers  in  bulk:    

l  specific  thermodynamics  and  amplicon  size  

Page 11: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Method  Classify  (colour)  primers  by  predicted  ability  to  amplify  only  a  subset  of  genome  sequences  in  silico:  

Group  I  (genus)  

Group  II  

Group  (species)  III  

Amplifica5on  of  nega5ve    samples:  discard  

Group  (species)  IV  

Group  (species)  V  

Simultaneous  design  of  primers  specific  to  all  subgroups  of  the    input  sequence  set  

Pritchard  L,  Holden  NJ,  Bielaszewska  M,  Karch  H,  Toth  IK  (2012)  Alignment-­‐free  design  of  highly  discriminatory  diagnos5c  primer  sets  for  Escherichia  coli  O104:H4  outbreak  strains.  PLoS  ONE  7:  e34498.  doi:10.1371/journal.pone.0034498.  

Validate  in  vitro:  unseen  data  (performance  es5mates)  

Page 12: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Method  

Page 13: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Method  l Automated  process,  pipeline  (open  source)  available  at  hnps://github.com/widdowquinn/find_differen5al_primers  

l  (Forked  by  two  other  developers)  

l Found  a  problem  in  GenBank!  

l  Cross-­‐amplifying  primers  suggested  that    species  assignment  of  reference  genomes    was  incorrect  

Pritchard  L,  Humphris  S,  Saddler  GS,  Parkinson  NM,  Bertrand  V,  et  al.  (2012)  Detec5on  of  phytopathogens  of  the  genus  Dickeya  using  a  PCR  primer  predic5on  pipeline  for  drag  bacterial  genome  sequences.  Plant  Pathol  doi:10.1111/j.1365-­‐3059.2012.02678.x.  

Page 14: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Method  l Automated  process,  pipeline  available  at  hnps://github.com/widdowquinn/find_differen5al_primers  

l  (Forked  by  two  other  developers)  

l Found  a  problem  in  GenBank!  

l  Cross-­‐amplifying  primers  suggested  that    species  assignment  of  reference  genomes    was  incorrect  

l  Confirmed  by  recA  maximum    likelihood  tree  

Page 15: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Results  I  l  Specificity  of  D.“solani”  (and  D.  dianthicola)  primers  confirmed  in  vitro  

Pritchard  L,  Humphris  S,  Saddler  GS,  Parkinson  NM,  Bertrand  V,  et  al.  (2012)  Detec5on  of  phytopathogens  of  the  genus  Dickeya  using  a  PCR  primer  predic5on  pipeline  for  drag  bacterial  genome  sequences.  Plant  Pathol  doi:10.1111/j.1365-­‐3059.2012.02678.x.  

Page 16: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Introduc1on  II    l E.  coli  EHEC  O104:H4  outbreak,  Europe  2011  

l Unprecedented:  l  Scale  of  outbreak  (3950  infected,  >50  deaths,  economic  impact  and  

interna5onal  import  restric5ons)  

l  Rapid,  open  produc5on  of  sequence  data  

l  Crowdsourcing  of  assembly  and  annota5on  via  collabora5ve  revision  control  site:  GitHub  hPps://github.com/ehec-­‐outbreak-­‐crowdsourced/BGI-­‐data-­‐analysis/wiki  

Rohde  H,  Qin  J,  Cui  Y,  Li  D,  Loman  NJ,  et  al.  (2011)  Open-­‐source  genomic  analysis  of  Shiga-­‐toxin-­‐producing  E.  coli  O104:H4.  N  Engl  J  Med  365:  718–724.  doi:10.1056/NEJMoa1107643.  

Page 17: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Introduc1on  II  l A  changing  paradigm?  Beyond  serotyping:  

l  4  PCRs  to  serotype:    (O-­‐an5gen,    flagellar  locus,  tellurite  resistance,    shigatoxin)  

Kwan  et  al.  (2011)  hnp://precedings.nature.com/documents/6663/version/1  

Page 18: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Results  II  l Direct  experimental  valida5on  of  primer  candidates  (Münster):  

l  ‘Posi5ve’  set  =  21  clinical  outbreak  isolates  

l  ‘Nega5ve’  set  =  32  HUSEC  /  EPEC  isolates  

l  Posi5ve  control  =  LB  226692  

l Extremely  good  diagnos5c  performance  

l  Specific  at  outbreak  isolate    (sub-­‐species)  level  

Page 19: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Results  II  

posi1ve   nega1ve  +  -­‐  

Pritchard  L,  Holden  NJ,  Bielaszewska  M,  Karch  H,  Toth  IK  (2012)  Alignment-­‐free  design  of  highly  discriminatory  diagnos5c  primer  sets  for  Escherichia  coli  O104:H4  outbreak  strains.  PLoS  ONE  7:  e34498.  doi:10.1371/journal.pone.0034498.  

Page 20: Adventures in Bioinformatics (2012)

Bacteria:  diagnos1cs:  Summary  II  l Developed  novel  rapid  primer  design  technique  

l  Drag  genome  design  (default  assemblies)  

l  Simultaneous  design  of  primers  specific  to  all  subgroups  where  possible  

l  Can  specify  internal  probes  for  TaqMan  

l New  primers  for  Dickeya  spp.  diagnos5cs  

l  In  ac5ve  use  in  Europe  

l Outbreak  isolate-­‐specific  primers  for  E.coli  O104:H4  outbreak  

l  Specific  diagnosis  with  two  PCRs  

l Discriminatory  primers  to  dis5nguish  Phytophthora  effector  gene  family  members  for  diversity  studies  (not  shown)  

Page 21: Adventures in Bioinformatics (2012)

Potato:  phylogene1cs:  Introduc1on  l Potato  NAC  transcrip5on  factors  are  targeted  by  Phytophthora    effectors  

l Plant-­‐specific,  >100  family  representa5ves  in  each  organism  

l  Implicated  in  many  plant  responses,  not  just  defence/stress  

l Defini5ve  N-­‐terminal  domain  (NAM  domain),  unique  fold  

Olsen  AN,  Ernst  HA,  Leggio  LL,  Skriver  K  (2005)  NAC  transcrip5on  factors:  structurally  dis5nct,  func5onally  diverse.  Trends  Plant  Sci  10:  79–87.  doi:10.1016/j.tplants.2004.12.010  

Page 22: Adventures in Bioinformatics (2012)

Potato:  phylogene1cs:  Introduc1on    l Variable  lengths  with  highly  diverse  C-­‐terminal  structure  

l NAM  domains  are  DNA-­‐binding  and  associated  with  5ssue-­‐specific  expression  

l Some  NACs  have  transmembrane  (TM)  domains  in  the  C-­‐terminus:  (including  our    interactors)  (other  structural  classes)  

Jensen  MK,  Kjaersgaard  T,  Nielsen  MM,  Galberg  P,  Petersen  K,  et  al.  (2010)  The  Arabidopsis  thaliana  NAC  transcrip5on  factor  family:  structure-­‐func5on  rela5onships  and  determinants  of  ANAC019  stress  signalling.  Biochem  J  426:  183–196.  doi:10.1042/BJ20091234.  

Page 23: Adventures in Bioinformatics (2012)

Potato:  phylogene1cs:  Methods  l Two  potato  target  sequences  of  interest  

l Two  N.  benthamiana  homologues:  what  about  other  organisms?  

l 2253  proteins  from  nr:  HMM  search  with  NAM  domain  model  

l 137  proteins  from  PGSC  potato  annota5on  

l 104  proteins  from  tomato    annota5on  

l 2552  proteins  total  

l 2200  non-­‐redundant  NAM-­‐containing  proteins  

Page 24: Adventures in Bioinformatics (2012)

Potato:  phylogene1cs:  Methods  1.  Predict  presence  of  transmembrane  (TM)  domain  (TMHMM)  

2.  Restrict  sequence  to  NAM  domain  (only  common  region)  

3.  Remove  outliers  <40%  amino  acid  iden5ty:  1700  large  sequence  set  

4.  Cluster  sequences  (MCL)  to  iden5fy  reduced  set  including  potato  target  sequences:  406  sequences  

5.  Align  406  sequence  set  (M-­‐COFFEE,  HMMalign)  

6.  Back-­‐translate  to  nucleo5de  sequence  (more  work  than  you  might  think…)  

Page 25: Adventures in Bioinformatics (2012)

0.3

0.00.5

1.01.5

2.02.5

3.0

Glycine m

ax - gi|356535507|ref|XP_003536286.1|/33-161

Populus trichocarpa - gi|224066587|ref|XP_002302150.1|/16-144

StNac1_5/28-157

Populus trichocarpa - gi|224068212|ref|XP_002302682.1|/11-142

Arabidopsis lyrata subsp. lyrata - gi|297829590|ref|XP_002882677.1|/9-137

Hordeum vulgare subsp. vulgare - gi|188593543|dbj|BAG

32519.1|/6-134

Solyc12g056790.1.1/22-150

Brachypodium distachyon - gi|357120932|ref|XP_003562178.1|/6-134

Oryza sativa Indica G

roup - gi|125550359|gb|EAY96181.1|/6-134

Populus trichocarpa - gi|224061527|ref|XP_002300524.1|/8-141

Arabidopsis lyrata subsp. lyrata - gi|297807857|ref|XP_002871812.1|/13-130

Ricinus comm

unis - gi|255548119|ref|XP_002515116.1|/20-150

Petunia x hybrida - gi|21358787|gb|AAM47025.1|/27-156

Arabidopsis thaliana - gi|18399166|ref|NP_564439.1|/16-144

Arabidopsis thaliana - gi|15229161|ref|NP_190522.1|/13-142

Oryza sativa Indica G

roup - gi|125562576|gb|EAZ08024.1|/21-149

Malus x dom

estica - gi|302398995|gb|ADL36792.1|/8-118

Petunia x hybrida - gi|21105736|gb|AAM34767.1|AF509867_1/35-163

Populus trichocarpa - gi|224130704|ref|XP_002320907.1|/2-130

Arabidopsis lyrata subsp. lyrata - gi|297792793|ref|XP_002864281.1|/3-134

PGSC0003DMP400034078/40-167

Ricinus comm

unis - gi|255558043|ref|XP_002520050.1|/5-124

Populus trichocarpa - gi|224068220|ref|XP_002302683.1|/30-151

Oryza sativa Indica G

roup - gi|218202386|gb|EEC84813.1|/10-144

Sorghum bicolor - gi|242063552|ref|XP_002453065.1|/19-156

Arabidopsis lyrata subsp. lyrata - gi|297819354|ref|XP_002877560.1|/223-364

Populus trichocarpa - gi|224126509|ref|XP_002329572.1|/6-134

Solyc02g036430.1.1/6-134

Glycine m

ax - gi|356576366|ref|XP_003556303.1|/34-162

Populus trichocarpa - gi|224082532|ref|XP_002306731.1|/16-144

Vitis vinifera - gi|147832851|emb|CAN63961.1|/30-157

Medicago truncatula - gi|357490459|ref|XP_003615517.1|/25-153

Populus trichocarpa - gi|224068206|ref|XP_002302681.1|/2-128

Glycine m

ax - gi|356542037|ref|XP_003539478.1|/17-145

Medicago truncatula - gi|357464261|ref|XP_003602412.1|/22-150

Populus trichocarpa - gi|224098826|ref|XP_002311281.1|/6-134

Physcomitrella patens subsp. patens - gi|168001860|ref|XP_001753632.1|/6-133

Ricinus comm

unis - gi|255548101|ref|XP_002515107.1|/14-139

Zea mays - gi|293333166|ref|NP_001169690.1|/11-142

Arabidopsis lyrata subsp. lyrata - gi|297848426|ref|XP_002892094.1|/3-133

PGSC0003DMP400030569/6-134

Arabidopsis lyrata subsp. lyrata - gi|297830402|ref|XP_002883083.1|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297848422|ref|XP_002892092.1|/3-133

Petunia x hybrida - gi|21105746|gb|AAM34772.1|AF509872_1/10-138

Medicago truncatula - gi|357519943|ref|XP_003630260.1|/6-134

Arabidopsis thaliana - gi|186509710|ref|NP_186970.2|/6-134

Glycine m

ax - gi|356562269|ref|XP_003549394.1|/9-144

Vitis vinifera - gi|225453680|ref|XP_002268892.1|/14-144

Populus trichocarpa - gi|224063971|ref|XP_002301327.1|/11-142

Populus trichocarpa - gi|224130136|ref|XP_002320761.1|/16-140

Malus x dom

estica - gi|302399015|gb|ADL36802.1|/8-136

Arabidopsis lyrata subsp. lyrata - gi|297848416|ref|XP_002892089.1|/3-139

Arabidopsis thaliana - gi|18398893|ref|NP_566375.1|/27-155

Medicago truncatula - gi|357451351|ref|XP_003595952.1|/6-134

Physcomitrella patens subsp. patens - gi|168001016|ref|XP_001753211.1|/1-128

Arabidopsis lyrata subsp. lyrata - gi|297851744|ref|XP_002893753.1|/24-152

Arabidopsis lyrata subsp. lyrata - gi|297797411|ref|XP_002866590.1|/6-134

Arabidopsis thaliana - gi|15229292|ref|NP_187093.1|/2-138

Arabidopsis thaliana - gi|15237698|ref|NP_196060.1|/28-170

Hordeum vulgare subsp. vulgare - gi|371925001|tpe|CBZ41162.1|/9-137

Populus trichocarpa - gi|224127202|ref|XP_002329425.1|/10-136

Arabidopsis lyrata subsp. lyrata - gi|297795805|ref|XP_002865787.1|/4-145

PGSC0003DMP400054092/6-134

Oryza sativa Japonica G

roup - gi|222641843|gb|EEE69975.1|/10-134

Populus trichocarpa - gi|224109872|ref|XP_002315338.1|/5-133

Triticum aestivum

- gi|289551912|gb|ADD10614.1|/9-137

Arabidopsis thaliana - gi|238479717|ref|NP_001154602.1|/27-188

Arabidopsis lyrata subsp. lyrata - gi|297810007|ref|XP_002872887.1|/5-138

Solyc06g061080.2.1/6-134

Oryza sativa Japonica G

roup - gi|115479907|ref|NP_001063547.1|/10-144

Arabidopsis thaliana - gi|15217677|ref|NP_171725.1|/3-130

Populus trichocarpa - gi|224120980|ref|XP_002318467.1|/13-137

Oryza sativa Indica G

roup - gi|125541671|gb|EAY88066.1|/16-153

Populus trichocarpa - gi|224100709|ref|XP_002311983.1|/10-138

Vitis vinifera - gi|147843196|emb|CAN80540.1|/1150-1280

Brachypodium distachyon - gi|357144217|ref|XP_003573214.1|/29-157

Arabidopsis thaliana - gi|145332955|ref|NP_001078343.1|/4-137

Arabidopsis lyrata subsp. lyrata - gi|297819600|ref|XP_002877683.1|/13-142

Medicago truncatula - gi|358346553|ref|XP_003637331.1|/12-142

Sorghum bicolor - gi|242091638|ref|XP_002436309.1|/30-161

Arabidopsis thaliana - gi|15229511|ref|NP_188400.1|/6-134

Arabidopsis thaliana - gi|18420448|ref|NP_568414.1|/21-148

Populus trichocarpa - gi|224127206|ref|XP_002329426.1|/6-134

Glycine m

ax - gi|356560207|ref|XP_003548385.1|/11-141

Ricinus comm

unis - gi|255548117|ref|XP_002515115.1|/12-128

Arabidopsis thaliana - gi|4585977|gb|AAD25613.1|AC005287_15/3-131

Vitis vinifera - gi|147780480|emb|CAN73478.1|/9-139

Arabidopsis lyrata subsp. lyrata - gi|297819354|ref|XP_002877560.1|/5-145

Triticum aestivum

- gi|292659258|gb|ADE34584.1|/22-150

Arabidopsis thaliana - gi|15233215|ref|NP_191081.1|/11-151

Populus trichocarpa - gi|224134993|ref|XP_002321956.1|/11-139

Zea mays - gi|293332711|ref|NP_001169185.1|/6-140

Glycine m

ax - gi|356535729|ref|XP_003536396.1|/19-147

Medicago truncatula - gi|357482369|ref|XP_003611470.1|/6-148

Vitis vinifera - gi|147843195|emb|CAN80539.1|/9-137

Arabidopsis thaliana - gi|145334985|ref|NP_171726.2|/3-133

Arabidopsis lyrata subsp. lyrata - gi|297794579|ref|XP_002865174.1|/6-135

Populus trichocarpa - gi|224063969|ref|XP_002301326.1|/14-145

Sorghum bicolor - gi|242079615|ref|XP_002444576.1|/32-160

Ricinus comm

unis - gi|255573304|ref|XP_002527580.1|/6-134

Oryza sativa Japonica G

roup - gi|115464001|ref|NP_001055600.1|/6-134

Solyc06g074170.2.1/32-159

Sorghum bicolor - gi|242045528|ref|XP_002460635.1|/6-135

Arabidopsis lyrata subsp. lyrata - gi|297832960|ref|XP_002884362.1|/6-134

Solyc06g073050.2.1/19-148

Arabidopsis thaliana - gi|12322791|gb|AAG51388.1|AC011560_20/9-137

Populus trichocarpa - gi|224118488|ref|XP_002317831.1|/11-139

Medicago truncatula - gi|357491761|ref|XP_003616168.1|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297829588|ref|XP_002882676.1|/27-155

Medicago truncatula - gi|358346543|ref|XP_003637326.1|/12-142

Solyc11g008010.1.1/97-225

Arabidopsis lyrata subsp. lyrata - gi|297846266|ref|XP_002891014.1|/10-136

Picea sitchensis - gi|116793533|gb|ABK26780.1|/6-134

Populus trichocarpa - gi|224104873|ref|XP_002313601.1|/19-146

Arabidopsis thaliana - gi|334182236|ref|NP_171727.2|/3-133

Oryza sativa Japonica G

roup - gi|115435780|ref|NP_001042648.1|/9-137

Brachypodium distachyon - gi|357129971|ref|XP_003566632.1|/9-137

NbNac2_1/29-157

Arabidopsis thaliana - gi|15225866|ref|NP_180298.1|/14-141

Vitis vinifera - gi|296089033|emb|CBI38736.3|/9-139

Hordeum vulgare subsp. vulgare - gi|326509519|dbj|BAJ91676.1|/22-150

Brachypodium distachyon - gi|357133590|ref|XP_003568407.1|/6-134

Solyc08g077110.2.1/6-134

Arabidopsis lyrata subsp. lyrata - gi|297846226|ref|XP_002890994.1|/6-139

Ricinus comm

unis - gi|255555833|ref|XP_002518952.1|/9-142

Arabidopsis lyrata subsp. lyrata - gi|297819350|ref|XP_002877558.1|/46-187

Arabidopsis thaliana - gi|15237609|ref|NP_201211.1|/6-134

Jatropha curcas - gi|337743347|gb|AEI73170.1|/3-124

Arabidopsis thaliana - gi|15234225|ref|NP_192064.1|/4-136

Hordeum vulgare subsp. vulgare - gi|326496190|dbj|BAJ90716.1|/6-145

Arabidopsis lyrata subsp. lyrata - gi|297806417|ref|XP_002871092.1|/9-137

Populus trichocarpa - gi|224120762|ref|XP_002330945.1|/237-361

Ricinus comm

unis - gi|255552430|ref|XP_002517259.1|/5-133

Arabidopsis lyrata subsp. lyrata - gi|297829586|ref|XP_002882675.1|/27-155

Populus trichocarpa - gi|224130574|ref|XP_002328323.1|/7-135

Vitis vinifera - gi|147854237|emb|CAN83436.1|/6-134

Selaginella moellendorffii - gi|302756881|ref|XP_002961864.1|/1-126

PGSC0003DMP400029635/22-150

Vitis vinifera - gi|296088554|emb|CBI37545.3|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297811871|ref|XP_002873819.1|/6-134

Glycine m

ax - gi|356560211|ref|XP_003548387.1|/4-136

Zea mays - gi|223949493|gb|ACN28830.1|/24-161

Populus trichocarpa - gi|224112333|ref|XP_002316156.1|/6-134

Populus trichocarpa - gi|224130708|ref|XP_002320908.1|/6-138

PGSC0003DMP400037231/6-134

Glycine m

ax - gi|356551985|ref|XP_003544352.1|/5-133

Medicago truncatula - gi|7716952|gb|AAF68626.1|AF254124_1/6-134

Arabidopsis thaliana - gi|21593389|gb|AAM65338.1|/9-137

Sorghum bicolor - gi|242056921|ref|XP_002457606.1|/9-137

PGSC0003DMP400040416/22-150

Zea mays - gi|293336322|ref|NP_001169920.1|/6-134

Glycine m

ax - gi|356512837|ref|XP_003525122.1|/3-131

Setaria italica - gi|326369339|gb|ADZ55681.1|/6-134

Petunia x hybrida - gi|21105740|gb|AAM34769.1|AF509869_1/12-140

Arabidopsis thaliana - gi|186516558|ref|NP_001119122.1|/9-137

Hordeum vulgare subsp. vulgare - gi|326508056|dbj|BAJ86771.1|/12-139

Medicago truncatula - gi|358346547|ref|XP_003637328.1|/7-149

Brachypodium distachyon - gi|357159131|ref|XP_003578349.1|/8-135

NbNac1_1/28-157

Medicago truncatula - gi|358346553|ref|XP_003637331.1|/193-323

Glycine m

ax - gi|356557052|ref|XP_003546832.1|/6-134

Vitis vinifera - gi|147843196|emb|CAN80540.1|/669-799

Petunia x hybrida - gi|21105751|gb|AAM34774.1|AF509874_1/6-134

Solyc02g081270.2.1/6-134

Hordeum vulgare subsp. vulgare - gi|326526251|dbj|BAJ97142.1|/9-143

Arabidopsis thaliana - gi|15234216|ref|NP_192061.1|/4-137

Oryza sativa Japonica G

roup - gi|115449815|ref|NP_001048559.1|/16-153

Vitis vinifera - gi|147787054|emb|CAN62333.1|/14-159

Medicago truncatula - gi|357476695|ref|XP_003608633.1|/6-137

Medicago truncatula - gi|357453161|ref|XP_003596857.1|/18-145

Ricinus comm

unis - gi|255558041|ref|XP_002520049.1|/13-134

Sorghum bicolor - gi|242042457|ref|XP_002468623.1|/6-134

Oryza sativa Japonica G

roup - gi|110289615|gb|AAP55107.2|/6-134

Populus trichocarpa - gi|224080239|ref|XP_002306067.1|/7-131

Glycine m

ax - gi|356576364|ref|XP_003556302.1|/4-132

Arabidopsis thaliana - gi|14334572|gb|AAK59465.1|/10-136

Arabidopsis thaliana - gi|42562467|ref|NP_174529.2|/6-141

Populus trichocarpa - gi|224109864|ref|XP_002315336.1|/39-167

Populus trichocarpa - gi|224127506|ref|XP_002320091.1|/16-140

Solyc03g080090.2.1/27-156

Zea mays - gi|212274312|ref|NP_001130458.1|/9-136

Vitis vinifera - gi|225463127|ref|XP_002265611.1|/20-148

Arabidopsis lyrata subsp. lyrata - gi|297848424|ref|XP_002892093.1|/3-133

Medicago truncatula - gi|358346870|ref|XP_003637487.1|/12-128

Solyc05g055470.2.1/10-138

Populus trichocarpa - gi|224145744|ref|XP_002336258.1|/2-133

Oryza sativa Japonica G

roup - gi|115474875|ref|NP_001061034.1|/9-143

Ricinus comm

unis - gi|255557623|ref|XP_002519841.1|/6-134

Ricinus comm

unis - gi|255571259|ref|XP_002526579.1|/6-134

Oryza sativa Indica G

roup - gi|218187925|gb|EEC70352.1|/9-137

Populus trichocarpa - gi|224127510|ref|XP_002320092.1|/11-142

Arabidopsis thaliana - gi|15218843|ref|NP_176766.1|/6-134

Picea sitchensis - gi|116786496|gb|ABK24128.1|/6-136

Medicago truncatula - gi|357464259|ref|XP_003602411.1|/230-358

Glycine m

ax - gi|356576117|ref|XP_003556180.1|/19-147

Arabidopsis lyrata subsp. lyrata - gi|297853370|ref|XP_002894566.1|/6-134

Picea sitchensis - gi|148909054|gb|ABR17630.1|/7-147

Zea mays - gi|226493114|ref|NP_001147448.1|/9-137

Solyc11g068750.1.1/82-209

Arabidopsis thaliana - gi|15223276|ref|NP_171609.1|/3-139

Medicago truncatula - gi|357460717|ref|XP_003600640.1|/6-134

Glycine m

ax - gi|356537835|ref|XP_003537430.1|/6-134

Populus trichocarpa - gi|224130712|ref|XP_002320909.1|/12-143

Ricinus comm

unis - gi|255548097|ref|XP_002515105.1|/3-141Ricinus com

munis - gi|255548099|ref|XP_002515106.1|/3-128

Populus trichocarpa - gi|224120988|ref|XP_002318469.1|/13-137

Populus trichocarpa - gi|224115462|ref|XP_002317040.1|/6-139

Vitis vinifera - gi|296088017|emb|CBI35300.3|/6-134

Oryza sativa Japonica G

roup - gi|115465864|ref|NP_001056531.1|/18-149

Ricinus comm

unis - gi|255569031|ref|XP_002525485.1|/6-134

Glycine m

ax - gi|356516356|ref|XP_003526861.1|/5-133

Brachypodium distachyon - gi|357154515|ref|XP_003576809.1|/6-142

Arabidopsis thaliana - gi|15237699|ref|NP_196061.1|/9-137

Solyc04g072220.2.1/32-160

Ricinus comm

unis - gi|255583748|ref|XP_002532627.1|/11-139

Oryza sativa Indica G

roup - gi|125542167|gb|EAY88306.1|/6-134

Glycine m

ax - gi|255646322|gb|ACU23644.1|/22-150

Physcomitrella patens subsp. patens - gi|168016362|ref|XP_001760718.1|/6-131

Vitis vinifera - gi|147787053|emb|CAN62332.1|/14-144

Populus trichocarpa - gi|224120448|ref|XP_002318332.1|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297811023|ref|XP_002873395.1|/6-134

Petunia x hybrida - gi|21105742|gb|AAM34770.1|AF509870_1/19-147

Populus trichocarpa - gi|224132570|ref|XP_002321355.1|/8-141

Arabidopsis thaliana - gi|30689531|ref|NP_197847.3|/13-142

Populus trichocarpa - gi|224125708|ref|XP_002329698.1|/6-134

Populus trichocarpa - gi|224120984|ref|XP_002318468.1|/13-137

Ricinus comm

unis - gi|255565737|ref|XP_002523858.1|/17-145

Glycine m

ax - gi|356561847|ref|XP_003549188.1|/6-134

Arabidopsis thaliana - gi|110742030|dbj|BAE98952.1|/3-139

Petunia x hybrida - gi|21105744|gb|AAM34771.1|AF509871_1/23-151

Oryza sativa Japonica G

roup - gi|115481670|ref|NP_001064428.1|/18-129

Oryza sativa Indica G

roup - gi|125564636|gb|EAZ10016.1|/116-246

Arabidopsis thaliana - gi|6456751|gb|AAF09254.1|AF201456_1/9-137

Ricinus comm

unis - gi|255560596|ref|XP_002521312.1|/9-137

Sorghum bicolor - gi|242049706|ref|XP_002462597.1|/9-136

Vitis vinifera - gi|296089034|emb|CBI38737.3|/14-144

Populus trichocarpa - gi|224079646|ref|XP_002305903.1|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297818896|ref|XP_002877331.1|/10-137

Oryza sativa Japonica G

roup - gi|115480567|ref|NP_001063877.1|/6-134

Vitis vinifera - gi|147802301|emb|CAN70406.1|/9-137

Glycine m

ax - gi|356520206|ref|XP_003528755.1|/5-139

Vitis vinifera - gi|359474569|ref|XP_002280894.2|/6-135

Solyc03g098190.2.1/6-134

PGSC0003DMP400040418/1-129

Arabidopsis lyrata subsp. lyrata - gi|297848420|ref|XP_002892091.1|/11-146

Glycine m

ax - gi|356525677|ref|XP_003531450.1|/3-131

Malus x dom

estica - gi|302398999|gb|ADL36794.1|/9-137

Populus trichocarpa - gi|224138076|ref|XP_002326512.1|/6-134

Zea mays - gi|226531876|ref|NP_001146336.1|/6-134

Picea sitchensis - gi|148907008|gb|ABR16648.1|/6-136

Sorghum bicolor - gi|242077812|ref|XP_002448842.1|/6-134

Arabidopsis thaliana - gi|15237939|ref|NP_197228.1|/6-134

Solyc11g008000.1.1/10-139

Brachypodium distachyon - gi|357119028|ref|XP_003561248.1|/14-150

Vitis vinifera - gi|225466227|ref|XP_002267333.1|/6-134

StNac2_5/32-160

Brachypodium distachyon - gi|357144944|ref|XP_003573468.1|/9-143

Sorghum bicolor - gi|242045530|ref|XP_002460636.1|/6-134

Glycine m

ax - gi|356499060|ref|XP_003518362.1|/5-133

Solyc11g005920.1.1/4-132

PGSC0003DMP400046923/18-147

Vitis vinifera - gi|225464868|ref|XP_002272914.1|/17-144

PGSC0003DMP400054118/10-139

PGSC0003DMP400054265/31-159

Glycine m

ax - gi|356520204|ref|XP_003528754.1|/5-139

Arabidopsis thaliana - gi|15237469|ref|NP_199471.1|/6-135

Sorghum bicolor - gi|242040125|ref|XP_002467457.1|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297810005|ref|XP_002872886.1|/4-136

Zea mays - gi|293336942|ref|NP_001168544.1|/26-154

Vitis vinifera - gi|147837829|emb|CAN73792.1|/84-212

Vitis vinifera - gi|147843196|emb|CAN80540.1|/9-139

Populus trichocarpa - gi|224095914|ref|XP_002310505.1|/7-135

Arabidopsis lyrata subsp. lyrata - gi|297833156|ref|XP_002884460.1|/3-135

Glycine m

ax - gi|356512014|ref|XP_003524716.1|/6-136

Vitis vinifera - gi|147802300|emb|CAN70405.1|/13-141

Populus trichocarpa - gi|224135105|ref|XP_002327567.1|/5-133

Arabidopsis thaliana - gi|15229931|ref|NP_190015.1|/14-141

Hordeum vulgare subsp. vulgare - gi|326492385|dbj|BAK01976.1|/6-134

Oryza sativa Japonica G

roup - gi|115477845|ref|NP_001062518.1|/21-149

Brachypodium distachyon - gi|357141157|ref|XP_003572109.1|/6-135

Brachypodium distachyon - gi|357168397|ref|XP_003581627.1|/5-133

Hordeum vulgare subsp. vulgare - gi|295881154|gb|ADG

56507.1|/3-134

Arabidopsis thaliana - gi|15242390|ref|NP_196495.1|/6-134

Glycine m

ax - gi|356508975|ref|XP_003523228.1|/5-133

Arabidopsis lyrata subsp. lyrata - gi|297812351|ref|XP_002874059.1|/13-140

Arabidopsis thaliana - gi|18399168|ref|NP_564440.1|/16-144

PGSC0003DMP400001112/27-156

Ricinus comm

unis - gi|255548095|ref|XP_002515104.1|/4-138

Glycine m

ax - gi|356528635|ref|XP_003532905.1|/6-134

Glycine m

ax - gi|356522462|ref|XP_003529865.1|/12-142

Vitis vinifera - gi|147765514|emb|CAN78113.1|/6-135

Arabidopsis lyrata subsp. lyrata - gi|297841167|ref|XP_002888465.1|/6-134

Glycine m

ax - gi|356547116|ref|XP_003541963.1|/17-144

Arabidopsis lyrata subsp. lyrata - gi|297800286|ref|XP_002868027.1|/6-136

Sorghum bicolor - gi|242080633|ref|XP_002445085.1|/12-146

Glycine m

ax - gi|356516174|ref|XP_003526771.1|/22-150

Glycine m

ax - gi|356560215|ref|XP_003548389.1|/5-139

Vitis vinifera - gi|225448908|ref|XP_002265550.1|/13-141

Arabidopsis lyrata subsp. lyrata - gi|297812653|ref|XP_002874210.1|/13-142

Medicago truncatula - gi|357516809|ref|XP_003628693.1|/6-134

Arabidopsis lyrata subsp. lyrata - gi|297798362|ref|XP_002867065.1|/9-137

Vitis vinifera - gi|296089035|emb|CBI38738.3|/64-192

Arabidopsis lyrata subsp. lyrata - gi|297843024|ref|XP_002889393.1|/2-138

Malus x dom

estica - gi|302399003|gb|ADL36796.1|/6-134

Arabidopsis thaliana - gi|145324118|ref|NP_001077648.1|/16-156

Arabidopsis thaliana - gi|18398891|ref|NP_566374.1|/27-155

Arabidopsis lyrata subsp. lyrata - gi|297846400|ref|XP_002891081.1|/16-144

Arabidopsis lyrata subsp. lyrata - gi|297826043|ref|XP_002880904.1|/14-141

Solyc05g055480.2.1/17-145

Glycine m

ax - gi|356531066|ref|XP_003534099.1|/6-134

Medicago truncatula - gi|357464643|ref|XP_003602603.1|/7-135

Medicago truncatula - gi|358346559|ref|XP_003637334.1|/11-141

Brachypodium distachyon - gi|357137711|ref|XP_003570443.1|/14-151

Arabidopsis thaliana - gi|21618244|gb|AAM67294.1|/16-144

Oryza sativa Indica G

roup - gi|218184374|gb|EEC66801.1|/31-142

Sorghum bicolor - gi|242032581|ref|XP_002463685.1|/4-155

Glycine m

ax - gi|356569398|ref|XP_003552888.1|/6-134

Malus x dom

estica - gi|302399033|gb|ADL36811.1|/9-137

Arabidopsis lyrata subsp. lyrata - gi|297846402|ref|XP_002891082.1|/16-144

Eutrema halophilum

- gi|312282343|dbj|BAJ34037.1|/24-151

Populus trichocarpa - gi|224135051|ref|XP_002321971.1|/10-134

PGSC0003DMP400049938/4-132

Arabidopsis thaliana - gi|6714418|gb|AAF26106.1|AC012328_9/6-126

Medicago truncatula - gi|358347538|ref|XP_003637813.1|/53-186

Petunia x hybrida - gi|21105732|gb|AAM34765.1|AF509865_1/4-132

Malus x dom

estica - gi|302399031|gb|ADL36810.1|/9-145

Glycine m

ax - gi|356563453|ref|XP_003549977.1|/6-136

PGSC0003DMP400054120/30-158

Hordeum vulgare subsp. vulgare - gi|326530494|dbj|BAJ97673.1|/13-150

Arabidopsis thaliana - gi|42562475|ref|NP_174582.3|/24-152

PGSC0003DMP400010437/31-158

Solyc03g078120.2.1/4-132

Populus trichocarpa - gi|224063973|ref|XP_002301328.1|/11-142

Arabidopsis thaliana - gi|15236721|ref|NP_193532.1|/6-136

Physcomitrella patens subsp. patens - gi|168025227|ref|XP_001765136.1|/5-132

Populus trichocarpa - gi|224116242|ref|XP_002331996.1|/2-129

Vitis vinifera - gi|296089035|emb|CBI38738.3|/682-812

100

100

100

100

56

6

69

100

100

15

100

71

100

100

100

100

99

100

94

100

100

100

100

100

100

100100

58

100

100

50

100

100

100

100

100

6

100

96

7

100

98

5

100

100

96

100

100

100

100

100

18

100

98

100

100

100

100

100

100

100

100

100

14

97

100

100

96

96

100

100

3

100

100

100

100

91

100

100

100

99

100

21

100

100

96

56

100

100

83

100

1

100

100

100

100

4

100

100

100

15

100

11

100

100

100

87

100

100

100

100

100

100

100

90

72

99

77

100

100

100

100

88

42

100

95

42

100

90

71

98

100

31

100

87

100

15

100

100

96

84

100

6

100

100

100

98

100

67

100

100

100

100

100

100

14

100100

100

100

99

100

99

18

100

100

100

88

100

100

100

100

100

71

100

6

100

100

100

100

100

100

100

14

14

100

100

100 47

96

100

94

100

29

100

100

70

100

100

2

100

100

96

100

100

29

100

99

100

98

100

95

100

100

100

66

100

100

100

100

97

100

90

100

100

64 99

100

100

100

100

95

100

91

100

100

12

100

80

100

100

100

100

100

100

100

6

99

100

100

100

100

19

11

96

100

100

100

100

100

100

100

100

14

100

100

100

100

57

8

37

100

100

100

95

100100

95

99

100

100

100

100

100

100

69

100

74

98

7

90

6

32

100

100

100

97

100

100

100

18 100

100

99

99100

98

52

100

100

100

100

7

100

100

77

100

98

100

100

100

97

100

100100

99

100

100

17

100

27

94

100

100

Potato:  phylogene1cs:  Methods  l Construct  neighbour-­‐joining  (NJ)  tree  from  406  sequence  set    including  target  sequences  (TOPALi)  

l  Infer  ‘best’  evolu5onary  model  from  NJ  tree  structure  (jModelTest)  

l Construct  maximum-­‐likelihood    tree  with  bootstrap  (RaxML)  

Page 26: Adventures in Bioinformatics (2012)

Potato:  phylogene1cs:  Results  

0.3

0.0 0.5 1.0 1.5 2.0 2.5

100

97

58

100

97

5690

100

100

100

57

100

100

100

67

100

100

100

96

100

100

100

52

100

100

100

100

100

100

99

100

100

18

100

100

100

100

17

99

100

100

100

100

91

100

100

100

100

99

100

100

83

100

100

100

100

100

100

100

100

100

100

100

17100

99

96

100

100

100

100

100100

100

100

100

97

100

100

19

100

94

100

100

100

100

100

6

47

96

70

100

100

100

100

100

42

100

6

100

100

100

100

100

100

4

100

100100

100

77

100

69

98

98

100

96

100

100

100

90

100

95

7

99

100

100100

100

100

100

100

29

100

3

100

100

100

66

71

6

42

100

98

001

77

98

100

15

18

100

100

100

99

100

12

100

100

100

98

72

100

100

100

95

88

100

74

100

50

100

2

100

100

100

100

100

100

100 100

100

14

100

96

100

31

100

100

100

6

96

100

100

6

100

100

100

100

88

6

99

7

100

100

100

56

8

100

7

100

100

100

100

100

100

94100

99100

100

100

100

15

100

80

27

100

100

100

91

100

64

11100

95

100

100

100

100

90

100

100

100

98

100

100

100

99

100

100

21

100

100

100

100

99

14

100

14

100

100

97

100

100

100

100

99

100

5

100

99

100

87

100

15100

29

100

100

96

96

100

100

10018

100

100

95

71100

100

100

100100

100

100

100

100

100

94

100

100100

100

10014

100

100

11

100

98

100

69

95

87

100

1

96

90

100

32

100

100

100

100

100

100

37

100

98

100

14

84

100

NAC1

NAC2

NTL1

NTL13NTL12

NTL11

NTL10

NTL9NTL8

NTL7

NTL6NTL5

NTL4

NTL3

NTL2

ArabidopsisSolanaceaeCereals

l Target  sequences:  NAC1,  NAC2  

l Arabidopsis  NACs:    NTL1-­‐NTL13  

l TM-­‐containing  NACs    in  red  

l Divergence  pre-­‐At/Sol  split  

l Novel  At  NAC  classes  

l At  NAC  family  expansion  

l TM-­‐containing  NACs  cluster  

l Cereals  underrepresented  

Page 27: Adventures in Bioinformatics (2012)

NbNac1_1/28-157PGSC0003DMP400001112/27-156

Solyc03g080090.2.1/27-156

Solyc06g073050.2.1/19-148PGSC0003DMP400046923/18-147

gi|115474875|ref|NP_001061034.1|/9-143gi|326526251|dbj|BAJ97142.1|/9-143

gi|357144944|ref|XP_003573468.1|/9-143

gi|242080633|ref|XP_002445085.1|/12-146gi|293332711|ref|NP_001169185.1|/6-140

gi|110289615|gb|AAP55107.2|/6-134gi|242040125|ref|XP_002467457.1|/6-134

gi|357141157|ref|XP_003572109.1|/6-135gi|326496190|dbj|BAJ90716.1|/6-145

gi|242045528|ref|XP_002460635.1|/6-135

gi|357154515|ref|XP_003576809.1|/6-142gi|125564636|gb|EAZ10016.1|/116-246

gi|357137711|ref|XP_003570443.1|/14-151gi|242063552|ref|XP_002453065.1|/19-156

gi|223949493|gb|ACN28830.1|/24-161gi|115449815|ref|NP_001048559.1|/16-153

gi|125541671|gb|EAY88066.1|/16-153

gi|326530494|dbj|BAJ97673.1|/13-150

gi|292659258|gb|ADE34584.1|/22-150

gi|326509519|dbj|BAJ91676.1|/22-150

gi|357144217|ref|XP_003573214.1|/29-157

gi|242079615|ref|XP_002444576.1|/32-160

gi|293336942|ref|NP_001168544.1|/26-154

gi|115477845|ref|NP_001062518.1|/21-149

gi|357119028|ref|XP_003561248.1|/14-150

gi|115465864|ref|NP_001056531.1|/18-149

gi|242091638|ref|XP_002436309.1|/30-161gi|293333166|ref|NP_001169690.1|/11-142

gi|212274312|ref|NP_001130458.1|/9-136gi|242049706|ref|XP_002462597.1|/9-136gi|357159131|ref|XP_003578349.1|/8-135

gi|326508056|dbj|BAJ86771.1|/12-139gi|115479907|ref|NP_001063547.1|/10-144

gi|218202386|gb|EEC84813.1|/10-144

gi|222641843|gb|EEE69975.1|/10-134gi|295881154|gb|ADG56507.1|/3-134

gi|242032581|ref|XP_002463685.1|/4-155

gi|302756881|ref|XP_002961864.1|/1-126

gi|168001860|ref|XP_001753632.1|/6-133

gi|168016362|ref|XP_001760718.1|/6-131

gi|168025227|ref|XP_001765136.1|/5-132

gi|168001016|ref|XP_001753211.1|/1-128

gi|148909054|gb|ABR17630.1|/7-147

gi|116793533|gb|ABK26780.1|/6-134

gi|224120762|ref|XP_002330945.1|/237-361

gi|224116242|ref|XP_002331996.1|/2-129

gi|224082532|ref|XP_002306731.1|/16-144

gi|224066587|ref|XP_002302150.1|/16-144

Solyc12g056790.1.1/22-150

PGSC0003DMP400029635/22-150

gi|21105740|gb|AAM34769.1|AF509869_1/12-140

gi|21105742|gb|AAM34770.1|AF509870_1/19-147PGSC0003DMP400054265/31-159

StNac2_5/32-160

Solyc04g072220.2.1/32-160NbNac2_1/29-157

gi|225463127|ref|XP_002265611.1|/20-148

gi|356576117|ref|XP_003556180.1|/19-147

gi|356535729|ref|XP_003536396.1|/19-147

gi|18399168|ref|NP_564440.1|/16-144

gi|297846402|ref|XP_002891082.1|/16-144

gi|297846400|ref|XP_002891081.1|/16-144

gi|14334572|gb|AAK59465.1|/10-136

gi|297846266|ref|XP_002891014.1|/10-136

gi|15237609|ref|NP_201211.1|/6-134

gi|297797411|ref|XP_002866590.1|/6-134

gi|297811023|ref|XP_002873395.1|/6-134

gi|15242390|ref|NP_196495.1|/6-134

gi|357519943|ref|XP_003630260.1|/6-134

gi|356525677|ref|XP_003531450.1|/3-131

gi|356512837|ref|XP_003525122.1|/3-131

gi|356508975|ref|XP_003523228.1|/5-133gi|356516356|ref|XP_003526861.1|/5-133

gi|357464643|ref|XP_003602603.1|/7-135

gi|255552430|ref|XP_002517259.1|/5-133

gi|224095914|ref|XP_002310505.1|/7-135

gi|224130574|ref|XP_002328323.1|/7-135

gi|302399015|gb|ADL36802.1|/8-136

gi|302399003|gb|ADL36796.1|/6-134

gi|296088554|emb|CBI37545.3|/6-134

Solyc03g078120.2.1/4-132

PGSC0003DMP400049938/4-132

Solyc11g005920.1.1/4-132

gi|21105732|gb|AAM34765.1|AF509865_1/4-132

gi|357133590|ref|XP_003568407.1|/6-134

gi|188593543|dbj|BAG32519.1|/6-134gi|115464001|ref|NP_001055600.1|/6-134

gi|326369339|gb|ADZ55681.1|/6-134

gi|242077812|ref|XP_002448842.1|/6-134

gi|293336322|ref|NP_001169920.1|/6-134

gi|125550359|gb|EAY96181.1|/6-134

gi|357168397|ref|XP_003581627.1|/5-133

gi|326492385|dbj|BAK01976.1|/6-134

PGSC0003DMP400054118/10-139

Solyc11g008000.1.1/10-139

PGSC0003DMP400040418/1-129

Solyc05g055470.2.1/10-138

gi|21105746|gb|AAM34772.1|AF509872_1/10-138

Solyc05g055480.2.1/17-145

PGSC0003DMP400040416/22-150

gi|21105744|gb|AAM34771.1|AF509871_1/23-151

gi|21105736|gb|AAM34767.1|AF509867_1/35-163

PGSC0003DMP400054120/30-158

Solyc11g008010.1.1/97-225gi|224109864|ref|XP_002315336.1|/39-167

gi|356535507|ref|XP_003536286.1|/33-161

gi|356576366|ref|XP_003556303.1|/34-162

gi|147802300|emb|CAN70405.1|/13-141

gi|356576364|ref|XP_003556302.1|/4-132

gi|147802301|emb|CAN70406.1|/9-137

gi|302399033|gb|ADL36811.1|/9-137

gi|255583748|ref|XP_002532627.1|/11-139gi|224100709|ref|XP_002311983.1|/10-138

gi|224109872|ref|XP_002315338.1|/5-133

gi|6456751|gb|AAF09254.1|AF201456_1/9-137

gi|15237699|ref|NP_196061.1|/9-137

gi|297806417|ref|XP_002871092.1|/9-137

gi|21593389|gb|AAM65338.1|/9-137gi|297829590|ref|XP_002882677.1|/9-137

gi|238479717|ref|NP_001154602.1|/27-188gi|18398891|ref|NP_566374.1|/27-155

gi|297829586|ref|XP_002882675.1|/27-155

gi|297829588|ref|XP_002882676.1|/27-155

gi|18398893|ref|NP_566375.1|/27-155gi|15237698|ref|NP_196060.1|/28-170

gi|357482369|ref|XP_003611470.1|/6-148

gi|357476695|ref|XP_003608633.1|/6-137

gi|356563453|ref|XP_003549977.1|/6-136

gi|356512014|ref|XP_003524716.1|/6-136gi|224138076|ref|XP_002326512.1|/6-134

gi|224126509|ref|XP_002329572.1|/6-134

gi|255557623|ref|XP_002519841.1|/6-134

Solyc08g077110.2.1/6-134

PGSC0003DMP400030569/6-134

gi|21105751|gb|AAM34774.1|AF509874_1/6-134

Solyc06g061080.2.1/6-134gi|42562467|ref|NP_174529.2|/6-141

gi|297846226|ref|XP_002890994.1|/6-139

gi|297800286|ref|XP_002868027.1|/6-136

gi|15236721|ref|NP_193532.1|/6-136gi|297794579|ref|XP_002865174.1|/6-135

gi|15237469|ref|NP_199471.1|/6-135

gi|297830402|ref|XP_002883083.1|/6-134

gi|15229511|ref|NP_188400.1|/6-134

gi|224125708|ref|XP_002329698.1|/6-134gi|224120448|ref|XP_002318332.1|/6-134gi|255569031|ref|XP_002525485.1|/6-134gi|356531066|ref|XP_003534099.1|/6-134

gi|356561847|ref|XP_003549188.1|/6-134

PGSC0003DMP400054092/6-134

Solyc03g098190.2.1/6-134

gi|296088017|emb|CBI35300.3|/6-134

gi|242045530|ref|XP_002460636.1|/6-134

gi|115480567|ref|NP_001063877.1|/6-134

gi|4585977|gb|AAD25613.1|AC005287_15/3-131

gi|297853370|ref|XP_002894566.1|/6-134

gi|255573304|ref|XP_002527580.1|/6-134gi|224098826|ref|XP_002311281.1|/6-134

gi|224112333|ref|XP_002316156.1|/6-134

gi|147854237|emb|CAN83436.1|/6-134

gi|356557052|ref|XP_003546832.1|/6-134

gi|356528635|ref|XP_003532905.1|/6-134

gi|357451351|ref|XP_003595952.1|/6-134

Solyc02g036430.1.1/6-134

PGSC0003DMP400037231/6-134

Solyc02g081270.2.1/6-134

gi|225466227|ref|XP_002267333.1|/6-134

gi|255571259|ref|XP_002526579.1|/6-134

gi|224135105|ref|XP_002327567.1|/5-133

gi|224079646|ref|XP_002305903.1|/6-134

gi|356551985|ref|XP_003544352.1|/5-133

gi|356499060|ref|XP_003518362.1|/5-133

gi|357491761|ref|XP_003616168.1|/6-134gi|357460717|ref|XP_003600640.1|/6-134

gi|356537835|ref|XP_003537430.1|/6-134

gi|356569398|ref|XP_003552888.1|/6-134

gi|242042457|ref|XP_002468623.1|/6-134

gi|226531876|ref|NP_001146336.1|/6-134

gi|125542167|gb|EAY88306.1|/6-134

gi|357120932|ref|XP_003562178.1|/6-134

gi|297841167|ref|XP_002888465.1|/6-134

gi|15218843|ref|NP_176766.1|/6-134

gi|15237939|ref|NP_197228.1|/6-134

gi|297811871|ref|XP_002873819.1|/6-134gi|297832960|ref|XP_002884362.1|/6-134

gi|186509710|ref|NP_186970.2|/6-134

gi|116786496|gb|ABK24128.1|/6-136

gi|148907008|gb|ABR16648.1|/6-136

gi|357453161|ref|XP_003596857.1|/18-145

gi|356542037|ref|XP_003539478.1|/17-145gi|356547116|ref|XP_003541963.1|/17-144

gi|225464868|ref|XP_002272914.1|/17-144

gi|224104873|ref|XP_002313601.1|/19-146

gi|255565737|ref|XP_002523858.1|/17-145

Solyc06g074170.2.1/32-159

PGSC0003DMP400010437/31-158

Solyc11g068750.1.1/82-209

PGSC0003DMP400034078/40-167

gi|297818896|ref|XP_002877331.1|/10-137

gi|15229931|ref|NP_190015.1|/14-141gi|312282343|dbj|BAJ34037.1|/24-151

gi|18420448|ref|NP_568414.1|/21-148gi|297812351|ref|XP_002874059.1|/13-140

gi|15225866|ref|NP_180298.1|/14-141

gi|297826043|ref|XP_002880904.1|/14-141

gi|226493114|ref|NP_001147448.1|/9-137gi|242056921|ref|XP_002457606.1|/9-137

gi|289551912|gb|ADD10614.1|/9-137gi|371925001|tpe|CBZ41162.1|/9-137

gi|357129971|ref|XP_003566632.1|/9-137

gi|218187925|gb|EEC70352.1|/9-137gi|115435780|ref|NP_001042648.1|/9-137

gi|224080239|ref|XP_002306067.1|/7-131gi|224120988|ref|XP_002318469.1|/13-137

gi|224120984|ref|XP_002318468.1|/13-137

gi|224120980|ref|XP_002318467.1|/13-137

gi|224135051|ref|XP_002321971.1|/10-134

gi|15233215|ref|NP_191081.1|/11-151

gi|297819350|ref|XP_002877558.1|/46-187gi|297819354|ref|XP_002877560.1|/5-145

gi|297819354|ref|XP_002877560.1|/223-364

gi|224132570|ref|XP_002321355.1|/8-141

gi|297807857|ref|XP_002871812.1|/13-130

gi|356562269|ref|XP_003549394.1|/9-144

gi|218184374|gb|EEC66801.1|/31-142

gi|302398995|gb|ADL36792.1|/8-118

gi|297795805|ref|XP_002865787.1|/4-145

gi|297843024|ref|XP_002889393.1|/2-138gi|15229292|ref|NP_187093.1|/2-138

gi|297833156|ref|XP_002884460.1|/3-135

gi|15234225|ref|NP_192064.1|/4-136gi|297810005|ref|XP_002872886.1|/4-136

gi|297810007|ref|XP_002872887.1|/5-138

gi|145332955|ref|NP_001078343.1|/4-137gi|15234216|ref|NP_192061.1|/4-137

gi|297792793|ref|XP_002864281.1|/3-134

gi|297848416|ref|XP_002892089.1|/3-139

gi|15223276|ref|NP_171609.1|/3-139gi|110742030|dbj|BAE98952.1|/3-139

gi|15217677|ref|NP_171725.1|/3-130

gi|297848422|ref|XP_002892092.1|/3-133

gi|334182236|ref|NP_171727.2|/3-133

gi|297848426|ref|XP_002892094.1|/3-133

gi|297848424|ref|XP_002892093.1|/3-133

gi|145334985|ref|NP_171726.2|/3-133

gi|297848420|ref|XP_002892091.1|/11-146

gi|224127206|ref|XP_002329426.1|/6-134

gi|224127202|ref|XP_002329425.1|/10-136

gi|255558041|ref|XP_002520049.1|/13-134

gi|255558043|ref|XP_002520050.1|/5-124

gi|224130708|ref|XP_002320908.1|/6-138

gi|224068212|ref|XP_002302682.1|/11-142

gi|224063973|ref|XP_002301328.1|/11-142

gi|224127510|ref|XP_002320092.1|/11-142

gi|255548097|ref|XP_002515105.1|/3-141gi|255548099|ref|XP_002515106.1|/3-128

gi|337743347|gb|AEI73170.1|/3-124

gi|255548101|ref|XP_002515107.1|/14-139

gi|255548117|ref|XP_002515115.1|/12-128

gi|255548119|ref|XP_002515116.1|/20-150

gi|224130712|ref|XP_002320909.1|/12-143

gi|224068220|ref|XP_002302683.1|/30-151

gi|224063969|ref|XP_002301326.1|/14-145

gi|224063971|ref|XP_002301327.1|/11-142

gi|224127506|ref|XP_002320091.1|/16-140gi|224130136|ref|XP_002320761.1|/16-140

gi|255548095|ref|XP_002515104.1|/4-138

gi|224130704|ref|XP_002320907.1|/2-130gi|224068206|ref|XP_002302681.1|/2-128

gi|358346547|ref|XP_003637328.1|/7-149

gi|356560211|ref|XP_003548387.1|/4-136

gi|356560215|ref|XP_003548389.1|/5-139

gi|356520206|ref|XP_003528755.1|/5-139

gi|356520204|ref|XP_003528754.1|/5-139

gi|356560207|ref|XP_003548385.1|/11-141gi|356522462|ref|XP_003529865.1|/12-142

gi|358346559|ref|XP_003637334.1|/11-141

gi|358346870|ref|XP_003637487.1|/12-128

gi|358346553|ref|XP_003637331.1|/193-323

gi|358347538|ref|XP_003637813.1|/53-186

gi|358346553|ref|XP_003637331.1|/12-142

gi|225453680|ref|XP_002268892.1|/14-144

gi|296089034|emb|CBI38737.3|/14-144

gi|147787053|emb|CAN62332.1|/14-144

gi|147843196|emb|CAN80540.1|/9-139

gi|296089035|emb|CBI38738.3|/682-812

gi|296089033|emb|CBI38736.3|/9-139

gi|296089035|emb|CBI38738.3|/64-192gi|147787054|emb|CAN62333.1|/14-159

gi|147780480|emb|CAN73478.1|/9-139gi|255555833|ref|XP_002518952.1|/9-142

gi|302399031|gb|ADL36810.1|/9-145

gi|224061527|ref|XP_002300524.1|/8-141

gi|224115462|ref|XP_002317040.1|/6-139

gi|297798362|ref|XP_002867065.1|/9-137

gi|186516558|ref|NP_001119122.1|/9-137

gi|42562475|ref|NP_174582.3|/24-152

gi|297851744|ref|XP_002893753.1|/24-152gi|30689531|ref|NP_197847.3|/13-142

gi|297812653|ref|XP_002874210.1|/13-142gi|15229161|ref|NP_190522.1|/13-142

gi|297819600|ref|XP_002877683.1|/13-142

gi|224134993|ref|XP_002321956.1|/11-139

gi|224118488|ref|XP_002317831.1|/11-139gi|357464259|ref|XP_003602411.1|/230-358

gi|357464261|ref|XP_003602412.1|/22-150gi|255646322|gb|ACU23644.1|/22-150gi|356516174|ref|XP_003526771.1|/22-150

gi|357490459|ref|XP_003615517.1|/25-153

gi|302398999|gb|ADL36794.1|/9-137gi|255560596|ref|XP_002521312.1|/9-137

gi|147837829|emb|CAN73792.1|/84-212gi|21358787|gb|AAM47025.1|/27-156

gi|145324118|ref|NP_001077648.1|/16-156

StNac1_5/28-157

gi|125562576|gb|EAZ08024.1|/21-149

gi|18399166|ref|NP_564439.1|/16-144

gi|21618244|gb|AAM67294.1|/16-144

gi|7716952|gb|AAF68626.1|AF254124_1/6-134

gi|357516809|ref|XP_003628693.1|/6-134

gi|225448908|ref|XP_002265550.1|/13-141

gi|12322791|gb|AAG51388.1|AC011560_20/9-137

gi|147765514|emb|CAN78113.1|/6-135gi|359474569|ref|XP_002280894.2|/6-135

gi|6714418|gb|AAF26106.1|AC012328_9/6-126

gi|147832851|emb|CAN63961.1|/30-157

gi|115481670|ref|NP_001064428.1|/18-129

gi|224145744|ref|XP_002336258.1|/2-133

gi|358346543|ref|XP_003637326.1|/12-142

gi|147843196|emb|CAN80540.1|/669-799gi|147843196|emb|CAN80540.1|/1150-1280

gi|147843195|emb|CAN80539.1|/9-137

0.01

Potato:  phylogene1cs:  Summary  l Largest  phylogene5c  survey  of  NAC  transcrip5on  factors  

l Observed  expansion  and  greatest  diversity  in  Arabidopsis  

l TM  domains  conserved  in  the  same  clade  across  species  

l  Iden5fied  supported  clades  for    two  target  sequences  in  potato  

l NACs  not  strongly  represented    in  cereals  –  single  large  clade    captures  most  known  sequences  

l Recombina5on  and  domain  composi5on  clearly  an  issue…  

Page 28: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Introduc1on  I  l Dickeya  spp.  

l  Major  pan-­‐European  bacterial  plant  pathogen  

l  Emerging  threat:  Dickeya  “solani”  

l Four  Dickeya  genomes  in  Genbank  

l  Three  misiden5fied:  (D.dadanGi,  D,chrysanthemi,  D.zeae,  D.paradisiaca)  

l  23  drag  Dickeya  genomes  sequenced  by  JHI  (D.dadanGi,  D,chrysanthemi,  D.zeae,  D.paradisiaca,  D.dianthicola,  D.“solani”,  several  unclassified)  

l  Two  drag  genomes  sequenced  by  ILVO  (D.dianthicola,  D.“solani”)  

l Several  technologies:  l  454  PE  and  single-­‐end  

l  Illumina  PE  and  single-­‐end  

Page 29: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Introduc1on  I  ID   Method   Bases   Con1gs   ID   Method   Bases   Con1gs  

Ddie_NCPPB_2976  dieffenbachiae  

Newbler  de  novo  

4804891   76   Dunk_MK7  unknown  

Newbler  de  novo  

4921532   57  

Dze_NCPPB_3531  zeae  

Newbler  de  novo  

4622472   27   Dch_NCPPB_516  chrysanthemi  

Minimus  (meta)  

4614776   33  

Dze_NCPPB_2538  zeae  

Newbler  de  novo  

4556975   41   Dda_NCPPB_898  dadanGi  

Minimus  (meta)  

4933637   45  

Dze_CSL_RW192  zeae  

Newbler  de  novo  

4692116   49   Dda_NCPPB_3537  dadanGi  

Minimus  (meta)  

4805222   38  

Dunk_CSL_RW240  unknown  

Newbler  de  novo  

4375181   73   Dunk_NCPPB_3274  unknown  

Minimus  (meta)  

5110316   54  

Dch_NCPPB_3533  chrysanthemi  

Minimus  (meta)  

4769655   49   Dunk_DW0440  unknown  

Newbler  de  novo  

4330262   147  

Dso_IPO2222  solani  

Newbler  de  novo  

4863391   96  (8)   Dunk_NCPPB_569  unknown  

Minimus  (meta)  

4215441   58  

Dch_NCPPB_402  chrysanthemi  

Newbler  de  novo  

4718933   93  (12)   Dze_NCPPB_3532  zeae  

Minimus  (meta)  

4555162   19  

Dpa_NCPPB_2511  paradisiaca  

Newbler  reference  

4627470   43   Ddi_GBBC2039  dianthicola  

Minimus  (meta)  

4776142   237  

Dso_MK16  solani  

Minimus  (meta)  

4868282   24   Dso_GBBC2040  solani  

CLCBio  reference  

4832847   224  

Dze_MK19  zeae  

Newbler  de  novo  

4668103   33   Ddi_NCPPB_3534  dianthicola  

MIRA  (hybrid  dn)  

4831142   45  

Ddi_IPO980  dianthicola  

Newbler  de  novo  

4825313   62   Ddi_NCPPB_453  dianthicola  

Newbler  de  novo  

4668151   46  

Dso_MK10  solani  

Minimus  (meta)  

4931437   41  

Page 30: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Introduc1on  I  ID   Prodigal   Excess   tRNA   ID   Prodigal   Excess   tRNA  

Ddie_NCPPB_2976  dieffenbachiae  

4269   63  +  2   63   Dunk_MK7  unknown  

4251   48  +  1   64  

Dze_NCPPB_3531  zeae  

4072   59   62   Dch_NCPPB_516  chrysanthemi  

4190   60  +  1   69  

Dze_NCPPB_2538  zeae  

4048   46   64   Dda_NCPPB_898  dadanGi  

4330   43   62  

Dze_CSL_RW192  zeae  

4205   64  +  1   60   Dda_NCPPB_3537  dadanGi  

4163   47   64  

Dunk_CSL_RW240  unknown  

3953   31   64   Dunk_NCPPB_3274  unknown  

4439   63   69  

Dch_NCPPB_3533  chrysanthemi  

4211   51  +  1   65   Dunk_DW0440  unknown  

3962   31   59  

Dso_IPO2222  solani  

4226   62  +  2   63   Dunk_NCPPB_569  unknown  

3878   59   72  

Dch_NCPPB_402  chrysanthemi  

4204   58   61   Dze_NCPPB_3532  zeae  

4062   55   64  

Dpa_NCPPB_2511  paradisiaca  

4094   75  +  2   64   Ddi_GBBC2039  dianthicola  

4286   67  +  2   57  

Dso_MK16  solani  

4168   58  +  2   64   Dso_GBBC2040  solani  

4231   57  +  4   63  

Dze_MK19  zeae  

4179   47   65   Ddi_NCPPB_3534  dianthicola  

4333   64  +  2   77  

Ddi_IPO980  dianthicola  

4254   71  +  1   63   Ddi_NCPPB_453  dianthicola  

4263   60  +  2   64  

Dso_MK10  solani  

4118   54  +  2   62  

Page 31: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Introduc1on  I  l 25  BioProjects  online  

l Genome  assembly  submissions  in  progress  

Page 32: Adventures in Bioinformatics (2012)

Bacteria:  genomics  l Pairwise  drag  assembly  comparisons  (MegaBLAST,  MUMmer)  

Dickeya  zeae  pairwise  comparisons  

Page 33: Adventures in Bioinformatics (2012)

Bacteria:  genomics  l Pairwise  drag  assembly  comparisons  (MegaBLAST,  MUMmer)  

l  Average  Nucleo5de  Iden5ty  (ANI),  unique  regions  

Dickeya  pairwise  unique  regions  

GBB

C2039

NCPP

B_3534

NCPP

B_453

IPO_980

MK7

NCPP

B_3274

NCPP

B_2976

NC_014500

NCPP

B_898

NCPP

B_3537

GBB

C2040

IPO_2222

MK10

MK16

CSL_RW240

DW_0440

NCPP

B_2511

NC_012880

NCPP

B_569

NCPP

B_402

NCPP

B_516

NCPP

B_3533

NC_012912

NCPP

B_3531

CSL_RW192

NC_013592

MK19

NCPP

B_3532

NCPP

B_2538

NCPPB_2538NCPPB_3532MK19NC_013592CSL_RW192NCPPB_3531NC_012912NCPPB_3533NCPPB_516NCPPB_402NCPPB_569NC_012880NCPPB_2511DW_0440CSL_RW240MK16MK10IPO_2222GBBC2040NCPPB_3537NCPPB_898NC_014500NCPPB_2976NCPPB_3274MK7IPO_980NCPPB_453NCPPB_3534GBBC2039

Dickeya  average  nucleo5de  iden5ty  (ANI)  

Page 34: Adventures in Bioinformatics (2012)

Bacteria:  genomics  l Pairwise  drag  assembly  comparisons  (MegaBLAST,  MUMmer)  

l  Is  D.“solani”  a  novel  species?  (95%  ANI≈70%  DNA-­‐DNA  hybridisa5on)    GBB

C2039

NCPP

B_3534

NCPP

B_453

IPO_980

MK7

NCPP

B_3274

NCPP

B_2976

NC_014500

NCPP

B_898

NCPP

B_3537

GBB

C2040

IPO_2222

MK10

MK16

CSL_RW240

DW_0440

NCPP

B_2511

NC_012880

NCPP

B_569

NCPP

B_402

NCPP

B_516

NCPP

B_3533

NC_012912

NCPP

B_3531

CSL_RW192

NC_013592

MK19

NCPP

B_3532

NCPP

B_2538

NCPPB_2538NCPPB_3532MK19NC_013592CSL_RW192NCPPB_3531NC_012912NCPPB_3533NCPPB_516NCPPB_402NCPPB_569NC_012880NCPPB_2511DW_0440CSL_RW240MK16MK10IPO_2222GBBC2040NCPPB_3537NCPPB_898NC_014500NCPPB_2976NCPPB_3274MK7IPO_980NCPPB_453NCPPB_3534GBBC2039

Dickeya  average  nucleo5de  iden5ty  (ANI)  GBB

C2039

NCPP

B_3534

IPO_980

NCPP

B_453

NCPP

B_2976

NCPP

B_898

NCPP

B_3537

NC_014500

GBB

C2040

IPO_2222

MK10

MK16

MK16

MK10

IPO_2222

GBBC2040

NC_014500

NCPPB_3537

NCPPB_898

NCPPB_2976

NCPPB_453

IPO_980

NCPPB_3534

GBBC2039

Page 35: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Methods  l Gene  complement  comparisons  

l  Reciprocal  best  BLAST  hits  (RBBH)  

Organism  2  

Organism  1  

CDS:  

CDS:  

not  RBBH  RBBH:  best  BLAST  hit:  

Page 36: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Methods  

best  BLAST  hits   reciprocal  best  BLAST  hits  

Page 37: Adventures in Bioinformatics (2012)

Bacteria:  genomics  

D.  solani  IPO2222  

70%  iden5ty   100%  iden5ty   core   accessory  

Page 38: Adventures in Bioinformatics (2012)

Bacteria:  genomics  Cliques:  l Core  and  accessory  genomes  from  RBBH  

l  Predicted  Dickeya  core  genome:  2201  genes  

Page 39: Adventures in Bioinformatics (2012)

Compara1ve  Genomics  l  Predicted  Dickeya  species-­‐specific  accessory  genome  sizes  

l  Accessory  =  RBBH  with  all  other  members  of  the  species,  but  no  other  Dickeya    

l Weak  pruning:  remove  all  RBBH  with  <80%  iden5ty,  40%  coverage  

l  Full  pruning:  remove  all  RBBH  un5l  minimal  ‘clique’  found  

Species   Weak  pruning   Full  pruning  

‘core’   2201   2201  

chrysanthemi   32   36  

dadanGi   11   14  

dianthicola   102   127  

paradisiaca   404   441  

solani   120   157  

zeae   33   40  

Page 40: Adventures in Bioinformatics (2012)

Bacteria:  genomics  l Collinearity  and  conserva5on  of  synteny  from  RBBH  

l  iADHoRe  

Page 41: Adventures in Bioinformatics (2012)

Bacteria:  genomics  l  Collinearity  and  conserva5on  of  synteny  from  RBBH  

l  iADHoRe  

Ddi/Dda  

Dze  

Dso  

Ddi/Dso  

Dickeya  

Page 42: Adventures in Bioinformatics (2012)

Bacteria:  SysBio:  Methods  l 29  metabolic  models  (25  drag  genomes,  4  published)  

l Presence/absence  of  metabolic  pathways  

l  Substrate  dependence/survival  

l Flux  Balance  Analysis  (FBA)  l  Steady-­‐states/Elementary  Modes  

l  Associa5on  with  phenotype  

Page 43: Adventures in Bioinformatics (2012)

Bacteria:  SysBio:  Results  

Compound  presence/absence  table  

Reac5on  presence/absence  table  

Page 44: Adventures in Bioinformatics (2012)

Bacteria:  SysBio:  Results  l Predicted  substrate-­‐dependent  growth  

l  experimental  verifica5on  

Sonia  Humphris/Anne-­‐Laure  Lucquet  

Page 45: Adventures in Bioinformatics (2012)

Dickeya  Virtual  Machine  l Aims:  

l  A  basis  for  collabora5on  (joint  compara5ve  genomics  paper)  

l  Share  sequencing  and  analysis  data  

l  Provide  tools  for  analysis  

l Solu5on:  l  Ubuntu  12.04  as  a  virtual  machine  on  USB  s5ck  

Page 46: Adventures in Bioinformatics (2012)

Dickeya  Virtual  Machine  

Page 47: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Introduc1on  II  l Campylobacter  spp.  

l  Collabora5on  with  University  of  Aberdeen  (studentship  available):  >4200  genome  sequences  of  outbreak  isolates  

l  Most  prevalent  (increasingly  so)  food-­‐borne  pathogen  in  Scotland:  >6500  cases  per  annum;  60-­‐80%  of  cases  from  chicken  

l  Occasional  water  contamina5on  

l  Mul5ple  host-­‐specific  species  (canle,  pig,  chicken)  

l  No  magic  bullet  –    hygiene  is  not  sufficient  

 

Page 48: Adventures in Bioinformatics (2012)

Bacteria:  genomics  l Analysis  underway  (RBBH  >34  day  calcula5on  on  JHI  cluster)  

 

0.000

0.001

0.002

0.003

0.004

0.005

2000 4000 6000 8000genecall count

dens

ity

Genecall count distribution (binsize=50)

0.0e+00

2.5e−06

5.0e−06

7.5e−06

1.0e−05

2e+06 3e+06 4e+06assembly length

dens

ity

Total assembly length distribution (binsize=1e4)

Page 49: Adventures in Bioinformatics (2012)

Bacteria:  genomics:  Summary  l 25  novel  Dickeya  drag  genomes  

l  Whole-­‐genome  differences  (structure/similarity)  

l  Gene  complement  differences  

l  Core  and  accessory  genomes  

l  Metabolic  reconstruc5on  

l  Phenotypic  differences  

l  Dickeya  VM  for  collaborators  

l 1034  novel  Campylobacter  genomes  

l  Influence  of  gene  complement  on  host  associa5on  

Page 50: Adventures in Bioinformatics (2012)

Acknowledgements  l  Bacteria:  diagnos1cs  

l  Ian  Toth,  Nicola  Holden,  Sonia  Humphris  (CMS)  

l  Mar5na  Bielaszewska,  Helge  Karch,  Nadine  Brandt  (University  of  Münster)  

l  John  Elphinstone,  Neil  Parkinson,  Valerie  Bertrand  (FERA,  York)  

 

l  Potato:  NB-­‐LRRs  

l  Ingo  Hein,  Florian  Jupe,  Glenn  Bryan,  Sanjeev  Sharma  (CMS)  

l  Peter  Cock,  Linda  Milne  (ICS)  

l  Frank  Wright,  Katrin  MacKenzie  (BioSS)  

l  Graham  Etherington  (TSL,  Norwich)  

l  Dan  Bolser  (University  of  Dundee/EBI)  

l  Bacteria:  genomics  

l  Ian  Toth,  Sonia  Humphris,  Nicola  Holden,  Emma  Douglas,  Anne-­‐Laure  Lucquet  (CMS)  

l  Peter  Cock,  Iain  Milne,  Sue  Jones  (ICS)  

l  Ken  Forbes,  Norval  Strachan  (University  of  Aberdeen)  

l  Gerry  Saddler  (SASA,  Edinburgh)  

l  Steve  Baeyen,  Mar5ne  Maes,  Johan  van  Vaerenbergh  (ILVO,  Belgium)  

l  John  Elphinstone,  Neil  Parkinson  (FERA,  York)  

l  Jan  van  der  Wolf  (PRI  Wageningen)  

l  Minna  Pirhonen  (University  of  Helsinki)  

l  Potato:  phylogene1cs  

l  Paul  Birch,  Hazel  McLellan  (CMS)  

l  Frank  Wright  (BioSS)