sook jung, taein lee, stephen ficklin, kate evans, cameron peace and dorrie main

19
Building breeding databases in GDR, Genome Database for Rosaceae Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Upload: sarina-wines

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Building breeding databases in GDR, Genome Database for Rosaceae

Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie

Main

Page 2: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

GDR: Genome database for RosaceaeGenomic, Genetic and Breeding data

Other databases:CottonGen, Citrus Genome Database, Cool Season Food

Legume Database, Genome Database for Vaccinium

Using open source tools for an efficient and flexible database construction (Chado, Tripal, Drupal)

Chado, with the recent Natural Diversity Module, allows integration of complex biological data from widely different projects and species

Introduction

Page 3: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Part I: How to store data using Chado

Part II: Demo of GDR Breeding Database

Outline

Page 4: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Chado: Modular, Generic and Ontology-driven schema

Feature

Feature_idNameUniquenameType_idOrganism_idresidues

Feature_relationship

Feature_relationship_idSubject_idObject_idType_id

Featureprop

Featureprop_idFeature_idType_idValuerank

cvterm

cvterm_idNamedefinitioncv_idDbxref_id

gene, mRNA, marker, QTL, etc

Abc-mRNApart_of

Abc-gene

Repeat_motif

Product_size

Subject_id

object_id

cv

cv_idNamedefinition

Sequence Ontology, Gene Ontology, etc

Page 5: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Storing Stock (from samples to population; pedigree)

stock

stock_idNameUniquenameType_idOrganism_idresidues

stock_relationship

Feature_relationship_idSubject_idObject_idType_id

stockprop

stockprop_idstock_idType_idvalue

cvterm

cvterm_idNamedefinitioncv_idDbxref_id

Population, cultivar,

breeding line, clone, sample,

etc

Gala-001sample_o

fGala

Description,population_si

ze

Subject_id

object_id

GalaMaternal_parent_

ofSonya

pedigree

Page 6: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Storing phenotype data (from measurements to projects)

stock

Feature_idNameUniquenameType_idOrganism_idresidues

nd_experiment

Nd_experiment_idNd_geolocation_idType_id phenotype

phenotype_idUniquenamevalueattr_id

cvterm

cvterm_idNamedefinitioncv_idDbxref_id

PhenotypingGenotypingCross_experiment

project

Featureprop_idFeature_idType_idvalue

NE_stockNE_phenoty

pe

project_relationship

NE_project

Page 7: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Genotypic data integrated with genomic/genetic data

nd_experiment

Nd_experiment_idNd_geolocation_idType_id

genotype

genotype_idnameUniquenamedescription

NE_genotype

feature_genotype

Feature

Feature_idNameUniquenameType_idOrganism_idresidues

project

stock

uniquename: CPSCT038_190|192 description: 190:192

Uniquename:CPSCT038Type:microsatellite

map

Explore sequences around marker in GBrowse

Page 8: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Relationship between genotype and phenotype(haplotype and haplotype effect)

nd_experiment

Nd_experiment_idNd_geolocation_idType_id

genotype

genotype_idnameUniquenamedescription

NE_genotype

feature_genotype

Feature

Feature_idNameUniquenameType_idOrganism_idresidues

project

stock

uniquename: MA_H3|H4bdescription: H3|H4b

Uniquename:MaType:MTL

map

phenotype

phenotype_idUniquenamevalueattr_id

NE_phenotype

phenstatement

phenstatement_idType_idGenotype_idphenotype_idEnvironmentpub

attr_id: crisp value: 2.2

Germplasm with H3|H4b alleles of MA locus hasvalue of 2.2 for crisp

Page 9: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Data Management (Browse, Search and Download)

Data Conversion (Generate Input files for Pedimap)

Decision Support Cross Assist Trait Locus Warehouse Marker Converter

GDR Breeding Database Demo

Page 10: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

10

Phenotypic Data Search

Page 11: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

11

Page 12: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

12

Genotypic Data Search

Page 13: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

o A web interface to generate a list of parents and the number of seedlings to get the progeny with desired traits

o Methods “Phenotype” (uses only phenotypic

information of individuals in the dataset), “+Pedigree” (uses both phenotypic and

pedigree information) “+Ped+DNA” (uses phenotypic, pedigree

information and information provided by DNA-based functional genotypes).

Cross Assist

Page 14: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Step 1: Select Method

Page 15: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Step 2: Select target number and trait thresholds

Page 16: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Step 3: Filter results by data completeness, required number of seedlings, and parentage

Page 17: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Future Development

o Data RosBreed QTLs and their genome positions More breeding data and DNA based functional

genotypes More re-sequencing data

o Functionality Data management: online data submission and

editing Viewing data on screen and generating report pages Decision support tools

o Cross Assist: o to accommodate more complex situations

(selfing, cross compatibility, etc)o To upload users’ own data

o Further develop more tools

Page 18: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Natural diversity module working groupNaama Menda, Seth Redmond, Robert M. Buels, Maren Friesen, Yuri Bendana, Lacey-Anne Sanderson, Hilmar Lapp, Taein Lee, Bob MacCallum, Kirstin E. Bett, Scott Cain, Dave Clements, Lukas A. Mueller and Dorrie MainMain Lab team

All Project CoPIs (tfGDR, RosBreed and CottonGen)Funding SourcesUSDA NIFA SCRI, NSF Plant Genome Program, USDA-ARS, Washington Tree Fruit Research Commission, Cotton Incorporated, Washington State University, Clemson University, University of Florida, Boyce Thompson Institute, North Carolina State University

Taein LeeStephen Ficklin Chun-Huai ChengPing Zheng Anna BlendaSushan RuDorrie Main Jing Yu

Acknowledgement

Page 19: Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main

Thank You!Any Questions?