solgs workshop 2016

83
solGS: Web-based Genomic Selection Analysis Tool

Upload: solgenomics

Post on 19-Jan-2017

70 views

Category:

Science


0 download

TRANSCRIPT

Page 1: SolGS workshop 2016

solGS: Web-based Genomic Selection Analysis Tool

Page 2: SolGS workshop 2016

Purposes

Gain understanding of genomic selection, GS model building, breeding values prediction, assessing model data input and output quality.

Brainstorm for ideas to make the tool suit better your research purposes.

Page 3: SolGS workshop 2016

Outline

Overview GS and solGS

Demo Exercise, bug watching, feedback

Brainstorming

Page 4: SolGS workshop 2016

Phenotyped &

genotyped individuals

Genomic selection…

Prediction model

Predicted breeding

Values (GEBVs)

Genotyped selection candidates

Training population

Page 5: SolGS workshop 2016

GS advantages

Little or no phenotyping reduced cost

Shorter breeding cycles Higher selection gain per unit time

Increased prediction accuracy

Page 6: SolGS workshop 2016

Phenotyped &

genotyped individuals

Genomic selection…

Prediction model

Predicted breeding

Values (GEBVs)

Genotyped selection candidates

Training population

Page 7: SolGS workshop 2016

Challenges…

Data volume, storage Data structuring, cleaning, imputation

Statistical analysis complexity

visualization and sharing

Page 8: SolGS workshop 2016

solGShttp://cassavabase.org/solgs

Page 9: SolGS workshop 2016

What you can do with solGS…

Store data Chado Natural Diversity schema

Create training dataset Build models and predict breeding values of selection candidates

Test model accuracy

Page 10: SolGS workshop 2016

What you can do with solGS…

Explore phenotype data Evaluate population structure Check on relationship between GEBVs vs observed phenotypes

Calculate selection indices, correlation

Visualize data on interactive plots

Calculate selection response

Page 11: SolGS workshop 2016

What is the statistical approach behind solGS?

Page 12: SolGS workshop 2016

…preparing phenotype data

Omits individuals completely missing phenotype values

Adjusts phenotype values for block effects

Averages across multiple trials after adjusting for block effects

Isaak Tecle
Page 13: SolGS workshop 2016

…preparing genotype data

Removes out monomorphic markers Removes markers with > 60% missing values

Removes markers with MAF < 5% Removes individuals with > 80% missing values

Imputes missing marker data Median substitution

Isaak Tecle
Page 14: SolGS workshop 2016

…statistical modeling Univariate Two-stage analysis RR-BLUP

Endelman, Plant Genome (2010) GBLUP

Marker-based realized relationship matrix

Prediction accuracy Based on 10-fold cross-validation

Isaak Tecle
Page 15: SolGS workshop 2016

How does solGS work?

Page 16: SolGS workshop 2016

Websites for exercise

Cassava-devel.sgn.cornell.edu Cassava-test.sgn.cornell.edu Review.cassavabase.org Cassavabase.org https://iita-mirror.cassavabase.org

https://172.30.2.199 Username: sgn Password: eggplant

Page 17: SolGS workshop 2016

Phenotyped &

genotyped individuals

Genomic selection steps…

Prediction model

Predicted breeding

Values (GEBVs)

Selection candidates

Training dataset

Page 18: SolGS workshop 2016

Demo: Part I

Create training data set & build modelExplore model input and outputPhenotype and genetic correlationPopulation structureSelection index

Isaak Tecle
Page 19: SolGS workshop 2016

Things to consider when creating a training data set & building a model

Isaak Tecle
Page 20: SolGS workshop 2016

Things to consider…Phenotype data

Number of phenotyped individuals Minimum 20 clones

Relevant to target environment Data quality

Experimental design Measurement accuracy Missing values outliers

Page 21: SolGS workshop 2016

Things to consider…genotype data Marker number, genome distribution, polymorphism,

Data quality Allele calling accuracy Missing values (Per marker, individual)

Minor alleles Heterozygosity, LD

Population structure

Page 22: SolGS workshop 2016

Let’s do stuff!

Page 23: SolGS workshop 2016

single trial – single trait Create training data set and build model Trial method

Search for trial ‘Cassava Ibadan 2002/03’

Create a training dataset with that trial

Description, correlation Build a model for FRW

Explore model input and output, model accuracy Download GEBVs

Isaak Tecle
Page 24: SolGS workshop 2016

Exercise: single trial – single trait Create training data set and build model

Search for your trial Create a training dataset with that trial

Check description, correlation Build a model for your trait

Explore model input and output, Population structure model accuracy Download GEBVs

Isaak Tecle
Page 25: SolGS workshop 2016

single trial – multiple traits Create training data set and build models

Search for trial ‘Cassava Ibadan 2002/03’

Create a training dataset with that trial

Description, correlation Build models for FRW and CMDS

Explore model input and output for each model,

Genetic correlation Selection index

Isaak Tecle
Page 26: SolGS workshop 2016

Exercise: single trial – multiple traits Create training data set and build models

Search for your trial Create a training dataset with that trial

Check description, correlation Build models for two traits at the same time

Explore model input and output for each model,

Genetic correlation Calculate and download selection index

Isaak Tecle
Page 27: SolGS workshop 2016

Combined trials – single trait Create training data set and build models using two trials Search for ‘cassava ibadan 02/03 & 01/02’ Create a training dataset with the trials

Check description, correlation Build a model for FRW

Explore model input and output for the model, Population structure Prediction accuracy Download GEBV

Isaak Tecle
Page 28: SolGS workshop 2016

Exercise: combined trials – single trait Create training data set and build models using two trials

Search for your trials Create a training dataset with the trials

Check description, correlation Build a model for your trait

Explore model input and output for the model,

Population structure Prediction accuracy Download GEBV

Isaak Tecle
Page 29: SolGS workshop 2016

Using list – single trait Create training data set and build a model using plots list

Using the search wizard create a plots list from trial ‘cassava ibadan 2002/03 plots’

Create a training dataset with the list Check description, correlation

Build a model for your FRW Explore model input and output for the model, Population structure Prediction accuracy Download GEBV

Isaak Tecle
Page 30: SolGS workshop 2016

Exercise: Using list – single trait Create training data set and build a model using plots list

Using the search wizard create a plots list from a trial… select all plots..

Create a training dataset with the list Check description, correlation

Build a model for your trait Explore model input and output for the model, Population structure Prediction accuracy Download GEBV

Isaak Tecle
Page 31: SolGS workshop 2016

Demo: Part II

Predict breeding values of selection populationsGenetic correlationSelection indexSelection gain

Isaak Tecle
Page 32: SolGS workshop 2016

Things to consider when applying a model to predict breeding values of selection populations

Isaak Tecle
Page 33: SolGS workshop 2016

Things to consider…applying the model Training population vs selection population genetic relationship

Target environment Marker types used Population structure

Page 34: SolGS workshop 2016

Predict GEBVs of a Selection population Create training data set & build model Cassava Ibadan 2002/03 FRW

Search for a selection population Cassava Ibadan 2003/04

Predict GEBVs for the selection population Check selection response Download GEBVs

Isaak Tecle
Page 35: SolGS workshop 2016

Exercise: Selection Population Prediction Create training data set & build model use one of the models you already built

Search for a selection population Related to the training population

Predict GEBVs for the selection population Check selection response Download GEBVs

Isaak Tecle
Page 36: SolGS workshop 2016

Multiple Traits: Predict GEBVs of a Selection population

Create training data set & build model Cassava Ibadan 2002/03 FRW, CMDS

Search for a selection population Cassava Ibadan 2003/04

Predict GEBVs for both traits for the selection population Check selection response Download GEBVs

Isaak Tecle
Page 37: SolGS workshop 2016

Exercise: Multiple Traits selection population prediction

Create training data set & build model Use previous two models from your training populations

Search for a selection population

Predict GEBVs for both traits for the selection population Check genetic correlation Calculate selection index

Isaak Tecle
Page 38: SolGS workshop 2016

List: Predict GEBVs of a Selection population Create training data set & build model Cassava Ibadan 2002/03 FRW

Search for a selection candidates list Cassava Ibadan 213 genotypes

Predict GEBVs for the selection population Check selection response Download GEBVs

Isaak Tecle
Page 39: SolGS workshop 2016

Exercise: selection candidates list Create training data set & build model Go to a previous model page

Create a selection candidates list Use search wizard to create accessions list

Using the model predict GEBVs of the list Check selection response Download GEBVs

Isaak Tecle
Page 40: SolGS workshop 2016

Demo: Part III

Trait search Search for ‘fresh root weight’ Select trial ‘cassava ibadan 2002/03’

Check model output

Isaak Tecle
Page 41: SolGS workshop 2016

Demo: Part III

PCA using accessions list

Isaak Tecle
Page 42: SolGS workshop 2016

Brainstorm for new featuresMake priority list

What features do you like in BMS?What features do you like in to be added in cassavabase?

Isaak Tecle
Page 43: SolGS workshop 2016

Thanks to…

Page 44: SolGS workshop 2016
Page 45: SolGS workshop 2016
Page 46: SolGS workshop 2016

Composing a training population: Fitting a

prediction model...3 options

Page 47: SolGS workshop 2016
Page 48: SolGS workshop 2016

Fitting a prediction model…

Option 1: Search using a trait name

Page 49: SolGS workshop 2016
Page 50: SolGS workshop 2016
Page 51: SolGS workshop 2016
Page 52: SolGS workshop 2016
Page 53: SolGS workshop 2016
Page 54: SolGS workshop 2016
Page 55: SolGS workshop 2016
Page 56: SolGS workshop 2016
Page 57: SolGS workshop 2016
Page 58: SolGS workshop 2016
Page 59: SolGS workshop 2016
Page 60: SolGS workshop 2016

Estimating breeding values of selection

candidates

Applying the model…

Page 61: SolGS workshop 2016
Page 62: SolGS workshop 2016
Page 63: SolGS workshop 2016

Fitting a prediction model…

Option 2: Search for trials

Page 64: SolGS workshop 2016
Page 65: SolGS workshop 2016
Page 66: SolGS workshop 2016
Page 67: SolGS workshop 2016
Page 68: SolGS workshop 2016
Page 69: SolGS workshop 2016
Page 70: SolGS workshop 2016

Estimating breeding values of a selection candidates for multiple traits

Applying the models…

Page 71: SolGS workshop 2016
Page 72: SolGS workshop 2016
Page 73: SolGS workshop 2016

Estimating genetic correlations

Page 74: SolGS workshop 2016
Page 75: SolGS workshop 2016

Calculating selection indices

Page 76: SolGS workshop 2016
Page 77: SolGS workshop 2016

Fitting a prediction model…

Option 3: use your own list of

individuals

Page 78: SolGS workshop 2016
Page 79: SolGS workshop 2016
Page 80: SolGS workshop 2016

To sum up… Store data Build prediction models Estimate breeding values Additional analyses:

Correlation analysis Population structure Selection indices

http://cassavabase.org/solgs Open source code

Page 81: SolGS workshop 2016

Thanks to…

Page 82: SolGS workshop 2016
Page 83: SolGS workshop 2016

Many thanks!!

Background image: nextgencassava.org