![Page 1: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/1.jpg)
1
Computational Infrastructure for Systems Genetics Analysis
Brian Yandell, UW-Madison
high-throughput analysis of systems dataenable biologists & analysts to share tools
UW-Madison: Yandell,Attie,Broman,KendziorskiJackson Labs: ChurchillU Groningen: Jansen,SwertzUC-Denver: TabakoffLabKey: Igra
![Page 2: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/2.jpg)
![Page 3: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/3.jpg)
typical workflow (from Mark Igra, LabKey)
3
FileStorage
DatabaseStorage
Cluster
1 Transfer files. 2 Configure Analysis
3
Bio-Web Server
Run analysis and load results
4 Review results, repeat steps 2-4 as required
![Page 4: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/4.jpg)
4
view results(R graphics,
GenomeSpace tools)
systems genetics portal
(PhenoGen)
collaborativeportal
(LabKey)
iterate many times
get data (GEO, Sage)
run pipeline(CLIO,XGAP,HTDAS)
![Page 5: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/5.jpg)
![Page 6: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/6.jpg)
data access model
closed limited open
publishedshared
collaborationin progress
patent issues
proprietaryinternal
Sage Commons
Company X
![Page 7: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/7.jpg)
7
analysis pipeline acts on objects(extends concept of GenePattern)
pipeline
checks
input output
settings
![Page 8: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/8.jpg)
8
pipeline is composed of many steps
Ai B
C
DEoD’
![Page 9: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/9.jpg)
9
causal model selection choices in context of larger, unknown network
focal trait
target trait
focal trait
target trait
focal trait
target trait
focal trait
target trait
causal
reactive
correlated
uncorrelated
![Page 10: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/10.jpg)
10
BxH ApoE-/- chr 2: causal architecture
hotspot
12 causal calls
![Page 11: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/11.jpg)
11
BxH ApoE-/- causal networkfor transcription factor Pscdbp
causal trait
![Page 12: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/12.jpg)
view results(R graphics,
GenomeSpace tools)
systems genetics portal
(PhenoGen)
collaborativeportal
(LabKey)
iterate many times
get data(GEO, Sage)
develop analysis models & algorithms
run pipeline(CLIO,XGAP,HTDAS)
updateperiodically
![Page 13: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/13.jpg)
13
platform for biologists and analysts
• create and extend pipeline steps• share algorithms–public library–private authentication
• compare methods on one platform• combine data from multiple studies
![Page 14: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/14.jpg)
14
pipeline
checks
input output
settings
rawcode
version controlsystem
R&D
![Page 15: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/15.jpg)
15
Model/View/Controller (MVC) software architecture
• isolate domain logic from input and presentation• permit independent development, testing, maintenance
ControllerInput/response
Viewrender for interaction
Modeldomain-specific logic
user changes
system actions
![Page 16: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/16.jpg)
16
perspectives for building a communitywhere disease data and models are shared
Benefits of wider access to datasets and models:1- catalyze new insights on disease & methods2- enable deeper comparison of methods & results
Lessons Learned:1- need quick feedback between biologists & analysts2- involve biologists early in development3- repeated use of pipelines leads to
documented learning from experienceincreased rigor in methods
Challenges Ahead:1- stitching together components as coherent system2- ramping up to ever larger molecular datasets
![Page 17: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/17.jpg)
17
www.stat.wisc.edu/~yandell/[email protected]
• UW-Madison– Alan Attie– Christina Kendziorski– Karl Broman– Mark Keller– Andrew Broman– Aimee Broman– YounJeong Choi– Elias Chaibub Neto– Jee Young Moon– John Dawson– Ping Wang– NIH Grants DK58037, DK66369, GM74244,
GM69430 , EY18869
• Jackson Labs (HTDAS)– Gary Churchill– Ricardo Verdugo– Keith Sheppard
• UC-Denver (PhenoGen)– Boris Tabakoff– Cheryl Hornbaker– Laura Saba– Paula Hoffman
• Labkey Software– Mark Igra
• U Groningen (XGA)– Ritsert Jansen– Morris Swertz– Pjotr Pins– Danny Arends
• Broad Institute– Jill Mesirov– Michael Reich
![Page 18: Computational Infrastructure for Systems Genetics Analysis Brian Yandell , UW-Madison](https://reader036.vdocuments.mx/reader036/viewer/2022062521/568156c7550346895dc458b9/html5/thumbnails/18.jpg)
18
Systems Genetics Analysis PlatformBrian Yandell, UW-Madison
high-throughput analysis of systems dataenable biologists & analysts to share toolsUW-Madison: Attie, Broman,KendziorskiJackson Labs: ChurchillU Groningen: Jansen, SwertzUC-Denver: TabakoffLabKey: Igra hotspot
causal trait