copyright © 2008, sas institute inc. all rights reserved. discovering meaningful patterns in...
TRANSCRIPT
Copyright © 2008, SAS Institute Inc. All rights reserved.
Discovering Meaningful Patterns in Genomics Data with JMP Genomics
Jordan HillerJMP Genomics Application Scientist
2/19/2010
Copyright © 2008, SAS Institute Inc. All rights reserved.
About SAS
SAS founded in 1976• About 10,000 employees worldwide• Over $2 billion in revenue
SAS provides statistical analysis solutions to pharma, financial and retail businesses, among others• Provides robust statistical and data mining algorithms• SAS programming language is powerful, but complex
Copyright © 2008, SAS Institute Inc. All rights reserved.
About JMP
We are a division of SAS
Founded in 1989 by John Sall, co-founder of SAS
Originally Mac only, ported to PC in the 90s
Desktop software, highly visual
Provides easy-to-use statistical analysis software for non-SAS users
Copyright © 2008, SAS Institute Inc. All rights reserved.
JMP Genomics SAS on the back-end
• Using established analytic tools: SAS/STAT, IML, SAS/Genetics
• Powerful and flexible language for data manipulation and statistical analysis
• Can handle large genomic datasets with file-based analysis
JMP up front• Point-and-click interface to SAS analytics• Dynamic, interactive visual display of results
Best of both worlds• Choose from JMP or SAS tools
Copyright © 2008, SAS Institute Inc. All rights reserved.
Philosophy of JMP Genomics
We provide pre-built workflows for biologist users
…while retaining statistical flexibility and transparency demanded by analysts
We offer many options for exploring and visualizing complex data sets
Copyright © 2008, SAS Institute Inc. All rights reserved.
JMP Genomics Architecture
“Run” Submits
Program to
to Local SAS
SAS Macros
Generate
Scripts and
Output Tables
New in 4.1:Option to Submit Job to SAS Server
New in 4.1:JMP Genomics64 bit Edition
Copyright © 2008, SAS Institute Inc. All rights reserved.
Highlights of Expression Workflows Import, QC, and analyze large datasets
• Traditional arrays, summaries from next-gen • Simplified workflows, advanced options for statisticians
Copyright © 2008, SAS Institute Inc. All rights reserved.
Highlights of Genetics Workflows
GWAS (up to 1.5M SNPs x 15,000 for 4.0)
Adjustments for population structure, relatedness
QC to filter markers and samples
Association Tests Population Structure Assessment/Correction
Linkage Disequilibrium
Relationship Matrices Individual and Group
Copyright © 2008, SAS Institute Inc. All rights reserved.
Highlights of Copy Number Workflows Import, QC, Normalization and ANOVA
Support SNP6, Illumina 1M SNP
Group vs. Group Individual vs. Group
Copy Number Partition ( Faster CBS)
ANOVA:
Copyright © 2008, SAS Institute Inc. All rights reserved.
Support for Next-Gen Data
Secondary:
Sequence
Alignment
Primary:
Image analysis
Base-calling
Tertiary:
Statistical Analysis of
Summarized Data
National Center for Genome Resources
Mudge et al, PloS One, 2008.
Copyright © 2008, SAS Institute Inc. All rights reserved.
Check our Code
All SAS code and JSL (JMP Scripting Language) code underlying Genomics analytic procedures is accessible
Key Benefits• Transparency: Examine underlying code to understand
procedures• Reproducibility: Use workflows to document analysis
steps, standardize across data sets• Extensibility: Adapt existing code or write your own• Flexibility: Using our analytic toolkit, run analyses we
didn’t think of
Copyright © 2008, SAS Institute Inc. All rights reserved.
What’s New in JMP Genomics 4.1
Support for miRNA, Tiling Arrays, and imputed SNP data
New Population Measures and Relationship Matrix
Q-K Mixed Model process to test for SNP association while adjusting for family relatedness and population structure simultaneously
Manhattan Plots
Improved P-Value Browser visualizes statistical results and genomic information together
Copyright © 2008, SAS Institute Inc. All rights reserved.
Today’s Demonstration
Gene Expression Analysis• Basic Expression Workflow
− QC Analysis
− ANOVA
Copyright © 2008, SAS Institute Inc. All rights reserved.
Today’s Dataset
Affymetrix MPRO hourly dataset
MPRO cell line stimulated with retinoic acid
5 timepoints: 0, 1, 2, 4, and 8 hours
4 technical replicates at each timepoint
Murine U74A arrays
Copyright © 2008, SAS Institute Inc. All rights reserved.
Thank you
Contact Info:
Jordan Hiller
919-531-9809
Copyright © 2008, SAS Institute Inc. All rights reserved.Copyright © 2008, SAS Institute Inc. All rights reserved.