additional research goals using transcriptomic datahonavar/transcriptomics-notes-tuggle-bcb570.pdfmy...
TRANSCRIPT
My definition of Transcriptomics Research
Research on profile of RNA transcripts and abundance under specific conditionsHigh-throughput/high dimensional dataPurposes:♦Define RNA profile expressed in specific tissue
or cell type- “transcriptome”♦ Identify RNAs responding to treatment♦ Identify RNAs responding to genetic differences
Additional Research Goals using Transcriptomic Data
Identify gene sets coordinately responding to treatment/genetic change♦ Identify co-regulation♦ Identify transcriptional regulatory proteins♦ Identify regulatory pathways♦ Identify dependencies among pathways
Integrate with QTL mapping to find QT lociregulating expression
Methods used to Generate Transcriptomic data
1. High volume cDNA sequencing (Expressed Sequence Tag (EST) projects
2. Quantitative PCR3. Serial Analysis of Gene Expression and
New HT Sequencing methods4. Microarray-based Methods and Results
1. EST Sequencing: What is an EST and how can it be used to investigate the genome?
cDNA library creation
AAAAAAAAAAAAAA
AAAAAAATranscriptioncreates mRNApopulation-specific to cell
mRNA
Sequencing of individual cDNA
inserts to generate EST
Comparative mapping
Comparative Sequence Analysis
Expression Analysis
Protein Functional Analysis
EST: 400-500 bp single-pass sequence
from expressed portion of genome
-97-99% accurate
EST Coverage in dbEST
http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html
……
Large Pig EST Projects (> 5,000 seq)*
*as of June 2006, total of 566,277 deposited: Now 1.7 million
Institution Contact Name ESTs
Submitted USDA-ARS Meat Animal Research Center Smith TPL 197,149
National Institute of Agrobiological Sciences (Japan) Uenishi H 137,092
Roslin Institute (U.K.) Anderson SI/Archibald A 56,364 University of Missouri-Columbia Prather RS 37,806 Institut National de la Recherche
Agronomique (France) Tosser-Klopp G/ Bonnet
A 24,956 Iowa State University Tuggle CK 20,983
Animal Technology Institute (Taiwan) Lee W-C 14,266 USDA-Plum Island Neilan JG 14,240
Oklahoma State University DeSilva U 12,825 Michigan State University Ernst C/ Coussens P 12,804
Nevada Department of Agriculture Rink A 11,556 National Chung-Hsing University (Taiwan) Huang M-C 9,373
University of Nebraska-Lincoln Pomp D 5,414
Largest single porcine EST projectContaining 823,871 novel ESTs and 398,837 public ESTsTissues came from both fetal and adult pigs♦ Brain, eye, circulatory (heart, aorta), gut, bone marrow,
cartilage, glandular (suprarenal, thyroid, mammary, lymphatic), muscle, mucosal membrane, and reproductive
97 different non-normalized libraries
Current total over 1,700,000 porcine ESTs in NCBI databases- O.C. starting point to annotate pig AffyNon-normalized versus normalized libraries
Sino-Danish EST Data Released
Normalization of libraries is useful to more deeply sequence the RNA complement of a specific tissueRNA frequencies among genes are highly different-scale across 4-5 orders of magnitude !Normalization is using hybridization among highly abundant members of the RNA pool to remove those sequences, bringing the RNA frequencies, and thus the sampling, closer to a “normal”distributionBut only sequence data from non-normalized libraries can be used to estimate expression levels
Non-normalized versus normalized libraries
~9,000 sequences/library averageThey have estimated expression levels based on EST frequency within their datasetThey have also identified putative SNPs within these sequences-> libraries from multiple breeds
www.piggenome.dk
Sino-Danish EST Data Released
Confirmation of Normalization in Pig cDNA Libraries- ISU
Library Tissue ESTs Clusters Unique FractionName Source** generated in Library Clusters UniqueA1-A3 Ant. Pituitary 1,235 1,054 560 0.45NA Normalized Ant. Pit. 963* 835* 430* 0.51*AY0 Term Placenta 1,411 1,040 563 0.40AY1 Term Placenta-Norm. 3,565 2,899 2028 0.57CP0 Uterus (D12/14) 1,669 1,229 701 0.42CP1 Uterus (D12/14)-Norm. 1,836 1,635 1057 0.58E3-E6 Whole Embryo/Fetus 1,274 1,061 603 0.47H1-H5 Hypothalamus (0, 5, 12 DE) 1,104 1,031 542 0.49O1-O3 Ovary (0, 5, 12 DE) 1,429 1,289 710 0.50
Totals 14,607* 12,048* 7,194*
*estimates based on analysis of first 560 sequences
Use of mixed tagged libraries for efficient EST production
Library Tissue ESTs Clusters Unique FractionName Source** generated in Library Clusters UniqueA1-A3 Ant. Pituitary 1,235 1,054 560 0.45NA Normalized Ant. Pit. 963* 835* 430* 0.51*AY0 Term Placenta 1,411 1,040 563 0.40AY1 Term Placenta-Norm. 3,565 2,899 2028 0.57CP0 Uterus (D12/14) 1,669 1,229 701 0.42CP1 Uterus (D12/14)-Norm. 1,836 1,635 1057 0.58E3-E6 Whole Embryo/Fetus 1,274 1,061 603 0.47H1-H5 Hypothalamus (0, 5, 12 DE) 1,104 1,031 542 0.49O1-O3 Ovary (0, 5, 12 DE) 1,429 1,289 710 0.50
Totals 14,607* 12,048* 7,194*
*estimates based on analysis of first 560 sequences
mRNACreate cDNAfrom mRNA
Size Selectionto obtain
best clones
Ligate into pT7T3-Pacplasmid
Anneal tag-T18primer
Production of Tissue-tagged cDNA libraries
Insert newplasmids intobacteria to make clone libraries
Create mixed normalized libraries
Sequence
....TAAGCTTGCGGCCGCCAAACTTTTTTTTTT....
Library Tag Identification
Plasmid Vector
Library Tag (example)
Poly-T tail
cDNA
Not I site
Sequencing
Use of mixed tagged libraries for efficient EST production
Library Tissue ESTs Clusters Unique FractionName Source** generated in Library Clusters UniqueA1-A3 Ant. Pituitary 1,235 1,054 560 0.45NA Normalized Ant. Pit. 963* 835* 430* 0.51*AY0 Term Placenta 1,411 1,040 563 0.40AY1 Term Placenta-Norm. 3,565 2,899 2028 0.57CP0 Uterus (D12/14) 1,669 1,229 701 0.42CP1 Uterus (D12/14)-Norm. 1,836 1,635 1057 0.58E3-E6 Whole Embryo/Fetus 1,274 1,061 603 0.47H1-H5 Hypothalamus (0, 5, 12 DE) 1,104 1,031 542 0.49O1-O3 Ovary (0, 5, 12 DE) 1,429 1,289 710 0.50
Totals 14,607* 12,048* 7,194*
*estimates based on analysis of first 560 sequences
3. Serial Analysis of Gene Expression:Digital Analysis
http://www.sagenet.org/findings/index.html
Serial analysis of gene expression (SAGE) is a method for comprehensive analysis of gene expression patterns- no cDNAs produced.
Three principles underlie the SAGE methodology:
1. A short sequence tag (10-14bp) contains sufficient information to uniquely identify a transcript provided that that the tag is obtained from a unique position within each transcript (may not be universally true)
2. Sequence tags can be linked together and then cloned and sequenced
3. Quantification of the number of times a particular tag is observed provides the expression level of the corresponding transcript.
A digital analysis of expression
Digital Analysis of Gene Expression:Limitation of SAGE is that large numbers of tags need to be accessed to find rare
transcripts.
New extremely high throughput sequencing technologies may solve this problem.
Example: Illumina technology
Estimated that a single copy transcript exists in about a frequency of about 1 per 350,000.
Thus a single copy will be “read” 3 time per 1,000,000 read or 3 TPM
With 4 million reads per run, every transcript should have about 12 reads or more
http://www.illumina.com/pagesnrn.ilmn?ID=70#234
Illumina Digital Analysis of Gene Expression:Overview of the extremely high throughput Illumina technology
With 4 million reads per run,every transcript should have about 12 reads or more
Process can be scaled as well- if you need more data on low level transcripts, you can simply sequence a second time
Illumina Digital Analysis of Gene Expression:
Reproducible results
http://www.illumina.com/pagesnrn.ilmn?ID=70#234
Comparable results with QPCR
2. Quantitative Real-time-PCR- testing expression one gene at a time
http://www.sigmaaldrich.com/Life_Science/Molecular_Biology/PCR/Key_Resources/Probed_based_QPCR_Animation.html
Real time: refers to the fact that the amplification of the specific sequence is measured in real time, rather than more traditional endpoint analyses on gels
Animation of fluorescent-probe-based Real-time PCR
2. Quantitative Real-time-PCR- testing expression one gene at a time
Fluo
resc
ence
CT
Quantitative Real time-PCR- ΔΔ Ct method
Ref: http://pathmicro.med.sc.edu/pcr/realtime-home.htm
Normalization of data using a “control” gene-assumes that control gene is not affected by treatment
Fold change calculation:
2 ΔΔ Ct : 2,702 fold increase in IL1b RNA due to treatment
Two general methods to make microarrays
In situ synthesis:Affymetrix GeneChip
Spotting cDNA segments or oligos onto glass slides
4. Transcriptional Profiling using microarrays
Adapted from Nuwaysir et al., 1999
Generic Two-color MicroarrayProcedure
Data Analysis Critical !!Oligos or cDNA
fragments spotted
Data Analysis !!
Important aspects to consider
When using microarray technology to reliably measure biology
♦Sources of variability♦Experimental Design♦Statistical analyses of data♦Validation ♦Standardization
Sources of variability
For cDNA spotted arrays:
♦ Accuracy of clone-tracking and PCR amplification of cDNAs♦ Spotting quality♦ Spot detection/analysis
♦ Identity of cDNA spotted-annotation of sequencewhich is often partial
Sources of variability
For both cDNA and oligonucleotide spotted arrays:♦Amount of nucleic acid spotted across arrays♦Spotting consistency and scanning quantification♦Variable hybridization protocols and results♦ Inter-laboratory comparison
?
Affymetrix Pros and Cons
Benefits♦ Consistency from Chip to Chip due to manufacturing
technology and QC♦ No clone tracking or spotting variation♦ Multiple values collected for each transcript- data depth high♦ Mismatch control improves specificity♦ High level of coverage of genes (especially compared to
currently available livestock spotted arrays)Limitations♦ Inflexible design; difficult to rapidly change feature content♦ High cost may decrease critical biological replication
Experimental Design Issues
♦ Reference design versus Loop design » Most relevant to spotted arrays
D. Nettleton
Experimental Design Issues
♦Reference design versus Loop design » Most relevant to spotted arrays
D. Nettleton
Reference Design
♦ Heavily used in initial pre-clinical settings- to compare “normal” to “cancerous”
♦ Could be useful to determine “abnormal” expression pattern♦ Problem- “known” sample is measured the most!
Reference
Loop Design
♦ All samples measured the same number of times in “loop”♦ Works well for multiple treatments, a logical series of
treatments (concentrations of drug, etc) as well as times series after treatment
Statistics
♦Initial work simply compared levels of Cy3 and Cy5 expression, set an ad hoc 2 fold difference in expression (i.e., reference design).
♦Clear that standard linear model ANOVA methods are more appropriate, also need to include effect of multiple testing
♦False discovery rate calculations-described later
Validation
To verify the expression patterns of key genes showing differential expression in the profiling experiment- one gene at a time.Main tool here is real-time quantitative RT-PCRLabor-intensive, each assay must be developedOther approaches coming forward
StandardizationData warehousing- public access- NCBI GEOMeta-analysis- improved powerSuggestions from NIST meeting* on the use of microarrays in the clinic♦ RNA reference materials
» Known/verifiable set of RNAs for validation of methods» Spike-in set of artificial RNAs for validation of specific
hybridizations
* Cronin et al., Clin. Chem. 50:1464 (2004)
Application of transcriptomics: An RNA expression-based clinical test available
commercially
van 't Veer et al., Gene expression profiling predicts clinical outcome of breast cancer, Nature, 2002 415: 530.
- Expression patterns of 70 genes in breast cancer tissue samples were found to accurately predict metastatic outcome (used historical tissue samples)- This “gene expression signature” was found in a large expression profiling study comparing normal and cancerous tissue samples- Microarray-based analyses used
“MammaPrint” diagnostic test for breast cancer prognosis
Validation of “Gene
Expression Signature” in
second cohort of patients
van de Vijver et al., A gene expression signature as a predictor of survival in breast cancer, New England Journal of Medicine, 2002 347: 19.
Validation of “Gene Expression Signature”
van de Vijver et al., New England Journal of Medicine, 2002 347: 19.
van de Vijver et al., New England Journal of Medicine, 2002 347: 19.
Validation of “Gene Expression Signature”
Project 1. Validating a New Porcine Oligonucleotide Array
Qiagen-Operon synthesized a large (~13,000) set of oligonucleotides last year with collaboration from a USDA-NRSP-8 committee.We have validated this new microarray (Zhao et al., 2005 Genomics 86:618):♦ Evaluated utility of sequence set for biology
» Identify number of spots with signal above background» Determine expression pattern for four tissues
♦ Tested specificity and annotation of oligonucleotides» Determine correlation of expression pattern for selected
spots with the expected pattern found for the annotated match in human/mouse
Testing of Porcine ArrayFour tissues: liver, lung, muscle, small intestine.RNA labeled using Cy3 and Cy5. Six biological replicates per tissue, each measured twice (each dye) for 48 measurements on 24 slides.Data analysis:♦ Normalization: LOWESS procedure♦ Linear Model ANOVA (dye, tissue as fixed effects and slide
and animal as random effects)
Comparison of labeling same target RNA with
Cy3 and Cy5
Using current protocols:•Only 50 spots (0.38%)
have greater than 2 fold difference between targets
•Technical reproducibility good
Further testing/use of Porcine Array
Evaluate utility of sequence set for biology♦ Identify number of spots with signal above background♦ Determine expression pattern for four tissues
Test specificity and annotation of oligonucleotides♦ Determine correlation of expression pattern for selected
spots with the expected pattern found for the annotated match in human/mouse
Example Slide Scan
Comparison of labeling different
target RNA
Small intestine Cy5
Liver (Cy 3)
How many spots represent expressed genes in each tissue?
Background: average of Arabidopsis spots signals
Tissue Selective Expression
Expected false positives: 13 per tissue
Lung Liver Muscle SI
266
147
405
538
Real-time quantitative RT-PCR (Q-PCR)
Tis sue e xpre ssion le vel ‡Oligo I D GeneSymb ol C T
* or ΔC T† Liv er Lung Mu scle Sm all
Intes tine
Agree w ithm icroarra y
results?
C T ± S D 37.8 ± 0 .7 32. 5 ± 0 .3 36.1 ± 0.8 36.6 ± 0.9SS 00002529 NO S2AΔC T ± S D 20.7 ± 0 .6a 15. 8± 0 .5c 18.7± 0 .4b 19.8 ± 1 .1ab yes
C T ± S D 30.4 ± 1 .8 28. 2 ± 0 .7 30.1 ± 0.9 31.7 ± 0.4SS 00010183 ICA M 1ΔC T ± S D 13.4± 1 .9ab 11. 6± 1 .0b 12.7± 0 .8ab 14.9± 0 .6a yes
C T ± S D 23.1 ± 0 .8 20. 3 ± 0 .6 25.6 ± 0.2 20.3 ± 0.1SS 00000872 CASP1ΔC T ± S D 6.0± 0.2b 3.7± 0 .3c 8.1± 0 .3a 3.5± 0 .2c yes
C T ± S D 27.9 ± 1 .3 21. 1 ± 0 .6 28.4 ± 0.8 27.7 ± 0.3SS 00006633 IND OΔC T ± S D 10.8± 1 .2a 4.4± 0 .4b 11.0± 0 .8a 10.9± 0 .5a yes
C T ± S D 27.7 ± 1 .9 27. 2 ± 1 .1 27.9 ± 1.2 29.4 ± 0.4SS 00004427 STAT6ΔC T ± S D 10.6± 1 .9a 10. 5± 1 .3a 10.5± 1 .1a 12.2± 0 .6a no
C T ± S D 22.2 ± 1 .2 20. 4 ± 0 .5 23.6 ± 0.8 21.1 ± 0.2SS 00002396 IRF 1ΔC T ± S D 5.1± 0.6b 3.7± 0 .2c 6.2± 0 .4a 4.4± 0 .3bc yes
C T ± S D 23.8 ± 0 .7 22. 8 ± 0 .5 25.0 ± 0.6 23.8 ± 0.2SS 00002273 IRF 2ΔC T ± S D 6.7± 0.8ab 6.2± 0 .2b 7.6± 0 .5a 7.0± 0 .3ab yes
C T ± S D 20.2 ± 0 .5 20. 6 ± 0 .6 21.8 ± 0.7 21.9 ± 0.2SS 00007514 M AKP 14ΔC T ± S D 3.1± 0.5c 3.9± 0 .3b 4.4± 0 .3b 5.2± 0 .1a yes
C T ± S D 21.0 ± 0 .7 20. 2 ± 0 .4 21.6 ± 0.5 20.8 ± 0.4SS 00008774 M AKP 1ΔC T ± S D 3.9± 0.3ab 3.5± 0 .3b 4.2± 0 .1a 4.1± 0 .9a yes
C T ± S D 28.4 ± 0 .3 25. 4 ± 0 .6 28.7 ± 0.5 27.1 ± 0.3SS 00000832 TG Fβ1ΔC T ± S D 11.3± 0 .5a 8.8± 0 .8b 11.3± 0 .7a 10.4± 0 .5a no
C T ± S D 24.7 ± 1 .0 20. 6 ± 0 .6 22.2 ± 0.4 23.7 ± 0.1SS 00000662 TG Fβ2ΔC T ± S D 7.6± 0.7a 3.9± 0 .4b 4.8± 0 .1b 6.9± 0 .2a yes
SS 00004196 RP L32 C T ± S D 17.1 ± 0 .7 16. 7 ± 0 .4 17.4 ± 0.5 16.8 ± 0.2 yes
9 of 11 genes agree with microarray results9 of 11 genes agree with microarray results-- 9 show statistical difference9 show statistical difference-- 1 more has same direction as MA1 more has same direction as MA
Additional verification
How else can we check the oligo expression data?Check data against expression of ortholog in other species[Also check position of gene in genome- confirm conservation of gene order across species as well]
Example: pig to mouse patterns
Skeletal muscle
mouse data
Pyruvate Kinase M2
SI
Liver
Lung
pig data
Project 2. Infection Response Transcriptomics
Current Purposes:♦ To use bioinformatics to investigate expression profiles in
porcine mesenteric lymph node during Salmonella infection♦ Initiate characterization of the regulatory pathways controlling
host response to Salmonella challenge♦ Test new 23K Affymetrix Porcine GeneChip
Long-term Goal: ♦ Identify genes to target for improving disease resistance --- few
QTL studies have identified genome regions important for resistance to Salmonella or other bacteria in pigs
Why lymph node?
Lymph node is the place where the innate (early,
non-specific immune response talks to the
adaptive (later, specific) immune system
Why mesenteric lymph node?
Mesenteric lymph node (MLN) is the
place where Salmonella usually
enters the body
Caveats for studying Immune Response at the MLN transcriptome level
♦Advantage of sampling the mesenteric lymph node is that we are studying at least a portion of the real host response♦Disadvantage is that the results are the combined efforts of a number of cell types which may or may not have responded to Salmonella♦We chose to study the gut-associated lymph tissue of specifically challenged animals- an experiment part-way between the most controlled type of study (single-cell, single-pathogen challenge) and the least controlled type of study (naturally challenged field population)
Experimental Design
• Pigs infected with 1 billion cfu S. Choleraesuis or S. typhimurium.
• Lymph nodes collected: Uninfected, 8h, 24h, 48h, and 21d post-infection.
• Three pigs per time point:
Wang et al., 2007
Affymetrix Probeset Annotation
1. Sequence based similarity using Affymetrix consensus sequence (2004 porcine data)
♦ Used BLAST to RefSeq Database ♦ BLASTN used; criteria e -10 maximum score♦ TBLASTX used; criteria e-5 maximum score♦ Hit rate: to human RefSeq: 14, 949/ 23,937* or 62%
• *Low value annotations: (918)480: Chr xx orf yy111: FLJxxxxx hypothetical protein159: KIAA cDNAs120: LOCxxxxx annotations48: MGC hypothetical protein
2. GO function/component/process annotation♦ Used GO terms associated with mouse RefSeq at NCBI♦ 1 or more GO terms were matched to 10,820 Affymetrix
probesets
0200400600800
1000
8h/0 24h/0 48h/0 21d/0Time cours
No. of the differential expressed genes aftTyphimurium infected
down-regulatedup-regulated
0200400600800
10001200
8h/0 24h/0 48h/0 21d/0
Time cours
No. of the differential expressed genes aCholeraesuis infected
down-regulateup-regulated
Summary of differentially expressed genes
• p < 0.01 and fold change > 2.0
• FDR ranges from 0.04 - 0.26
Differentially expressed in ST infection
Differentially expressed in SC infection
Analysis of differentially expressed gene expression patterns
• How to further study ~ 1,000+ differentially expressed genes??
• Recognize patterns of expression of sets of genes using clustering tools
• Correlate with known biology through annotation:• to understand specific immune response(s)• to establish benchmark patterns marking health/disease• to identify and characterize transcriptional
regulatory networks
Why transcription factors and regulatory networks
• Transcription factor (TF) function is very close to RNA expression data
• Emphasize comparative analysis to use wealth of information from human/mouse data
• Meta-analysis of pathogen response shows fundamental pathways in common across many host cell types and pathogens (Jenner and Young 2005)
GenePromoterDNA
⇑recognition sequence
Conserved immune response networks (Jenner and Young 2005)
• Meta-analysis of 782 experiments and 77 different host-pathogen interactions studied.
• Epithelia, Endothelia, Macrophage, PBMC, DC, Liver, skin, fibroblast, stomach, T cell, B cell….
• 12+ viruses, 10+ bacteria plus stimulants--LPS, etc• Clustering analysis of data: direction of expression
response across all experiments was compared• 511 genes showed similar pattern of expression
upon infection --> co-expressed• Due to co-regulation?
Common host response networks
RG Jenner and RA Young
2005
Hierarchical Clustering of Genes
• 848 differentially expressed genes in ST infection
• 1,853 differentially expressed genes in SC infection
• p<0.01; q < 0.24; fold change >2
• all pair-wise comparisons used
• Heat map was built
• genecluster3 and treeview software
Hierarchical Clustering of DE genes - ST infectionUpUp--regulatedregulatedDownDown--regulatedregulated
Up-regulatedIn ST infection at 24 hr,but highest at 48 hr in SC infection - 105 genes
“Inflammatory immune response/NFkB” cluster contains;6 chemokines,6 interferons/interleukins, many NFkB targets
Q-PCR confirms Affymetrix MLN Gene Expression Patterns
Most areKnown NFκB target genes
Regulatory Networks Revealed by Time-Course “Co-Expression” Data
- Looked at genes induced at two stages of the acute SC infection (annotated only)
- Early response genes (E)- 83 genes; up at 8 and/or 24 hpi
- Late response (L)- 320 genes; induced only at 48 hpi
Text-mining to annotate Regulatory Networks
- Pathway Studio analysis of all common regulators of E genes
- Protein or complex that has a PubMed abstract with regulatory links to at least two genes in list
- 50 E genes were able to be so linked to other members of the E group
- 20 of these 50 genes are known to be regulated by the NFκB complex
Specific regulatory network:NFκB known co-regulated genes in E group
Text-mining to annotate Regulatory Networks
- Pathway Studio analysis of all common regulators of E genes
- 20 of these 50 genes are known to be regulated by the NFκB complex
- 30 other “co-expressed” genes---potential novel NFκB targets?
- How to test this?
Evidence for TF Regulatory Network
Can we provide additional evidence that NFκB is regulating these gene sets in the pig?
Initial step:Use human promoter sequence as a surrogate to
look for regulatory sequences known to mediate NFκB activity at target genes
DNA Motifs at DE Genes: Evidence for TF
Regulatory Network
Data AfterMLN
Expression Analysis
Find Human Orthologous
Promoter(-1500 to +500
bp)
PERL scripts extract sequences from
GenBank
Clustering or other criteria
Group DE Genes by Expression
Similarity
Motif-Finding Software:TFM-Explorer
identified shared “windows” across group that contain NFκB motifs
Promoter regions with
Over-representedNFκB Motifs
Orthology based on BLAST of human
RefSeq
Putative NFkB Target Genes found by Motif Analysis of DE Gen
0
100
200
300
400
500
600
Early Late All
Early Late
Group Category
Known NFkB targets Found Known NFkB targets not Found
Unknown NFkB targets Found Unknown NFkB targets Not foun
Evidence for Novel NFκB Regulatory Targets
Number of Genes with Significant Windows with Over-representedNFκB Motifs
Putative unrecognized NFκB
target genes?
0 +500-1500
UBD promoters for human, mouse, pig
Example: Is UBD a putative NFκB target gene? Are there NFκB motifs at UBD?
NFκB motif human
NFκB motif mouse
NFκB motif pig
-1150 414
-1061 341
-1180 316
0 +500-1500
0 +500-1500
Testing by in vitro binding
EMSA using porcine UBD motif and mouse macrophage cell nuclear extract
LPS - + + +Competitor - - SP NSP
NFκB-DNA complex All four porcine
promoter motifs tested by EMSA were bound by nuclear proteins
CollaboratorsTuggle Lab
♦ Dr. Shu-hong Zhao♦ Dr. Yan-fang Wang♦ Oliver Couture♦ Sarah Orley ♦ Sender Lkhagvadorj Microarray development
NRSP8 Swine Genome User Committee Chris Tuggle, Co-Chair Daniel Pomp, Co-Chair Max Rothschild, CoordinatorJon Beever, Cathy Ernst, Diane MoodyMike Murtaugh
Qiagen-OperonSajeev Batra
University of MinnesotaVivek Kapur, Archana Deshpande
Dan Nettleton♦ Justin Recknor♦ Long Qu
Univ. Iowa Bioinformatics♦ Dr. Tom Casavant♦ Dr. Todd Scheetz♦ Bart Brown
USDA-ARS-Beltsville♦ Dr. Joan Lunney♦ Dr. Harry Dawson♦ Dr. Daniel Kuhar