arrays against time transcriptomics ‘101’ wuhan 2011 ccc
TRANSCRIPT
Arrays against timeTranscriptomics
‘101’Wuhan 2011 CCC
WT
Mut
ant
Ove
r ex
pres
sed
Oth
er s
peci
es
Transcription assay: Northerns
Extract targetRNA
YFG
Label probe+ hybridise
Nextgene
quantitate
• Slow (Time consuming)• Hard (Technically challenging)
• 捡了芝麻丢了西瓜
Problems with Northerns:
Systems biology networks- We want to look at lots of transcripts:
AtRegNet (gene regulation network)
Aracyc +other metabolomics data
Arabidopsis gene network (Ma et al. Genome Research 2007)
Arabidopsis
Merged Network Proteins (red) Metabolites (blue) & Genes (green)
19392 nodes and 72715 edges
捡了芝麻丢了西瓜
WT
Mut
ant
Ove
r ex
pres
sed
Oth
er s
peci
es
Northerns – a few genes at a time.
Extract targetRNA
YFG
Label probe+ hybridise
Nextgenequantitate
Again and again and….
• Sequencing ESTs (Déjà vu?)
• Differential display (random 5’ primers + fixed polyA primers)
Mass transcript profiling: Transcriptomics
• Microarrays
Probe preparation
Acquire or Generate probes
‘All the genes you want’
Label cDNA
from sample 1 RNA
…and sample 2 RNA
Target preparation
Extract RNA from yourControl AND your Experimental plant
Spot
Identify ‘spots’remove background
produce ‘red/green’ ratios
• Link ratio to relative abundance.• Link spot to gene. • Link genes to each other.
Hybridise & Scan
ArraysHow do you make them ?
Arrayers
Pins
Pin type: blunt, ring, quill, coated…..Breaking: bending, stickingConsistency of spots: ‘coffee-cup’, splash, dripContamination: carry-over, dust, hairs, crystals.Etc etc….
Slides• Cracking• Splitting• Exfoliating• Fluorescing
• Coatings - Hydrophobic, hydrophilic, correctly aged poly-lysine (a bit of an art)• Home-made vs bought (cost of internal vs external quality control.• Scan before coating, scan after coating, scan after arraying, scan after hyb-ing all part of QC•Etc…etc...
The finished spotted array
Before processing, we have a LOT of spots
After processing, we have a LOT of objective data
Example Hybridisation
What biological questions can you answer with arrays ?
5 hormone response gene family members
In different experiments
3. Root vs shoot hyb
1. +hormone vs ctrl hyb
2. Normal vs mutant hyb
microarray
Sorting out gene families
The original choice was:
Mass amplifications of cDNAs identified by partial sequence
(ESTs)
What goes on the slide ?
However ….. Duplication in genomes is a real problem
Human
PlantYeast
Gene families:(# of members as a proportion of the genome)
Apart from wholesale duplication
Unique 2 3 4 5 >5
35% 12.5% 7% 4.4% 3.6% 37.4%
Conservation between genes:
• 37% of genes are highly conserved (TBLASTX E<10-30)
• 10% more are partially conserved(TBLASTX E<10-5)
Gene of interest
ESTs have inherent problems
Example EST sequence 1
Homologous EST sequence 2
Dissimilar EST sequence 3
On the slide
1
2
3
Labelled target may hybridise similarly to each
Better solutions:
• GSTs (gene specific tags)• Oligo arrays• Affymetrix genechips
• RNA seq???
Selection of Expression Probes
Probes
Sequence
Perfect Match
MismatchChip
5’ 3’
AffymetrixWafer and Chip Format
1.28cm
5 - 50 µm
5 - 50 µm
Millions of identical oligonucleotide
probes per feature
49 - 400 chips/wafer
up to ~ 3,000,000 features/chip
Probe cells of an Affymetrix GeneChip contain millions of identical 25-mers
25-mer
Photolithographic Synthesis
Lamp
Mask Chip
Synthesis of Ordered Oligonucleotide Arrays
One nucleotide at a time.
here
Procedures for Target Preparation
RNAAAAA
RNA Quality control
Procedures for Target Preparation
cDNA
Wash & Stain
Scan
Hybridise
(16 hours)
RNAAAAA
B B B B
Biotin-labeled transcripts Fragment
(heat, Mg2+)
Fragmented cRNA
B B
B
B
IVT(Biotin-UTPBiotin-CTP)
GeneChip® Expression AnalysisHybridization and Staining
Array
cRNA Target
Hybridized Array
Ab detection
Affymetrix software derives the intensity for each probe from the 75% quantile of the pixel values in
each box.
The intensities of the multiple probes within a probeset are combined into ONE measure of expression
Expression Measure
Chips need to be normalised against each other.
Each chip is a different colour in this graph
They are not co-incident for
intensities
To compare they need to
be comparable
RMA uses normalisation at the probe
level
Chip 1
Chip 2
Chip 3
1 2 3 4 5
1 2 3 5 7
2 3 4 5 9
Order by ranks
PA PB PC PD PE
Chip 1
Chip 2
Chip 3
1 2 4 3 5
7 2 5 3 1
5 3 4 2 9
Average the intensities at each rank
Chip 1
Chip 2
Chip 3
1.33 2.33 3.33 4.66 7
1.33 2.33 3.33 4.66 7
1.33 2.33 3.33 4.66 7
PA PB PC PD PE
Chip 1
Chip 2
Chip 3
1.33 2.33 4.66 3.33 7
7 2.33 4.66 3.33 1.33
4.66 2.33 3.33 1.33 7
Reorder by probe
R / BioConductor
training
AffylmGUItraining
Xspecies analysistraining
Normalisation, filtering and annotation
.CDF , filtering, stats and annotation
RMA Normalisation
Sequencing: current / next gen / future
Sequencing is likely to complement arrays in the future
Standard (Sanger) sequencing
TemplateTemplate
PrimerPrimer
Primer
Primer
Random ddNTP termination.
Label can be added to the:
• Primer• ddNTP –or-• Incorporated dNTPs
454 sequencing (images by Roche) Sample Input and Fragmentation: Genomic DNA or BACs are fractionated into small, 300- to 800-basepair fragments
Library Preparation: Short adaptors (A and B) - specific for both the 3' and 5' ends - are added to each single stranded fragment.
One Fragment = One Bead: Each fragment of the single-stranded DNA library is immobilized individually onto beads in a water-in-oil mixture.
emPCR (Emulsion PCR) Amplification: Each unique fragment is amplified in parallel to several million per bead.
One Bead = One Read: The clonally amplified fragments are loaded onto a PicoTiterPlate device for sequencing. Only one bead per well.
Auto fluidics flows individual nucleotides in a fixed order across the hundreds of thousands of wells containing one bead. Addition of a nucleotide results in a chemiluminescent signal.
Solexa sequencing ISeries of images taken from www.illumina.com
Solexa sequencing II
Solexa sequencing III
But the future may be even faster……
• http://www.pacificbiosciences.com/aboutus/video-gallery
• Note: Direct link may be disallowed by the server.– try direct paste into a browser and click the SMRT Biology Overview in the video-gallery archive
Rubber sequencing