review of important points from the ncbi lectures. –example slides review the two types of...

24
Review of important points from the NCBI lectures. Example slides Review the two types of microarray platforms. Spotted arrays – Affymetrix Specific examples that use microarray technology. Gene expression - role of a transcription factor

Post on 22-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

• Review of important points from the NCBI lectures.– Example slides

• Review the two types of microarray platforms.– Spotted arrays– Affymetrix

• Specific examples that use microarray technology.– Gene expression - role of a transcription factor

Web Access

BLAST

VAST

Entrez

Text

Sequence

Structure

Translated BLAST

Query DatabaseProgram

N Pucleotide rotein

N

N

N

N

P

P

blastx

tblastn

tblastx

PPPP P P

PPPP P P PPPP P P

PPPP P PParticularly useful for nucleotide sequences withoutprotein annotations, such as ESTs or genomic DNA

Position Specific Score Matrix (PSSM)

A R N D C Q E G H I L K M F P S T W Y V 206 D 0 -2 0 2 -4 2 4 -4 -3 -5 -4 0 -2 -6 1 0 -1 -6 -4 -1 207 G -2 -1 0 -2 -4 -3 -3 6 -4 -5 -5 0 -2 -3 -2 -2 -1 0 -6 -5 208 V -1 1 -3 -3 -5 -1 -2 6 -1 -4 -5 1 -5 -6 -4 0 -2 -6 -4 -2 209 I -3 3 -3 -4 -6 0 -1 -4 -1 2 -4 6 -2 -5 -5 -3 0 -1 -4 0 210 S -2 -5 0 8 -5 -3 -2 -1 -4 -7 -6 -4 -6 -7 -5 1 -3 -7 -5 -6 211 S 4 -4 -4 -4 -4 -1 -4 -2 -3 -3 -5 -4 -4 -5 -1 4 3 -6 -5 -3 212 C -4 -7 -6 -7 12 -7 -7 -5 -6 -5 -5 -7 -5 0 -7 -4 -4 -5 0 -4 213 N -2 0 2 -1 -6 7 0 -2 0 -6 -4 2 0 -2 -5 -1 -3 -3 -4 -3 214 G -2 -3 -3 -4 -4 -4 -5 7 -4 -7 -7 -5 -4 -4 -6 -3 -5 -6 -6 -6 215 D -5 -5 -2 9 -7 -4 -1 -5 -5 -7 -7 -4 -7 -7 -5 -4 -4 -8 -7 -7 216 S -2 -4 -2 -4 -4 -3 -3 -3 -4 -6 -6 -3 -5 -6 -4 7 -2 -6 -5 -5 217 G -3 -6 -4 -5 -6 -5 -6 8 -6 -8 -7 -5 -6 -7 -6 -4 -5 -6 -7 -7 218 G -3 -6 -4 -5 -6 -5 -6 8 -6 -7 -7 -5 -6 -7 -6 -2 -4 -6 -7 -7 219 P -2 -6 -6 -5 -6 -5 -5 -6 -6 -6 -7 -4 -6 -7 9 -4 -4 -7 -7 -6 220 L -4 -6 -7 -7 -5 -5 -6 -7 0 -1 6 -6 1 0 -6 -6 -5 -5 -4 0 221 N -1 -6 0 -6 -4 -4 -6 -6 -1 3 0 -5 4 -3 -6 -2 -1 -6 -1 6 222 C 0 -4 -5 -5 10 -2 -5 -5 1 -1 -1 -5 0 -1 -4 -1 0 -5 0 0 223 Q 0 1 4 2 -5 2 0 0 0 -4 -2 1 0 0 0 -1 -1 -3 -3 -4 224 A -1 -1 1 3 -4 -1 1 4 -3 -4 -3 -1 -2 -2 -3 0 -2 -2 -2 -3

Serine is scored differently in these two positions

Active site nucleophile

PSI-BLAST

Create your own PSSM:

Confirming relationships of purine

nucleotide metabolism proteins

query BLOSUM62PSSM AlignmentAlignment

Affymetrix vs. glass slide based arrays

• Affymetrix• Short oligonucleotides• Many oligos per gene• Single sample

hybridized to chip

• Glass slide• Long oligonucleotides

or PCR products• A single oligo or PCR

product per gene• Two samples

hybridized to chip

Bacterial DNA microarrays

• Small genome size

• Fully sequenced genomes, well annotated

• Ease of producing biological replicates

• Genetics

Applications of DNA microarrays

• Monitor gene expression– Study regulatory networks– Drug discovery - mechanism of action– Diagnostics - tumor diagnosis – etc.

• Genomic DNA hybridizations– Explore microbial diversity– Whole genome comparisons– Diagnostics - tumor diagnosis

• ?

Characterization of the stationary phase sigma factor regulon (H)

in Bacillus subtilis

• Robert A. Britton and Alan D. Grossman - Massachusetts Institute of Technology.

• Patrick Eichenberger, Eduardo Gonzalez-Pastor, and Richard Losick - Harvard University.

What is a sigma factor?

• Directs RNA polymerase to promoter sequences

• Bacteria use many sigma factors to turn on regulatory networks at different times.– Sporulation– Stress responses– Virulence

Wosten, 1998

Alternative sigma factors in B. subtilis sporulation

Kroos and Yu, 2000

The stationary phase sigma factor: H

most active at the transition from exponential growth to stationary phase

mutants are blocked at stage 0 of sporulation

• known targets involved in:

phosphorelay (kinA, spo0F) sporulation (sigF, spoVG) cell division (ftsAZ) cell wall (dacC) general metabolism (citG) phosphatase inhibitors (phr peptides)

Experimental approach• Compare expression profiles of wt and

∆sigH mutant at times when sigH is active. • Artificially induce the expression of sigH

during exponential growth.– When Sigma-H is normally not active.– Might miss genes that depend additional factors

other than Sigma-H.

• Identify potential promoters using computer searches.

s i g H

P s p a c

Grow cells

Isolate RNAMake labeled cDNA

Mix and hybridize

Scan slideAnalyze data

∆sigH wild-type

Hour -1 Hour 0 Hour +1

wild type (Cy5) vs. sigH mutant (Cy3)

citGsacT

Identifying differentially expressed genes

• Many different methods

• Arbritrary assignment of fold change is not a valid approach

• Statistical representation of the data– Iterative outlier analysis– SAM (significance analysis of microarrays)

Data from a microarray are expressed as ratios

• Cy3/Cy5 or Cy5/Cy3

• Measuring differences in two samples, not absolute expression levels

• Ratios are often log2 transformed before analysis

Genes whose transcription is influenced by H

• 433 genes were altered when comparing wt vs. ∆sigH.

• 160 genes were altered when sigH overexpressed.

• Which genes are directly regulated by Sigma-H?

Identifying sigH promoters

• Two bioinformatics approaches– Hidden Markov Model database (P. Fawcett)

• HMMER 2.2 (hmm.wustl.edu)

– Pattern searches (SubtiList)

• Identify 100s of potential promoters

Correlate potential sigH promoters with genes identified

with microarray data.• Genes positively regulated by Sigma-H in a

microarray experiment that have a putative promoter within 500bp of the gene.

Directly controlled sigH genes

• 26 new sigH promoters controlling 54 genes• Genes involved in key processes associated with the

transition to stationary phase– generation of new food sources (ie. proteases)– transport of nutrients– cell wall metabolism– cyctochrome biogenesis

• Correctly identified nearly all known sigH promoters• Complete sigH regulon:

– 49 promoters controlling 87 genes.

• Identification of DNA regions bound by proteins.

Iyer et al. 2001 Nature, 409:533-538

Grow cells

Isolate RNAMake labeled cDNA

Mix and hybridize

Scan slideAnalyze data

Pathogen 1 Pathogen 2