thursday, 5 june 2008

24
Thursday, 5 June 2008 • Problems in sequence analysis • Identification by sequence similarity Genes Determining Plant- Cyanobacterial Symbioses and Consideration of Blast This demonstration is best viewed as a slide show, enabling you to simulate a session and make changes in cursor position more obvious. To do this, click Slide Show on the top tool bar, then View show. Click anywhere to go on to the next slide

Upload: sevita

Post on 19-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast. Thursday, 5 June 2008. Problems in sequence analysis. Identification by sequence similarity. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Thursday, 5 June 2008

Thursday, 5 June 2008

• Problems in sequence analysis• Identification by sequence similarity

Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast

This demonstration is best viewed as a slide show,enabling you to simulate a session and make changes

in cursor position more obvious.To do this, click Slide Show on the top tool bar, then View show.

Click anywhere to go on to the next slide

Page 2: Thursday, 5 June 2008

10 mM nitrate 0.1 mM nitrate

Gland development is stimulated by N-limitation

What's special about the gland?

Gland suppressed by presence of fixed N

Plant starved for N makes gland to house cyanobacteria

What genes are specifically

expressed in glands?

Page 3: Thursday, 5 June 2008

Construction of a cDNA library from Gunnera gland

mRNA ends with polyA tails

Use modified polyT to direct synthesis of

DNA copy of mRNAReverse Transcriptase (RT) adds CCC to end.

Add 2nd adapter, using GGG to attach to CCC. Extend cDNA

Page 4: Thursday, 5 June 2008

Construction of a cDNA library from Gunnera gland(Same protocol, but with real sequences)

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3'3'-TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5'

Use modified polyT adapter to direct synthesis of DNA copy of mRNA

Page 5: Thursday, 5 June 2008

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3'3'-TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5'

Use modified polyT adapter to direct synthesis of DNA copy of mRNA

The adapter can bind to many positions in polyA tail, resulting in variation in number

of T's in cDNA sequence.

Page 6: Thursday, 5 June 2008

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3'3'-TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5'

Use modified polyT adapter to direct synthesis of DNA copy of mRNA

The adapter can bind to many positions in polyA tail, resulting in variation in number

of T's in cDNA sequence.

Page 7: Thursday, 5 June 2008

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN

Reverse Transcriptase (RT) extends the adapter to the end of the mRNA

and adds CCC to the 3' end.

Page 8: Thursday, 5 June 2008

3'-CCCNNNNNNNNNN ...

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN

5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG

A second adapter is added which (with the help of antibodies to) uses three G's to bind to the three .C's.

Page 9: Thursday, 5 June 2008

CCCNNNNNNNNNN ...

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN

5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG

The cDNA sequence is extended to the left, using the second

adapter as a template.

TTCGTCACCATAGTTGCGTCTCACCGGTAATGCCGG

Page 10: Thursday, 5 June 2008

CCCNNNNNNNNNN ...

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN

5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGNNNNNNNNNN ...

The cDNA sequence is extended to the left, using the second

adapter as a template…

…and then the second cDNA is strand is synthesized left-to-right, using the first cDNA strand as the

template.

TTCGTCACCATAGTTGCGTCTCACCGGTAATGCCGG

Page 11: Thursday, 5 June 2008

CCCNNNNNNNNNN ...

Construction of a cDNA library from Gunnera gland

5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN

5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGNNNNNNNNNN ... TTCGTCACCATAGTTGCGTCTCACCGGTAATGCCGG

Hundreds to thousands of nucleotides

To give some perspective, the adapters are about 50 nucleotides, while the

mRNA itself can be as large as a couple of thousands of nucleotides.

Page 12: Thursday, 5 June 2008

Construction of a cDNA library from Gunnera gland

Of course there are thousands of different mRNA's in a cell, leading to thousands of cDNA's in the library, all

in multiple copies.

Page 13: Thursday, 5 June 2008

Sequencing of cDNA library

Limitations:

- Only from ends

- Only ~400 nt

It would be nice to be able to sequence the cDNA's from end to end, but that's

not presently possible.

Sequencing has its limitations.

Page 14: Thursday, 5 June 2008

Sequencing of cDNA library

Limitations:

- Only from ends

- Only ~400 nt

Solution:

- Break the cDNA

The solution is to break up the cDNA so that there are multiple, overlapping ends from which to sequence. In this way, all the full

length of the cDNA can be sequenced

Page 15: Thursday, 5 June 2008

Sequencing of cDNA library

(1000's of cDNA's)

The broken fragments are read from either end (at random). If there are enough reads, it is possible to use overlaps to

reassemble the original sequence.

Unfortunately, the adapters are also sequenced, and these complicate the assembly process, as they're interpreted as

overlapping sequences, leading to misassembly.

They need to be removed.

Page 16: Thursday, 5 June 2008

Sequencing of cDNA library

(1000's of cDNA's)

Given the number of sequences, the removal process obviously must be automated, but automated processes,

while fast, are often stupid.

We need to check to make sure they worked.

Page 17: Thursday, 5 June 2008

Identifying elements of cDNA library

The assembly process should, in theory, also remove duplicate sequences.

Page 18: Thursday, 5 June 2008

Identifying elements of cDNA library

The assembly process should, in theory, also remove duplicate sequences.

In practice, partial duplicates may remain, and it is necessary to keep an eye

out for them.

Page 19: Thursday, 5 June 2008

Identifying elements of cDNA library

Predict function directly from sequence

How to go from cDNA sequence to predicted function for the sequences?

You might think that since we can readily predict a protein sequence from a DNA

sequence, it should be possible to predict function as well.

Page 20: Thursday, 5 June 2008

Identifying elements of cDNA library

Predict function directly from sequence

Predict function from sequence similarity

Nope. At present that's impossible.

The best we can do is to compare sequences with sequences from other

organisms where there is experimental evidence as to function.

Page 21: Thursday, 5 June 2008

Identifying elements of cDNA library

Predict function directly from sequence

Predict function from sequence similarityBlast is a tool to do just that, comparing a

given sequence against at database of known sequences.

It is important to understand the mind of Blast. But that is a subject for another time.

Page 22: Thursday, 5 June 2008

Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast

1. Determine if primers been removed from sequences.

2. Determine if the library contains duplicates

3. Identify protein sequences similar to those encoded by cDNAs

We've identified many things that need to be done:

4. (plus one extra) Find where in the cDNAs genes begin and end

Page 23: Thursday, 5 June 2008

Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast

Go into StaphyloBIKE through the BioBIKE portal(Gunnera isn't a member of the Staphylococcus, of course, but I put the cDNA sequences in that instance of BioBIKE)

RUN-FILE "contig-resources.bike" SHARED(this makes the cDNA sequences available to you as a variable called gunnera-contigs and also provides you with a possibly useful tool

READ-NAMED to extract specific sequences)

These questions are ordinarily answered by high-powered computer types. But you can answer them yourself.

First you need to read in the data.

Page 24: Thursday, 5 June 2008

Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast

SEQUENCE-SIMILAR-TOAccesses BLAST, using as targets either internal data

(i.e. gunnera-contigs) or external data (i.e. *GENBANK*)Also used to look for nearly identical sequences,

using the MISMATCHES option.

READING-FRAMES-OF Translates the sequence in all six possible reading frames.

Possibly useful functions: