artificial neural networks for secondary structure prediction csc391/691 bioinformatics spring 2004...

21
Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Upload: patricia-harrison

Post on 23-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Artificial Neural Networks for Secondary Structure Prediction

CSC391/691 Bioinformatics

Spring 2004

Fetrow/Burg/Miller

(slides by J. Burg)

Page 2: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Artificial Neural Networks

A problem-solving paradigm modeled after the physiological functioning of the human brain.

Synapses in the brain are modeled by computational nodes.

The firing of a synapse is modeled by input, output, and threshold functions.

The network “learns” based on problems to which answers are known (in supervised learning).

The network can then produce answers to entirely new problems of the same type.

Page 3: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Applications of Artificial Neural Networks

speech recognition medical diagnosis image compression financial prediction

Page 4: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Existing Neural Network Systems for Secondary Structure Prediction

First systems were about 62% accurate. Newer ones are about 70% accurate when

they take advantage of information from multiple sequence alignment.

PHD NNPREDICT

(web links given in your book)

Page 5: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Applications in Bioinformatics

Translational initiation sites and promoter sites in E. coli

Splice junctions Specific structural features in proteins such

as α-helical transmembrane domains

Page 6: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Neural Networks Applied to Secondary Structure Prediction

Create a neural network (a computer program) “Train” it uses proteins with known secondary

structure. Then give it new proteins with unknown structure

and determine their structure with the neural network.

Look to see if the prediction of a series of residues makes sense from a biological point of view – e.g., you need at least 4 amino acids in a row for an α-helix.

Page 7: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Example Neural Network

From Bioinformatics by David W. Mount, p. 453

Training pattern

One of n inputs, each with 21 bits

Page 8: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Inputs to the Network Both the residues and target classes are encoded in

unary format, for example Alanine: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Cysteine: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Helix: 1 0 0 Each pattern presented to the network requires n 21-bit

inputs for a window of size n. (One bit is required per residue to indicate when the window overlaps the end of the chain).

The advantage of this sparse encoding scheme is that it does not pay attention to ordering of the amino acids

The main disadvantage is that it requires a lot of input.

Page 9: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Weights

Input values at each layer are multiplied by weights. Weights are initially random. Weights are adjusted after the output is computed

based on how close the output is to the “right” answer.

When the full training session is completed, the weights have settled on certain values.

These weights are then used to compute output for new problems that weren’t part of the training set.

Page 10: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Neural Network Training Set

A problem-solving paradigm modeled after the physiological functioning of the human brain.

A typical training set contains over 100 non-homologous protein chains comprising more than 15,000 training patterns.

The number of training patterns is equal to the total number of residues in the 100 proteins.

For example, if there are 100 proteins and 150 residues per protein there would be 15,000 training patterns.

Page 11: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Neural Network Architecture A typical architecture has a window-size of n and 5

hidden layer nodes.* Then a fully-connected would be 17(21)-5-3

network, i.e. a net with an input window of 17, five hidden nodes in a single hidden layer and three outputs.

Such a network has 357 input nodes and 1,808 weights.

((17 * 21) * 5) + (5 * 3) + 5 + 3 = 1808?

*This information is adapted from “Protein Secondary Structure Prediction with Neural Networks: A Tutorial” by Adrian Shepherd (UCL),

http://www.biochem.ucl.ac.uk/~shepherd/sspred_tutorial/ss-index.html.)

Page 12: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Window

The n-residue window is moved across the protein, one residue at a time.

Each time the window is moved, the center residue becomes the focus.

The neural network “learns” what secondary structure that residue is a part of. It keeps adjusting weights until it gets the right answer within a certain tolerance. Then the window is moved to the right.

Page 13: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Artificial Neuron (aka “node”)

iin

)( jsgWi,jaj

ja

Input Links

InputFunction

TriggerFunction

Output

ai = g(ini)

Page 14: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Trigger Function Each hidden layer node sums its weighted inputs and “fires”

an output accordingly. A simple trigger function (called a threshold function): send 1

to the output if the inputs sum to a positive number; otherwise, send 0.

The sigmoid function is used more often:

sj is the sum of the weighted inputs. As k increases, discrimination between weak and strong

inputs increases.

)1(

1* jske

Page 15: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Adjusting Weights With Back Propagation

The inputs are propagated through the system as described above.

The outputs are examined and compared to the right answer.

Each weight is adjusted according to its contribution to the error.

See page 455 of Bioinformatics by Mount.

Page 16: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Refinements or Variations of Method

Use more biological information

See http://www.biochem.ucl.ac.uk/~shepherd/sspred_tutorial/ss-pred-new.html#beyond_bioinf

Page 17: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Predictions Based on Output

Predictions are made on a winner-takes-all basis.

That is, the prediction is determined by the strongest of the three outputs. For example, the output (0.3, 0.1, 0.1) is interpreted as a helix prediction.

Page 18: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Performance Measurements

How do you know if your neural network performs well? Test it on proteins that are not included in the training set

but whose structure is known. Determine how often it gets the right answer.

What differentiates one neural network from another? Its architecture – whether or not it has hidden layers, how

many nodes are used. Its mathematical functions – the trigger function, the back-

propagation algorithm.

Page 19: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Balancing Act in Neural Network Training

The network should NOT just memorize the training set.

The network should be able to generalize from the training set so that it can solve similar but not identical problems.

It’s a matter of balancing the # of training patterns vs. # network weights vs. # hidden nodes vs. # of training iterations

Page 20: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Disadvantages to Neural Networks

They are black boxes. They cannot explain why a given pattern has been classified as x rather than y. Unless we associate other methods with them, they don’t tell us anything about underlying principles.

Page 21: Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Summary

Perceptrons (single-layer neural networks) can be used to find protein secondard structure, but more often feed-forward multi-layer networks are used.

Two frequently-used web sites for neural-network-based secondary structure prediction are PHD (http://www.embl-heidelberg.de/predictprotein/predictprotein.html ) and NNPREDICT (http://www.cmpharm.ucsf.edu/~nomi/nnpredict.html)