selecting restriction enzymes for terminal restriction fragment length polymorphism based on...

Selecting Restriction Enzymes for Terminal Restriction Fragment

Length Polymorphism based on Phylogenetic Distance

Present: Mei, Ko-JenN18981246機械博二

Outline

• Introduction• Algorithm– Skeleton– Distinct Matrix Establishment for Clusters– Distinct Matrix Establishment for Sample Vector of

Restriction Enzymes– Match and Weight

• Program Application• Discussion• Conclusion

Introduction

• What is T-RFLP?• What is phylogenetic distance matrix?• What is restriction enzyme?• What is the concern of selecting restriction

enzymes for T-RFLP? • How to solve this problem in the past?

Algorithm - SkeletonSample Vectorsof Restriction

Enzyme

DistinctMatrix

Distinct Matrixes

Score

Cluster

By Phylogenetic Distance Matris

By User-defined

Match and Weight


By Artificial Weight

Algorithm - Distinct Matrix Establishment for Clusters

Phylogenetic Distance Matrix

(nxn upper triangle matrix, value between

0~1)

Compare distance cutoff for each

element

Distance Cutoff(value between 0~1)

Distinct Matrix for Cluster(nxn upper triangle boolean

matrix, value within {SAME, DIFFERENT})

Algorithm -Distinct Matrix Establishment for Sample Vector

of Restriction Enzymes (I)

Sample Vectorsof Restriction Enzyme(m vectors of length n, all value is integer)

T-RFLP Cutoff(integer)

Degree of Combination

Permutation

Sample Vectorsof Restriction Enzymes

(m* vectors of length n, all value is integer)

Compare with T-RFLP cutoff for all elements

for all matrixes

Distinct Matrixes of Restriction Enzymes

(m* upper triangle boolean matrixes of dimension nxn, all value is within

{SAME, DIFFERENT})

Build difference matrixes by pair-wise differentiate within each sample vector

Difference Matrixes of Restriction Enzymes (m* upper triangle matrixes of dimension

nxn, all value is integer)

Algorithm -Distinct Matrix Establishment for Sample Vector of

Restriction Enzymes (II)

Algorithm- Match and Weight (I)Artificial Weight Natural Weight

Distinct Matrixes for Restriction Enzymes

Distinct Matrix for Clusters

Match and weight by weight matrix

Score Matrixes (m* nxn upper triangle

float matrixes)

Match and dot phylogenetic

distance matrix

Phylogenetic Distance Matrix

ArtificialWeight Matrix

(2x2 float matrix)

Score Matrixes (m* nxn upper triangle

float matrixes)

Distinct Matrixes for Restriction Enzymes

Distinct Matrix for Clusters

Restriction Enzyme\Cluster

SAME DIFFERENT

SAME Weight 0DIFFERENT 0 1

Algorithm- Match and Weight (II)Score Matrixes

Sum all elements in each matrix. Divide sum by sum of perfectly

matched matrix to get score

Scores

Sort and find the best restriction enzymes

according to highest scores

The BestRestriction Enzymes

Program Application

Sample Vectorsof Restriction

Enzyme

DistinctMatrix

Distinct Matrixes

Score

Cluster


By User-defined

Match and Weight


By Artificial Weight

Discussion (I)

• Find the best mask for a target vector from several candidate vectors

• Why natural weight meaningful for identical element?– Difference between classes higher then difference

within class– (Cluster, Restriction Enzyme) = (SAME, SAME) does

not influence the result significantly– The distinctibility is remained

Discussion (II)

• How artificial weight matrix defined?

– Distinctibility for species• (Cluster, Restriction Enzyme) = (DIFFERENT, DIFFERENT)• (Cluster, Restriction Enzyme) = (DIFFERENT, SAME)

– Distinctibility for clusters• (Cluster, Restriction Enzyme) = (DIFFERENT, DIFFERENT)

– Identity within cluster• (Cluster, Restriction Enzyme) = (SAME SAME)• Not so important; weight it!• Since the importance lower than distinctibility, the weight value is usually

within 0~1.

Restriction Enzyme\Cluster

SAME DIFFERENT

SAME Weight 0DIFFERENT 0 1

Conclusion• Choosing the most identical vector

– from a set of sample vectors – to match the given clusters

• applied for choosing restriction enzymes– according to the clusters

• clusters can be achieved– by User-defined– by phylogenetic distance matrix

• weighting can be achieved – according to phylogenetic distance matrix– according to artificial weight.

• The application can provide an easy environment to manipulate clusters and selecting method.

selecting restriction enzymes for terminal restriction fragment length polymorphism based on...

Documents

restriction enzymesaccording

weight value

phylogenetic distance

important weight

phylogenetic distancepresent

element distance cutoffvalue

set of sample vectors

userdefined match