selecting restriction enzymes for terminal restriction fragment length polymorphism based on...
TRANSCRIPT
Selecting Restriction Enzymes for Terminal Restriction Fragment
Length Polymorphism based on Phylogenetic Distance
Present: Mei, Ko-JenN18981246機械博二
Outline
• Introduction• Algorithm– Skeleton– Distinct Matrix Establishment for Clusters– Distinct Matrix Establishment for Sample Vector of
Restriction Enzymes– Match and Weight
• Program Application• Discussion• Conclusion
Introduction
• What is T-RFLP?• What is phylogenetic distance matrix?• What is restriction enzyme?• What is the concern of selecting restriction
enzymes for T-RFLP? • How to solve this problem in the past?
Algorithm - SkeletonSample Vectorsof Restriction
Enzyme
DistinctMatrix
Distinct Matrixes
Score
Cluster
By Phylogenetic Distance Matris
By User-defined
Match and Weight
By Phylogenetic Distance Matris
By Artificial Weight
Algorithm - Distinct Matrix Establishment for Clusters
Phylogenetic Distance Matrix
(nxn upper triangle matrix, value between
0~1)
Compare distance cutoff for each
element
Distance Cutoff(value between 0~1)
Distinct Matrix for Cluster(nxn upper triangle boolean
matrix, value within {SAME, DIFFERENT})
Algorithm -Distinct Matrix Establishment for Sample Vector
of Restriction Enzymes (I)
Sample Vectorsof Restriction Enzyme(m vectors of length n, all value is integer)
T-RFLP Cutoff(integer)
Degree of Combination
Permutation
Sample Vectorsof Restriction Enzymes
(m* vectors of length n, all value is integer)
Compare with T-RFLP cutoff for all elements
for all matrixes
Distinct Matrixes of Restriction Enzymes
(m* upper triangle boolean matrixes of dimension nxn, all value is within
{SAME, DIFFERENT})
Build difference matrixes by pair-wise differentiate within each sample vector
Difference Matrixes of Restriction Enzymes (m* upper triangle matrixes of dimension
nxn, all value is integer)
Algorithm -Distinct Matrix Establishment for Sample Vector of
Restriction Enzymes (II)
Algorithm- Match and Weight (I)Artificial Weight Natural Weight
Distinct Matrixes for Restriction Enzymes
Distinct Matrix for Clusters
Match and weight by weight matrix
Score Matrixes (m* nxn upper triangle
float matrixes)
Match and dot phylogenetic
distance matrix
Phylogenetic Distance Matrix
ArtificialWeight Matrix
(2x2 float matrix)
Score Matrixes (m* nxn upper triangle
float matrixes)
Distinct Matrixes for Restriction Enzymes
Distinct Matrix for Clusters
Restriction Enzyme\Cluster
SAME DIFFERENT
SAME Weight 0DIFFERENT 0 1
Algorithm- Match and Weight (II)Score Matrixes
Sum all elements in each matrix. Divide sum by sum of perfectly
matched matrix to get score
Scores
Sort and find the best restriction enzymes
according to highest scores
The BestRestriction Enzymes
Program Application
Sample Vectorsof Restriction
Enzyme
DistinctMatrix
Distinct Matrixes
Score
Cluster
By Phylogenetic Distance Matris
By User-defined
Match and Weight
By Phylogenetic Distance Matris
By Artificial Weight
Discussion (I)
• Find the best mask for a target vector from several candidate vectors
• Why natural weight meaningful for identical element?– Difference between classes higher then difference
within class– (Cluster, Restriction Enzyme) = (SAME, SAME) does
not influence the result significantly– The distinctibility is remained
Discussion (II)
• How artificial weight matrix defined?
– Distinctibility for species• (Cluster, Restriction Enzyme) = (DIFFERENT, DIFFERENT)• (Cluster, Restriction Enzyme) = (DIFFERENT, SAME)
– Distinctibility for clusters• (Cluster, Restriction Enzyme) = (DIFFERENT, DIFFERENT)
– Identity within cluster• (Cluster, Restriction Enzyme) = (SAME SAME)• Not so important; weight it!• Since the importance lower than distinctibility, the weight value is usually
within 0~1.
Restriction Enzyme\Cluster
SAME DIFFERENT
SAME Weight 0DIFFERENT 0 1
Conclusion• Choosing the most identical vector
– from a set of sample vectors – to match the given clusters
• applied for choosing restriction enzymes– according to the clusters
• clusters can be achieved– by User-defined– by phylogenetic distance matrix
• weighting can be achieved – according to phylogenetic distance matrix– according to artificial weight.
• The application can provide an easy environment to manipulate clusters and selecting method.