clustering the temporal sequences of 3d protein structure

25
Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +* , Sachi Kimura, Mikito Toda , Masami Takata + , Kazuki Joe + + Graduate School of Humanities and Science, Information and Computer Sciences, Nara Women’s University Departments of physics, Nara Women’s University

Upload: tamekah-bridges

Post on 03-Jan-2016

45 views

Category:

Documents


3 download

DESCRIPTION

Clustering the Temporal Sequences of 3D Protein Structure. Mayumi Kamada +* , Sachi Kimura, Mikito Toda ‡ , Masami Takata + , Kazuki Joe +. +: Graduate School of Humanities and Science, Information and Computer Sciences, Nara Women’s University - PowerPoint PPT Presentation

TRANSCRIPT

  • Clustering the Temporal Sequences of 3D Protein StructureMayumi Kamada+*, Sachi Kimura, Mikito Toda, Masami Takata+, Kazuki Joe++Graduate School of Humanities and Science, Information and Computer Sciences, Nara Womens UniversityDepartments of physics, Nara Womens University

  • OutlineMotivationFlexibility DockingFeature Extraction using MotionAnalysis Conclusions and Future Work

  • MotivationProtein in biological molecules DockingTransform oneself and Combine with other materials

    Prediction of Docking Prediction of resultant functions

  • Existing Docking SimulationPredicted structuresfrom dockingstructureAstructureBDocking simulationPDB*Rigid structures* Protein Data BankFluctuating in living cells Low prediction accuracyDocking simulationConsidering fluctuations

  • Flexibility DockingPredicted structuresfrom dockingstructureAstructureBDocking simulationPDBFlexibility handling Considering fluctuation of proteins in living cellsExtraction of fluctuated structuresConsideration ofstructural fluctuation of proteins

  • Flexibility HandlingFlexibility handlingMDFilteroutputfileRepresentativestructureFiltering Selection of representative structures from similar structuresMolecular dynamic simulation(MD) Simulation of motion of molecules in a polyatomic systemoutputfileoutputfileoutputfileoutputfileRepresentativestructure

  • Filters using RMSDRMSD(Root Mean Square Deviation)Comparison of the similarity of two structures

    Propose two filtering algorithms Maximum RMSD selection filter Below RMSD 1 deletion filterResult Useful for the heat fluctuation conditionRMSD Unification of topology information Lapse of informationFeature extraction focusing on Protein Motion not Structure

  • Capture Protein Motion MDWavelet transformClusteringContinuous wavelet transform: Morlet wavelet Clustering algorithm:Affinity PropagationSelection of representative motionsFeature extractionThe frequency may change momentarily!

  • Target Protein1TIBResidue length: 269MD simulationSoftware: AMBERSimulation run time: 2 nsec Result data files: 200Space coordinates of C atoms

  • Singular Value DecompositionSVD(Singular value decomposition)

    Definition:

    Unitary matrix U: Left-singular vectorsSpatial motionUnitary matrix V: Right-singular vectorsFrequency fluctuationmatrix-size of A: 807199

  • Singular Value DecompositionSVD(Singular value decomposition)

    Definition:

    Unitary matrix U: Left-singular vectorsSpatial motionUnitary matrix V: Right-singular vectorsFrequency fluctuationmatrix-size of A: 807199

  • Verification of ReproducibilitySingular values and principal components

    Left Singular Vectors(Spatial motion)Right Singular Vectors(Frequency fluctuation)

  • ReproducibilityUsing the eight principal components, the motion expressed by 199 componentscan be reproduced !Almost adjusted !

  • Examination (1) Each of singular values

    (2)The first singular valueAccounted for about 30% overExpression of the original motion Possible by the six singular valuesThe first singular value is useful

  • Clustering AnalysisFocus on the first principal componentDefinitionSimilarities and Preference

    Clustering by using the above values

  • Similarities (1)For left singular vectorsDifference of spatial directs Inner products

    Similarity : C

  • Similarities (2)For right singular vectorsDifference between distributions of spectrum Hellinger Distance

    Similarity:

  • Clustering MethodAffinity propagation(AP)Brendan J. Frey and Delbert Dueck Clustering by Passing Messages Between Data Points. Science 315, 972976.2007Obtain Exemplars: cluster centers

    PreferenceLeft singular vectorsAverage of similaritiesRight singular vectorsminimum of similaritiesmaximum of similaritiesminimum

  • Similarities between Left Singular Vectors

  • Clusteringof Left Singular Vectors

  • Similarities between Right Singular Vectors

  • Clustering of Right Singular Vectors

  • DiscussionsEach of motionsSpatial motionRepetition of several similar spatial motions in time variationFrequency fluctuationRepetition of similar frequency patterns in time variation Relationship Characteristic Frequency fluctuation Group transition on spatial motion

  • Conclusions and Future WorkFlexibility dockingFlexibility handling: MD and FilterFeature extraction based motionWavelet analysisAnalysis of motions ClusteringFuture workCollective motionRelationship Perform the docking simulation

  • Conclusions and Future WorkFlexibility dockingFlexibility handling: MD and FilterFeature extraction based motionWavelet analysisAnalysis of motions ClusteringFuture workCollective motionRelationship Perform the docking simulation