a biclustering method for rationalizing chemical biology mechanisms of action

Post on 23-Jun-2015

248 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chemical Interaction Matrix:

Gerald Lushington / LiS Consultinghttp://geraldlushington.com / glushington@yahoo.com

Personalized Medicine

Comprehensive Biochemical & Chemical Biology Understanding

Big data: NGS, medical outcomes, etc.

Personalized Medicine

Comprehensive Biochemical & Chemical Biology Understanding

Informatics& Creativity

HTS &Chemical

Proteomics

Big data: NGS, medical outcomes, etc.

Example Challenges:●Toxicology: single toxin may modulate several different biochemical processes

●Cancer: malignant cells have multiple biochemical sensitivities that may be targeted

●Spectral disorders (e.g., Autism, Alzheimers, etc.): distinct phenotypes produce similar symptoms

Discovery Paradigm:

Chemical screening prospective hitsChemical proteomics prospective targets

How to attain comprehensive understanding?

Data Comprehension Reality

TargetsCompounds

How to make sense of diffuse multimode data?

Mechanism of Action (MOA) discovery: find compound subsets that conserve common mechanism

Excellent (but imperfect) example: TEST (Toxicology Estimation Software Tool)

http://www.epa.gov/nrmrl/std/qsar/qsar.html

TEST

Multiple data sets covering toxicity outcomes for numerous compounds

Predict toxicity of query compounds via on-the-fly training to similar pre-characterized analogs

TEST

Multiple data sets covering toxicity outcomes for numerous compounds

Predict toxicity of query compounds via on-the-fly training to similar pre-characterized analogs

Use Tanimoto distances over molecular fingerprints: no validated relevance specific

outcomes

Procedure: 1. Assemble Matrix of compounds vs.

activity & features

MOA method: feature / compound selection

Procedure: 1. Assemble Matrix of compounds vs.

activity & features2. Normalize

MOA method: feature / compound selection

Procedure: 1. Assemble Matrix of compounds vs.

activity & features2. Normalize3. Fold activity into features as per:

Ci = |Act* - Xi*|

X values: 0 = perfect correlation1 = perfect anticorrelation

MOA method: feature / compound selection

Procedure: 1. Assemble Matrix of compounds vs.

activity & features2. Normalize3. Fold activity into features as per:

Ci = |Act* - Xi*|4. Bicluster

MOA method: feature / compound selection

Procedure: 1. Assemble Matrix of compounds vs.

activity & features2. Normalize3. Fold activity into features as per:

Ci = |Act* - Xi*|4. Bicluster

Clusters Contiguous correlative or anticorrelative regions or matrix

Within clusters: molecules may share MOA; features may correlate with activity

Confidence: correlative & predictive quality of model derived from cluster

MOA method: feature / compound selection

Example: Oral Bioavailability

Oral update depends on:

● Polar solubility● Membrane permeability● Interaction with various transporters

Data (from Tingjun Hou): 773 moleculeshttp://modem.ucsd.edu/adme/databases/databases_bioavailability.htm

Descriptors (from VolSurf and DVS): 298 featurespassing information content and linear independence (R < 0.90) filters

Example: Oral Bioavailability

Oral update depends on:

● Polar solubility● Membrane permeability● Interaction with various transporters

Data (from Tingjun Hou): 773 moleculeshttp://modem.ucsd.edu/adme/databases/databases_bioavailability.htm

Descriptors (from VolSurf and DVS): 298 featurespassing information content and linear independence (R < 0.90) filters

Preliminary Model (Weka: Bootstrap Aggregating / RepTree):

Q2(5-fold) = 0.4712

Example: Oral Bioavailability

Oral update depends on:

● Polar solubility● Membrane permeability● Interaction with various transporters

Data (from Tingjun Hou): 773 moleculeshttp://modem.ucsd.edu/adme/databases/databases_bioavailability.htm

Descriptors (from VolSurf and DVS): 298 featurespassing information content and linear independence (R < 0.90) filters

Preliminary Model (Weka: Bootstrap Aggregating / RepTree):

Q2(5-fold) = 0.4712 CFS & RF: reduced to 27 features

Q2(5-fold) = 0.4739

Biclustering: Before and After

Clusters as local training sets:

Clusters as local training sets:

Condense to 18 high quality clusters that cover almost entire training space (omit only 10 of 768 cpds)

Conclusions

Correlative & predictive performance of subset models gives strong confidence in MOA conservation in clusters

Head-to-head comparison with chemical proteomics data should provide strong basis for target identification

Questions / Suggestions?glushington@yahoo.com

top related