identification of candidate drugs for heart failure using tensor decomposition-based unsupervised...

13
Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based  Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression  between Heart Failure and Drug Matrix Datasets Y-h. Taguchi Department of Physics, Chuo University Tokyo 112-8551, Japan DOI: 10.1101/117465 DOI: 10.1007/978-3-319-63312-1_45

Upload: y-h-taguchi

Post on 22-Jan-2018

65 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Identification of Candidate Drugs for Heart Failure using Tensor Decomposition­Based  Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression  between Heart Failure and Drug Matrix Datasets

Y­h. TaguchiDepartment of Physics, Chuo UniversityTokyo 112­8551, Japan

DOI: 10.1101/117465 DOI: 10.1007/978­3­319­63312­1_45

Page 2: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Introduction

Drug discovery (DD) is experimentally time consuming and expensive process.

In silico drug discovery enables researcher to reduce the cost and time required for this complicated process. 

Two major in silico DD strategies.Ligand based one vs Structure based one. 

Page 3: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Pros and Cons of ligand based and structure based Pros and Cons of ligand based and structure based DD.DD.

Ligand based:Ligand based:(Relatively) easy to perform.(Relatively) high accuracy.But unable to identify compounds  missing similarity with known drug copounds.Structure Based:Structure Based:No needs for similarity with known drug compoundsBut need for (inferred) protein structures.Computationally massive (unable to apply to library having  more than million compounds).

Page 4: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

The 3The 3rdrd  in silicoin silico DD strategy : gene expression  DD strategy : gene expression based.based.

In stead of (structural) similarity  between ligands, that of (drug treated cell line/model animal) gene expression profiles.

(Relatively) easy to perform.Needs massive (labeled) data set for training.

The purpose of this research:The purpose of this research:Propose of gene expression based in silico DD without training data sets (via unsupervised learning)

Page 5: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Method: Tensor decomposition (TD) based Method: Tensor decomposition (TD) based unsupervised feature extraction (FE)unsupervised feature extraction (FE)

 N features

Categorical multiclasses SV

D

1st

samples

M samplesN × M Matrix X (numerical values)

2nd

1st

Genes

++ ++ +

+++

++ ++ ++

+

No distinction between classes

Example:  singular  value  decomposition  (SVD) applied  to  matrix:  samples  and  features  are embedded into Q dimensional space.

Page 6: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Synthetic example

10 samples10 samples

90 features 10 featuresN(0)N()

[N()+N(0)]/2

+:Top 10 outliersThus, extracting outliers selects features distinct between two classes in an unsupervised way.Accuracy:(100 trials)Accuracy:(100 trials) 89.5% ( 52.6% (

1st

2nd

Normal μ:mean Distribution ½ :SD

Page 7: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

In this study, tensor is generated from In this study, tensor is generated from mathematical product of gene expression mathematical product of gene expression between between human diseaseshuman diseases and  and drug treated drug treated 

model animalsmodel animals  

Human Animal

sam

ples

genes

=

=

Hum

an sam

p les

Animal Samples Animal SamplesH

uman 

samp les

gene s

TD

Page 8: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

xij3

xj1j2j3i=xij1j2xij3

=G(l1,l2,l3,l4)xl1j1xl2j2

xl3j3xl4i

xij1j2

xl1j1

xl2j2

xl3j3xl4i

j1j2

j3

i

Animal samples(compounds)

Animal samples(time points)

Humansamples

gene

Patients vs normal control

TD

Page 9: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Time points singular value vectors (SVV)

FirstSecondThirdFourth

Page 10: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Human sampls singular value vectors (SVV)

Two heart failure 

vsNormal

Page 11: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

Histogram of drug singular value vectors (SVV)

43 compounds

Page 12: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

281  genes  affected  by  drug  treatments were  selected  using  21st,  25th,  27th, 28th, 33rd, 36th, 37th, 38 th, 41st, and 42nd genes SVV as well.

Although  I  performed  massive biological  evaluations  of  genes  and compounds,  due  to  lack  of  time,  I cannot disucuss about  it. See my paper in proceeding for more details.

Page 13: Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart

SummarySummary

I have developed unsupervised method of in silico DD using gene expression.

I have successfully applied this method to the combination of drug treated model animals (rat) and human disease (heart failure).

Biological evaluation of identified genes are reasonable.

More extensive applications together with drug target genes is under consideration (already submitted).