refined blood-borne mirnome of human diseases via pca-based feature extraction

18
Refined blood-borne miRNome of human diseases via PCA-based feature extraction Y-h. Taguchi Department of Physics, Chuo University Yoshiki Murakami Center for Genomic Medicine Kyoto University

Upload: y-h-taguchi

Post on 16-Jul-2015

137 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Y-h. TaguchiDepartment of Physics,

Chuo University

Yoshiki MurakamiCenter for Genomic Medicine

Kyoto University

Page 2: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Caution:

Main results obtained by the collaboration with Prof. Murakami are based upon his own experiments ( * ), but our results are related to planed patent proposal. Thus, here we decided to present our methods applied to alternative public data.

(*) to be submitted to Journal of hepatology

Page 3: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

1. The concept of PCA based feature extraction

2. What is miRNA (will be skipped)?

3. Previous Work (Dry + Wet)

4. Proposed method + Results

5. Summary & Conclusion

Page 4: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

1. The concept of PCA based feature extraction

Why feature extraction?

・ Avoiding overfitting ・Needs for experimental validation too many genes/proteins cannot be tested.

・Several methods require fewer state variables than observationsOne of problems: Feature extraction itself rarely passes cross validation test.

Page 5: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Samples

Group1 Group2 Group3

FeatureExtraction

ModelConstruction

FeatureExtraction

ModelConstruction

Validation

Training Set

Conventional Test Set

Page 6: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Samples

Group1 Group2 Group3

ModelConstruction

FeatureExtraction

ModelConstruction

ValidationTraining Set

Proposed

Without knowledge

about classification/t

arget variable

Test Set

Page 7: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

2. What is miRNA?

miRNA is a kind of non-coding RNA. miRNAs are believed to suppress target gene expression by degradation of mRNAs. Important features:

・ Typically, there are hundreds kinds of miRNAs found for each species (c.a., 1000 for human).≧

・ Each miRNA targets more than hundreds of genes. ・ miRNA mainly contributes to cell type change

(e.g., cancer, defferentiation, diseases) ・Infulence to target gene expression by miRNA is subtle (〜10%) and contexts dependent.・In spite of that, miRNA critically contributes to the related processesmiRNA critically contributes to the related processes (e.g., induction of cell cycle arrest)

Page 8: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

3. Previous Work (Dry + Wet)

Toward the blood-borne miRNome of human diseases, A. Keller et al., Nature Method, (2011).

Discrimination between diseases using miRNA in blood

Feature (miRNA) selection : P-value (t test)

Discrimination: SVC with several types of kernels + grid based optimal parameter search

Page 9: Refined blood-borne miRNome of human diseases via PCA-based feature extraction
Page 10: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

cf. Nature Method, 10 miRNAs

<0.7

Page 11: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

4. Proposed method + Results

Data

PCA

Feature Selection(without classification information)

LDA

Page 12: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

◯ Control△ lung cancer 

PCA (samples: diseases/cancers)

diseasescancers

Page 13: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Feature extraction (miRNAs)

PCA (miRNAs)

10 outliner miRNAs

Why outliners?⇓

main contribution to PCA

embeddings of samples

Why 10?⇓

To compare with Nature Method paper results

miRNA

Page 14: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

◯ Control △ lung cancer

PCA, again (samples after feature extraction)

diseasescancers

Page 15: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Control vs Lung CancerLDA with PCA (after feature extraction, up to the 5th PC)

control lung cancer

control 56 8

lung cancer 14 24

Accuracy 0.784Specificity 0.800Sensitivity 0.750Precision 0.632

Pred

iction

Actual

0.8130.8440.781

cf. Nature Method, 250 miRNAs

Page 16: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

0.813 0.844 0.781 250 miRNAsRelatively Best

0.867 0.867 0.844 150 miRNAsRelatively Worst

(+)(-) : Comparison with 10 miRNA results in Nature Methods

>0.70

Page 17: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Selected miRNAs: diseases/cancers vs normal(+)/(-) : up/downregulated after the transformation by PCA+LDA (*) not selected independence of diseases/cancers

Page 18: Refined blood-borne miRNome of human diseases via PCA-based feature extraction

Advantages of proposed method

・ No need of classification information for feature selection

・ Independent of training/test set division for feature selection (Thus, stable) 

5. Summary & Conclusion