text mining medline - oracle · text mining medline a simple application to mine medline raf...

53
Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences Oracle Corporation [email protected]

Upload: phungnga

Post on 12-Apr-2018

237 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Text Mining MEDLINEA Simple Application to Mine MEDLINE

Raf PodowskiLife Sciences

Oracle [email protected]

Page 2: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Outline

ThemesThesauriClusteringClassificationDemo

LoadingRetrievalIndexingSearchingTokens

Page 3: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Loading

Storage choiceXML DBCLOB

Loading methodSQL*LoaderINSERTUPSERT

Initial Loading with 12M Records

Periodic Updates

Page 4: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Loading create table medtab (pmid number primary key,text xmltype );

Load DATA INFILE ‘medline.dat‘BADFILE ‘medline.bad' DISCARDFILE ‘medline.discard' INTO TABLE medtab REPLACE FIELDS TERMINATED BY '\t' (pmid, text char(1000000))

select pmid,extractValue(text,

'/MedlineCitation/Article/Abstract/AbstractText') Abstractfrom medtabwhere PMID='15129431';

Page 5: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Loading

MATCHED THEN insert clauseWHEN NOT

WHENSKIP ( condition )

MERGE INTO table

USING table/view/subquery

ON ( condition )

WHEN MATCHED THEN update clause

Page 6: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Retrievalselect pmid,

extractValue(text, '/MedlineCitation/Article/ArticleTitle') Title,extractValue(text, '/MedlineCitation/Article/Abstract/AbstractText') Abstract

from medtabwhere PMID='3298569';

3298569

Estrogen receptor immunoreactivity in meningiomas. Comparison with the binding activity of estrogen, progesterone, and androgen receptors.

Estrogen receptor (ER) analysis was performed in 70 meningioma samples by means of two assays: an enzyme immunoassay that used monoclonal antibodies against human ER protein (estrophilin), and a sensitive radioligand binding assay that used iodine-125-labeled estradiol as the radioligand. Low levels of ERimmunoreactivity were found in tumors from 51% of patients, whereas ER binding activity was demonstrated in 40% of the meningiomas examined. In eight (11%) of the tissue samples, multiple binding sites for estradiol were observed. Theimmunoreactive binding sites corresponded to those of the classic high-affinity ER. In ligand binding studies, however, measurement of classic ER was…

Page 7: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Retrievalselect extract(text,

'/MedlineCitation/MeshHeadingList/MeshHeading/DescriptorName').getStringVal()from medtabwhere PMID='3298569';

<DescriptorName MajorTopicYN="N">Comparative Study</DescriptorName><DescriptorName MajorTopicYN="N">Female</DescriptorName><DescriptorName MajorTopicYN="N">Human</DescriptorName><DescriptorName MajorTopicYN="N">Immunoenzyme Techniques</DescriptorName><DescriptorName MajorTopicYN="N">Male</DescriptorName><DescriptorName MajorTopicYN="N">Meningeal Neoplasms</DescriptorName><DescriptorName MajorTopicYN="N">Meningioma</DescriptorName><DescriptorName MajorTopicYN="N">Middle Aged</DescriptorName><DescriptorName MajorTopicYN="N">Radioligand Assay</DescriptorName><DescriptorName MajorTopicYN="N">Receptors, Androgen</DescriptorName><DescriptorName MajorTopicYN="N">Receptors, Estrogen</DescriptorName><DescriptorName MajorTopicYN="N">Receptors, Progesterone</DescriptorName>

Page 8: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Retrievalselect extract(text,

'/MedlineCitation/ChemicalList/Chemical/NameOfSubstance').getStringVal()from medtabwhere PMID='3298569';

<NameOfSubstance>Receptors, Androgen</NameOfSubstance><NameOfSubstance>Receptors, Estrogen</NameOfSubstance><NameOfSubstance>Receptors, Progesterone</NameOfSubstance>

Page 9: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Indexing

FilterFile formats

SectionerHTMLXML

TokenizerLexer - tokenizeStoplists - mask

Page 10: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Indexing

Index TypesCONTEXT

Text retrievalCONTAINS query operator

CTXCATItem categoriesCATSEARCH query operator

CTXRULEClassification rulesMATCHES query operator

Page 11: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Indexing-- Enable theme indexingexec ctx_ddl.create_preference('mylex','BASIC_LEXER');exec ctx_ddl.set_attribute('mylex','MIXED_CASE','NO');exec ctx_ddl.set_attribute('mylex','THEME_LANGUAGE','ENGLISH');exec ctx_ddl.set_attribute('mylex','index_themes','YES');exec ctx_ddl.set_attribute('mylex','index_text','YES');

-- Create XML sectionsexec ctx_ddl.create_section_group('xmlgroup','auto_section_group');

-- Index column 'text' of table 'medtab' for user 'hex'create index medtab_idx on medtab(text)indextype is ctxsys.contextparameters('lexer mylex filter ctxsys.null_filter section groupxmlgroup');

Page 12: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SearchingCOL Title FORMAT a60;COL S FORMAT 999;select score(1) s, pmid,

extractValue(text, '/MedlineCitation/Article/ArticleTitle') Titlefrom medtabwhere CONTAINS(text, 'aldose reductase WITHIN AbstractText', 1) > 0ORDER BY score(1) DESC;

S PMID TITLE---- ---------- ------------------------------------------------------------

59 14768008 Detection and identification of tumor-associated protein variants in human hepatocellular carcinomas.

12 9537432 New member of aldose reductase family proteins overexpressedin human hepatocellular carcinoma.

12 10322639 Developmental expression of urine concentration-associated genes and their altered expression in murine infantile-type polycystic kidney disease.

12 9565553 Identification and characterization of a novel human aldosereductase-like gene.

12 11261885 Overexpression of aldose reductase in liver cancers may contribute to drug resistance.

Page 13: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Searchingselect score(1) s, pmid,

extractValue(text, '/MedlineCitation/Article/ArticleTitle') Titlefrom medtabwhere contains(text,

'<query><textquery>aldose reductase WITHIN AbstractText</textquery><score algorithm="COUNT"/></query>', 1) > 0;

S PMID TITLE---- ---------- ------------------------------------------------------------

5 14768008 Detection and identification of tumor-associated protein variants in human hepatocellular carcinomas.

1 9537432 New member of aldose reductase family proteins overexpressedin human hepatocellular carcinoma.

1 10322639 Developmental expression of urine concentration-associated genes and their altered expression in murine infantile-type polycystic kidney disease.

1 9565553 Identification and characterization of a novel human aldosereductase-like gene.

1 11261885 Overexpression of aldose reductase in liver cancers may contribute to drug resistance.

Page 14: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Tokensdeclare

the_tokens ctx_doc.token_tab;begin

ctx_doc.tokens('medtab_idx','9115210',the_tokens);for i in 1..the_tokens.count loopdbms_output.put_line(the_tokens(i).token||' '||the_tokens(i).offset);

end loop;end;

…MUTATIONS 176TUMOR 193SUPPRESSOR 199GENE 210PATCHED 215PTC 224FOUND 233HUMAN 242PATIENTS 248BASAL 266CELL 272NEVUS 277SYNDROME 283DISEASE 295…

Basal cell carcinomas in mice overexpressing sonic hedgehog.

Mutations in the tumor suppressor gene PATCHED (PTC) are foundin human patients with the basal cell nevus syndrome, a diseasecausing developmental defects and tumors, including basal cell carcinomas. Gene regulatory relationships defined in the fruit flyDrosophila suggest that overproduction of Sonic hedgehog (SHH), the ligand for PTC, will mimic loss of ptc function. It is shown here that transgenic mice overexpressing SHH in the skin develop many features of basal cell nevus syndrome, demonstrating that SHH is sufficient to induce basal cell carcinomas in mice. These data suggest that SHH may have a role in human tumorigenesis.

Page 15: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Themesset serveroutput on format wrapped size 1000000;

declarethe_themes ctx_doc.theme_tab;

beginctx_doc.themes('medtab_idx','9115210',the_themes,FALSE,10);for i in 1..the_themes.count loopdbms_output.put_line(the_themes(i).theme||' : '||the_themes(i).weight);

end loop;end;

metabolisms:33genetics:33proteins:27pathology:26genes:25United States:23humankind:20basal cells:19SHH:18suggestion:17

Basal cell carcinomas in mice overexpressing sonic hedgehog.

Mutations in the tumor suppressor gene PATCHED (PTC) are found in human patients with the basal cell nevus syndrome, a disease causing developmental defects and tumors, including basal cell carcinomas. Gene regulatory relationships defined in the fruit fly Drosophila suggest that overproduction of Sonic hedgehog (SHH), the ligand for PTC, will mimic loss of ptc function. It is shown here that transgenic mice overexpressing SHH in the skin develop many features of basal cellnevus syndrome, demonstrating that SHH is sufficient to induce basal cell carcinomas in mice. These data suggest that SHH may have a role in human tumorigenesis.

Page 16: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Thesauribegin

ctx_thes.create_thesaurus('genes', FALSE);end;

beginctx_thes.create_phrase('genes','LLID_231');ctx_thes.create_phrase('genes','LLID_367');ctx_thes.create_phrase('genes','LLID_374');ctx_thes.create_phrase('genes','AR');

end;

beginctx_thes.create_phrase('genes','androgen receptor');ctx_thes.create_phrase('genes','dihydrotestosterone receptor');ctx_thes.create_phrase('genes','HGNC:644');ctx_thes.create_phrase('genes','AIS');ctx_thes.create_phrase('genes','DHTR');ctx_thes.create_phrase('genes','HUMARA');ctx_thes.create_phrase('genes','KD');ctx_thes.create_phrase('genes','NR3C4');ctx_thes.create_phrase('genes','SBMA');ctx_thes.create_phrase('genes','SMAX1');ctx_thes.create_phrase('genes','TFM');

end;

Page 17: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Thesauribegin

ctx_thes.create_relation('genes','gene','NT','LLID_231');ctx_thes.create_relation('genes','gene','NT','LLID_367');ctx_thes.create_relation('genes','gene','NT','LLID_374');ctx_thes.create_relation('genes','LLID_367','NT','AR');ctx_thes.create_relation('genes','LLID_367','NT','androgen receptor');

end;

declaresynonyms varchar2(2000);

beginsynonyms := ctx_thes.nt('LLID_367',1,‘genes');dbms_output.put_line(‘Thesaurus: genes');dbms_output.put_line(‘The synonym expansion for LLID_367 is: '||synonyms);

end;

Thesaurus: genesThe synonym expansion for LLID_367 is: {LLID_367}|{ANDROGEN RECEPTOR}|{AR}| {DIHYDROTESTOSTERONE RECEPTOR}|{HGNC:644}|{AIS}|{DHTR}|{HUMARA}|{KD}|{NR3C4} |{SBMA}|{SMAX1}|{TFM}

Page 18: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Document Clustering

Unsupervised ClassificationNo training neededGood for initial overview of a group of documentsIdentifies shares attributes

CTX_CLS.CLUSTERINGKMEAN

requires setting the number of clustersTEXTK

experimental hierarchical clustering

Page 19: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Document Clustering

Prepare database objectsCollection tableClusters tableDocument results table

Set clustering preferencesPopulate and index collection tableRun clustering

Page 20: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

create table collection (id number primary key, text clob);

Collection table

create table restab (docid NUMBER,clusterid NUMBER,score NIMBER);

Document results tablecreate table clusters (

clusterid NUMBER,descript varchar2(4000),label varchar2(200),sze NUMBER,quality_score NUMBER,parent NUMBER);

Clusters table

Page 21: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

create index collectionx on collection(text)indextype is ctxsys.contextparameters('STOPLIST CTXSYS.DEFAULT_STOPLIST');

Index collection table

Declarex clob := null;

Beginctx_report.index_stats('collectionx',x);insert into collection_stats values (x);commit;dbms_lob.freetemporary(x);

end;

Select * from collection_stats;

Examine token TF and DF

Page 22: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Query: liver cancerNo.docs: 9, No.tokens: 702

Token TF DF TFIDFDNA 22 2 11.00TISSUE 30 3 10.00CELLS 37 5 7.40TUMOR 29 4 7.25AR 40 6 6.67ENZYME 12 2 6.00 LIVER 50 9 5.56 PROTEIN 11 2 5.50 DIETS 11 2 5.50

CTX_REPORT.INDEX_STATS

ALTER INDEX collectionx REBUILD PARAMETERS ('ADD STOPWORD 1');ALTER INDEX collectionx REBUILD PARAMETERS ('ADD STOPWORD CANCER');ALTER INDEX collectionx REBUILD PARAMETERS ('ADD STOPWORD LIVER');

Add stopwords to collection index

Page 23: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

BEGINctx_ddl.drop_preference('my_cluster');ctx_ddl.create_preference('my_cluster','KMEAN_CLUSTERING');ctx_ddl.set_attribute('my_cluster','CLUSTER_NUM',5);ctx_ddl.set_attribute('my_cluster','MAX_FEATURES',200);ctx_ddl.set_attribute('my_cluster','MAX_DOCTERMS',20);

END;

Set clustering preferences

BEGINctx_cls.clustering('collectionx','id','restab','clusters','my_cluster');

END;

Cluster collection

Page 24: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Document Classification

Supervised ClassificationNeeds trainingCan be applied to any document

Rule-basedManual rule creation

Decision TreesAutomatic rule creation (editable)

SVMAutomatic rule creation (opaque)

Page 25: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM ClassificationTraining Documents

Create and populate training document tableGenerate CONTEXT index on documents

CategoriesAssign documents to categories

Set classifier preferencesMAX_FEATURES

Train ClassifierCreate Rules TableTrainGenerate a CTXRULE index on the rules table

Classify New Documents

Page 26: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM Classification

create table svmtrain (docid number primary key, text clob);

Create and populate training documents table

create index svmtrainx on svmtrain(text)indextype is ctxsys.contextparameters('STOPLIST CTXSYS.DEFAULT_STOPLIST');

Create training document index

Page 27: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM Classification

create table svmcats (docid number,cat_id number,catname varchar2(250));

Create and populate the category table

create table svmtab (cat_id number,type number(3) NOT NULL,rule blob);

Create the rules table

Page 28: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM Classification

beginctx_ddl.set_attribute('mysvm','MAX_FEATURES','100');

end;

Set SVM classifier preferences

beginctx_cls.train('svmtrainx',

'docid','svmcats','docid','cat_id','svmtab','mysvm');

end;

Train SVM classifier

Page 29: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM Classification

create index svmx on svmtab(rule)indextype is ctxsys.ctxruleparameters ('filter svmfilter classifier mysvm');

Create rules index

select cat_id, match_score(1) SCOREfrom svmtab where matches(rule, (

select extractValue(text, '/MedlineCitation/Article/ArticleTitle')||' '||extractValue(text, /MedlineCitation/Article/Abstract/AbstractText')

from medtabwhere pmid='7587903'), 1) > 50;

Classify unknown documents

Page 30: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

DEMO

Page 31: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Oracle Text Application

Load MEDLINE

documents

Index with Oracle Text

Thesauri

Keyword Search

Document Clustering

Retrieve Document

Cluster Visualization

Stopwords

SVM Classification

Co-occurrence matrix

Create SVM category

Interactive UI

Page 32: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Search

Page 33: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Search

Page 34: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Search

Page 35: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Fetch

Page 36: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Fetch

Page 37: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM

Page 38: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM

Page 39: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM

Page 40: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM

Page 41: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM

Page 42: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

SVM

Page 43: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 44: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 45: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 46: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 47: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 48: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 49: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering

Page 50: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Clustering Medline Abstracts

Page 51: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Comparing Document Collections

Page 52: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

Analysis by Scatter Plot

Page 53: Text Mining Medline - Oracle · Text Mining MEDLINE A Simple Application to Mine MEDLINE Raf Podowski Life Sciences ... Cross-domain Data Analysis Data MiningData mining Text Mining

InforSense Data Model FacilitatesCross-domain Data Analysis

Data miningData Mining

Text Mining

Spectrum Data MiningChemical/sequence

Data Model