text mining

41
• text mining https://store.theartofservice.com/the-text-mining- toolkit.html

Upload: prudence-fowler

Post on 01-Jan-2016

239 views

Category:

Documents


0 download

TRANSCRIPT

• text mining

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining

1 Typical text mining tasks include text categorization, text clustering, concept mining|concept/entity

extraction, production of granular taxonomies, sentiment analysis,

document summarization, and entity relation modeling (i.e., learning relations between named entity

recognition|named entities).

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Text mining and text analytics

1 The latter term is now used more frequently in business settings while text mining is used in some of the

earliest application areas, dating to the 1980s, notably life-sciences

research and government intelligence.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - History

1 Labor-intensive manual text mining approaches first surfaced in the mid-1980s,

but technological advances have enabled the field to advance during the past decade. Text mining is an interdisciplinary field that draws

on information retrieval, data mining, machine learning, statistics, and computational

linguistics. As most information (common estimates say over 80%) is currently stored as

text, text mining is believed to have a high commercial potential value.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Security applications

1 Many text mining software packages are marketed for security appliance|

security applications, especially monitoring and analysis of online plain text sources such as Internet

news, blogs, etc. for national security purposes. It is also involved in the

study of text encryption/decryption.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Biomedical applications

1 A range of text mining applications in the biomedical literature has been described.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Biomedical applications

1 One online text mining application in the biomedical literature is PubGene that combines biomedical text mining with network visualization as an Internet

service. TPX is a concept-assisted search and navigation tool for biomedical

literature analyses - it runs on PubMed/PubMed Central|PMC and can be

configured, on request, to run on local literature repositories too.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Software applications

1 Text mining methods and software is also being researched and developed

by major firms, including IBM and Microsoft, to further automate the

mining and analysis processes, and by different firms working in the area of search and indexing in general as

a way to improve their results.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Online media applications

1 Text mining is being used by large media companies, such as the Tribune Company,

to clarify information and to provide readers with greater search experiences, which in turn increases site stickiness and

revenue. Additionally, on the back end, editors are benefiting by being able to

share, associate and package news across properties, significantly increasing

opportunities to monetize content.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Marketing applications

1 Text mining is starting to be used in marketing as well, more specifically in analytical customer relationship management. Coussement and Van den Poel (2008) apply it to improve

predictive analytics models for customer churn (customer attrition).

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Academic applications

1 Therefore, initiatives have been taken such as Nature (journal)|Nature's proposal for an Open Text Mining Interface (OTMI) and the National

Institutes of Health's common Journal Publishing Document Type Definition

(DTD) that would provide semantic cues to machines to answer specific queries contained within text without removing

publisher barriers to public access.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Academic applications

1 Academic institutions have also become involved in the text mining initiative:

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Academic applications

1 With an initial focus on text mining in the biology|biological and biomedical

sciences, research has since expanded into the areas of social

sciences.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Academic applications

1 *In the United States, the UC Berkeley School of Information|

School of Information at University of California, Berkeley is developing a

program called BioText to assist biology researchers in text mining

and analysis.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Academic applications

1 Further, private initiatives also offer tools

for academic text mining:

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Software and applications

1 Text mining computer programs are available from many commercial software|commercial and open source companies and sources.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Commercial

1 * AeroText – a suite of text mining applications for content analysis. Content used can be in multiple

languages.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Commercial

1 * Attensity – hosted, integrated and stand-alone text mining (analytics) software that uses natural language

processing technology to address collective intelligence in Social Media

and forums; the voice of the customer in surveys and emails;

customer relationship management; e-services; research and e-discovery; risk and compliance; and intelligence

analysis.https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Commercial

1 * Autonomy Corporation|Autonomy – text mining, clustering and categorization

software

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Commercial

1 * Clarabridge – text analytics (text mining) software, including natural language (NLP), machine learning,

clustering and categorization. Provides SaaS, hosted and on-

premise text and sentiment analytics that enables companies to collect, listen to, analyze, and act on the Voice of the Customer (VOC) from both external (Twitter, Facebook, Yelp!, product forums, etc.) and

internal sources (Call Center notes, CRM, Enterprise Data Warehouse, BI,

surveys, emails, etc.).

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Commercial

1 * WordStat - Content analysis and text mining add-on module of QDA

Miner for analyzing large amounts of text data.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Open source

1 * The programming language R (programming language)|R provides

a framework for text mining applications in the package tm

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Open source

1 * KH Coder - For content analysis, text mining or

corpus linguistics.

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Implications

1 Until recently, websites most often used text-based searches, which only found documents containing specific user-defined words or phrases. Now, through use of a semantic web, text

mining can find content based on meaning and context (rather than

just by a specific word).

https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Implications

1 Additionally, text mining software can be used to build large dossiers of information

about specific people and events. For example, large datasets based on data

extracted from news reports can be built to facilitate social networks analysis or

counter-intelligence. In effect, the text mining software may act in a capacity

similar to an intelligence analyst or research librarian, albeit with a more limited scope of

analysis.https://store.theartofservice.com/the-text-mining-toolkit.html

Text mining - Implications

1 Text mining is also used in some email spam filters as a way of

determining the characteristics of messages that are likely to be

advertisements or other unwanted material.

https://store.theartofservice.com/the-text-mining-toolkit.html

Intelligent text analysis - Text mining and text analytics

1 The term 'text analytics' describes a set of Linguistics|linguistic, statistical, and machine learning techniques that model and

structure the information content of textual sources for business intelligence, exploratory data analysis, research, or

investigation.[ http://intelligent-enterprise.informationweek.com/blog/archives/2007/02/defining_text_a.html Defining Text Analytics] The term is

roughly synonymous with text mining; indeed, Ronen Feldman modified a 2000 description of text mining[

http://www.cs.cmu.edu/~dunja/CFPWshKDD2000.html KDD-2000 Workshop on Text Mining] in 2004 to describe text

analytics.[ http://www.ir.iit.edu/cikm2004/tutorials.html#T2 Text Analytics: Theory and Practice] The latter term is now used more frequently in business settings while text mining is used in

some of the earliest application areas, dating to the 1980s, notably life-sciences research and government intelligence.

https://store.theartofservice.com/the-text-mining-toolkit.html

National Centre for Text Mining

1 is a publicly funded text mining (TM) centre. It was established to provide support, advice, and information on TM technologies and to disseminate

information from the larger TM community, while also providing

tailored services and tools in response to the requirements of the

United Kingdom academic community.

https://store.theartofservice.com/the-text-mining-toolkit.html

National Centre for Text Mining

1 The software tools and services which NaCTeM supplies allow researchers to apply text mining techniques to problems within

their specific areas of interest - examples of these tools are highlighted below. In addition

to providing services, the Centre is also involved in, and makes significant

contributions to, the text mining research community both nationally and

internationally in initiatives such as Europe PubMed Central.

https://store.theartofservice.com/the-text-mining-toolkit.html

National Centre for Text Mining - Resources

1 [http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/home/wiki.cgi?

page=GENIA+corpus 'GENIA'] a collection of reference materials for the development of biomedical text

mining systems

https://store.theartofservice.com/the-text-mining-toolkit.html

List of text mining software - Commercial

1 * AUTINDEX - is a commercial text mining software package based on

sophisticated linguistics by IAI (Institute for Applied Information

Sciences), Saarbrücken.

https://store.theartofservice.com/the-text-mining-toolkit.html

List of text mining software - Commercial

1 * Eduworks – software and solutions providing analytics and text mining

in education, competency management, and training.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining

1 'Biomedical text mining' (also known as 'BioNLP') refers to text

mining applied to texts and literature of the biomedical and molecular

biology domain. It is a rather recent research field on the edge of natural language processing, bioinformatics,

medical informatics and computational linguistics.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining

1 There is an increasing interest in text mining and information extraction

strategies applied to the biomedical and molecular biology literature due

to the increasing number of electronically available publications

stored in databases such as PubMed.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining - Main applications

1 Information extraction and text mining methods have been explored

to extract information related to biological processes and diseases.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining - Examples

1 * [http://u-compare.org/index.html U-Compare] - U-Compare is an

integrated text mining/natural language processing system based

on the UIMA Framework, with an emphasis on components for

biomedical text mining.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining - Examples

1 * [http://www-tsujii.is.s.u-tokyo.ac.jp/medie/ MEDIE] - an

intelligent search engine to retrieve biomedical correlations from

MEDLINE, based on indexing by Natural Language Processing and

Text Mining techniques

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining - Examples

1 * [http://www.nextbio.com NextBio]- Life sciences search engine with a

text mining functionality that utilizes PubMed abstracts

[http://www.nextbio.com/b/home/generalSearch.nb?q=breast+cancer (ex: literature search)] and clinical trials

[http://www.nextbio.com/b/home/generalSearch.nb?

q=breast+cancer#sitype=TRIALS (example)] to return concepts

relevant to the query based on a number of heuristics including ontology relationships, journal impact, publication date, and

authorship.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining - Examples

1 * [http://brainarray.mbni.med.umich.edu/Brainarray/prototype/PubAnatomy/

PubAnatomy ] — An interactive visual search engine that provides new ways to explore relationships

among Medline literature, text mining results, anatomical

structures, gene expression and other background information.

https://store.theartofservice.com/the-text-mining-toolkit.html

Biomedical text mining - Examples

1 * [http://anote-project.org @Note2] — A workbench for Biomedical Text

Mining (Including Information Retrieval, Name Entity Recognition

and Relation Extraction plugins)

https://store.theartofservice.com/the-text-mining-toolkit.html