Download - Data & Text Mining
![Page 1: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/1.jpg)
Data & Text MiningAbhay Ahluwalia, Chris Bruck, Christopher Stanton, Stefanie Felitto, Mike Paulus
BUAD 466: Introduction to Business Intelligence
November 30, 2011
![Page 2: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/2.jpg)
Data Mining Background
Definition – the process of analyzing data from different perspectives and summarizing it into useful information
Data Mining Software (ex. XL Miner) allows users to analyze data from many different dimensions, categorize it, and summarize the relationships identified
![Page 3: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/3.jpg)
The Basics of Data Mining Analyzes relationships and patterns in stored
transaction data based on open-ended user queries Classes: Stored data is used to locate data in
predetermined groups Clusters: Data items are grouped according to logical
relationships or consumer preferences Associations: Data can be mined to identify
associations Sequential patterns: Data is mined to anticipate
behavior patterns and trends
![Page 4: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/4.jpg)
Text Mining Background Definition: the discovery by computer of
previously unknown knowledge in text, by automatically extracting information from different written resources
Goal: to extract new, never-before encountered information
Text mining can expand the ability of data mining to deal with textual materials
![Page 5: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/5.jpg)
Data are Key to Business Value
DATA: Measures of variables in categories Support Decision Making Provide Basis for Forecasting Important to
Obtain data from new sources (text mining) Integrate (mash) information from multiple sources
![Page 6: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/6.jpg)
Software Example #1: VAIM (Value-Added Information Mash)
MINING: finding patterns in data (pattern-oriented, record-oriented searches)
MASHING: Integrating information mined from multiple resources Useful in Hospitals and for Government Campaigns
![Page 7: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/7.jpg)
Software Example #2: IBM SPSS
Assists in Statistical Analysis in predicting trends
Categorizes data, Preforms Statistical Analysis Multiple Regressions to suggest causality
![Page 8: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/8.jpg)
Software Example #3: XL Miner
Add-In on Microsoft Excel Products Builds off of software that companies already
possess
Assists in predictive forecasting based on observed data trends
Demonstration
![Page 9: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/9.jpg)
Business Value Example #1: Grocery Store
Data mining using Oracle Analyzed buying patterns Finding lead to changes in Marketing Increased revenues
![Page 10: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/10.jpg)
Value Example #2 - University of Rochester Cancer Center Using KnowledgeSEEKER software Studied effect of anxiety of Chemotherapy on
nausea Analysis helped improved treatment of
patients and improved quality of life.
![Page 11: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/11.jpg)
Value Example #3: MGM Grand Hotel Analyzed customer satisfaction and probability
of return stay Found that the front desk and room where most
important Focused next 6 months improving 10% improvement in attrition Increased guest returns and profitability
![Page 12: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/12.jpg)
Business Applications
Pros: Extracts new information
and Combines human linguistic capabilities with the speed and accuracy of a computer
Can answer the ‘Why?’
Competitive advantage
Cons: Expensive
Requires Training
Dependent on structure of warehouses and repositories
![Page 13: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/13.jpg)
Complications & Concerns
Invasion of Privacy According to Lita van Wel and
Lamber Royakkers in “Ethical issues in web data mining”, privacy is considered lost when information about an individual is obtained, used, or spread without that individual’s permission
![Page 14: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/14.jpg)
More Complications• Data is made anonymous
before gathered into profiles, there are no personal profiles; therefore these applications de-individualize the users by judging them just by their mouse clicks
• De-individualization: tendency of judging and treating people on the basis of group characteristics instead of on their own individual characteristics
![Page 15: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/15.jpg)
More Concerns
Companies can claim to collect the data for one purpose and use it for another
The growing movement of selling personal data as a service encourages website owners to trade personal data obtained from their site
The companies that buy the data make it anonymous and these companies and assume ownership of the data that they release
http://www.youtube.com/watch?v=zdM6vzRHrG0
![Page 16: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/16.jpg)
Even More Complications
Some web mining algorithms might use controversial characteristics to categorize individuals, such as sex, race, religion, or sexual orientation This process could result in the refusal of service or a
privilege to an individual based on his race, religion, or sexual orientation.
![Page 17: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/17.jpg)
Application Recommendations & Conclusion
Sync data repositories (VAIM Software) Training Use Data Mining and Text Mining together
![Page 18: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/18.jpg)
![Page 19: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/19.jpg)
Group Jeopardy:Data and
Text Mining Background
Business Applications
Complications with Mining
From the Examples
100 100 100 100
200 200 200 200
300 300 300 300
![Page 20: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/20.jpg)
Data and Text Mining Background For 100:True or False: Clusters refer to Data Items that are grouped according to logical relationships or consumer preferences?
True.
Home
![Page 21: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/21.jpg)
Data and Text Mining Background For 200:What is the name of the Text Mining Software that allows users to analyze data from different dimensions, categorize it, and summarize the relationships it identified, all within a familiar Microsoft Office Program?
XL Miner
Home
![Page 22: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/22.jpg)
Data and Text Mining Background For 300:Name either 2 Pro's or 2 Cons to the Business Applications of Data Mining. Pros: extracts new info, can answer the why, creates a
competitive advantage Cons: expensive, requires training, dependent on
structure of warehouses and repositories
Home
![Page 23: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/23.jpg)
Business Applications for 100:What does VAIM stand for?
Value-Added Information Mashing
Home
![Page 24: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/24.jpg)
Business Applications for 200:What is the difference between Text Mining and Text Mashing? MINING: finding patterns in data (pattern-oriented,
record-oriented searches) MASHING: Integrating information mined from multiple
resources
Home
![Page 25: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/25.jpg)
Business Applications for 300:What is the greatest benefit of Text Mining for Businesses? Extracts new information and Combines human
linguistic capabilities with the speed and accuracy of a computer
Home
![Page 26: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/26.jpg)
Complications for 100:True or False: Companies who buy the data and make it anonymous are not responsible for potential legal actions against them for using the data? False, they are responsible and can have serious legal
actions taken upon them
Home
![Page 27: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/27.jpg)
Complications for 200:What is the term used when the personal data of individuals is treated on the basis of group characteristics rather than individual characteristics? De-individualization
Home
![Page 28: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/28.jpg)
Complications for 300:Which two US Senators introduced the Commercial Privacy Bill of Rights? John McCain (R-AZ) John Kerry (D-MA)
Home
![Page 29: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/29.jpg)
From the Examples for 100:When the grocery store analyzed men's buying trends they found that when men purchased diapers and what other item did they buy?
Beer
Home
![Page 30: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/30.jpg)
From the Examples for 200:What software did the University of Rochester Cancer Center use to analyze the affects of Chemotherapy treatments on nausea?
KnowledgeSEEKER
Home
![Page 31: Data & Text Mining](https://reader036.vdocuments.mx/reader036/viewer/2022062501/568164c3550346895dd6d8bf/html5/thumbnails/31.jpg)
From the Examples for 300:What did Text Mining identify as the two most important areas of the MGM Grand Hotel?
The Front Desk and the Room
Home