comp3740 cr32: knowledge management and adaptive systems data mining outputs: what knowledge can...
TRANSCRIPT
![Page 1: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/1.jpg)
COMP3740 CR32:Knowledge Management
and Adaptive Systems
Data Mining outputs:
What “knowledge” can Data Mining learn?
By Eric Atwell, School of Computing, University of Leeds
(including re-use of teaching resources from other sources, esp. Knowledge Management by Stuart Roberts,
School of Computing, University of Leeds)
![Page 2: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/2.jpg)
Data Mining, Knowledge Discovery, Text Mining
• Data mining is about discovering “knowledge”: patterns, correlations, predictive rules in a large data-set or corpus.
• For this we need:– Data mining techniques, algorithms, tools, eg WEKA, R, MatLab, …– A methodological framework to guide us in collecting data and finding
“useful” models, CRISP-DM
• Data Mining was originally about “learning” patterns from DataBases, data structured as Records, Fields
• Knowledge Discovery is “exotic term” for DM???• Increasingly, data is unstructured text (WWW), so• Text Mining is a new subfield of DM/KD, focussing on
Knowledge Discovery from unstructured text data
![Page 3: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/3.jpg)
Data Mining: Overview
Concepts,Instances,Attributes
Data Mining
Concept Descriptions
Each instance is an example of the concept to be learned or described. The instance is described by the values of its attributes.
![Page 4: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/4.jpg)
instances
• Input to a data mining algorithm is in the form of a set of examples, or instances.
• Each instance is represented as a set of features or attributes.
• Usually this set takes the form of a flat file; each instance is a record in the file, each attribute is a field in the record.
• In text-mining, instance is word/term in a corpus.• The concepts to be learned are formed from
patterns discovered within the set of instances.
![Page 5: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/5.jpg)
conceptsThe types of concepts we try to ‘learn’ include:• Key “differences” between 2 (or more) data-sets
– Eg difference in sales by region this year compared to previous
- Eg terms important in one corpus but not another
• Clusters or ‘Natural’ partitions;– Eg cluster customers according to their shopping habits;
- Eg semantic clusters: “synonyms” with similar COLLOCATIONS
• Rules for classifying examples into pre-defined classes.– Eg successful PhD student?: Mature student, IS, AI3n, 2i/1st => PhD
- Eg predicting Part-of-Speech word-class of each word in a corpus:
- Adj + X + Verb => X=Noun;
- “to” + X + Adverb => X=Verb
![Page 6: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/6.jpg)
More concepts
The types of concepts we try to ‘learn’ include:• General Associations
– Eg “People who buy nappies are in general likely to also buy beer”
- Eg high-frequency terms tend to be “grammatical”, not “meaningful”
• Numerical prediction– Eg look for rules to predict what salary a graduate will get, given A
level results, age, gender, programme of study and degree result – this may give us an equation:
Salary = a*A-level + b*Age + c*Gender + d*Prog + e*Degree
![Page 7: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/7.jpg)
DB Example: weather to play?
![Page 8: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/8.jpg)
/usr/local/weka-3-4-5/data/weather.arff@relation weather@attribute outlook {sunny,overcast,rainy}@attribute temperature real@attribute humidity real@attribute windy {TRUE, FALSE}@attribute play {yes, no}
@datasunny,85,85,FALSE,nosunny,80,90,TRUE,noovercast,83,86,FALSE,yesrainy,70,96,FALSE,yesrainy,68,80,FALSE,yesrainy,65,70,TRUE,noovercast,64,65,TRUE,yessunny,72,95,FALSE,nosunny,69,70,FALSE,yesrainy,75,80,FALSE,yes
![Page 9: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/9.jpg)
![Page 10: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/10.jpg)
In general, any DB records can be ARFFed
• Save records as plain text file, comma-separated values (csv format)
• Add HEADER:
@relation <filename>
@attribute <name><type>
@attribute<name><type>
…
@data
… then the data (instances)
![Page 11: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/11.jpg)
Concept-learning exampleStart with set of instances
Use clustering algorithm to partition set
![Page 12: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/12.jpg)
Concept-learning example
Identify cluster centroids
![Page 13: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/13.jpg)
Concept-learning example
Clusters, represented by centroids are the learned concepts
![Page 14: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/14.jpg)
Example use of clustering
• Point of sale data contains information about the buyer and the ‘basket’.
• We want to target advertising to different types of shopper.
• Cluster analysis groups shoppers into classes, each with distinctive characteristics.
• Cluster characteristics are examined to interpret what kind of advertising each group will respond to.
• Groups then related to where they live.
![Page 15: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/15.jpg)
Output: Clusters• Output can take the form of:
– Classification of each instance according to the cluster number/name (like a dictionary/thesaurus)
– Cluster centroids– Dendrogram depicting hierarchical partitioning:
x c f y d p o k a e m b l s
![Page 16: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/16.jpg)
Example use: comparing data-sets• Finding specialist terms, UK v US?
• Compare this month’s data with last month’s
• Compare with several previous months
• Notice new sales growth areas
• Trends – rise, fall, cyclical (eg turkey sales?)
• Key differences may denote clusters (eg ise/ize)
• “size/scale” of difference
• “Aligned”, “parallel” corpora used in Statistical Machine Translation, eg Google Translate
![Page 17: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/17.jpg)
Output: differences between data-sets• Key instances/attributes with most significant
difference, eg highest Log-Likelihood score
• Groups or clusters of significant terms, eg names
• Trends over several data-sets: graphs
• Overall metrics of difference
![Page 18: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/18.jpg)
Example use of classifying
• A large database of symptoms and diagnoses is available from medical records.
• We seek rules that will predict which disease someone has, given their symptoms.
Or• Given information about physical environment and crop
yields – seek rules that will help us understand why some areas give higher yields than others.
![Page 19: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/19.jpg)
Output: decision treeOutlook
Humidity
sunny
high
Play = ‘no’
normal
Play = ‘yes’
Windy
rainy
true
Play = ‘no’
false
Play = ‘yes’
![Page 20: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/20.jpg)
About decision trees
• Non-leaf node represents a test on a particular attribute.• Arcs represent the outcomes of the test.• Tests on numerical attributes usually have binary
outcome• Tests on nominal attributes usually have one outcome for
each element in the domain.• The leaf nodes represent a class.• Each path down the tree represents a prediction for
assigning instances to classes
![Page 21: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/21.jpg)
Output: classification rulesIf outlook = sunny and humidity = high then play = no
If outlook = rainy and windy = true then play = no
If outlook = overcast then play = yes
If humidity = normal then play = yes
Default play = yes
![Page 22: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/22.jpg)
About Classification rules• Alternative to decision trees:
– If <antecedent> then <consequent>– Consequent indicates a class.– Usually the antecedent is a conjunction of conditions on attribute
values.– Usually we interpret the set of rules to be a disjunction of the
individual rules.• Evaluation: Accuracy of a rule:
– Ratio of number of instances it predicts correctly to total number of instances that match the antecedent.
• Advantages of rules:– Easier to read than trees– Can be more compact– Each rule represents a ‘nugget’ of knowledge, with its own accuracy
![Page 23: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/23.jpg)
A variant: rules with exceptions
• General form:– If A then B except if C then D
• Advantages:– Can be more compact than rules without exceptions– Closer to the way we organise our knowledge– Scales well as new instances are introduced.
![Page 24: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/24.jpg)
Output: association rules
• Given point of sales data, seek any kind of dependencies between data items that will help us understand shopping behaviour.
“People who live by the sea and buy pet food go on fewer holidays”
• ‘Learned’ rules may or may not be interesting!
![Page 25: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/25.jpg)
About association rules
• Similar to classification rules, but now consequent can predict any attribute, not just the class.
• Evaluation: Coverage (or support) of a rule:– The number of instances it correctly predicts
• Evaluation: Accuracy (or confidence) of a rule:– Ratio of number of instances it predicts correctly to
total number of instances that match the antecedent.
![Page 26: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/26.jpg)
Output: numerical prediction• Best-fit equation
• e.g.linear: length = a + b*width + c*height
• Widely used in maths and stats
• ?not “really” data mining?
![Page 27: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/27.jpg)
Example use of numerical prediction
• Given numerical information about physical environment and crop yields – seek rules that will help us predict crop yields for some new set of conditions.
![Page 28: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/28.jpg)
Key points• Data Mining tools semi-automate the process of
discovering patterns in data.
• Tools differ in terms of what concepts they discover (differences, clusters, decision-trees, rules, numerical prediction)…
• … and in terms of the output they provide (eg clustering algorithms provide a set of centroids or a dendrogram)
• Selecting the right tools for the job is based on business objectives: what is the USE for the knowledge discovered
![Page 29: COMP3740 CR32: Knowledge Management and Adaptive Systems Data Mining outputs: What knowledge can Data Mining learn? By Eric Atwell, School of Computing,](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5515f512550346d46f8b55c3/html5/thumbnails/29.jpg)
Self-test
• You should be able to:– Decide what attributes are relevant to the given data
mining task – Decide which is the appropriate data mining technique
for a given a problem defined in terms of business objectives.
– Decide which is the most appropriate form of output.