applying data mining

Upload: spotanand9941

Post on 05-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Applying Data Mining

    1/13

    Applying Data Mining

    By Susan L. Miertschin

    1

  • 7/31/2019 Applying Data Mining

    2/13

    Is Data Mining Appropriate for the

    ro em a an

    Can you clearly define the problem?

    Does potentially meaningful data exist? Does the data contain hidden knowledge or is the data factual

    an use u or reporting purpose on y

    Will the cost of processing the data be less than the likely

    gained from the data mining project?

  • 7/31/2019 Applying Data Mining

    3/13

    Determine if the Problem is Suitable for

    Shallow Knowledge

    Mu ti imensiona Know e ge

    Hidden Knowledge

    Deep Knowledge

  • 7/31/2019 Applying Data Mining

    4/13

    our ypes o now e e

    Factual

    Factual

    On-line analytical Processing

    manipulated in a database

    oo s use o man pu a emultidimensional knowledge

    Not easily found using database query

    Data mining algorithms can findpatterns

    Can only be found if some

    direction about what we are

  • 7/31/2019 Applying Data Mining

    5/13

    Data Mining vs. Data Query: An

    xamp e

    You already almost know Find regularities in data

    obvious without the aid of

    tools y

    Amount of data

    Organization of data

    o scures patterns Limits of human capabilities

    to consider many things at

  • 7/31/2019 Applying Data Mining

    6/13

    A computer program that

    -

    A person trained to

    solving skills of one or

    more human experts

    order to capture the

    experts implicitknowledge in explicit form

  • 7/31/2019 Applying Data Mining

    7/13

    Data Mining ToolData

    If Swollen Glands = YesThen Diagnosis = Strep Throat

    Expert SystemBuilding Tool

    Human Expert Knowledge Engineer

    If Swollen Glands = Yes

    Figure 1.2 Data mining vs. expert systems

    Then Diagnosis = Strep Throat

  • 7/31/2019 Applying Data Mining

    8/13

    What is simple search? Nearest neighbor classifier

    K-nearest neighbor classifier

  • 7/31/2019 Applying Data Mining

    9/13

    Create a table of instances with known classifications

    is is t e training ata

    Get a new instance

    using the Euclidean distance metric for comparison

    purposes 22

    Find the instance in the training set that is closest on thebasis of the distance metric to the new instant

    11 nn

    Classify the new instance the same way as the one closest toit in the training data

  • 7/31/2019 Applying Data Mining

    10/13

    Problems with Nearest Neighbor

    ass ca on

    Computation times will be large when the training set is

    large

    No differentiation of relevant from irrelevant attributes

    o way to te w ic attri utes i erentiate among c asses

    10

  • 7/31/2019 Applying Data Mining

    11/13

    Different algorithms are available for different data mining tasks

    Different tools exist that implement different algorithms and

    different versions of algorithms

    11

  • 7/31/2019 Applying Data Mining

    12/13

    e.g., Algorithms Available in Microsofts

    na ys s erv ces

    Decision Trees

    Linear Regression Nave Bayes

    Clustering Algorithms

    Association Rules Sequence Clustering

    Time Series Analysis

    Neura Networ s Logistic Regression

    12

  • 7/31/2019 Applying Data Mining

    13/13

    Applying Data Mining

    By Susan L. Miertschin

    13