prediction in data mining

Upload: nguyen-minh-tan

Post on 10-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Prediction in Data Mining

    1/12

    Hong Vit Lm

    Nguyn Minh Tn

  • 8/8/2019 Prediction in Data Mining

    2/12

    Outliney Data Mining Prediction in general

    y Definition

    y How does Prediction work?y Prediction Evaluation

    y Some current research in Data Mining prediction

    y Regression

    y SVM

    y Neural Network

  • 8/8/2019 Prediction in Data Mining

    3/12

    Definitiony Prediction in Data Mining is a

    process to build a continuous-valued function to

    predict future val

    ued from current data.

    y Different with Classification:

    y Classification : predict categoricallabels

    y Prediction : predict continuous-valued

    y Field applied:y Medical , Economic .

  • 8/8/2019 Prediction in Data Mining

    4/12

    How does Prediction work?

    Datapreprocessing

    DataCleaning

    Relevanceanalysis

    Datatransformationsand reduction

    Learning Phase

    Input:

    training set

    Output:

    Continuous-valuedfunction

    Testing Phase

    Input:

    Testing set

    Continuous-valuedfunction

    Output:

    Accuracy Evaluation

    Suitable Predictor

    Predictor

  • 8/8/2019 Prediction in Data Mining

    5/12

    Criteria for comparing prediction

    methods

    Accuracy

    Speed

    Robustness

    Scalability

    Interpretability

  • 8/8/2019 Prediction in Data Mining

    6/12

    Predictor Error Measuresy Use loss function:

    y Absolute error

    y

    Squared error

    y Test error rate:y Mean absolute error

    y Mean squared error

    y Relative absolute error

    y Relative squared error

  • 8/8/2019 Prediction in Data Mining

    7/12

    Evaluating the Accuracy of

    Predictor

    y Holdout Methods and Random Sub sampling

    y Cross-Validation

    y Bootstrap

  • 8/8/2019 Prediction in Data Mining

    8/12

    Outliney Data Mining Prediction in general

    y Definition

    y How does Prediction work?y Prediction Evaluation

    y Some current research in Data Mining prediction

    y Regression

    y SVM

    y Neural Network

  • 8/8/2019 Prediction in Data Mining

    9/12

    Regressiony Statistical methodology developed by Sir Frances

    Galton

    yA good choice when all of the predictor variables arecontinuous valued

    y Classification:

    y Linear regression /Non-linear regression

    y Single variable / Multiple variables

  • 8/8/2019 Prediction in Data Mining

    10/12

    Support Vector Machines

    y SVM : use nonlinear mapping to transform the originaltraining data into a higher dimension then find out thelinear optimal separating hyper plane

    y In regression it can be used to learn the input-outputrelationship between input training tuples.

    y Pros and Cons:

    y Training time low but highlyaccurate

  • 8/8/2019 Prediction in Data Mining

    11/12

    Neural Network

    y Artificial neural networks: Non-linear predictive models thatlearn through training and resemble biological neural networksin structure.

    y One of the most commonly used techniques in data mining

    y Pros:y Long training timesy Poor interpretability

    y Cons:y High tolerance of noisy dataand pattern that have not been trainy

    Well-suit for continuous-valued inputs and outputy Success on wide public

    y Architecture:y Feed forward

    y Algorithm:y Back propagation

  • 8/8/2019 Prediction in Data Mining

    12/12