s 0219622011004750

Upload: jimakosjp

Post on 02-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 s 0219622011004750

    1/14

    International Journal of Information Technology & Decision MakingVol. 10, No. 6 (2011) 11611174c World Scientific Publishing Company

    DOI: 10.1142/S0219622011004750

    A FUZZY LINEAR PROGRAMMING-BASED

    CLASSIFICATION METHOD

    AIHUA LI

    School of Management Science and Engineering

    Central University of Finance and Economics

    Beijing 100080, P. R. China

    [email protected]

    YONG SHI

    Fictitious Economics and Data Technology Research Centre

    Chinese Academy of Science

    Beijing 100081, P. R. China

    College of Information Science and Technology

    University of Nebraska

    NE 68182, USA

    [email protected]

    JING HE and YANCHUN ZHANG

    Centre for Applied Informatics

    Victoria University, Melbourne City MC

    VIC 8001, Australia

    Fictitious Economics and Data Technology Research Centre

    Chinese Academy of Science

    Beijing 100081, P. R. [email protected]@gmail.com

    Multiple criteria linear programming and multiple criteria quadratic programming clas-sification models have been applied in some field in financial risk analysis and credit riskcontrol such as credit cardholders behavior analysis. In this paper, a fuzzy linear pro-gramming classification method with soft constraints and criteria was proposed based onthe previous findings from other researchers. In this method, the satisfied result can beobtained through selecting constraint and criteria boundary variable di, respectively. A

    general framework of this method is also constructed. Two real-life datasets, one from amajor USA bank and the other from a database of KDD 99, are used to test the accuraterate of the proposed method. And the result shows the feasibility of this method.

    Keywords: Classification; data mining; MCLP; fuzzy linear programming; membershipfunction.

    Corresponding author.

    1161

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

    http://dx.doi.org/10.1142/S0219622011004750http://dx.doi.org/10.1142/S0219622011004750
  • 7/27/2019 s 0219622011004750

    2/14

    1162 A. Li et al.

    1. Introduction

    Data mining becomes an important international technology with the development

    of database and internet, which can extract nontrivial, implicit, previously unknown

    and potential useful patterns, or knowledge from database. Classification is one of

    the functions in data mining, which is a kind of supervised learning. There are two

    steps in the classification process.1 First, hidden pattern or discriminant function

    can be derived from the training set. Second, the pattern or discriminant function

    is applied to classify the testing dataset. The training accurate rate and testing

    accurate rate are often used to evaluate the model.

    The term of classification methods initially employ artificial intelligent (AI),

    traditional statistics and machine learning tools, such as decision tree,2 linear dis-

    criminant analysis (LDA),

    3

    support vector machine (SVM),

    4

    and so on. They havebeen applied in real-life medical, communication, and strategic management prob-

    lems. For different datasets with different characters, classification methods show

    their different advantages and disadvantages. For example, SVM or neural network

    (NN) fits well for the output of some dataset, but it may result in overfit problem

    sometimes. LDA shows its advantage when the datasets obey normal distribution,

    but not a good choice in other conditions.

    Linear programming (LP) classification method was first proposed in 1980s,57

    which showed its potential applications. In 1990s, multiple criteria linear program-

    ming (MCLP) and multiple criteria quadratic programming (MCQP) classificationmodels were developed,810 which have been successfully used in credit cardholders

    behavior analysis1114 and network intrusion detection later.15 He et al.16 proposed

    a fuzzy linear programming (FLP) model only with soft criteria, in which a sat-

    isfied solution could be solved. In this paper, we proposed a FLP classification

    method with soft constraints and criteria based on the previous researchers work.

    This paper is presented as follows: Sec. 2 reviews LP, MCLP, and FLP method.

    Section 3 proposes FLP with soft constraints and criteria, which means decision

    maker can choose the reasonable bound for constraints in deriving a satisfied solu-

    tion. Section 4 uses two examples, one from a major USA bank and the other fromthe database of KDD 99,17 to test the accurate rate of the proposed method. Some

    remarks are given in Sec. 5.

    2. LP, MCLP, and FLP Classification Models

    In the LP classification method, the objectives of initial forms can be categorized

    as MMD and MSD.6 Here, MMD means maximize the minimum distance of obser-

    vations from the critical value. MSD means minimize the sum of the distance of theobservations from the critical value. For example, in the credit cardholder behavior

    analysis a basic framework of two-class problems can be presented as.

    Given a set of r variables (attributes) about a cardholder a = (a1, a2, . . . , ar),

    let Ai = (Ai1, Ai2, . . . , Air) be the development sample of data for the variables,

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    3/14

    A Fuzzy Linear Programming-based Classification Method 1163

    where i = 1, 2, . . . , n and n is the sample size. We want to determine the best

    coefficients of the variables, denoted by X = (x1, x2, . . . , xr)T, and a boundary

    value b (a scalar) to separate two classes: G (Good for nonbankrupt accounts) and

    B (Bad for bankrupt accounts), that is as follows:

    AiX b, Ai B (Bad),

    AiX b, Ai G (Good).

    To measure the separation of Good and Bad, we define:

    i = the overlapping of two-class boundary for case Ai (external measurement);

    = the maximum overlapping of two-class boundary for all cases Ai(i < );

    i = the distance of case Ai from its adjusted boundary (internal measurement);

    = the minimum distance for all cases Ai from its adjusted boundary (i > ).

    A simple version of Freed and Glovers model which seeks MSD can be written as

    Minimizei

    i,

    Subject to:

    AiX b + i, Ai B,

    AiX b i, Ai G,

    (2.1)

    where Ai are given, X and b are unrestricted, and i 0.

    The alternative of the above model is to find MMD as follows:

    Maximizei

    i,

    Subject to:

    AiX b i, Ai B,

    AiX b + i, Ai G,

    (2.2)

    where Ai are given, X and b are unrestricted, and i 0.

    A hybrid model7 that combines models (2.1) and (2.2) can be as follows:

    Minimizei

    i i

    i,

    Subject to:

    AiX = b + i i, Ai B,

    AiX = b i + i, Ai G,

    (2.3)

    where Ai are given, X and b are unrestricted, and i, i 0, respectively.

    Shi et al.8 applied the compromise solution of MCLP to minimize the sum of

    i and maximize the sum of i simultaneously. A two-criteria LP model is stated

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    4/14

    1164 A. Li et al.

    as follows:

    Minimizei

    i and Maximizei

    i,

    Subject to:

    AiX = b + i i, Ai B,

    AiX = b i + i, Ai G,

    (2.4)

    where Ai are given, X and b are unrestricted, and i, i 0, respectively.

    For this model MCLP, the system explanation and summary are presented in

    these papers.1820

    In compromise solution approach,21 the best trade-off between ii and ii

    is identified for an optimal solution. To explain this, assume the ideal value ofii be > 0 and the ideal value of ii be > 0. Then, ifii > ,

    the regret measure is defined as d+ = ii + . Otherwise, it is defined as 0.

    If ii < , the regret measure is defined as d = + ii; otherwise, it

    is 0. Thus, the relationship of these measures are (i) + ii = d d+ , (ii)

    |+ii| = d +d+ , and (iii) d

    , d+ 0. Similarly, we derive

    ii = d

    d+

    ,

    | ii| = d

    + d+

    , and d

    , d+

    0.

    An MCLP model for two-class separation is presented as

    Minimize d

    + d

    +

    + d

    + d

    +

    Subject to:

    +i

    i = d

    d+ ,

    i

    i = d

    d+

    , (2.5)

    AiX = b + i i, Ai B,

    AiX = b i + i, Ai G,

    where Ai, , and are given, X and b are unrestricted, and i, i, d , d+ , d

    ,

    d+ 0.

    In a FLP approach with soft criteria,16 membership functions for the criteria

    Minimize

    i i and Maximize

    i i were expressed respectively by

    F1(x) =

    1, ifi

    i y1U

    i i y1Ly1U y1L , if y1L 0, respectively.

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    7/14

    A Fuzzy Linear Programming-based Classification Method 1167

    For the model (2.2), we similarly define the membership function as follows:

    F2(x) =

    1, ifi

    i y2U,

    i i y2L

    y2U y2L, if y2L 0,

    (3.2)

    where Ai are given, X and b are unrestricted, i, d3, d4 > 0, respectively.In order to unify the sign of models in this research, we use the same membership

    function F1 in the model of (3.1), F2 in the model of (3.2) instead of them in

    model (2.6), so model (2.6) would be changed into the following format:

    Maximize ,

    Subject to:i i y1U

    y1L y1U ,

    i i y2L

    y2U y2L ,

    AiX = b + i i, Ai G,

    AiX = b i + i, Ai B,

    (3.3)

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    8/14

    1168 A. Li et al.

    where Ai is known, X and b are unrestricted, and i, i, 0, y1L, y1U, y2L, and

    y2U are the same in the models of (3.1) and (3.2).

    To identify a fuzzy model for the model (2.4), we first relax the model (2.4)s

    constraints to inequality constraints. Then, suppose d1 = d2 = d1, d3 = d4 = d2, afuzzy model with the combinations (3.1) and (3.2) for the relaxed (M4) will be

    Maximize ,

    Subject to:i i y1U

    y1L y1U ,

    i i y2L

    y2U y2L ,

    1 +AiX (b + i i)

    d1 , Ai B,

    1 AiX (b + i i)

    d1 , Ai B,

    1 +AiX (b i + i)

    d2 , Ai G,

    1 AiX (b i + i)

    d

    2

    , Ai G,

    1 > 0,

    (3.4)

    where Ai are given, X and b unrestricted, i, i > 0, respectively. d

    i > 0, i = 1, 2

    are fixed in the computation. The definitions ofy1L, y1U, y2L, and y2U are the same

    as those in models (3.1) and (3.2), respectively.

    There are two pieces of difference between the models (3.4) and (2.4). First,

    instead of optimal solution, a satisfying solution is obtained based on the member-

    ship function from the FLP. Second, with these soft constraints to the model (2.4),

    the boundary b can be flexibly moved by the upper bound and the lower bound

    with the separated distance di, i = 1, 2, 3, 4 according to the characteristics of thedata.

    4. Experimental Studies

    There are two datasets used here to test the accuracy rate of the proposed fuzzy

    classification method with both soft criteria and constraints. The first dataset came

    from a major US bank with 65 attributes which include the credit cardholders over

    limit fee, over charge fee, and other information in credit card using history, etc.

    There are in total 6000 records in the dataset. Here we compare the proposed FLPwith both soft criteria and constrains with MSD, MMD, and MCLP model.

    We select 1400 records with 700 Good (nonbankrupt) and 700 Bad (bankrupt)

    randomly from the dataset for training, and the left 4600 are used to test the clas-

    sifier accuracy, which is based on the method of cross validation. In the experiment,

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    9/14

    A Fuzzy Linear Programming-based Classification Method 1169

    b is given as 0.5 for all models, d1 = d2 = d

    1 = 1, d3 = d4 = d

    2 = 1.5, respectively

    for fuzzy-model (3.4). There are five groups training results in Table 1 and testing

    results in Table 2 listed below. In Tables 13, we define

    Absolute accurate rate of Good = Sensitivity =t Good

    Good,

    Absolute accurate rate of Bad = Specificity =t Bad

    Bad,

    Catch rate = Accuracy = Sensitivity Good

    Good + Bad

    + Specificity Bad

    Good + Bad

    ,

    where t Good is the number of the Good (Good records that were correctly

    classified as much). Good is the number of Good; t Bad is the number of the

    Bad (Bad records that were correctly classified as much). Bad is the number

    of Bad. In this case, to catch a bad person is more important than to catch a

    good cardholder in order to avoid the defaulting.

    Tables 1 and 2 show us that model (2.5)-MCLP, fuzzy-model (3.3)-FLP1,

    and the proposed model (3.4)-FLP2 are better than (2.1)-MSD and (2.2)-MMD.

    Although model (2.2)-(MMD) is the best for Bad catching it cannot be selected

    due to its poor Good catching and instability in the experiment. MCLP shows its

    trade-off with the balanced Good and Bad accuracy rate. FLP1 works well

    for the overall catch rate and a little worse than FLP2 for Bad catching. Thus,

    among MSD, MCLP, and FLP if we give importance to catching Bad cardholder

    and keeping a satisfied absolute accuracy rate, fuzzy model (3.4) would be a good

    choice. Table 3 shows us that the choice of boundary value di in the model (3.4)

    affects the result of classification. By adjusting the value of di , we can get the

    satisfied result of classification in the training process.

    The second dataset came from KDD 99. Here a connection is a sequence of TCP

    packets starting and ending between which data flows from a source IP address to

    a target IP address under some well-defined protocol. Each connection is labeled

    as either normal or an attack, here dos is exactly one specific attack type. In this

    task, we select 38 characters needed. There are 1,060,078 records in the dataset we

    used in this example; 812,812 Normalrecords and 247,266 Dos records. First,

    4000 records was selected randomly from the dataset for training, 2000 of which is

    labeled Normal, the other 2000 is labeled Dos. Second, the left records, 810,812

    records for Normal and 245,266 records for Dos were used for testing. Tables 4

    and 5 show us the training and testing results.In Tables 4 and 5, we use:

    Absolute accurate rate of Normal = Sensitivity =t Normal

    Normal,

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    10/14

    1170 A. Li et al.

    Table1.

    Trainingresultsof1400records.

    DifferentGroups

    Model1(MSD)

    Model2(MMD)

    M

    odel5(MCLP)

    FuzzyM

    odel3(FLP1)

    FuzzyMode

    l4(FLP2)

    AbsoluteAccuracyRateA

    bsoluteAccuracyRate

    Abso

    luteAccuracyRate

    Absolute

    AccuracyRate

    AbsoluteAcc

    uracyRate

    Good

    Bad

    CatchRateG

    ood

    Bad

    CatchRate

    Good

    Bad

    CatchRate

    GoodB

    ad

    CatchRate

    Good

    Bad

    CatchRate

    Group1

    0.6

    7

    0.6

    8

    0.6

    8

    0.0

    6

    0.9

    3

    0.5

    0

    0.74

    0.7

    4

    0.7

    4

    0.7

    7

    0.8

    0

    0.7

    8

    0.6

    2

    0.8

    2

    0.7

    2

    Group2

    0.7

    0

    0.7

    0

    0.7

    0

    0.0

    4

    0.9

    0

    0.4

    7

    0.79

    0.7

    9

    0.7

    9

    0.7

    5

    0.7

    8

    0.7

    6

    0.6

    8

    0.8

    3

    0.7

    5

    Group3

    0.6

    9

    0.7

    0

    0.7

    0

    0.0

    4

    0.9

    3

    0.4

    9

    0.77

    0.7

    7

    0.7

    7

    0.7

    4

    0.7

    8

    0.7

    6

    0.6

    7

    0.7

    9

    0.7

    3

    Group4

    0.6

    9

    0.7

    0

    0.6

    9

    0.0

    6

    0.9

    1

    0.4

    8

    0.76

    0.7

    5

    0.7

    6

    0.7

    5

    0.7

    9

    0.7

    7

    0.6

    2

    0.8

    4

    0.7

    3

    Group5

    0.7

    2

    0.6

    9

    0.7

    1

    0.2

    8

    0.6

    0

    0.4

    4

    0.73

    0.7

    8

    0.7

    5

    0.2

    8

    0.6

    0

    0.4

    4

    0.5

    8

    0.8

    4

    0.7

    1

    Table2.

    Testingresultsof4600records.

    DifferentGroups

    Model1(MSD)

    Model2(MMD)

    M

    odel5(MCLP)

    FuzzyM

    odel3(FLP1)

    FuzzyMode

    l4(FLP2)

    AbsoluteAccuracyRateA

    bsoluteAccuracyRate

    Abso

    luteAccuracyRate

    Absolute

    AccuracyRate

    AbsoluteAcc

    uracyRate

    Good

    Bad

    CatchRateG

    ood

    Bad

    CatchRate

    Good

    Bad

    CatchRate

    GoodB

    ad

    CatchRate

    Good

    Bad

    CatchRate

    Group1

    0.7

    0

    0.7

    7

    0.7

    0

    0.0

    3

    0.9

    2

    0.0

    8

    0.75

    0.7

    4

    0.7

    5

    0.7

    2

    0.7

    8

    0.7

    3

    0.6

    1

    0.8

    4

    0.6

    2

    Group2

    0.6

    9

    0.7

    1

    0.6

    9

    0.0

    3

    0.9

    1

    0.0

    8

    0.73

    0.7

    9

    0.7

    4

    0.7

    4

    0.7

    4

    0.7

    4

    0.6

    3

    0.7

    9

    0.6

    4

    Group3

    0.7

    2

    0.7

    3

    0.7

    2

    0.0

    3

    0.9

    0

    0.0

    8

    0.75

    0.7

    2

    0.7

    5

    0.7

    6

    0.7

    5

    0.7

    6

    0.6

    7

    0.7

    8

    0.6

    7

    Group4

    0.6

    9

    0.7

    0

    0.6

    9

    0.0

    4

    0.8

    8

    0.0

    9

    0.77

    0.6

    8

    0.7

    6

    0.7

    4

    0.7

    0

    0.7

    4

    0.6

    5

    0.8

    2

    0.6

    6

    Group5

    0.7

    0

    0.6

    8

    0.7

    0

    0.2

    8

    0.6

    3

    0.3

    0

    0.75

    0.7

    2

    0.7

    5

    0.7

    2

    0.6

    3

    0.7

    2

    0.6

    1

    0.7

    8

    0.6

    2

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    11/14

    A Fuzzy Linear Programming-based Classification Method 1171

    Table 3. Training and testing result of 1400 records for fuzzy model 4.

    Different di Training: Absolute Accuracy Rate Testing Absolute Accuracy Rate

    d1=

    d2 d3=

    d4Good Bad Catch Rate Good Bad Catch Rate

    1 3 0.547 0.897 0.722 0.538 0.896 0.5581 2 0.574 0.863 0.719 0.570 0.877 0.5881 1.5 0.619 0.823 0.721 0.607 0.838 0.6201 1 0.673 0.746 0.709 0.670 0.777 0.676

    Absolute accurate rate of Dos = Specificity =t Dos

    Dos,

    Catch rate = Accuracy = Sensitivity NormalNormal + Dos

    + Specificity DosNormal + Dos

    ,

    where t Normal is the number of the Normal (Normal records that were cor-

    rectly classified as much). Normal is the number of Normal; t Dos is the number

    of the Dos (Dos records that were correctly classified as much). Dos is the num-

    ber of Dos. In this experimental study, MMD shows the same character as the

    credit cardholder dataset analysis. But the result of comparison is not very clear

    from the separate group training and testing result, so we compute the average

    value to analyze the classification efficiency. The average value tells that MCLPand FLP2 show better catch rate in testing. MSD works well for Dos catching

    and fuzzy model (3.4) FLP2 does a little worse than that.

    In this paper, we just compared the proposed FLP classification method with

    MMD, MSD, and MCLP in two real-life datasets. As references, the readers can find

    the previous works comparing MCLP and FLP with soft criteria, decision tree, and

    neural network in Refs. 9, 10 and 17. Thus, we shall not elaborate the comparison

    of this FLP method with other classification methods.

    5. Remarks

    In this paper, a FLP classification method with both soft criteria and constraints is

    proposed based on the previous researchers works. The relationship between this

    model and other related models was discussed. Two real-life datasets, one from

    the real bank in USA and the other from KDD 99, have been used to evaluate

    the accurate rate of classification. The result shows the feasibility of this method.

    Moreover, the general framework of FLP for classification have been described for

    the first time systemically and evaluated. However, there is some new research workto be considered and continued in the line of research. For example, how does the

    value di affect the result of classification? How can we consider ensemble analysis

    to improve the selection of the best classifier? We shall report the significant results

    of these ongoing projects in the near future.

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    12/14

    1172 A. Li et al.

    Table4.

    Trainingresultsof4000records.

    DifferentGroup

    s

    Model1(MSD)

    Model2(MMD)

    Model5(MCL

    P)

    FuzzyModel4

    (FLP2)

    AbsoluteAccuracyRate

    AbsoluteAccuracyR

    ate

    AbsoluteAccuracyRate

    AbsoluteAccura

    cyRate

    Normal

    Dos

    CatchR

    ate

    Normal

    Dos

    CatchRate

    Normal

    Dos

    CatchRate

    Normal

    Dos

    CatchRate

    Group1

    0.9

    89

    0.9

    97

    0.99

    3

    0.5

    08

    0.9

    19

    0.7

    13

    0.9

    98

    0.9

    93

    0.9

    95

    0.9

    89

    0.9

    97

    0.9

    93

    Group2

    0.9

    91

    0.9

    95

    0.99

    3

    0.2

    69

    0.9

    72

    0.6

    21

    0.9

    92

    0.9

    98

    0.9

    95

    0.9

    9

    0.9

    96

    0.9

    93

    Group3

    0.9

    87

    0.9

    98

    0.99

    2

    0.2

    32

    0.9

    92

    0.6

    12

    0.9

    93

    0.9

    98

    0.9

    95

    0.9

    87

    0.9

    97

    0.9

    92

    Group4

    0.9

    89

    0.9

    97

    0.99

    3

    0.2

    63

    0.9

    82

    0.6

    22

    0.9

    94

    0.9

    97

    0.9

    95

    0.9

    90

    1.0

    00

    0.9

    95

    Average

    0.9

    89

    0.9

    97

    0.99

    3

    0.3

    18

    0.9

    66

    0.6

    42

    0.9

    94

    0.9

    97

    0.9

    95

    0.9

    89

    0.9

    98

    0.9

    93

    Table5.

    Testingresultsofotherrecords.

    DifferentGroup

    s

    Model1(MSD)

    Model2(MMD)

    Model5(MCL

    P)

    FuzzyModel4

    (FLP2)

    AbsoluteAccuracyRate

    AbsoluteAccuracyR

    ate

    AbsoluteAccuracyRate

    AbsoluteAccura

    cyRate

    Normal

    Dos

    CatchR

    ate

    Normal

    Dos

    CatchRate

    Normal

    Dos

    CatchRate

    Normal

    Dos

    CatchRate

    Group1

    0.9

    18

    0.9

    90

    0.93

    5

    0.4

    99

    0.9

    16

    0.5

    95

    0.9

    75

    0.9

    83

    0.9

    77

    0.9

    53

    0.9

    88

    0.9

    61

    Group2

    0.9

    71

    0.9

    86

    0.97

    5

    0.3

    23

    0.9

    80

    0.4

    76

    0.9

    68

    0.9

    89

    0.9

    73

    0.9

    59

    0.9

    88

    0.9

    65

    Group3

    0.9

    25

    0.9

    89

    0.94

    0

    0.2

    54

    0.9

    88

    0.4

    24

    0.9

    30

    0.9

    87

    0.9

    43

    0.9

    66

    0.9

    88

    0.9

    72

    Group4

    0.9

    14

    0.9

    89

    0.93

    1

    0.2

    90

    0.9

    82

    0.4

    51

    0.9

    63

    0.9

    85

    0.9

    68

    0.9

    54

    0.9

    88

    0.9

    62

    Average

    0.9

    32

    0.9

    89

    0.94

    5

    0.3

    42

    0.9

    67

    0.4

    87

    0.9

    59

    0.9

    86

    0.9

    65

    0.9

    58

    0.9

    88

    0.9

    65

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    13/14

    A Fuzzy Linear Programming-based Classification Method 1173

    Acknowledgments

    The authors would like to thank Professor S. Cheng for his patience and encourage-

    ments on this work. They also express their thanks to Mr. G. Kou and P. Zhang for

    their constructive comments in preparing this paper. This research is partially sup-

    ported by the grants (70531040, 70472074 and 70921061) from the National NSFC,

    the third 211 construction funding and Program for Innovation Research in CUFE.

    References

    1. J. Han and M. Kamber, Data Mining: Concepts and Techniques (Academic Press,Beijing, 2001), p. 28.

    2. J. R. Quinlan, Induction of decision tree, Machine Learning1 (1986) 81106.

    3. R. A. Fisher, The use of multiple measurements in taxonomic problems, Annals ofEugenics7 (1936) 179188.

    4. V. Vapnik, Statistical Learning Theory (Wiley, New York, 1998).5. N. Freed and F. Glover, Simple but powerful goal programming models for discrimi-

    nant problems, European Journal of Operational Research 7 (1981) 4460.6. N. Freed and F. Glover, Evaluating alternative linear programming models to solve

    the two-group discriminant problem, Decision Sciences17 (1986) 151162.7. F. Glover, Improve linear programming models for discriminant analysis, Decision

    Sciences21 (1990) 771785.8. Y. Shi, M. Wise, M. Luo and Y. Lin, Data mining in credit card portfolio management:

    A multiple criteria decision making approach, in Multiple Criteria Decision Makingin the New Millennium (Springer, Berlin, 2001), pp. 427436.

    9. Y. Shi, Y. Peng, X. Xu and X. Tang, Data mining via multiple criteria linear pro-gramming: Applications in credit card portfolio management, International Journalof Information Technology and Decision Making1 (2002) 145166.

    10. G. Kou, X. Liu, Y. Peng, Y. Shi, M. Wise and W. Xu, Multiple criteria linear program-ming approach to data mining: Models, algorithm designs and software development,Optimization Mathods and Software18 (2003) 453473.

    11. Y. Peng, G. Kou, Y. Shi and Z. Chen, A descriptive framework for the field of datamining and knowledge discovery, International Journal of Information Technologyand Decision Making7 (2008) 639682.

    12. A. Li, Y. Shi and J. He, MCLP-based methods for improving Bad catching rate incredit cardholder behavior analysis, Applied Soft Computing8 (2008) 12591265.

    13. Y. Peng, G. Kou, Y. Shi and Z. Chen, A multi-criteria convex quadratic programmingmodel for credit data analysis, Decision Supply System44 (2008) 10161030.

    14. J. He, Y. Zhang, Y. Shi and G. Huang, Domain-driven classification based on multiplecriteria and multiple constraint-level programming for intelligent credit scoring, IEEETransactions on Knowledge and Data Engineering22 (2010) 826838.

    15. G. Kou, Y. Peng, Z. Chen and Y. Shi, Multiple criteria mathematical programming formulti-class classification and application in network intrusion detection, InformationSciences179 (2009) 371381.

    16. J. He, X. Liu, Y. Shi, W. Xu and N. Yan, Classifications of credit cardholder behaviorby using fuzzy linear programming, International Journal of Information Technologyand Decision Making3 (2004) 633650.

    17. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.18. Y. Shi, The research trend of information technology and decision making in 2009,

    International Journal of Information Technology and Decision Making 9 (2010) 18.

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.

  • 7/27/2019 s 0219622011004750

    14/14

    1174 A. Li et al.

    19. Y. Shi, Multiple criteria optimization-based data mining methods and applications:A systematic survey. Knowledge and Information Systems24 (2010) 369391.

    20. Y. Shi, Current research trend: Information technology and decision making in 2008,

    International Journal of Information Technology and Decision Making8

    (2009) 15.21. P. L. Yu, Multiple Criteria Decision Making: Concepts, Techniques and Extensions(Plenum Press, New York, 1985).

    22. A. Charnes and W. W. Cooper, Management Models and Industrial Applications ofLinear Programming (Wiley, New York, 1961).

    23. P. H. Lindsay and D. A. Norman, Human Information Processing: An Introductionto Psychology (Academic Press, New York, 1972).

    24. H. J. Zimmermann, Fuzzy programming and linear programming with several objec-tive functions, Fuzzy Sets and Systems 1 (1978) 4555.

    25. D. Dubois and H. Prade, Fuzzy Sets and Systems: Theory and Application (AcademicPress, New York, 1980), pp. 242248.

    26. S. Wang and C. Lee, A fuzzy real option valuation approach to capital budgetingunder uncertainty environment, International Journal of Information Technology andDecision Making5 (2010) 695713.

    27. A. Nachev, S. Hill, C. Barry and B. Stoyanov, Fuzzy, distributed, instance counting,and default artmap neural networks for financial diagnosis, International Journal ofInformation Technology and Decision Making9 (2010) 959978.

    Int.J.Info.T

    ech.

    Dec.

    Mak.

    2011.1

    0:1161-1174

    .Downloadedfromwww.worldscie

    ntific.com

    by85.7

    4.8

    4.1

    34on10/23/1

    2.

    Forpersonaluseonly.