advances in information mining 2010 issue1

44
Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-08-12 Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 A new fuzzy MADM approach used for finite selection problem Muley A.A. and Bajaj V.H.* *Department of Statistics, Dr. B. A. M. University, Aurangabad (M.S.)-431004, India [email protected], [email protected] Abstract- This paper proposes a new approach to product configuration by applying the theory of Fuzzy Multiple Attribute Decision Making (FMADM), which focuses on uncertain and fuzzy requirements the customer, submits to the product supplier. The proposed method can be used in e-commerce websites, with which it is easy for customers to get his preferred product according to the utility value with respect to all attributes. The main concern of this paper, in which requirements the customer submitted to the configuration of television is vague. Further verify the validity and the feasibility of the proposed method compared with Weighted Product Method (WPM). Finally, the television is taken as an example to demonstrate the proposed methods. Keywords- MADM, Fuzzy, Triangular fuzzy number, T.V., Uncertainty Introduction Real world problems are often require a decision maker (DM) to rank discrete alternatives or, at least, to select best one. The MADM theory was developed to help the DM to solve such problems. MADM has been one of the fastest growing areas during the last decades depending on the changing in the business sector; Hwang & Yoon [1], Turban [4]. We focus on MADM which is used in a finite ‘selection’ or choice problem. In real world problem, MADM play most important role. Now a day’s television is the common in every person’s life. Here, we take as an application of selection of television configuration. Generally, common people purchase 21” size for house purpose; therefore we choose the most common size. Mass customization, as a business strategy, aims at meeting diverse customer needs while maintaining near mass production efficiency, can implement both economies of scale and scope for an enterprise, and has become the goal that the companies pursue; Zhu & Jiang [7]. In order to reach the goal, companies are often forced to adopt differentiation strategy to offer customer more choices of products to meet the growing individualization of demand, by giving a more customer-centric role. The configuration approaches based on rules which are usually dependent on expert’s experience to establish. The configuration is one of the most important ways to realize quickly product customization. But, in business, particularly through the internet, a customer normally develops in his mind some sort of ambiguity, given the choice of similar products. The main concern is the requirements of the customers with respect to configuration of television which are vague. The television is taken as an example to further verify the validity and the feasibility of the proposed method and compared with WPM by Millar & Starr [2]. Framework of product configuration based on uncertain customers requirements Each attribute has a finite set of possible values, in which, the variant is defined by using attributes and attribute values. Together, all attributes and attribute values describe a complete range of the product family. Products in the same product family vary according to different attribute and its attribute value, choosing a product could be considered as a process of choosing its attributes and different attribute value. But, generally, it is difficult for a customer to express his requirements in a clear and unambiguous way, which is often due to the fact that he is not thoroughly familiar with the product which is the supplier offers. So, the requirements are often vague and fuzzy, preference weight varies with respect to different product attributes. We describe the customers’ vague and uncertain requirements in the form of fuzzy number by using the method of representation of fuzzy set. It is also the design to solve the configuration problem of the uncertain environment. As we know, there are various attributes in different products, but in which some attributes, such as color, shape, and so on, are not suitable to be represented as a fuzzy number. These attributes are often clear in the customers mind and the customer could select the attribute value by seeing the virtual product model in some browser environment. In realistic configuration system, it could be achieved by selecting the corresponding attribute value that the customer prefers directly. By using the theory of fuzzy MADM, the requirements the customer decided with respect to corresponding product attribute can be regarded as an ideal product. Firstly, the uncertain attribute value the customer decided would be represented in the form of triangular fuzzy number or interval fuzzy number, which is the most common way to solve uncertain, imprecise problems. Moreover, as the attribute values of the alternate products for the customer to select are determinate, which are usually definite and known, therefore, it is impossible to measure directly the distance or similarity degree between the ideal product the customer wanted and the alternate products. The definite attribute value is converted into the form of the fuzzy number so as to compute the distance between two fuzzy numbers. When choosing a product from a number of similar alternatives, customer

Upload: erdenetsetseg-erka

Post on 13-Mar-2016

215 views

Category:

Documents


0 download

DESCRIPTION

Muley A.A. and Bajaj V.H.* Framework of product configuration based on uncertain customers requirements Each attribute has a finite set of possible values, in which, the variant is defined by using attributes and attribute values. Together, all attributes and attribute values describe a Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-08-12 Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

TRANSCRIPT

Page 1: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-08-12

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

A new fuzzy MADM approach used for finite selection problem

Muley A.A. and Bajaj V.H.* *Department of Statistics, Dr. B. A. M. University, Aurangabad (M.S.)-431004, India

[email protected], [email protected] Abstract- This paper proposes a new approach to product configuration by applying the theory of Fuzzy Multiple Attribute Decision Making (FMADM), which focuses on uncertain and fuzzy requirements the customer, submits to the product supplier. The proposed method can be used in e-commerce websites, with which it is easy for customers to get his preferred product according to the utility value with respect to all attributes. The main concern of this paper, in which requirements the customer submitted to the configuration of television is vague. Further verify the validity and the feasibility of the proposed method compared with Weighted Product Method (WPM). Finally, the television is taken as an example to demonstrate the proposed methods. Keywords- MADM, Fuzzy, Triangular fuzzy number, T.V., Uncertainty Introduction Real world problems are often require a decision maker (DM) to rank discrete alternatives or, at least, to select best one. The MADM theory was developed to help the DM to solve such problems. MADM has been one of the fastest growing areas during the last decades depending on the changing in the business sector; Hwang & Yoon [1], Turban [4]. We focus on MADM which is used in a finite ‘selection’ or choice problem. In real world problem, MADM play most important role. Now a day’s television is the common in every person’s life. Here, we take as an application of selection of television configuration. Generally, common people purchase 21” size for house purpose; therefore we choose the most common size. Mass customization, as a business strategy, aims at meeting diverse customer needs while maintaining near mass production efficiency, can implement both economies of scale and scope for an enterprise, and has become the goal that the companies pursue; Zhu & Jiang [7]. In order to reach the goal, companies are often forced to adopt differentiation strategy to offer customer more choices of products to meet the growing individualization of demand, by giving a more customer-centric role. The configuration approaches based on rules which are usually dependent on expert’s experience to establish. The configuration is one of the most important ways to realize quickly product customization. But, in business, particularly through the internet, a customer normally develops in his mind some sort of ambiguity, given the choice of similar products. The main concern is the requirements of the customers with respect to configuration of television which are vague. The television is taken as an example to further verify the validity and the feasibility of the proposed method and compared with WPM by Millar & Starr [2]. Framework of product configuration based on uncertain customers requirements Each attribute has a finite set of possible values, in which, the variant is defined by using attributes and attribute values. Together, all attributes and attribute values describe a

complete range of the product family. Products in the same product family vary according to different attribute and its attribute value, choosing a product could be considered as a process of choosing its attributes and different attribute value. But, generally, it is difficult for a customer to express his requirements in a clear and unambiguous way, which is often due to the fact that he is not thoroughly familiar with the product which is the supplier offers. So, the requirements are often vague and fuzzy, preference weight varies with respect to different product attributes. We describe the customers’ vague and uncertain requirements in the form of fuzzy number by using the method of representation of fuzzy set. It is also the design to solve the configuration problem of the uncertain environment. As we know, there are various attributes in different products, but in which some attributes, such as color, shape, and so on, are not suitable to be represented as a fuzzy number. These attributes are often clear in the customers mind and the customer could select the attribute value by seeing the virtual product model in some browser environment. In realistic configuration system, it could be achieved by selecting the corresponding attribute value that the customer prefers directly. By using the theory of fuzzy MADM, the requirements the customer decided with respect to corresponding product attribute can be regarded as an ideal product. Firstly, the uncertain attribute value the customer decided would be represented in the form of triangular fuzzy number or interval fuzzy number, which is the most common way to solve uncertain, imprecise problems. Moreover, as the attribute values of the alternate products for the customer to select are determinate, which are usually definite and known, therefore, it is impossible to measure directly the distance or similarity degree between the ideal product the customer wanted and the alternate products. The definite attribute value is converted into the form of the fuzzy number so as to compute the distance between two fuzzy numbers. When choosing a product from a number of similar alternatives, customer

Page 2: Advances in Information Mining 2010 Issue1

A new fuzzy MADM approach used for finite selection problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 9

normally develops some sort of ambiguity. The ambiguity is mainly due to two reasons. Firstly, how to make a final product choice to purchase and secondly, on what basis the other products will be rejected. In order to answer the above questions, the customer may like to classify the products in different preference levels, preferably through some numerical strength of preference by Mohanty & Bhasker [3]. We adopt the triangular fuzzy number to represent the vague requirements provided by the customers it is shown by Fig. (1).

0 ,

( ) ,

,

A

x a

x ax a x b

b a

c xx c

c b

µ

<

−= ≤ ≤

−−

> −

%(1)

Fuzzy MADM methodology As we know, when a customer chooses his preferred product from many candidate products, it is done, in fact, by comparing different attributes that could be used to describe product performance in different aspects, and ranking these products according to the customer’s subjective preference. The customer requirements for products are usually uncertain and vague due to unable to understand product specifications comprehensively. On the other hand, the attribute values or specification of products offered by manufacturers are determinate and known. The model of fuzzy MADM has been introduced firstly by Yang & Chou [6]. The general MADM model can be described as follows:

• Let | 1, 2,...,{ }iX X i m= = denote a

finite discrete set of 2( )m ≥ possible

alternatives (courses of action, candidates);

• Let | 1,2,...,{ }j

A A j n= = denote a

finite set of 2( )n ≥ attributes according to

which the desirability of an alternative is to be judged,

• Let 1 2, ,( ..., )Tnω ω ω ω= be the

vector of weights,

where 1, 0, 1, 2...,1 j j

n j nj

ω ω ≥ ==∑ =,

and jω denotes the weight of attribute

jA .

• Let ( )m nR rij

= × denote the m n×

decision matrix, where 0( )ijr ≥ is the

performance rating of

alternative Xi

with respect to

attribute jA .

Normally, there are basically two types of attributes for a MADM problem, the first type is of ‘cost’ nature, and the second type is of ‘benefit’ nature. Since the attributes are generally incommensurate, the decision matrix needs to be normalized so as to transform the various attribute values into comparable ones. A common method of normalization is given as

min

max min1 1, ,..., ; ,..., ;

ij jij

j j

r rZ i m j n

r r

−= = =

for benefit attribute (2) and

max

max min1 1, ,..., ; ,..., ;

j ijij

j j

r rZ i m j n

r r

−= = =

for cost attribute (3)

Where ijZ is the normalized attribute value,

max minandj jr r given by,

max1 2

1max( , ,..., ) ,..., ;j mjj jr r r r j n= = (4)

min1 2

1min( , ,..., ) ,..., ;j mjj jr r r r j n= = (5)

Let ( )ij m nZ Z= × be the normalized decision

matrix. According to the SAW method, the overall weighted assessment value of alternative

11,...,

n

i ij ji

i md Z ω=

== ∑ (6)

Where id is a linear function of weight

variables and the greater the value of id the

better the alternative iX . The aim of MADM is

to rank alternatives or to determine the best alternative with the highest degree of desirability with respect to all relevant attributes. So, the best alternative is the one with the greatest overall weighted assessment value. The classic MADM techniques assume

Page 3: Advances in Information Mining 2010 Issue1

Muley AA and Bajaj VH

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

10

all ijr values are crisp numbers. In the

practical MADM problems, ijr values can be

crisp and/or fuzzy data. Fuzzy MADM methods have been developed due to the lack of precision in accessing the performance rating of alternatives with respect to an attribute, in

which the representation of ijr values are

often linguistic terms or fuzzy numbers. Configuration approach based on fuzzy MADM is introduced in details the algorithm which includes the following steps: Step 1: Representation of fuzzy requirements When choosing a product from a number of similar alternatives, a customer normally develops in his mind some sort of ambiguity. Step 2: Similarity measure In step 1, the customer’s requirements have been described as the triangular fuzzy number with respect to different product attributes. In this step, we will take the requirement vector as the ideal product the customer really wants, with the purpose to measure the similarity degree with the existing product vectors, in which the specification values are known and determinate. As we know, the fuzzy numbers can not be compared with crisp ones directly unless the unfuzzy numbers have to be transformed into the form of fuzzy numbers firstly. For example, for a crisp number b, the form of its triangular fuzzy can be written as follows:

( , , )L M Ub b b b=% (7)

WhereL M Ub b b= = , and Similarity measure

between two triangular fuzzy numbers can be calculated with Eq. (8); Xu [5],

2 2 2 2 2 2max

( , )(( ) ( ) ( ) ,( ) ( ) ( ) )

L L M M U U

L M U L M U

ab a b absab

a a a b b b

+ +=

+ + + +(8)

Where the above two triangular fuzzy numbers

are ( , , )L M U

a a a a= and ( , , )L M Ub b b b= ,

respectively. In realistic configuration system, it could be achieved by selecting an attribute value the customer prefers from the given alternate options. The similarity measure of this type of attributes is defined as follows:

' '

' '

' '

1,( , )

0,

a bs a b

a b

==

≠(9)

Step 3: Construction of Decision Matrix (DM). Calculation result of similarity measure between alternate products and the ideal product can be concisely expressed in a matrix format which is called decision matrix in MADM problems, and in which columns indicate product attributes and rows alternate products.

Thus, an element ijS in the in Eq. (10)

denotes the similarity degree to the ideal

product of the ith product with respect to the jth

attribute.

11 12 1

21 22 2

1 2

...

...

... ... ...

...

n

n

ij

m m mn

S S S

S S SDM

S

S S S

=

(10)

Step 4: Normalization: In order to eliminate the difference of dimension among different attributes, the operation of normalization is needed to transform various attribute dimensions into the non-dimensional attribute. Here, we adopt the Eqs. (11) and (12) to complete normalization of the fuzzy number.

max max max 1( , , )i i ii

i i i

a b cr

c b a= ∧% for

benefit attribute(11)

min min min

1( , , )i i ii

i i i

c c cr

c b a= ∧% for cost

attribute (12) Where

max min( ) max{( ) } and ( ) min{( ) }

iii i i i= =g g g g

Step 5: Rank of the alternate products

The element ijS in the decision matrix reflects

the closeness degree of the ideal product with the ith alternate product with respect to the j

th

attribute. In this step, we can use the SAW method, which is widely used in MADM, to calculate the utility value with respect to all attributes, with which the ranking order of alternate products according to utility value can be obtained. And we can consider the product with the highest utility value as the closest one to that of the customer requires. The utility value of i

th alternate product can be calculated

with Eq. (13).

11, 2, ,...,

j

n

ij ji mU x iω=

= =∑ (13)

And the maximum of utility value can be written as Eq. (14).

1max 1,2max ,...,

i

n

jij j mU x iω

== =∑ (14)

Here, we compared with the WPM by Millar & Starr [2] and check the feasibility of the customer’s requirement

1, 21

,...,jn

ji ij mU x i

ω

== =∏ (15)

Case study In this section, we take the television as an example to illustrate the method mentioned above. Table 1 shows the television that could

Page 4: Advances in Information Mining 2010 Issue1

A new fuzzy MADM approach used for finite selection problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 11

be used to configure for the different customers with respect to different attributes, in which the corresponding attributes are described as follows: Table 1 Configuration of Television

Sr. No.

Speakers Watt Channels Price

P1 6 1800 200 10300

P2 2 110 100 9790

P3 5 500 200 11990

P4 4 1200 200 12400

P5 2 200 200 9400

P6 2 400 100 11490

P7 2 250 200 9300

P8 4 500 200 9900

Suppose that the ideal product the customer wants according to the above attributes and the corresponding preference weight are shown in Table 2. Table 2 The ideal product and attribute weight

Attributes Ideal Lower Upper Weight

Speakers 5 2 8 0.25

Watt 1000 200 2400 0.2

Channels 150 100 250 0.25

Price 10,000 9,000 12,000 0.30

The vector of the ideal product can be represented as the following form of the triangular fuzzy number.

[(2,5,8), (200,1000, 2400), (100,150, 250), (9 '000,10 '000,12 '000)]C =%

The corresponding vector of the attribute weight can be written as the follows:

(0.25,0.20,0.25,0.30)ω =

The decision matrix, which shows the similarity degree with respect to each attribute between the ideal television that the customer desired and the candidate ones, is shown in Table 3 by using Eqs. (8)-(12).

Table 3 Decision Matrix Sr. No.

U1 U2 U3 U4

P1 0.8333 0.6666 0.8333 0.9824

P2 0.3225 0.0582 0.5263 0.738

P3 0.8064 0.2647 0.8333 0.8618

P4 0.6451 0.6329 0.8333 0.8333

P5 0.3225 0.1058 0.8333 0.8966

P6 0.3225 0.2117 0.5263 0.897

P7 0.3225 0.1323 0.8333 0.887

P8 0.6451 0.2647 0.8333 0.9443

The utility value of all candidate products with respect to all attributes can be calculated by Eq. (13) and the final calculated results are given below:

Table 4 Utility value of each product configuration by SAW P1 P2 P3 P4 P5 P6 P7 P8

0.8447 0.5039 0.7214 0.7462 0.5791 0.5243 0.5815 0.7058

The Table 4 presents the final utility value, with which the customer can rank the candidate products according to his preference to different attributes, and the order that shows the closeness degree to the customer requirements can be written as follows:

1 4 3 8 7 5 6 2P P P P P P P P> > > > > > >

Here, we compare the above method with the WP method and check the feasibility of the customer requirement calculated by Eq. (15), we get Table 5 Utility value of each product configuration by WPM P1 P2 P3 P4 P5 P6 P7 P8

0.8373 0.356 0.6637 0.7398 0.4446 0.4558 0.4635 0.6452

1 4 3 8 7 6 5 2P P P P P P P P> > > > > > >

Due to the uncertainty of the customers’ requirements and the fact that different algorithms may yield different results, therefore, in the realistic configuration system, several products that have the higher similarity degree to that of the customer requires can be presented for customer to choose so as to satisfy the customer requirements to the greatest degree. Conclusion This paper proposes an approach to realize product level configuration according to the fuzzy and uncertain customer requirements by using the theory of the fuzzy MADM. The television is taken as an example to demonstrate feasibility of the proposed method for solving uncertain customer requirements. When results of SAW and WPM are compared, we get the same preferences to our problem and the optimal solution for selection of television is P1.

References [1] Hwang C.L. and Yoon K. P. (1981)

Springer, Berlin. [2] Millar D.W., Starr M.K. (1969) Prentice

Hall, Englewood Cliffs, New Jersey. [3] Mohanty B.K. & Bhasker B. (2005)

Decision Support Systems, .38, 611–619.

[4] Turban E. (1988) Macmillan, New York.

Page 5: Advances in Information Mining 2010 Issue1

Muley AA and Bajaj VH

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

12

[5] Xu Z. S. (2002) Systems Engineering and Electronics, 124, 9–12.

[6] Yang T. & Chou P. (2005) Mathematics and Computers in Simulation, 68, 9–21.

[7] Zhu B. & Jiang P. Y. (2005) The International Journal of Product Development, 2, 155–169.

Page 6: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-13-17

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

Ant based rule mining with parallel fuzzy cluster

Sankar K.1 and Krishnamoorthy K.

2

1Department of Master of Computer Applications, KSR College of Engineering, Tiruchengode, [email protected]

2Department of Computer Science and Engineering, SONA College of Technology, Salem, [email protected]

Abstract- Ant-based techniques, in the computer sciences, are designed those who take biological inspirations on the behavior of these social insects. Data clustering techniques are classification algorithms that have a wide range of applications, from Biology to Image processing and Data presentation. Since real life ants do perform clustering and sorting of objects among their many activities, we expect that an study of ant colonies can provide new insights for clustering techniques. The aim of clustering is to separate a set of data points into self-similar groups such that the points that belong to the same group are more similar than the points belonging to different groups. Each group is called a cluster. Data may be clustered using an iterative version of the Fuzzy C means (FCM) algorithm, but the draw back of FCM algorithm is that it is very sensitive to cluster center initialization because the search is based on the hill climbing heuristic. The ant based algorithm provides a relevant partition of data without any knowledge of the initial cluster centers. In the past researchers have used ant based algorithms which are based on stochastic principles coupled with the k-means algorithm. The proposed system in this work use the Fuzzy C means algorithm as the deterministic algorithm for ant optimization. The proposed model is used after reformulation and the partitions obtained from the ant based algorithm were better optimized than those from randomly initialized hard C Means. The proposed technique executes the ant fuzzy in parallel for multiple clusters. This would enhance the speed and accuracy of cluster formation for the required system problem.

1. INTRODUCTION Research in using the social insect metaphor for solving problems is still in its infancy. The systems developed using swarm intelligence principles emphasize distributiveness, direct or indirect interactions among relatively simple agents, flexibility and robustness [4]. Successful applications have been developed in the communication networks, robotics and combinatorial optimization fields. 1.1 ANT COLONY OPTIMIZATION Many species of ants cluster dead bodies to form cemeteries, and sort the larvae into several piles [4]. This behavior can be simulated using a simple model in which the agents move randomly in space and pick up and deposit items on the basis of local information. The clustering and sorting behavior of ants can be used as a metaphor for designing new algorithms for data analysis and graph partitioning. The objects can be considered as items to be sorted. Objects placed next to each other have similar attributes. This sorting takes place in two-dimensional space, offering a low-dimensional representation of the objects. Most swarm clustering work has followed the above model. In the work, there is implicit communication among the ants making up a partition. The ants also have memory. However, they do not pick up and put down objects but rather place summary objects in locations and remember the locations that are evaluated as having good objective function values. The objects represent single dimensions of multidimensional cluster centroids which make up a data partition.

1.2 CLUSTERING The aim of cluster analysis is to find groupings or structures within unlabeled data [5]. The partitions found should result in similar data being assigned to the same cluster and dissimilar data assigned to different clusters. In most cases the data is in the form of real-valued vectors. The Euclidean distance is one measure of similarity for these data sets. Clustering techniques can be broadly classified into a number of categories [6]. Hard C Means (HCM) is one of the simplest unsupervised clustering algorithms for a fixed number of clusters. The basic idea of the algorithm is to initially guess the centroids of the clusters and then refine them. Cluster initialization is very crucial because the algorithm is very sensitive to this initialization. A good choice for the initial cluster centers is to place them as far away from each other as possible. The nearest neighbor algorithm is then used to assign each example to a cluster. Using the clusters obtained, new cluster centroids are calculated. The above steps are repeated until there is no significant change in the centroids. Hard clustering algorithms assign each example to one and only one cluster. This model is inappropriate for real data sets in which the boundaries between the clusters may not be well defined. Fuzzy algorithms can partially assign data to multiple clusters. The strength of membership in the cluster depends on the closeness of the example to the cluster center. The Fuzzy C Means algorithm (FCM), allows an example to be a partial member of more than one cluster. The FCM algorithm is based on

Page 7: Advances in Information Mining 2010 Issue1

Ant based rule mining with parallel fuzzy cluster

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 14

minimizing the objective function. The drawback of clustering algorithms like FCM and HCM, which are based on the hill climbing heuristic, is, prior knowledge of the number of clusters in the data is required and they have significant sensitivity to cluster center initialization. The proposal of this work moves in the direction of constructing C fuzzy means clustering with ant colony optimization (parallel ant agents) in evolving efficient rule mining techniques. In this thesis, the proposal introduces the problem of combining multiple partitionings of a set of objects without accessing the original features. The system first identify several application scenarios for the resultant `knowledge reuse' framework that the system call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information in building rule mining techniques. In addition to a direct maximization approach, the system proposes three effective and efficient techniques for obtaining high-quality combiners. 2. RELATED WORKS Andrea Baraldi and Palma Blonda,[1] describe, equivalence between the concepts of fuzzy clustering and soft competitive learning in clustering algorithms was proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. Alfred Ultsch systems for clustering with collectives of autonomous agents follow either the ant approach of picking up and dropping objects or the DataBot approach of identifying the data points with artificial life creatures. In DataBot systems the clustering behaviour is controlled by movement programs. Julia Handl and Bernd Meyer Sorting and clustering methods inspired by the behavior of real ants are among the earliest methods in ant-based meta-heuristics. The system revisits these methods in the context of a concrete application and introduces some modifications that yield significant improvements in terms of both quality and efficiency. Firstly, re-examine their capability to simultaneously perform a combination of clustering and multi-dimensional scaling. In J.Handl, J.Knowles and M.Dorigo Ant-based clustering and sorting is a nature-inspired heuristic for general clustering tasks. It has been applied variously, from problems arising in commerce, to circuit design, to text-mining, all with some promise. However, although early results were broadly encouraging, there has been very limited analytical evaluation of the algorithm. Alexander Strehl, Joydeep Ghosh introduces the problem of combining multiple partitioning of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitioning. The system first

identify several application scenarios for the resultant `knowledge reuse' framework that we call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information. In addition to a direct maximization approach, the system proposes three effective and efficient techniques for obtaining high-quality combiners (consensus functions). The first combiner induces a similarity measure from the partitioning and then reclusters the objects. The second combiner is based on hypergraph partitioning. The third one collapses groups of clusters into meta-clusters which then compete for each object to determine the combined clustering. Due to the low computational costs of the techniques, it is quite feasible to use a supra-consensus function that evaluates all three approaches against the objective function and picks the best solution for a given situation. The system evaluates the effectiveness of cluster ensembles in three qualitatively different application scenarios: (i) where the original clusters were formed based on non-identical sets of features, (ii) where the original clustering algorithms worked on non-identical sets of objects, and (iii) where a common data-set is used and the main purpose of combining multiple clusterings is to improve the quality and robustness of the solution. Promising results are obtained in all three situations for synthetic as well as real data-sets. Nicolas Labroche, Nicolas Monmarch´e and Gilles Venturini introduces a method to solve the unsupervised clustering problem, based on a modeling of the chemical recognition system of ants. This system allow ants to discriminate between estimates and intruders, and thus to create homogeneous groups of individuals sharing a similar odor by continuously exchanging chemical cues. This phenomenon, known as ”colonial closure”, inspired us into developing a new clustering algorithm and then comparing it to a well-known method such as K-MEANS method. The previous literature work on fuzzy cluster depicted above insists on the following parameters. The first one handles the functional attribute with the theoretical analysis. The second and third one deal with the cluster object movement issues on synthetic data sets. The fourth and fifth one deals with heuristic ant optimization model with trial repetition. Sixth and seventh authors utilized unsupervised cluster with class tree structuring. The final one uses c-fuzzy mean cluster in the sequential way. This motivates us to precede our proposal on ACO with c-fuzzy means. Based on the C-Fuzzy sequential clustering of ACO Problem, we derived a parallel fuzzy ant clustering model to improve the attribute accuracy rate and faster execution on the proposed problem domain.

Page 8: Advances in Information Mining 2010 Issue1

Sankar K and Krishnamoorthy K

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

15

3. FUZZY ANT CLUSTERING Ant-based clustering algorithms are usually inspired by the way ants cluster dead nest mates into piles, without negotiating about where to gather the corpses. These algorithms are characterized by the lack of centralized control or a priori information, which makes them very appropriate candidates for the task at hand. Since the Fuzzy ants algorithm does not need initial partitioning of the data or a predefined number of clusters, it is very well suited for the Web People Search task, where the system do not know in advance how many clusters (or individuals) correspond to a particular document set (or person name in the case). A detailed description of the algorithm is given by Schockaert et al. It involves a pass in which ants can only pick up one item as well as a pass during which ants can only pick up an entire heap. A fuzzy ant-based clustering algorithm was introduced where the ants are endowed with a level of intelligence in the form of IF / THEN rules that allow them to do approximate reasoning. As a result, at any time the ants can decide for themselves whether to pick up a single item or an entire heap, which makes a separation of the clustering in different passes superuous. The system has experiment with a different number of ant’s runs and fixed the number of runs to 800000 for the experiments. In addition, the system has also evaluated different values for the parameters that determine the probability that a document or heap of documents is picked up or dropped by the ants and kept following values for the experiments: Table 1: Parameter settings for fuzzy clustering

n1 probality of dropping one item

1

m1 probality of picking up one item

1

n2 probality of dropping an entire heap

5

m2 probality of picking up a heap

5

3.1 Hierarchical Clustering The second clustering algorithm the system applies is an agglomerative hierarchical approach. This clustering algorithm builds a hierarchy of clustering’s that can be represented as a tree (called a dendrogram) which has singleton clusters (individual documents) as leaves and a single cluster containing all documents as root. An agglomerative clustering algorithm builds this tree from the leaves to the top, in each step merging the two clusters with the largest similarity. Cutting the tree at a given height gives a clustering at a selected number of clusters. The system have opted to cut the tree at different similarity thresholds between the document pairs, with intervals of 0.1 (e.g. for threshold 0.2 all document pairs with similarities

above 0.2 are clustered together). For the experiments, the system has used an implementation of Agnes (Agglomerative Nesting) that is fully described. 3.2 Fuzzy Ant Parallel System Clustering approaches are typically quite sensitive to initialization. In this thesis, the system examine a swarm inspired approach to building clusters which allows for a more global search for the best partition than iterative optimization approaches. The approach is described with cooperating ants as its basis. The ants participate in placing cluster centroids in feature space. They produce a partition which can be utilized as is or further optimized. The further optimization can be done via a focused iterative optimization algorithm. Experiments were done with both deterministic algorithms which assign each example to one and only one cluster and fuzzy algorithms which partially assign examples to multiple clusters. The algorithms are from the C-means family. These algorithms were integrated with swarm intelligence concepts to result in clustering approaches that were less sensitive to initialization.

4. EXPERIMENTAL SIMULATION ON ANT BASED PARALLEL CLUSTER The system implementation of fuzzy ant based parallel clustering algorithm for rule mining used three real data sets obtained from UCI repository. The data sets were Iris Human Data Set, Wine Recognition Data Set, and Glass Identification Data Set. The simulation conducted in matlab normalizes the feature values between 0 and 1. The normalization is linear. The minimum value of a dataset specific feature is mapped to 0 and the maximum value of the feature is mapped to 1. Initialize the ants with random initial values and with random direction. There are two directions, positive and negative. The positive direction means the ant is moving in the feature space from 0 to 1. The negative direction means the ant is moving in the feature space from 1 to 0. Clear the initial memory. The ants are initially assigned to a particular feature within a particular cluster of a particular partition. The ants never change the feature, cluster or the partition assigned to them. Repeat For one epoch /* One epoch is n iterations of random ant movement */ For all ants With a probability Prest the ant rests for this epoch If the ant is not resting then with a probability Pcontinue the ant continues in the same direction else it changes direction With a value between Dmin and Dmax the ant moves in the selected direction The new Rm value is calculated using the new cluster centers calculated by recording the

Page 9: Advances in Information Mining 2010 Issue1

Ant based rule mining with parallel fuzzy cluster

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 16

position of the ants known to move the features of clusters for a given partition. If the partition is better than any of the old partitions in memory then the worst partition is removed from the memory and this new partition is copied to the memories of the ants making up the partition. If the partition is not better than any of the old partitions in memory Then With a probability P Continue Current the ant continues with the current partition Else With a probability 0.6 the ant moves to the best known partition, with a probability 0.2 the ant moves to the second best known partition, with a probability 0.1 the ant goes to the third best known partition, with a probability 0.075 the ant goes to the fourth best known partition and with a probability 0.025 the ant goes to the worst known partition Until Stopping criteria The stopping criterion is the number of epochs.

Table 2- Parameter Values Parameter Value Number of ants 30 * c * # features Memory per ant 5 Iterations per epoch 50 Epochs 1000

Prest 0.01 Pcontinue 0.75 PContinueCurrent 0.20 Dmin 0.001

Dmax 0.01 Note the multiplier 30 for the number of ants allows for 30 partitions. Three data sets Glass Data Set, Wine Data Set, Iris Data Set were evaluated from a mixture of five Gaussians. The probability distribution across all the data sets is the same but the means and standard deviations of the Gaussians are different. Of the three data sets, two data sets had 500 instances each and the remaining one data set had 1000 instances each. Each instance had two attributes. To visualize the Iris data set, the Principal Component Analysis (PCA) algorithm was used to project the data points into a 2D and 3D space. 5. RESULTS AND DISCUSSIONS The ants move the cluster centers in feature space to find a good partition for the data. There are less controlling parameters than the previous ant based clustering algorithms. The previous ant clustering algorithms typically group the objects on a two-dimensional grid. Results from 18 data sets show the superiority of the algorithm over the randomly initialized FCM and HCM algorithms. For comparison purposes, Table 2 shows the frequency of occurrence of different extrema for the ant initialized FCM and HCM algorithms and the randomly initialized FCM and HCM algorithms.

Table 3- Frequency of different extrema from parallel fuzzy based ant clustering, for Glass (2 class) Iris and Wine data set

Data Set

Extrema

Frequency HCM, and Initialization

Frequency HCM, random Initialization

Sequential C-Fuzzy ACO (Existing)

Parallel C-Fuzzy ACO (Proposed)

34.1320

19 3 31 27.8

34.1343

11 19 32.12 28.5

34.1372

19 15 32.36 29.1

Glass (2 class)

34.1658

1 5 32.89 29.82

6.9981

50 23 5.3938 4.23

7.1386

0 14 5.8389 4.3658

10.9083

0 5 8.3746 5.3256

Iris

12.1437

0 8 10.6434 8.2356

9.3645

20 2 5.2369 3.2567

11.3748

15 20 8.2356 5.236

Wine

13.8483

12 18 10.2356 8.3656

The ant initialized parallel ant fuzzy algorithm always finds better extrema for the Iris data set and for the Wine data set the ant initialized algorithm finds the better extrema 49 out of 50 times. The ant initialized HCM algorithm always finds better extrema for the Iris data set and for the Glass (2 class) data set a majority of the time. For the different Iris, the ant initialized parallel algorithm finds a better extrema most of the time. The ACO approach was used to optimize the clustering criteria, the ant approach for parallel C Means, found better extrema 64% of the time for the Iris data set. The ant initialized parallel C fuzzy ACO finds better extrema all the time. The number of ants is an important parameter of the algorithm. This number only increases when more partitions are searched for at the same time; as ants are (currently) added in increments (Graph 1 and Graph 2). The quality of the final partition improves with an increase of ants, but the improvement comes at the expense of increased execution time. Graph 1: Number of Iterations Vs Time

No.of.Iterations VS. Time

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9 10

No.of.Iterations

Tim

e Ant Fuzzy Parallel

Ant Fuzzy Sequential

Page 10: Advances in Information Mining 2010 Issue1

Sankar K and Krishnamoorthy K

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

17

Graph 2: Time Vs Path Length

Time VS. Path Length

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10

Time

Path

Le

ng

th

0

5

10

15

20

25

30

35

Ant Fuzzy Sequential

Ant Fuzzy Parallel

7. CONCLUSION The system discussed a swarm inspired optimization algorithm to partition or creates clusters of data. The system described it using the ant paradigm. The approach is to have a coordinated group of ant’s position cluster centroids in feature space. The algorithm was evaluated with a soft clustering formulation utilizing the fuzzy c-means objective function and a hard clustering formulation utilizing the hard c-means objective function. The presented clustering approach seems clearly advantageous for the data sets where it is expected there will be lots of local extrema. The cluster discovery aspect of the algorithm provides the advantage of obtaining a partition at the same time it indicates the number of clusters. That partition can be further optimized or accepted as is. This is in contrast to some other schemes which require partitions to be created with different numbers of clusters and then evaluated. The results are generally a superior optimized partition (objective function) than obtained with FCM/HCM. One needs a large number of random initializations to be competitive in terms of skipping some of the poor local extrema which was done with the ant-based algorithm. It has provided enhanced final partitions on average than a previously introduced evolutionary computation clustering approach for several data sets. Random initializations have been shown to be the best approach for the c-means family and the ant clustering algorithm results in generally better partitions than a single random initialization. The parallel version of the ants algorithm could operate much faster than the current sequential implementation, thereby making it a clear choice for minimizing the chance of finding a poor extrema when doing c-means clustering. This algorithm should scale better for large numbers of examples than grid-based ant clustering algorithms.

REFERENCES [1] Baraldi A. and Blonda P. (1999a) IEEE

Transactions on Systems, Man, and Cybernetics, 29(6), 778-785.

[2] Kanade P.M. and Hall L.O. (2003) IEEE Transactions on Fuzzy Systems , 11(2), 227-232.

[3] Handl J. and Meyer B. (2002) Springer-Verlag, 2439, 913-923.

[4] Handl J., Knowles J. and Dorigo M. (2003) IOS Press, Amsterdam, the Netherlands, 204-213.

[5] Strehl A. and Ghosh J. (2002) Journal of Machine Learning Research 3, 583-617.

[6] Labroche N., Monmarche N. and Venturini G. (2002) France: IOS Presss, 345-349.

Page 11: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-01-07

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

Fuzzy multi-objective multi-index transportation problem

Lohgaonkar M.H.1, Bajaj V.H.

1*, Jadhav V.A.

2 and Patwari M.B.

1

*1Department of Statistics, Dr. B. A. M. University, Aurangabad, MS, [email protected],

[email protected] 2Departments of Statistics, Science College, Nanded, MS

Abstract- The aim of this paper is to present a fuzzy multi-objective multi-index transportation problem and develop multi-objective multi-index fuzzy programming model. This model cannot only satisfy more of the actual requirements of the integral system but is also more flexible than conventional transportation problems. Furthermore, it can offer more information to the decision maker (DM) for reference, and then it can raise the quality for decision-making. This paper, we use a special type of linear and non-linear membership functions to solve the multi-objective multi-index transportation problem. It gives an optimal compromise solution. Keywords- Transportation problem, multi-objective transportation problem, multi-index, linear membership function, non-linear membership function Introduction Fuzzy set theory was proposed by L. A. Zadeh and has been found extensive in various fields. Bellman and Zadeh [2] were the first to consider the application of the fuzzy set theory in solving optimization problems in a fuzzy environment, these investigators constraints that both the objective function and the constraints that exist in the model could be represented by corresponding fuzzy set and should be treated in the same manner. The earliest application of it to transportation problems include Prade [11], O’he’igeartaigh [10], Chanas et al. [4]. But these researcher emphases on investigating theory and algorithm. Furthermore, these above investigations are illustrated with simple instance slacking in actual cases of submition. On the other hand, these models are only of single objective and are classical two index transportation problems. In actual transportation problem, the multi-objective functions are generally considered, which includes average delivery time of the commodities, minimum cost, etc. Zimmermann [15] applied the fuzzy set theory to the linear multicriteria decision making problem. It used the linear fuzzy membership function and presented the application of fuzzy linear vector maximum problem. He showed that solutions obtained by fuzzy linear programming always provide efficient solutions and also an optimal compromised solution. Aneja and Nair [1] Showed that the problem model. Multi-index transportation problem are the extension of conventional transportation problems, and are appropriate for solving transportation problems with multiple supply points, multiple demand points as well as problems using diverse modes of transportation demand or delivering different kinds of merchandises. Thus, the forwarded problem would be more complicated than conventional transportation problems. Junginger [9] who proposed a set of logic problems, to solve multi-index transportation problems, has also conducted a detailed investigation regarding the characteristics of multi-index transportation problem model.

Rautman et al. [12] used multi-index transportation problem model to solve the shipping scheduling suggested that the employment of such transportation efficiency but also optimize the inegral system. Mathematical Model Multi-objective Multi-index Transportation Problem

Let aijl

be multi-dimensional array

1 i m, 1 j n , 1 l k≤ ≤ ≤ ≤ ≤ ≤ and let

A=(a ), B=(b ), C=(c )ij jl il

be multi-matrices

then multi-index transportation problem is defined as follows

Minimize Z = a Xijl ijli j l

∑∑∑ (1)

Subject to

X = a ( i , j )i j l i jl

X = c ( i , l )i j l i lj

X = b ( j , l )i j l j li

X 0 ( i , j , l )i j l

∀∑

∀∑

∀∑

≥ ∀

(2)

It is immediate that

a = b ; a = c ; b = cij jl ij il jl ili l j l j i

∑ ∑ ∑ ∑ ∑ ∑ (3)

are three necessary conditions however they are noted to be non sufficient. Multi-objective double transportation problem as follows

Page 12: Advances in Information Mining 2010 Issue1

Fuzzy multi-objective multi-index transportation problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 2

m mn n(1) (2)(1) (2)Minimize Z = k x + k x (4)p ij ijij iji=1 i=1j=1 j=1

Subject to

n (1)x =a i (5)1iijj=1

n (2)x =a i (6)2iijj=1

m (1)x =b j (7)1jiji=1

m (2)x =b j (8)2jij

i=1

(1) (2)x +x =c i,j (9)ijij ij

(1) (2)x ,x 0 i,j (10)

ij ij

∑ ∑∑ ∑

∀∑

∀∑

∀∑

∀∑

≥ ∀

It may be easily seen that for existence of solution following set of conditions are necessary.

nc =a +a i (11)ij 1i 2ij=1

mc =b +a j (12)ij 1j 2ji=1

m na = b (13)1i 1ji=1 j=1

m na = b (14)

2i 2ji=1 j=1

c Min(a +b )+Min(a +b ) (i,j) (15)ij 1i 1j 2i 2j

∀∑

∀∑

∑ ∑

∑ ∑

≤ ∀∑

It may be easily seen that DTP is composed of two transportation tables and one C matrix as given below.

(1) (1) (1)k k ... k a11 12 1n 11

(1) (1) (1) ak k ... k 1221 22 2nC = (16)1

......

a(1) (1) (1) 1mk k ... kmnm1 m2

b b ... b11 12 1n

(2) (2) (2)k k ... k a11 12 1n 21

(2) (2) (2) ak k ... k 2221 22 2nC =2

...

(2) (2) (2)k k ... kmnm1 m2

and C = (c ) (17)mxnij...

a2m

b b ... b11 12 1n

(1) (1) (1)k k ... k a11 12 1n 11

(1) (1) (1) ak k ... k 1221 22 2nT = (18)1

......

a(1) (1) (1) 1mk k ... kmnm1 m2

b b ... b11 12 1n

(2) (2) (2)k k ... k a11 12 1n 21

(2) (2) (2) ak k ... k 2221 22 2nT =2

...

(2) (2) (2)k k ... kmnm1 m2

and C = (c ) (19)mxnij...

a2m

b b ... b21 22 2n

Fuzzy Algorithm to solve multi-objective multi-index transportation problem Step 1: Solve the multi-objective multi-index transportation problem as a single objective transportation problem P times by taking one of the objectives at a time Step 2 : From the results of step 1, determine the corresponding values for every objective at each solution derived. According to each solution and value for every objective, we can find pay-off matrix as follows

Z (X) Z (X) ... Z (X)p1 2

(1)X

(2)X

.

.

(P)X

Z Z … Z p11 1 2 1

. . .ZZ Z 2 p2 1 2 2

. . . . . . . . . . . .

Z Z Zp p p p1 2

Where, (1) (2) (p)

X ,X ,...,X are the isolated

optimal solutions of the P different transportation problems for P different objective functions

iΖ =Ζ (X ) (i=1,2,...,p & j=1,2,...,p)ιj j

be the

i-th row and j-th column element of the pay-off matrix. Step 3: From step 2, we find for each objective the worst (Up) and the best (Lp) values corresponding to the set of solutions, where,

U =max (Z ,Z ,...,Z )p pp1p 2pand

L =Z p=1,2,...,Pp pp

An initial fuzzy model of the problem (4)-(10)

can be stated as

Find X i=1,2,...,m j=1,2,...,n,ij

~so as to satisfy Z < L p=1,2,...,Pp p

subject to (4)-(10)

Step 4: Case (i) Define membership function for the p-th objective function as follows:

1 if Z (X) Lp p

U -Z (X)p pµ (X)= if L < Z < Up p p p

U -Lp p

0 if Z Up p

(20)

Step 5: Find an equivalent crisp model by using a linear membership function for the initial fuzzy model

M a x i m i z e λ

U - Z ( X )p pλ

U - Lp p

s u b j e c t t o ( 5 ) - ( 1 0 )

≤ (21)

Page 13: Advances in Information Mining 2010 Issue1

Lohgaonkar MH, Bajaj VH, Jadhav VA and Patwari MB

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

3

Step 6: Solve the crisp model by an appropriate mathematical programming algorithm.

Maximize λ

Subject to

pC X + λ (U -L ) U p=1,2,...,Pp p pij ij

(22)

subject to (5)-(10)

Now, by using hyperbolic membership function for the P-th objective function

1 if Z Lp p

(U +L ) (U +L )p p p p{ -Z (x)}α -{ -Z (x)}αp p p p2 21 e -e 1H

µ Z (x)= + if L Z Up p p p(U +L ) (U +L )p p p p2 2{ -Z (x)}α -{ -Z (x)}αp p p p2 2e +e

0 if Z Up p

< <

(23)

Where, 3 6

α = =pU -L U -Lp p p p

Crisp model for the fuzzy model can be formulated as: Maximize λ subject to

(U +L ) (U +L )p p p p{ -Z (x)}α -{ -Z (x)}αp p p p2 21 e -e 1

λ +(U +L ) (U +L )p p p p2 2

{ -Z (x)}α -{ -Z (x)}αp p p p2 2e +e

≤(24)

subject to (5) (10) & λ 0− ≥

Solve the crisp model as Maximize Xmn+1

subject to

α Z (x) + X α (U + L ) /2 , p = 1,2,-----Pp p p p pmn+1 ≤

(25)

subject to (5)-(10) and X 0 mn+1

Where, -1

X =tanh (2λ-1)mn+1

Now, by using exponential membership function for the p th objective function and is defined as

1, if Z Lp p

-SΨ (X)p -Se -eE

µ Z (x)= , if L Z Up p p p-S1-e

0, if Z Up p

< <

(26)

Where,Z -Lp P

Ψ (X)= p=1,2,...,PpU -Lp p

S is a non zero parameter, prescribed by the decision maker Numerical Examples Example 1

C C C1 2

4 3 5 8 6 3 5 7 39 6

14 78 6 2 5 4 1 8 4 9

6 57 4 1 9 2 6 4 1 6

7 69 10 12 4 9 3 2 8 314 12 10 5 8 11

(27)

Example 2

T T C1 2

5 6 7 10 9 9 5 7 39 6

14 74 5 2 7 9 2 8 4 9

6 51 3 4 8 7 9 4 1 6

7 64 2 3 8 4 5 2 8 314 12 10 5 8 11

(28)

Example 1 is simplified as

(1) (1) (1) (1) (1) (1) (1) (1)Minimize Z = 4X +3X +5X +8X +6X +2X +7X +4X +

1 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) X +9X +10X +12X +8X +6X +3X +5X +

33 41 42 43 11 12 13 21

4X22(2) (2) (2) (2) (2) (2) (2) (2)

+X +9X +2X +6X +4X +9X +3X23 31 32 33 41 42 43

(29)

S u b j e c t t o

( 1 ) ( 1 ) ( 1 )X + X + X = 9

1 1 1 2 1 3

( 1 ) ( 1 ) ( 1 )X + X + X = 1 4

2 1 2 2 2 3

( 1 ) ( 1 ) ( 1 )X + X + X = 6

3 1 3 2 3 3

( 1 ) ( 1 ) ( 1 )X + X + X = 7

4 1 4 2 4 3

( 2 ) ( 2 ) ( 2 )X + X + X = 6

1 1 1 2 1 3

( 2 ) ( 2 ) ( 2 )X + X + X = 7

2 1 2 2 2 3

( 2 ) ( 2 ) ( 2 )X + X + X = 5

3 1 3 2 3 3

( 2 ) ( 2 ) ( 2 )X + X + X = 6

4 1 4 2 4 3

( 1 ) ( 1 )X + X +

1 1 2 1( 1 ) ( 1 )

X + X = 1 43 1 4 1

( 1 ) ( 1 ) ( 1 ) ( 1 )X + X + X + X = 1 2

1 2 2 2 3 2 4 2

( 1 ) ( 1 ) ( 1 ) ( 1 )X + X + X + X = 1 0

1 3 2 3 3 3 4 3

( 2 ) ( 2 ) ( 2 ) ( 2 )X + X + X + X = 5

1 1 2 1 3 1 4 1

( 2 ) ( 2 ) ( 2 ) ( 2 )X + X + X + X = 8

1 2 2 2 3 2 4 2

( 2 ) ( 2 ) ( 2 ) ( 2 )X + X + X + X = 1 1

1 3 2 3 3 3 4 3

( 1 ) ( 2 )X + X = 5

1 1 1 1

( 1 ) ( 2 )X + X = 7

1 2 1 2

( 1 )X

1 3( 2 )

+ X = 31 3

( 1 ) ( 2 )X + X = 8

2 1 2 1

( 1 ) ( 2 )X + X = 4

2 2 2 2

( 1 ) ( 2 )X + X = 9

2 3 2 3

( 1 ) ( 2 )X + X = 4

3 1 3 1

( 1 ) ( 2 )X + X = 1

3 2 3 2

( 1 ) ( 2 )X + X = 6

3 3 3 3

( 1 ) ( 2 )X + X = 2

4 1 4 1

( 1 ) ( 2 )X + X = 8

4 2 4 2

( 1 ) ( 2 )X + X = 3

4 3 4 3

(30)

Page 14: Advances in Information Mining 2010 Issue1

Fuzzy multi-objective multi-index transportation problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 4

Example 2 is simplified as (1) (1) (1) (1) (1) (1) (1) (1)

Minimize Z = 5X +6X +7X +4X +5X +2X +1X +3X +2 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 4X +4X +2X +3X +10X +9X +9X +7X +

33 41 42 43 11 12 13 21

(2 9X

22) (2) (2) (2) (2) (2) (2) (2)+2X +8X +7X +9X +8X +4X +5X

23 31 32 33 41 42 43

(31)

S u b je c t to

( 1 ) ( 1 ) (1 )X + X + X = 9

1 1 1 2 1 3

( 1 ) ( 1 ) (1 )X + X + X = 1 4

2 1 2 2 2 3

( 1 ) ( 1 ) (1 )X + X + X = 6

3 1 3 2 3 3

( 1 ) ( 1 ) (1 )X + X + X = 7

4 1 4 2 4 3

( 2 ) ( 2 ) (2 )X + X + X = 6

1 1 1 2 1 3

( 2 ) ( 2 ) (2 )X + X + X = 7

2 1 2 2 2 3

( 2 ) ( 2 ) (2 )X + X + X = 5

3 1 3 2 3 3

( 2 ) ( 2 ) (2 )X + X + X = 6

4 1 4 2 4 3

( 1 ) ( 1 )X + X +

1 1 2 1(1 ) (1 )

X + X = 1 43 1 4 1

( 1 ) ( 1 ) (1 ) (1 )X + X + X + X = 1 2

1 2 2 2 3 2 4 2

( 1 ) ( 1 ) (1 ) (1 )X + X + X + X = 1 0

1 3 2 3 3 3 4 3

( 2 ) ( 2 ) (2 ) (2 )X + X + X + X = 5

1 1 2 1 3 1 4 1

( 2 ) ( 2 ) (2 ) (2 )X + X + X + X = 8

1 2 2 2 3 2 4 2

( 2 ) ( 2 ) (2 ) (2 )X + X + X + X = 1 1

1 3 2 3 3 3 4 3

( 1 ) ( 2 )X + X = 5

1 1 1 1

( 1 ) ( 2 )X + X = 7

1 2 1 2

( 1 )X

1 3( 2 )

+ X = 31 3

( 1 ) ( 2 )X + X = 8

2 1 2 1

( 1 ) ( 2 )X + X = 4

2 2 2 2

( 1 ) ( 2 )X + X = 9

2 3 2 3

( 1 ) ( 2 )X + X = 4

3 1 3 1

( 1 ) ( 2 )X + X = 1

3 2 3 2

( 1 ) ( 2 )X + X = 6

3 3 3 3

( 1 ) ( 2 )X + X = 2

4 1 4 1

( 1 ) ( 2 )X + X = 8

4 2 4 2

( 1 ) ( 2 )X + X = 3

4 3 4 3

(32)

For objective 1

Z , we find the optimal solution

as

(1) (1) (1) (1)X =5 ; X =4 ;X =8 ,X =2 ,

11 12 21 22

(1) (1) (1) (1)X =4; X =6 ; X =1;X 6;

23 33 41 42

(1) (2) (2) (2) (2)X = X =3 ; X =3 ;X =2 ; X =5 ,

12 13 22 23

(2) (2) (2) (2)X =4; X =1; X =1;X 2;

31 32 41 42

(2)X 3

43

=

=

=

Z =3001

For objective 2Z , we find the optimal solution

as

(1) (1) (1) (1)X =4 ; X =5 ;X =8 ,X =4 ,

11 12 21 22

(1) (1) (1) (1)X =2; X =1;X =5 ; X =1;

23 31 33 41

(1) (1) (2) (2)X 3; X =3;X =1 ; X =2;(2) 42 43 11 12X =

(2) (2) (2) (2)X =3;X =7; X =3; X =1;

13 23 31 32

(2) (2) (2)X 1; X =1;X 5;

33 41 42

=

= =

Z =2831

Now for (1)

X we can find out Z2

,

(1)Z (X )=291

2

Now for (2)

X we can find out Z1

(2)Z (X ) =330

1

Pay-off matrix is

Z Z1 2

(1)X 300 291

(2) 330 283X

From this matrix

U =330, U =291, L =300, L =2831 2 1 2

{ }Find X , i=1,2,3, j=1,2,3 , soastosatisfy Z 300 andZ 283,1 2ij≤ ≤

Define membership function for the objective

functions Z (X)1

and 2

Z (X) respectively

Page 15: Advances in Information Mining 2010 Issue1

Lohgaonkar MH, Bajaj VH, Jadhav VA and Patwari MB

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

5

1, if Z (X) 3001

330-Z (X)1 , if 300 Z (X) 330

1µ (X)= 330-3001

0, if Z (X) 3301

< <

;

1, if Z (X) 2832

291-Z (X)2 , if 283 Z (X) 2912µ (X)= 291-2832

0, if Z (X) 2912

< <

Find an equivalent crisp model

Maximize λ , λ+Z (X) 3301

≤ and

5λ+Z (X) 2912

Solve the crisp model by using an appropriate mathematical algorithm.

(1) (1) (1) (1) (1) (1) (1) (1) 4X +3X +5X +8X +6X +2X +7X +4X +

11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) X +9X +10X +12X +8X +6X +3X +5X +33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 4X +X +9X +2X +6X +4X +9X +3X +30λ 322 23 31 32 33 41 42 43 ≤ 30

(1) (1) (1) (1) (1) (1) (1) (1)

5X +6X +7X +4X +5X +2X +1X +3X +11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 4X +4X +2X +3X +10X +9X +9X +7X +33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 9X +2X +8X +7X +9X +8X +4X +5X +8λ 222 23 31 32 33 41 42 43 ≤ 91

Subject to (30) The optimal compromise solution of the problem is represented as

λ=0.6521

(1) (1) (1) (1)X =5 ; X =2.2608 ;X =1.7391 ;X =8;

11 12 13 21

(1) (1) (1) (1)X =3.7391;X =2.2608; X =6 ; X =1;

22 23 33 41

(1) (2) (2) (2)(*) X 6; X =4.7391;X =1.2608;X =6.7391;

42 12 13 23X =(2) (2) (2) (2) (2)

X =4; X =1; X =1;X 2; X31 32 41 42 43

=

= 3

* * Z =309.3902 and Z =283.4329

1 2

=

If we use hyperbolic membership function with

6 6 6 6 6 6α = = = ; α = = =1 2

U -L 330-300 30 U -L 291-283 81 1 2 2

U +L U +L630 5741 1 2 2 = =315 ; = =2872 2 2 2

Then we get the membership

functionsH Hµ (Z ) and µ (Z )1 1 2 2

for the objectives Z & Z1 2

respectively, are

defined as follows:

{ }

1, if Z (X) 3001

1 6 1Hµ Z (x)= tanh 315-Z (X) + if 300 Z (X) 330

1 1 12 30 2

0, if Z (X) 3301

< <

{ }

1, if Z (X) 2832

1 6Hµ Z (x)= tanh 287-Z (X) if 283 Z (X) 291

2 2 22 8

0, if Z (X) 2912

< <

We get an equivalent crisp model

Maximize Xmn+1

Subject to

α1α Z (X)+X (U +L )

1 1 10 1 12

6 (1) (1) (1) (1) (1) (1) (1) (1) (4X +3X +5X +8X +6X +2X +7X +4X +

11 12 13 21 22 23 31 3230

(1) (1) (1) (1) (2) (2) (2) (2) X +9X +10X +12X +8X +6X +3X +5X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 4X +X +9X +2X +6X +4X +9X +3X )+

22 23 31 32 33 41 42 43

6X 315

mn+130

(1) (1) (1) (1) (1) (1) (1) (1)

24X +18X +30X +48X +36X +12X +42X +24X +11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 6X +54X +60X +72X +48X +36X +18X +30X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) 24X +6X +54X +12X +36X +24X +54

22 23 31 32 33 41(2) (2)

X +18X +30X 189042 43 mn+1

And

α2α Z (X)+X (U +L )

2 2 2 22

6 (1) (1) (1) (1) (1) (1) (1) (1) (5X +6X +7X +4X +5X +2X +1X +3X +

11 12 13 21 22 23 31 328

(1) (1) (1) (1) (2) (2) (2) (2) 4X +4X +2X +3X +10X +9X +9X +7X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 9X +2X +8X +7X +9X +8X +4X +5X )+

22 23 31 32 33 41 42 43

6X 291

mn+1 8≤

(1) (1) (1) (1) (1) (1) (1) (1) 30X +36X +42X +24X +30X +12X +6X +18X +

11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 24X +24X +12X +18X +60X +54X +54X +42X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2)54X +12X +48X +42X +54X +48X +2

22 23 31 32 33 41(2) (2)

4X +30X )+8X 174642 43 mn+1

Subject to (30) The problem was solved by using the linear interactive and discrete optimization (LINDO) software, the optimal compromise solution is

Page 16: Advances in Information Mining 2010 Issue1

Fuzzy multi-objective multi-index transportation problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 6

X =1.9608mn+1

(1) (1) (1) (1)X =5; X =3.1304 ; X =8;X =2.8695;11 12 21 22

(1) (1) (1) (1)X =3.1304; X =6 ; X =1;X 6;

23 33 41 42

(2) (2) (2)X =3.8695;X =2.1304;X =1.1304;

12 13 22

(2) (2) (2) (2) (2)X =5.8695; X =4; X =1 ; X =1;X 2; X

23 31 32 41 42 4(*)

X =

=

=(2)

33

* * Z =300.8683 and Z =282.30241 2

=0.9804

λ

=

1, if Z 3001-1Ψ (X) -11e -eE

µ Z (x)= , if 300 Z 3301 1-S

1-e

0, if Z 3301

< <

;

1, if Z 2832-1Ψ (X) -12e -eE

µ Z (x)= , if 283 Z 2912 2-S

1-e

0, if Z 2912

< <

Then an equivalent crisp model for fuzzy model can be formulated as

Maximize λ subject to

( )-sψ xp -se -e

λ ,-s1-e

≤ p = 1,2,-----P and

subject to (7)-(9)

Z -L Z -300 Z -3001 1 1 1Ψ (X)= = =

1U -L 330-300 30

1 1

and

Z -L Z -283 Z -2832 2 2 2Ψ (X)= = =

2U -L 291-283 8

2 2

(1) (1) (1) (1) (1) (1) (1) (1) Ψ (X) = (-4X -3X -5X -8X -6X -2X -7X -4X -

2 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) X -9X -10X -12X -8X -6X -3X -5X -

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2)

4X -X -9X -2X -6X +4X -9X -3X 300) /3022 23 31 32 33 41 42 43

+

(1) (1) (1) (1) (1) (1) (1) (1)

Ψ (X) =(-5X -6X -7X -4X -5X -2X -1X -3X -2 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 4X -4X -2X -3X -10X -9X -9X -7X -

33 41 42 43 11 12 13 21

(2) (2) 9X -2X -8X

22 23(2) (2) (2) (2) (2) (2)

-7X -9X -8X -4X -5X 283)/831 32 33 41 42 43

+

Then the problem is

( )ψ x -11e -eλ ,

-11-e

≤ and

( )ψ x -12e -eλ ,

-11-e

And subject to (30) Then the problem can be simplified as

Maximize λ

Subject to

-SΨ (X)p -S -Se -(1-e )λ e p=1,2,...,P

(3.2) (3.4) i,j & λ 0

− ∀ ≥

Maximize λ⇒

-Ψ(X) -Ψ(X) -Ψ(X)-1 -11 1 1e -(1-e )λ e e -(1-0.368)λ 0.368 e -(0.6321)λ 0.368≥ ⇒ ≥ ⇒ ≥

-Ψ (X) -Ψ (X)-1 12 2e -(1-e )λ e e -(0.6321λ 0.368−

≥ ⇒ ≥

The problem is solved by the general interactive optimization (LINGO) software

λ=0.7084

(1) (1) (1) (1) (1)X =5; X =2.3703 ; X =1.6296;X =8;X =4;

11 12 13 21 22

(1) (1) (1) (1) (1)X =2; X =6 ; X =1 ;X 5.6296;X 0.3703

23 33 41 42 43

(2) (2) (2) (2) (2)X =4.6296;X =1.3703;X =4; X =1; X =1;

12 13 31 32 41

(*) (2) (2)X = X 2.3703; X

42 43

= =

= = 2.6296

* * Z =306.1085 and Z =270.6274

1 2

Conclusion In this paper multi-objective multi-index transportation problem is defined and problem is solved by using fuzzy programming technique (Linear, Hyperbolic and Exponential membership function). The multi-index transportation problem can represent different modes of origins and destination or it may represent a set of intermediate warehouse. If we use the hyperbolic membership function, then the crisp model becomes linear. The optimal compromise solution of hyperbolic membership function changes significantly if we compare with the solution obtained by the linear membership function but the optimal compromise solution of exponential membership function does not change significantly if we compare with the solution obtained by the linear membership function. References [1] Aneja V.P. and Nair K.P.K. (1979)

Management Science, 25, 73-78. [2] Bellman R. E. and Zadeh L. A. (1970)

Management science, 17, 141-164. [3] Bit A. K., Biswal M.P. and Alam S. S.

(1993) Industrial Engineering Journal XXII, No. 6, 8-12.

[4] Chanas S., Kolodzejczyk W. and Machaj A. (1984) Fuzzy set and systems, 13,

211-221. [5] Gwo-Hshiung Tzeng, Dusan Teodorovic

and Ming-Jiu Hwang (1996) European Journal of Operations Research, 95, 62-72.

Page 17: Advances in Information Mining 2010 Issue1

Lohgaonkar MH, Bajaj VH, Jadhav VA and Patwari MB

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

7

[6] Haley K. B. (1963) Operations Research 10, 448-463.

[7] Haley K. B. (1963) Operations Research 11, 369-379.

[8] Haley K. B. (1965) Operations Research 16, 471-474.

[9] Junginger W. (1993) European Journal of Operational Research 66, 353-371.

[10] Oheigeartaigh M. (1982) fuzzy sets and systems, 8 , 235-243.

[11] Prade H. (1980) Fuzzy sets. Theory and applications to policy analysis and information Systems. Plenum Press, new work, 155-170.

[12] Rautman C.A. Reid R.A. and Ryder E.E. (1993) Operations Research 41, 459-469.

[13] Verma Rakesh, Biswal M.P. and Biswas A. (1997) Fuzzy sets and systems 91, 37-43.

[14] Waiel F. and Abd El- Wahed (2001) fuzzy sets and systems, 117, 26-33.

[15] Zimmermann H. J. (1978) fuzzy set and system 1, 45-55.

Page 18: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-18-22

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

Data mining- A Mathematical Realization and cryptic application using variable key

Chakrabarti P. Sir Padampat Singhania University, Udaipur-313601, Rajasthan, India, [email protected]

Abstract- In this paper we have depicted the various mathematical models based on the themes on data mining. The numerical representations of regression and linear models have been explained. We have also shown the prediction of datum in the light of statistical approaches namely probabilistic approach, data estimation and dispersion theory. The paper also deals with the efficient generation of shared keys required for direct communication among co-processors without active participation of server. Hence minimization of time-complexity, proper utilization of resource as well as environment for parallel computing can be achieved with higher throughput in secured fashion. The techniques involved are cryptic methods based support analysis, confidence rule, resource mining, sequence mining and feature extraction. A new approach towards realizing variability concept of key in Wide – Mouth Frog Protocol, Yahalom Protocol and SKEY Protocol has been depicted in this context. Keywords-data mining, regression, dispersion theory, sequence mining, variable key Regression based data-mining techniques A. Concept We have pointed out the scenario where the prediction of dependency of a datum at time instant t1 on another at t2 can be computed. If we assume d1 as datum at t1 and d2 as datum at t2

then we can write the following equation as d2 = a + bd1.. (1) Where a,b are constants. Data prediction based on linear regression model has been concentrated. B. Linear representation As per statistical prediction let the predicted value of a datum d is ∆1 . We assume that its original; value is ∆2. As per data mining based regression model , we can denote ∆i = d 2,i – (a+ b d1,i ) as the error in taking a + b d 1,i for d 2,i and this is known as error of estimation . Prediction based on probabilistic approach Suppose observed data be k1, k2, k3 ..km have respective probability p1,p2, …….pn. m When ∑ pi = 1 i = 1 then E(k) = ∑ ki pi = 1 .(2), i = 1 provided it is finite. Here, we are use bivariate probability based on K (k1, k2, k3……km) i.e. set of observed data and Q (q1, q2, q3, …….qn) i.e. set of predictive values , ( 1 < m < n) Theorem 1 If the observed data set value and predicted data set value be two jointly distributed random variable then E ( K + Q) = E (K) + E(Q) . Proof : K assume values k1, k2, k3 … km Q assume values q1, q2, q3 …. qm

P(K=ki, Q = qj) = pij, i = 1 to n and j = 1 to n E (K + Q) = ∑ ∑ (ki + qj) pij i j = ∑ ∑ ki pij + ∑ ∑ qj pij i j i j = ∑ ki ∑ pij + ∑ qj ∑ pij i j j i E( K + Q) = E (K) + E(Q)...(3) Similarly, E( K * Q) = E (K) * E (Q)…(4) Prediction based on datum estimation Let the data space be (k1, k2, k3----kn), let distribution function f1(k1) of random variable k involves a parameter whose value is unknown and we have to uses value of on the basis of observed data space (k1,k2,…….km) where (m < n). We have to select = f2(k1,k2,……..km), it is basically a number and it is taken as a given for the value of . Hence, is an estimation of and value of obtained from observed data space is on estimate of . should be negligible for successful prediction of datum. Now, we can represent the datum assumption criteria as below : E () = for true value of ….(5) and Var () <= Var (Ψ), for ..(6)

True value of and Ψ being any other estimate satisfying equation (5). Hence the data prediction has been pointed out on the basis of property of unbiasedness (equation (5)) and property of maximum variance ( equation(6)). Prediction based on dispersion theory and pattern analysis The values of the data for different sessions are not all equal. In some cases the values are close to one another, where in some cases they are highly dedicated from one another. In order to get a proper idea about overall nature of a given set of values, it is necessary to know, besides average, the extent to which the data differ

Page 19: Advances in Information Mining 2010 Issue1

Data mining- A Mathematical Realization and cryptic application using variable key

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 19

among themselves or equivalently, how they are scattered about the average. Let the values k1, k2, k3…….km are the obtained data and c be the average of the original values of km+1, km+2, ……………kn Mean Deviation of k about c will be given by 1 n-m MDc = ________ ∑ | ki – c | ..(7) ( n – m) i = 1 In particular , when c = k , mean deviation about mean will be given by 1 n-m MDk = ________ ∑ | ki – ki | .(8) ( n – m) i = 1 B. Pattern matching We want to study the trend analysis of future events based on prediction using previously observed data. If the event delivers some numerical based data estimation , then we can predict so in certain forms. We assume dp to be predicted datum and do as observed datum If dp and do are linearly related, then dp = a + b do…..(9) If exponentially related , the equation will be in the form of dp = ab

do..(10). If

logarithmic transformation based prediction rule is observed, then the equation will be Dp = A + B d t..(11). where Dp = log dp , A = log a and B = log b. In case of data merging towards obtaining a meaningful information , the convention rule is as follows- di=> d(i+k)mod n where di €D , k is the offset value and n is the number of sensed data elements ie. number of elements of set D.The value of k varies from stage to stage. Communication based on support A .Scheme A and B are two parties . K1,K2,K3,K4,K5,K6 are keys which are protected to A and B only . A sends message m1,m2,m3,m4,m5,m6 in encrypted form with the help of one or more keys . Third party will decipher each message by error-and-trial method and form sets . The key having maximum support is the shared key between A and B . If the number of shared key is more than one then that one is primary while other one is candidate to it .Here we will find shared key so that the third party will not be able to decipher the message .

B. Mathematical Analysis Message Encrypted Key m1 ek1=f(k1,k3,k4,k6)=k1^k3^k4^k6 m2 ek2=f(k3,k5)=k3^k5 m3 ek3=f(k4,k5,k6)=k4^ k5^k6 m4 ek4=f(k2,k3,k5)=k2^k3^k5 m5 ek5=f(k1,k2)=k1^k2 m6 ek6=f(k1,k2,k3,k6)=k1^k2^k3^k6 So, it is seen that k3 is supported by 4 out of 6 sets of shared key . This support of k3=66.6% . Hence shared key of A& B is k3. If hacker hacks k1,k2…….,k6 then by applying error-and-trial it

will get shared key. So concept of automatic variable shared key is proposed. The concept is that shared key = (key having maximum support) xor (xor of the value of messages where the support is not available) . Hence, k3= key having maximum support , m3,m5= messages encrypted without k3 . Therefore , shared key =k3^m3^m5 . This scheme cannot be revealed to the hacker . So it will hack k3 instead modified value of the shared key.

Communication based on confidence rule A. Scheme Input:- m1,m2,m3,m4,m5,m6 to A. K1,K2,K3,K4,K5,K6 TO A and B. Step1:

A encrypts each of the messages with combination of the keys and sends it to B.

Step2: B finds the key which has the confidence level of 100 %,i.e. key1=>key2. If key1 exists, then key2 will also exist and hence confidence of Key1=>key2 is 100 %. Step3: Shared key is key1. Step4:

( Application only for enhancing security level ) Shared (key=key1) XOR (key-new) , where key-new can be obtained such that key-new=>key1 is minimum.

B. Mathematical Analysis Message Encrypted Keys m1 Sk1=(k1,k3,k4,k6)=(k1^k3^k4^k6) m2 Sk2=(k3,k5)=(k3^k5) m3 Sk3=(k4,k5,k6)=(k4^k5^k6) m4 Sk4=(k2,k3,k5)=(k2^k3^k5) m5 Sk5=(k1,k2)=(k1^k2) m6 Sk6=(k1,k2,k3,k6)=(k1^k2^k3^k6) Only k4=>k6 has confidence level of 100 % . So, shared key=k4(up to step 3). Association Scheme Probability k1=>k4 1/3 k2=>k4 0 k3=>k4 1/4 k5=>k4 1/2 k6=>k4 2/3 So, key-new=k2 since it has least probability . Hence , shared key=k4 XOR k2. Statistical approaches of resource mining A. Based on prediction of most frequent word The most frequent key can be obtained based on Max(f1,f2,….,fn) where f1, f2, ….,fn are relative frequencies and n is total number of keys. B. Based on prediction of variable within interval

Page 20: Advances in Information Mining 2010 Issue1

Chakrabarti P

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

20

We can predict the value of a variable key if we can measure interval properly. We can apply this scheme in hacking. Theorem 2 If a variable key changes (V) over time (t) in an exponential manner, in that case the value of the variable at the centre point an interval (a1, a2) is a geometric mean of its value at a1 and a2. Proof: Let Va = mn

a

Then Va1 = mna1

and Va2 = mna2

Now, value of V at (a1 + a2)/2 = mn

(a1+a2)/2

= [m2n

(a1+a2)]1/2

= [(mn

a1)(mn

a2)]

1/2

= (Va1Va2)1/2

C. Based on prediction of interrelated variables In a message there may be a variable which is dependent on any other based on any equation in that case extraction can be made. Theorem 3 If a variable m related to another variable n in the form m = an, where a is a constant, then harmonic mean of n is related to that of n based on the same equation. Proof: Let x is no. of given values. If mHM = x / (∑ 1/mi) for i = 1 to x = x / (∑ 1/ani) [ Since mi = ani] = x / ( 1/a ∑ 1/ni) for i = 1 to x = a( x / ( ∑ 1/ni) for i= 1 to x = anHM

Shared key generation in the light of sequence mining Let us suppose that four users viz.U1,U2,U3,U4 are in a network. Each of U1,U2,U3 transmits three messages to U4 in successive sessions. Sender Key Operations U1 110110 U1(m1)�U4 U2 100101 U2(m1)�U4 U3 001010 U3(m1)�U4 U1 001100 U1(m2)�U4 U2 000011 U2(m2)�U4 U3 100001 U3(m2)�U4 U1 111100 U1(m3)�U4 U2 000001 U2(m3)�U4 U3 110100 U3(m3)�U4 A . Algorithm Step 1 : Designate each bit of key as a character. Step 2 : If the character index value is 1 include it in sequence. Step 3 : else ignore the value. Step 4 :Identify the pattern that is decided by the communicating party and fetch the combination. Step 5 : The shared key for each user will be based on the combined result Step 6 : Repeat the steps 1to5 for other users

Step 7 : Final shared key will be based on shared key in combined form of U1/U2/U3 and computation scheme. B. Analysis The bits can be denoted by A,B,C,D,E,F. Combined sequence of U1: (A,B,D,E)�(C,D)�(A,B,C,D) Table 1- Combined sequence forU1 Sequence Session A B C D E F

1 1 1 1 0 1 1 0

2 4 0 0 1 1 0 0

3 7 1 1 1 1 0 0

Combined sequence of U2 : (A,D,F)�(E,F)�(F) Table 2- Combined sequence forU2 Sequence Session A B C D E F

1 2 1 0 0 1 0 1

2 5 0 0 0 0 1 1

3 8 0 0 0 0 1 1

Combined sequence of U3 : (C,E) �(A,F) �(A,B,D) Table 3- Combined sequence forU3 Sequence Session A B C D E F

1 3 0 0 1 0 1 0

2 6 1 0 0 0 0 1

3 9 1 1 0 1 0 0

C. Method 1 Communicating parties : U1 and U4 (say).Sequence of AB and D are as follows : AB=2, D=3.Therefore x1=2 and x2=3 Therefore U1 will compute ((A.M.of2and3)*(H.M.of 2and3))

1/2 and U4 will

compute G.M. of 2and3.So, shared key= 6

1/2. If any

occurrence becomes null, then that parameter value is treated as zero. D. Method 2 Communicating parties : U3 and U4 (say) In case of U3, Union becomes C E A F B D So, shared key of U3 and U4 is C E A F B D E. Method 3 Communicating parties : U2 and U4 (say) Shared key is based on intersection and it is F. Key using feature based method Let six messages are to be sent by the sender and those have to be encrypted by combination of one or more keys using some function.

Page 21: Advances in Information Mining 2010 Issue1

Data mining- A Mathematical Realization and cryptic application using variable key

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 21

Table 4 - Association of keys against each message

message Keys associated M1 SK1 = ( K1,K3,K4,K6)

M2 SK2 = (K3,K5) M3 SK3 = (K4,K5,K6) M4 SK4 = (K2,K3,K5) M5 SK5 = (K1,K2) M6 SK6 = (K1,K2,K3,K6)

Table 5 - Determination of count and value

Key Initial value

Count Value (Value)2

K1 0.1 3 0.3 0.09

K2 0.2 3 0.6 0.36

K3 0.3 4 1.2 1.44

K4 0.4 2 0.8 0.64

K5 0.5 3 1.5 2.25

K6 0.6 3 1.8 3.24

Now CF = ( x , y , z ) where x = number of elements , y = linear sum of the elements and z = sum of the square of the elements CF1 = ( 4 , 4.1 , 5.41) CF2 = ( 2 , 2.7 , 3.69) CF3 = ( 3 , 4.1 , 6.13 ) CF4 = ( 3 , 3.3 , 4.05 ) CF5 = ( 2 , 0.9 , 0.45 ) CF6 = ( 4 , 3.9 , 5.13 ) So CFnet = accumulation of maximum of each tuple = ( 4 , 4.1 , 6.13) So shared key = floor of modulus of (4.1 – 6.13) = 2 Wide – mouth frog using variable key Both Alice and Bob share a secret key with a trusted server let Trent. The keys are just used for key distribution and not to encrypt any actual messages between users. The proposed algorithm is as follows-

1) Alice concatenates a timestamp, Bob’s name and a technique to deduce random session key based on timestamp and Bob’s name. She then encrypts the whole message with the key she shares with Trent. She sends this to Trent along with her name. Alice sends:- A,EKA(TA, B, f).

2) Trent decrypts the message. For enhanced security, he concatenates a new timestamp, Alice’s name, function “f” and the difference between TB and TA. He then encrypts the whole message with the key he shares with Bob. Trent sends: EKB(TB, A, f, d). Hence, f is automatic variable based on TB, d.

3) Bob decrypts it. He then first verify the sender’s name, and compute TA based on TA= TB – d

4) Then it will compute “f” based on TA and binary form of ASCII value of his name.

5) Thus he computes K, i.e. the session key with which he will communicate with Alice.

6) In the next iteration TA, KA will be changed and hence “f” and so on.

The main advantage is that nowhere the transmission of key K is used. Yahalom protocol using variable key Both Bob and Alice share a secret key with Trent. Let , RA = Nonce chosen by Alice NB =Number chosen by Bob based on RA ,A KA = Shared key between Alice and Trent KB = Shared key between Bob and Trent A=Alice’s name B=Bob’s name K=Random session key

1. Alice concatenates her name and a random

number and sends it to Bob. 2. Bob computes NB = RA + (binary form of

ASCII value of Alice). He sends Trent B, EKB (A, RA, f), where

f= offset which when applied on NB yields RA. 3. Trent generates two messages to Alice EKA

(B, K’, RA, f, d), EKB (A, K’, d),where K= session key random = f( K’, d).

4. Alice decrypts first message, extracts K using f((K’, d). Alice sends Bob two messages EKB (A, K’, d), EK (RA, f).

5. Bob decrypts A, K’, d are extracts of K like f(K’, d)= K.. Then he extracts NB using f(RA, f)≡ NB. It is to be remembered that the functions

f(K’, d) and f(RA, f) should be reversible. Bob then matches whether NB has same value. At the end, both Alice and Bob are convinced that they are talking to the other and not to a third party. Advantage is that there is no use of transmitting NB and K. Demerit is calculation of NB and K using the functions specified. Analysis of skey using variable key SKEY is mainly a program for authentication and it is based on a one-way function. The proposed algorithm is as follows- 1. Host computes a Bernouli trial with biased

coin for which p= probability of coming 1.q=(1-p)=probability of coming 0.Let number of trials be n. Assume n=6, and string=110011.

2. Host sends the string to Alice. 3. Alice modifies its own public key based on

that the new public key = previous key + ( binary equivalent of the number of 1’s present in the string).

4. Alice creates a Shared Key.

Page 22: Advances in Information Mining 2010 Issue1

Chakrabarti P

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

22

5. Alice modifies the public key along with modification scheme with shared key.

6. Alice then encrypts the string with her private key and sends back to the host along with her name.

7. Host first decrypts public key and accordingly fetches it from database of Alice and computes the result.

8. If match is found, then it performs another level of verification by decrypting the string with new value of Alice’s public key.

9. If that also matches, then authentication of Alice is certified.

Conclusion The techniques involved for data prediction in this paper are namely regression rule, probabilistic approach, and datum estimation analysis and dispersion theory. We have also shown how pattern matching can be sensed. Several approaches of shared key computation on the basis of data mining techniques have been discussed in details with relevant mathematical analysis. Variable concept of key in Wide-Mouth Frog Protocol, Yahalom Protocol and SKEY Protocol has also been applied in cryptic data mining. References [1] Chakrabarti P., et. al. (2008) IJCSNS, 8,7. [2] Chakrabarti P., et. al. () Asian Journal of

Information Technology, Article ID: 706-AJIT

[3] Chakrabarti P., et. al. () Asian Journal of Information Technology, Article ID: 743-AJIT

[4] Chakrabarti P., et. al. (2008) IJHIS . [5] Chakrabarti P. (2008) International

conference on Emerging Technologies and Applications in Engineering, Technology and Sciences , Rajkot.

[6] Chakrabarti P. (2008) ICQMOIT08, Hyderabad.

[7] Schneier B. (2008) Applied Cryptography , Wiley-India Edition

Page 23: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-01-07

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

Fuzzy multi-objective multi-index transportation problem

Lohgaonkar M.H.1, Bajaj V.H.

1*, Jadhav V.A.

2 and Patwari M.B.

1

*1Department of Statistics, Dr. B. A. M. University, Aurangabad, MS, [email protected],

[email protected] 2Departments of Statistics, Science College, Nanded, MS

Abstract- The aim of this paper is to present a fuzzy multi-objective multi-index transportation problem and develop multi-objective multi-index fuzzy programming model. This model cannot only satisfy more of the actual requirements of the integral system but is also more flexible than conventional transportation problems. Furthermore, it can offer more information to the decision maker (DM) for reference, and then it can raise the quality for decision-making. This paper, we use a special type of linear and non-linear membership functions to solve the multi-objective multi-index transportation problem. It gives an optimal compromise solution. Keywords- Transportation problem, multi-objective transportation problem, multi-index, linear membership function, non-linear membership function Introduction Fuzzy set theory was proposed by L. A. Zadeh and has been found extensive in various fields. Bellman and Zadeh [2] were the first to consider the application of the fuzzy set theory in solving optimization problems in a fuzzy environment, these investigators constraints that both the objective function and the constraints that exist in the model could be represented by corresponding fuzzy set and should be treated in the same manner. The earliest application of it to transportation problems include Prade [11], O’he’igeartaigh [10], Chanas et al. [4]. But these researcher emphases on investigating theory and algorithm. Furthermore, these above investigations are illustrated with simple instance slacking in actual cases of submition. On the other hand, these models are only of single objective and are classical two index transportation problems. In actual transportation problem, the multi-objective functions are generally considered, which includes average delivery time of the commodities, minimum cost, etc. Zimmermann [15] applied the fuzzy set theory to the linear multicriteria decision making problem. It used the linear fuzzy membership function and presented the application of fuzzy linear vector maximum problem. He showed that solutions obtained by fuzzy linear programming always provide efficient solutions and also an optimal compromised solution. Aneja and Nair [1] Showed that the problem model. Multi-index transportation problem are the extension of conventional transportation problems, and are appropriate for solving transportation problems with multiple supply points, multiple demand points as well as problems using diverse modes of transportation demand or delivering different kinds of merchandises. Thus, the forwarded problem would be more complicated than conventional transportation problems. Junginger [9] who proposed a set of logic problems, to solve multi-index transportation problems, has also conducted a detailed investigation regarding the characteristics of multi-index transportation problem model.

Rautman et al. [12] used multi-index transportation problem model to solve the shipping scheduling suggested that the employment of such transportation efficiency but also optimize the inegral system. Mathematical Model Multi-objective Multi-index Transportation Problem

Let aijl

be multi-dimensional array

1 i m, 1 j n , 1 l k≤ ≤ ≤ ≤ ≤ ≤ and let

A=(a ), B=(b ), C=(c )ij jl il

be multi-matrices

then multi-index transportation problem is defined as follows

Minimize Z = a Xijl ijli j l

∑∑∑ (1)

Subject to

X = a ( i , j )i j l i jl

X = c ( i , l )i j l i lj

X = b ( j , l )i j l j li

X 0 ( i , j , l )i j l

∀∑

∀∑

∀∑

≥ ∀

(2)

It is immediate that

a = b ; a = c ; b = cij jl ij il jl ili l j l j i

∑ ∑ ∑ ∑ ∑ ∑ (3)

are three necessary conditions however they are noted to be non sufficient. Multi-objective double transportation problem as follows

Page 24: Advances in Information Mining 2010 Issue1

Fuzzy multi-objective multi-index transportation problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 2

m mn n(1) (2)(1) (2)Minimize Z = k x + k x (4)p ij ijij iji=1 i=1j=1 j=1

Subject to

n (1)x =a i (5)1iijj=1

n (2)x =a i (6)2iijj=1

m (1)x =b j (7)1jiji=1

m (2)x =b j (8)2jij

i=1

(1) (2)x +x =c i,j (9)ijij ij

(1) (2)x ,x 0 i,j (10)

ij ij

∑ ∑∑ ∑

∀∑

∀∑

∀∑

∀∑

≥ ∀

It may be easily seen that for existence of solution following set of conditions are necessary.

nc =a +a i (11)ij 1i 2ij=1

mc =b +a j (12)ij 1j 2ji=1

m na = b (13)1i 1ji=1 j=1

m na = b (14)

2i 2ji=1 j=1

c Min(a +b )+Min(a +b ) (i,j) (15)ij 1i 1j 2i 2j

∀∑

∀∑

∑ ∑

∑ ∑

≤ ∀∑

It may be easily seen that DTP is composed of two transportation tables and one C matrix as given below.

(1) (1) (1)k k ... k a11 12 1n 11

(1) (1) (1) ak k ... k 1221 22 2nC = (16)1

......

a(1) (1) (1) 1mk k ... kmnm1 m2

b b ... b11 12 1n

(2) (2) (2)k k ... k a11 12 1n 21

(2) (2) (2) ak k ... k 2221 22 2nC =2

...

(2) (2) (2)k k ... kmnm1 m2

and C = (c ) (17)mxnij...

a2m

b b ... b11 12 1n

(1) (1) (1)k k ... k a11 12 1n 11

(1) (1) (1) ak k ... k 1221 22 2nT = (18)1

......

a(1) (1) (1) 1mk k ... kmnm1 m2

b b ... b11 12 1n

(2) (2) (2)k k ... k a11 12 1n 21

(2) (2) (2) ak k ... k 2221 22 2nT =2

...

(2) (2) (2)k k ... kmnm1 m2

and C = (c ) (19)mxnij...

a2m

b b ... b21 22 2n

Fuzzy Algorithm to solve multi-objective multi-index transportation problem Step 1: Solve the multi-objective multi-index transportation problem as a single objective transportation problem P times by taking one of the objectives at a time Step 2 : From the results of step 1, determine the corresponding values for every objective at each solution derived. According to each solution and value for every objective, we can find pay-off matrix as follows

Z (X) Z (X) ... Z (X)p1 2

(1)X

(2)X

.

.

(P)X

Z Z … Z p11 1 2 1

. . .ZZ Z 2 p2 1 2 2

. . . . . . . . . . . .

Z Z Zp p p p1 2

Where, (1) (2) (p)

X ,X ,...,X are the isolated

optimal solutions of the P different transportation problems for P different objective functions

iΖ =Ζ (X ) (i=1,2,...,p & j=1,2,...,p)ιj j

be the

i-th row and j-th column element of the pay-off matrix. Step 3: From step 2, we find for each objective the worst (Up) and the best (Lp) values corresponding to the set of solutions, where,

U =max (Z ,Z ,...,Z )p pp1p 2pand

L =Z p=1,2,...,Pp pp

An initial fuzzy model of the problem (4)-(10)

can be stated as

Find X i=1,2,...,m j=1,2,...,n,ij

~so as to satisfy Z < L p=1,2,...,Pp p

subject to (4)-(10)

Step 4: Case (i) Define membership function for the p-th objective function as follows:

1 if Z (X) Lp p

U -Z (X)p pµ (X)= if L < Z < Up p p p

U -Lp p

0 if Z Up p

(20)

Step 5: Find an equivalent crisp model by using a linear membership function for the initial fuzzy model

M a x i m i z e λ

U - Z ( X )p pλ

U - Lp p

s u b j e c t t o ( 5 ) - ( 1 0 )

≤ (21)

Page 25: Advances in Information Mining 2010 Issue1

Lohgaonkar MH, Bajaj VH, Jadhav VA and Patwari MB

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

3

Step 6: Solve the crisp model by an appropriate mathematical programming algorithm.

Maximize λ

Subject to

pC X + λ (U -L ) U p=1,2,...,Pp p pij ij

(22)

subject to (5)-(10)

Now, by using hyperbolic membership function for the P-th objective function

1 if Z Lp p

(U +L ) (U +L )p p p p{ -Z (x)}α -{ -Z (x)}αp p p p2 21 e -e 1H

µ Z (x)= + if L Z Up p p p(U +L ) (U +L )p p p p2 2{ -Z (x)}α -{ -Z (x)}αp p p p2 2e +e

0 if Z Up p

< <

(23)

Where, 3 6

α = =pU -L U -Lp p p p

Crisp model for the fuzzy model can be formulated as: Maximize λ subject to

(U +L ) (U +L )p p p p{ -Z (x)}α -{ -Z (x)}αp p p p2 21 e -e 1

λ +(U +L ) (U +L )p p p p2 2

{ -Z (x)}α -{ -Z (x)}αp p p p2 2e +e

≤(24)

subject to (5) (10) & λ 0− ≥

Solve the crisp model as Maximize Xmn+1

subject to

α Z (x) + X α (U + L ) /2 , p = 1,2,-----Pp p p p pmn+1 ≤

(25)

subject to (5)-(10) and X 0 mn+1

Where, -1

X =tanh (2λ-1)mn+1

Now, by using exponential membership function for the p th objective function and is defined as

1, if Z Lp p

-SΨ (X)p -Se -eE

µ Z (x)= , if L Z Up p p p-S1-e

0, if Z Up p

< <

(26)

Where,Z -Lp P

Ψ (X)= p=1,2,...,PpU -Lp p

S is a non zero parameter, prescribed by the decision maker Numerical Examples Example 1

C C C1 2

4 3 5 8 6 3 5 7 39 6

14 78 6 2 5 4 1 8 4 9

6 57 4 1 9 2 6 4 1 6

7 69 10 12 4 9 3 2 8 314 12 10 5 8 11

(27)

Example 2

T T C1 2

5 6 7 10 9 9 5 7 39 6

14 74 5 2 7 9 2 8 4 9

6 51 3 4 8 7 9 4 1 6

7 64 2 3 8 4 5 2 8 314 12 10 5 8 11

(28)

Example 1 is simplified as

(1) (1) (1) (1) (1) (1) (1) (1)Minimize Z = 4X +3X +5X +8X +6X +2X +7X +4X +

1 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) X +9X +10X +12X +8X +6X +3X +5X +

33 41 42 43 11 12 13 21

4X22(2) (2) (2) (2) (2) (2) (2) (2)

+X +9X +2X +6X +4X +9X +3X23 31 32 33 41 42 43

(29)

S u b j e c t t o

( 1 ) ( 1 ) ( 1 )X + X + X = 9

1 1 1 2 1 3

( 1 ) ( 1 ) ( 1 )X + X + X = 1 4

2 1 2 2 2 3

( 1 ) ( 1 ) ( 1 )X + X + X = 6

3 1 3 2 3 3

( 1 ) ( 1 ) ( 1 )X + X + X = 7

4 1 4 2 4 3

( 2 ) ( 2 ) ( 2 )X + X + X = 6

1 1 1 2 1 3

( 2 ) ( 2 ) ( 2 )X + X + X = 7

2 1 2 2 2 3

( 2 ) ( 2 ) ( 2 )X + X + X = 5

3 1 3 2 3 3

( 2 ) ( 2 ) ( 2 )X + X + X = 6

4 1 4 2 4 3

( 1 ) ( 1 )X + X +

1 1 2 1( 1 ) ( 1 )

X + X = 1 43 1 4 1

( 1 ) ( 1 ) ( 1 ) ( 1 )X + X + X + X = 1 2

1 2 2 2 3 2 4 2

( 1 ) ( 1 ) ( 1 ) ( 1 )X + X + X + X = 1 0

1 3 2 3 3 3 4 3

( 2 ) ( 2 ) ( 2 ) ( 2 )X + X + X + X = 5

1 1 2 1 3 1 4 1

( 2 ) ( 2 ) ( 2 ) ( 2 )X + X + X + X = 8

1 2 2 2 3 2 4 2

( 2 ) ( 2 ) ( 2 ) ( 2 )X + X + X + X = 1 1

1 3 2 3 3 3 4 3

( 1 ) ( 2 )X + X = 5

1 1 1 1

( 1 ) ( 2 )X + X = 7

1 2 1 2

( 1 )X

1 3( 2 )

+ X = 31 3

( 1 ) ( 2 )X + X = 8

2 1 2 1

( 1 ) ( 2 )X + X = 4

2 2 2 2

( 1 ) ( 2 )X + X = 9

2 3 2 3

( 1 ) ( 2 )X + X = 4

3 1 3 1

( 1 ) ( 2 )X + X = 1

3 2 3 2

( 1 ) ( 2 )X + X = 6

3 3 3 3

( 1 ) ( 2 )X + X = 2

4 1 4 1

( 1 ) ( 2 )X + X = 8

4 2 4 2

( 1 ) ( 2 )X + X = 3

4 3 4 3

(30)

Page 26: Advances in Information Mining 2010 Issue1

Fuzzy multi-objective multi-index transportation problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 4

Example 2 is simplified as (1) (1) (1) (1) (1) (1) (1) (1)

Minimize Z = 5X +6X +7X +4X +5X +2X +1X +3X +2 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 4X +4X +2X +3X +10X +9X +9X +7X +

33 41 42 43 11 12 13 21

(2 9X

22) (2) (2) (2) (2) (2) (2) (2)+2X +8X +7X +9X +8X +4X +5X

23 31 32 33 41 42 43

(31)

S u b je c t to

( 1 ) ( 1 ) (1 )X + X + X = 9

1 1 1 2 1 3

( 1 ) ( 1 ) (1 )X + X + X = 1 4

2 1 2 2 2 3

( 1 ) ( 1 ) (1 )X + X + X = 6

3 1 3 2 3 3

( 1 ) ( 1 ) (1 )X + X + X = 7

4 1 4 2 4 3

( 2 ) ( 2 ) (2 )X + X + X = 6

1 1 1 2 1 3

( 2 ) ( 2 ) (2 )X + X + X = 7

2 1 2 2 2 3

( 2 ) ( 2 ) (2 )X + X + X = 5

3 1 3 2 3 3

( 2 ) ( 2 ) (2 )X + X + X = 6

4 1 4 2 4 3

( 1 ) ( 1 )X + X +

1 1 2 1(1 ) (1 )

X + X = 1 43 1 4 1

( 1 ) ( 1 ) (1 ) (1 )X + X + X + X = 1 2

1 2 2 2 3 2 4 2

( 1 ) ( 1 ) (1 ) (1 )X + X + X + X = 1 0

1 3 2 3 3 3 4 3

( 2 ) ( 2 ) (2 ) (2 )X + X + X + X = 5

1 1 2 1 3 1 4 1

( 2 ) ( 2 ) (2 ) (2 )X + X + X + X = 8

1 2 2 2 3 2 4 2

( 2 ) ( 2 ) (2 ) (2 )X + X + X + X = 1 1

1 3 2 3 3 3 4 3

( 1 ) ( 2 )X + X = 5

1 1 1 1

( 1 ) ( 2 )X + X = 7

1 2 1 2

( 1 )X

1 3( 2 )

+ X = 31 3

( 1 ) ( 2 )X + X = 8

2 1 2 1

( 1 ) ( 2 )X + X = 4

2 2 2 2

( 1 ) ( 2 )X + X = 9

2 3 2 3

( 1 ) ( 2 )X + X = 4

3 1 3 1

( 1 ) ( 2 )X + X = 1

3 2 3 2

( 1 ) ( 2 )X + X = 6

3 3 3 3

( 1 ) ( 2 )X + X = 2

4 1 4 1

( 1 ) ( 2 )X + X = 8

4 2 4 2

( 1 ) ( 2 )X + X = 3

4 3 4 3

(32)

For objective 1

Z , we find the optimal solution

as

(1) (1) (1) (1)X =5 ; X =4 ;X =8 ,X =2 ,

11 12 21 22

(1) (1) (1) (1)X =4; X =6 ; X =1;X 6;

23 33 41 42

(1) (2) (2) (2) (2)X = X =3 ; X =3 ;X =2 ; X =5 ,

12 13 22 23

(2) (2) (2) (2)X =4; X =1; X =1;X 2;

31 32 41 42

(2)X 3

43

=

=

=

Z =3001

For objective 2Z , we find the optimal solution

as

(1) (1) (1) (1)X =4 ; X =5 ;X =8 ,X =4 ,

11 12 21 22

(1) (1) (1) (1)X =2; X =1;X =5 ; X =1;

23 31 33 41

(1) (1) (2) (2)X 3; X =3;X =1 ; X =2;(2) 42 43 11 12X =

(2) (2) (2) (2)X =3;X =7; X =3; X =1;

13 23 31 32

(2) (2) (2)X 1; X =1;X 5;

33 41 42

=

= =

Z =2831

Now for (1)

X we can find out Z2

,

(1)Z (X )=291

2

Now for (2)

X we can find out Z1

(2)Z (X ) =330

1

Pay-off matrix is

Z Z1 2

(1)X 300 291

(2) 330 283X

From this matrix

U =330, U =291, L =300, L =2831 2 1 2

{ }Find X , i=1,2,3, j=1,2,3 , soastosatisfy Z 300 andZ 283,1 2ij≤ ≤

Define membership function for the objective

functions Z (X)1

and 2

Z (X) respectively

Page 27: Advances in Information Mining 2010 Issue1

Lohgaonkar MH, Bajaj VH, Jadhav VA and Patwari MB

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

5

1, if Z (X) 3001

330-Z (X)1 , if 300 Z (X) 330

1µ (X)= 330-3001

0, if Z (X) 3301

< <

;

1, if Z (X) 2832

291-Z (X)2 , if 283 Z (X) 2912µ (X)= 291-2832

0, if Z (X) 2912

< <

Find an equivalent crisp model

Maximize λ , λ+Z (X) 3301

≤ and

5λ+Z (X) 2912

Solve the crisp model by using an appropriate mathematical algorithm.

(1) (1) (1) (1) (1) (1) (1) (1) 4X +3X +5X +8X +6X +2X +7X +4X +

11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) X +9X +10X +12X +8X +6X +3X +5X +33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 4X +X +9X +2X +6X +4X +9X +3X +30λ 322 23 31 32 33 41 42 43 ≤ 30

(1) (1) (1) (1) (1) (1) (1) (1)

5X +6X +7X +4X +5X +2X +1X +3X +11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 4X +4X +2X +3X +10X +9X +9X +7X +33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 9X +2X +8X +7X +9X +8X +4X +5X +8λ 222 23 31 32 33 41 42 43 ≤ 91

Subject to (30) The optimal compromise solution of the problem is represented as

λ=0.6521

(1) (1) (1) (1)X =5 ; X =2.2608 ;X =1.7391 ;X =8;

11 12 13 21

(1) (1) (1) (1)X =3.7391;X =2.2608; X =6 ; X =1;

22 23 33 41

(1) (2) (2) (2)(*) X 6; X =4.7391;X =1.2608;X =6.7391;

42 12 13 23X =(2) (2) (2) (2) (2)

X =4; X =1; X =1;X 2; X31 32 41 42 43

=

= 3

* * Z =309.3902 and Z =283.4329

1 2

=

If we use hyperbolic membership function with

6 6 6 6 6 6α = = = ; α = = =1 2

U -L 330-300 30 U -L 291-283 81 1 2 2

U +L U +L630 5741 1 2 2 = =315 ; = =2872 2 2 2

Then we get the membership

functionsH Hµ (Z ) and µ (Z )1 1 2 2

for the objectives Z & Z1 2

respectively, are

defined as follows:

{ }

1, if Z (X) 3001

1 6 1Hµ Z (x)= tanh 315-Z (X) + if 300 Z (X) 330

1 1 12 30 2

0, if Z (X) 3301

< <

{ }

1, if Z (X) 2832

1 6Hµ Z (x)= tanh 287-Z (X) if 283 Z (X) 291

2 2 22 8

0, if Z (X) 2912

< <

We get an equivalent crisp model

Maximize Xmn+1

Subject to

α1α Z (X)+X (U +L )

1 1 10 1 12

6 (1) (1) (1) (1) (1) (1) (1) (1) (4X +3X +5X +8X +6X +2X +7X +4X +

11 12 13 21 22 23 31 3230

(1) (1) (1) (1) (2) (2) (2) (2) X +9X +10X +12X +8X +6X +3X +5X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 4X +X +9X +2X +6X +4X +9X +3X )+

22 23 31 32 33 41 42 43

6X 315

mn+130

(1) (1) (1) (1) (1) (1) (1) (1)

24X +18X +30X +48X +36X +12X +42X +24X +11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 6X +54X +60X +72X +48X +36X +18X +30X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) 24X +6X +54X +12X +36X +24X +54

22 23 31 32 33 41(2) (2)

X +18X +30X 189042 43 mn+1

And

α2α Z (X)+X (U +L )

2 2 2 22

6 (1) (1) (1) (1) (1) (1) (1) (1) (5X +6X +7X +4X +5X +2X +1X +3X +

11 12 13 21 22 23 31 328

(1) (1) (1) (1) (2) (2) (2) (2) 4X +4X +2X +3X +10X +9X +9X +7X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2) 9X +2X +8X +7X +9X +8X +4X +5X )+

22 23 31 32 33 41 42 43

6X 291

mn+1 8≤

(1) (1) (1) (1) (1) (1) (1) (1) 30X +36X +42X +24X +30X +12X +6X +18X +

11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 24X +24X +12X +18X +60X +54X +54X +42X +

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2)54X +12X +48X +42X +54X +48X +2

22 23 31 32 33 41(2) (2)

4X +30X )+8X 174642 43 mn+1

Subject to (30) The problem was solved by using the linear interactive and discrete optimization (LINDO) software, the optimal compromise solution is

Page 28: Advances in Information Mining 2010 Issue1

Fuzzy multi-objective multi-index transportation problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 6

X =1.9608mn+1

(1) (1) (1) (1)X =5; X =3.1304 ; X =8;X =2.8695;11 12 21 22

(1) (1) (1) (1)X =3.1304; X =6 ; X =1;X 6;

23 33 41 42

(2) (2) (2)X =3.8695;X =2.1304;X =1.1304;

12 13 22

(2) (2) (2) (2) (2)X =5.8695; X =4; X =1 ; X =1;X 2; X

23 31 32 41 42 4(*)

X =

=

=(2)

33

* * Z =300.8683 and Z =282.30241 2

=0.9804

λ

=

1, if Z 3001-1Ψ (X) -11e -eE

µ Z (x)= , if 300 Z 3301 1-S

1-e

0, if Z 3301

< <

;

1, if Z 2832-1Ψ (X) -12e -eE

µ Z (x)= , if 283 Z 2912 2-S

1-e

0, if Z 2912

< <

Then an equivalent crisp model for fuzzy model can be formulated as

Maximize λ subject to

( )-sψ xp -se -e

λ ,-s1-e

≤ p = 1,2,-----P and

subject to (7)-(9)

Z -L Z -300 Z -3001 1 1 1Ψ (X)= = =

1U -L 330-300 30

1 1

and

Z -L Z -283 Z -2832 2 2 2Ψ (X)= = =

2U -L 291-283 8

2 2

(1) (1) (1) (1) (1) (1) (1) (1) Ψ (X) = (-4X -3X -5X -8X -6X -2X -7X -4X -

2 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) X -9X -10X -12X -8X -6X -3X -5X -

33 41 42 43 11 12 13 21

(2) (2) (2) (2) (2) (2) (2) (2)

4X -X -9X -2X -6X +4X -9X -3X 300) /3022 23 31 32 33 41 42 43

+

(1) (1) (1) (1) (1) (1) (1) (1)

Ψ (X) =(-5X -6X -7X -4X -5X -2X -1X -3X -2 11 12 13 21 22 23 31 32

(1) (1) (1) (1) (2) (2) (2) (2) 4X -4X -2X -3X -10X -9X -9X -7X -

33 41 42 43 11 12 13 21

(2) (2) 9X -2X -8X

22 23(2) (2) (2) (2) (2) (2)

-7X -9X -8X -4X -5X 283)/831 32 33 41 42 43

+

Then the problem is

( )ψ x -11e -eλ ,

-11-e

≤ and

( )ψ x -12e -eλ ,

-11-e

And subject to (30) Then the problem can be simplified as

Maximize λ

Subject to

-SΨ (X)p -S -Se -(1-e )λ e p=1,2,...,P

(3.2) (3.4) i,j & λ 0

− ∀ ≥

Maximize λ⇒

-Ψ(X) -Ψ(X) -Ψ(X)-1 -11 1 1e -(1-e )λ e e -(1-0.368)λ 0.368 e -(0.6321)λ 0.368≥ ⇒ ≥ ⇒ ≥

-Ψ (X) -Ψ (X)-1 12 2e -(1-e )λ e e -(0.6321λ 0.368−

≥ ⇒ ≥

The problem is solved by the general interactive optimization (LINGO) software

λ=0.7084

(1) (1) (1) (1) (1)X =5; X =2.3703 ; X =1.6296;X =8;X =4;

11 12 13 21 22

(1) (1) (1) (1) (1)X =2; X =6 ; X =1 ;X 5.6296;X 0.3703

23 33 41 42 43

(2) (2) (2) (2) (2)X =4.6296;X =1.3703;X =4; X =1; X =1;

12 13 31 32 41

(*) (2) (2)X = X 2.3703; X

42 43

= =

= = 2.6296

* * Z =306.1085 and Z =270.6274

1 2

Conclusion In this paper multi-objective multi-index transportation problem is defined and problem is solved by using fuzzy programming technique (Linear, Hyperbolic and Exponential membership function). The multi-index transportation problem can represent different modes of origins and destination or it may represent a set of intermediate warehouse. If we use the hyperbolic membership function, then the crisp model becomes linear. The optimal compromise solution of hyperbolic membership function changes significantly if we compare with the solution obtained by the linear membership function but the optimal compromise solution of exponential membership function does not change significantly if we compare with the solution obtained by the linear membership function. References [1] Aneja V.P. and Nair K.P.K. (1979)

Management Science, 25, 73-78. [2] Bellman R. E. and Zadeh L. A. (1970)

Management science, 17, 141-164. [3] Bit A. K., Biswal M.P. and Alam S. S.

(1993) Industrial Engineering Journal XXII, No. 6, 8-12.

[4] Chanas S., Kolodzejczyk W. and Machaj A. (1984) Fuzzy set and systems, 13,

211-221. [5] Gwo-Hshiung Tzeng, Dusan Teodorovic

and Ming-Jiu Hwang (1996) European Journal of Operations Research, 95, 62-72.

Page 29: Advances in Information Mining 2010 Issue1

Lohgaonkar MH, Bajaj VH, Jadhav VA and Patwari MB

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

7

[6] Haley K. B. (1963) Operations Research 10, 448-463.

[7] Haley K. B. (1963) Operations Research 11, 369-379.

[8] Haley K. B. (1965) Operations Research 16, 471-474.

[9] Junginger W. (1993) European Journal of Operational Research 66, 353-371.

[10] Oheigeartaigh M. (1982) fuzzy sets and systems, 8 , 235-243.

[11] Prade H. (1980) Fuzzy sets. Theory and applications to policy analysis and information Systems. Plenum Press, new work, 155-170.

[12] Rautman C.A. Reid R.A. and Ryder E.E. (1993) Operations Research 41, 459-469.

[13] Verma Rakesh, Biswal M.P. and Biswas A. (1997) Fuzzy sets and systems 91, 37-43.

[14] Waiel F. and Abd El- Wahed (2001) fuzzy sets and systems, 117, 26-33.

[15] Zimmermann H. J. (1978) fuzzy set and system 1, 45-55.

Page 30: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-08-12

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

A new fuzzy MADM approach used for finite selection problem

Muley A.A. and Bajaj V.H.* *Department of Statistics, Dr. B. A. M. University, Aurangabad (M.S.)-431004, India

[email protected], [email protected] Abstract- This paper proposes a new approach to product configuration by applying the theory of Fuzzy Multiple Attribute Decision Making (FMADM), which focuses on uncertain and fuzzy requirements the customer, submits to the product supplier. The proposed method can be used in e-commerce websites, with which it is easy for customers to get his preferred product according to the utility value with respect to all attributes. The main concern of this paper, in which requirements the customer submitted to the configuration of television is vague. Further verify the validity and the feasibility of the proposed method compared with Weighted Product Method (WPM). Finally, the television is taken as an example to demonstrate the proposed methods. Keywords- MADM, Fuzzy, Triangular fuzzy number, T.V., Uncertainty Introduction Real world problems are often require a decision maker (DM) to rank discrete alternatives or, at least, to select best one. The MADM theory was developed to help the DM to solve such problems. MADM has been one of the fastest growing areas during the last decades depending on the changing in the business sector; Hwang & Yoon [1], Turban [4]. We focus on MADM which is used in a finite ‘selection’ or choice problem. In real world problem, MADM play most important role. Now a day’s television is the common in every person’s life. Here, we take as an application of selection of television configuration. Generally, common people purchase 21” size for house purpose; therefore we choose the most common size. Mass customization, as a business strategy, aims at meeting diverse customer needs while maintaining near mass production efficiency, can implement both economies of scale and scope for an enterprise, and has become the goal that the companies pursue; Zhu & Jiang [7]. In order to reach the goal, companies are often forced to adopt differentiation strategy to offer customer more choices of products to meet the growing individualization of demand, by giving a more customer-centric role. The configuration approaches based on rules which are usually dependent on expert’s experience to establish. The configuration is one of the most important ways to realize quickly product customization. But, in business, particularly through the internet, a customer normally develops in his mind some sort of ambiguity, given the choice of similar products. The main concern is the requirements of the customers with respect to configuration of television which are vague. The television is taken as an example to further verify the validity and the feasibility of the proposed method and compared with WPM by Millar & Starr [2]. Framework of product configuration based on uncertain customers requirements Each attribute has a finite set of possible values, in which, the variant is defined by using attributes and attribute values. Together, all attributes and attribute values describe a

complete range of the product family. Products in the same product family vary according to different attribute and its attribute value, choosing a product could be considered as a process of choosing its attributes and different attribute value. But, generally, it is difficult for a customer to express his requirements in a clear and unambiguous way, which is often due to the fact that he is not thoroughly familiar with the product which is the supplier offers. So, the requirements are often vague and fuzzy, preference weight varies with respect to different product attributes. We describe the customers’ vague and uncertain requirements in the form of fuzzy number by using the method of representation of fuzzy set. It is also the design to solve the configuration problem of the uncertain environment. As we know, there are various attributes in different products, but in which some attributes, such as color, shape, and so on, are not suitable to be represented as a fuzzy number. These attributes are often clear in the customers mind and the customer could select the attribute value by seeing the virtual product model in some browser environment. In realistic configuration system, it could be achieved by selecting the corresponding attribute value that the customer prefers directly. By using the theory of fuzzy MADM, the requirements the customer decided with respect to corresponding product attribute can be regarded as an ideal product. Firstly, the uncertain attribute value the customer decided would be represented in the form of triangular fuzzy number or interval fuzzy number, which is the most common way to solve uncertain, imprecise problems. Moreover, as the attribute values of the alternate products for the customer to select are determinate, which are usually definite and known, therefore, it is impossible to measure directly the distance or similarity degree between the ideal product the customer wanted and the alternate products. The definite attribute value is converted into the form of the fuzzy number so as to compute the distance between two fuzzy numbers. When choosing a product from a number of similar alternatives, customer

Page 31: Advances in Information Mining 2010 Issue1

A new fuzzy MADM approach used for finite selection problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 9

normally develops some sort of ambiguity. The ambiguity is mainly due to two reasons. Firstly, how to make a final product choice to purchase and secondly, on what basis the other products will be rejected. In order to answer the above questions, the customer may like to classify the products in different preference levels, preferably through some numerical strength of preference by Mohanty & Bhasker [3]. We adopt the triangular fuzzy number to represent the vague requirements provided by the customers it is shown by Fig. (1).

0 ,

( ) ,

,

A

x a

x ax a x b

b a

c xx c

c b

µ

<

−= ≤ ≤

−−

> −

%(1)

Fuzzy MADM methodology As we know, when a customer chooses his preferred product from many candidate products, it is done, in fact, by comparing different attributes that could be used to describe product performance in different aspects, and ranking these products according to the customer’s subjective preference. The customer requirements for products are usually uncertain and vague due to unable to understand product specifications comprehensively. On the other hand, the attribute values or specification of products offered by manufacturers are determinate and known. The model of fuzzy MADM has been introduced firstly by Yang & Chou [6]. The general MADM model can be described as follows:

• Let | 1, 2,...,{ }iX X i m= = denote a

finite discrete set of 2( )m ≥ possible

alternatives (courses of action, candidates);

• Let | 1,2,...,{ }j

A A j n= = denote a

finite set of 2( )n ≥ attributes according to

which the desirability of an alternative is to be judged,

• Let 1 2, ,( ..., )Tnω ω ω ω= be the

vector of weights,

where 1, 0, 1, 2...,1 j j

n j nj

ω ω ≥ ==∑ =,

and jω denotes the weight of attribute

jA .

• Let ( )m nR rij

= × denote the m n×

decision matrix, where 0( )ijr ≥ is the

performance rating of

alternative Xi

with respect to

attribute jA .

Normally, there are basically two types of attributes for a MADM problem, the first type is of ‘cost’ nature, and the second type is of ‘benefit’ nature. Since the attributes are generally incommensurate, the decision matrix needs to be normalized so as to transform the various attribute values into comparable ones. A common method of normalization is given as

min

max min1 1, ,..., ; ,..., ;

ij jij

j j

r rZ i m j n

r r

−= = =

for benefit attribute (2) and

max

max min1 1, ,..., ; ,..., ;

j ijij

j j

r rZ i m j n

r r

−= = =

for cost attribute (3)

Where ijZ is the normalized attribute value,

max minandj jr r given by,

max1 2

1max( , ,..., ) ,..., ;j mjj jr r r r j n= = (4)

min1 2

1min( , ,..., ) ,..., ;j mjj jr r r r j n= = (5)

Let ( )ij m nZ Z= × be the normalized decision

matrix. According to the SAW method, the overall weighted assessment value of alternative

11,...,

n

i ij ji

i md Z ω=

== ∑ (6)

Where id is a linear function of weight

variables and the greater the value of id the

better the alternative iX . The aim of MADM is

to rank alternatives or to determine the best alternative with the highest degree of desirability with respect to all relevant attributes. So, the best alternative is the one with the greatest overall weighted assessment value. The classic MADM techniques assume

Page 32: Advances in Information Mining 2010 Issue1

Muley AA and Bajaj VH

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

10

all ijr values are crisp numbers. In the

practical MADM problems, ijr values can be

crisp and/or fuzzy data. Fuzzy MADM methods have been developed due to the lack of precision in accessing the performance rating of alternatives with respect to an attribute, in

which the representation of ijr values are

often linguistic terms or fuzzy numbers. Configuration approach based on fuzzy MADM is introduced in details the algorithm which includes the following steps: Step 1: Representation of fuzzy requirements When choosing a product from a number of similar alternatives, a customer normally develops in his mind some sort of ambiguity. Step 2: Similarity measure In step 1, the customer’s requirements have been described as the triangular fuzzy number with respect to different product attributes. In this step, we will take the requirement vector as the ideal product the customer really wants, with the purpose to measure the similarity degree with the existing product vectors, in which the specification values are known and determinate. As we know, the fuzzy numbers can not be compared with crisp ones directly unless the unfuzzy numbers have to be transformed into the form of fuzzy numbers firstly. For example, for a crisp number b, the form of its triangular fuzzy can be written as follows:

( , , )L M Ub b b b=% (7)

WhereL M Ub b b= = , and Similarity measure

between two triangular fuzzy numbers can be calculated with Eq. (8); Xu [5],

2 2 2 2 2 2max

( , )(( ) ( ) ( ) ,( ) ( ) ( ) )

L L M M U U

L M U L M U

ab a b absab

a a a b b b

+ +=

+ + + +(8)

Where the above two triangular fuzzy numbers

are ( , , )L M U

a a a a= and ( , , )L M Ub b b b= ,

respectively. In realistic configuration system, it could be achieved by selecting an attribute value the customer prefers from the given alternate options. The similarity measure of this type of attributes is defined as follows:

' '

' '

' '

1,( , )

0,

a bs a b

a b

==

≠(9)

Step 3: Construction of Decision Matrix (DM). Calculation result of similarity measure between alternate products and the ideal product can be concisely expressed in a matrix format which is called decision matrix in MADM problems, and in which columns indicate product attributes and rows alternate products.

Thus, an element ijS in the in Eq. (10)

denotes the similarity degree to the ideal

product of the ith product with respect to the jth

attribute.

11 12 1

21 22 2

1 2

...

...

... ... ...

...

n

n

ij

m m mn

S S S

S S SDM

S

S S S

=

(10)

Step 4: Normalization: In order to eliminate the difference of dimension among different attributes, the operation of normalization is needed to transform various attribute dimensions into the non-dimensional attribute. Here, we adopt the Eqs. (11) and (12) to complete normalization of the fuzzy number.

max max max 1( , , )i i ii

i i i

a b cr

c b a= ∧% for

benefit attribute(11)

min min min

1( , , )i i ii

i i i

c c cr

c b a= ∧% for cost

attribute (12) Where

max min( ) max{( ) } and ( ) min{( ) }

iii i i i= =g g g g

Step 5: Rank of the alternate products

The element ijS in the decision matrix reflects

the closeness degree of the ideal product with the ith alternate product with respect to the j

th

attribute. In this step, we can use the SAW method, which is widely used in MADM, to calculate the utility value with respect to all attributes, with which the ranking order of alternate products according to utility value can be obtained. And we can consider the product with the highest utility value as the closest one to that of the customer requires. The utility value of i

th alternate product can be calculated

with Eq. (13).

11, 2, ,...,

j

n

ij ji mU x iω=

= =∑ (13)

And the maximum of utility value can be written as Eq. (14).

1max 1,2max ,...,

i

n

jij j mU x iω

== =∑ (14)

Here, we compared with the WPM by Millar & Starr [2] and check the feasibility of the customer’s requirement

1, 21

,...,jn

ji ij mU x i

ω

== =∏ (15)

Case study In this section, we take the television as an example to illustrate the method mentioned above. Table 1 shows the television that could

Page 33: Advances in Information Mining 2010 Issue1

A new fuzzy MADM approach used for finite selection problem

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 11

be used to configure for the different customers with respect to different attributes, in which the corresponding attributes are described as follows: Table 1 Configuration of Television

Sr. No.

Speakers Watt Channels Price

P1 6 1800 200 10300

P2 2 110 100 9790

P3 5 500 200 11990

P4 4 1200 200 12400

P5 2 200 200 9400

P6 2 400 100 11490

P7 2 250 200 9300

P8 4 500 200 9900

Suppose that the ideal product the customer wants according to the above attributes and the corresponding preference weight are shown in Table 2. Table 2 The ideal product and attribute weight

Attributes Ideal Lower Upper Weight

Speakers 5 2 8 0.25

Watt 1000 200 2400 0.2

Channels 150 100 250 0.25

Price 10,000 9,000 12,000 0.30

The vector of the ideal product can be represented as the following form of the triangular fuzzy number.

[(2,5,8), (200,1000, 2400), (100,150, 250), (9 '000,10 '000,12 '000)]C =%

The corresponding vector of the attribute weight can be written as the follows:

(0.25,0.20,0.25,0.30)ω =

The decision matrix, which shows the similarity degree with respect to each attribute between the ideal television that the customer desired and the candidate ones, is shown in Table 3 by using Eqs. (8)-(12).

Table 3 Decision Matrix Sr. No.

U1 U2 U3 U4

P1 0.8333 0.6666 0.8333 0.9824

P2 0.3225 0.0582 0.5263 0.738

P3 0.8064 0.2647 0.8333 0.8618

P4 0.6451 0.6329 0.8333 0.8333

P5 0.3225 0.1058 0.8333 0.8966

P6 0.3225 0.2117 0.5263 0.897

P7 0.3225 0.1323 0.8333 0.887

P8 0.6451 0.2647 0.8333 0.9443

The utility value of all candidate products with respect to all attributes can be calculated by Eq. (13) and the final calculated results are given below:

Table 4 Utility value of each product configuration by SAW P1 P2 P3 P4 P5 P6 P7 P8

0.8447 0.5039 0.7214 0.7462 0.5791 0.5243 0.5815 0.7058

The Table 4 presents the final utility value, with which the customer can rank the candidate products according to his preference to different attributes, and the order that shows the closeness degree to the customer requirements can be written as follows:

1 4 3 8 7 5 6 2P P P P P P P P> > > > > > >

Here, we compare the above method with the WP method and check the feasibility of the customer requirement calculated by Eq. (15), we get Table 5 Utility value of each product configuration by WPM P1 P2 P3 P4 P5 P6 P7 P8

0.8373 0.356 0.6637 0.7398 0.4446 0.4558 0.4635 0.6452

1 4 3 8 7 6 5 2P P P P P P P P> > > > > > >

Due to the uncertainty of the customers’ requirements and the fact that different algorithms may yield different results, therefore, in the realistic configuration system, several products that have the higher similarity degree to that of the customer requires can be presented for customer to choose so as to satisfy the customer requirements to the greatest degree. Conclusion This paper proposes an approach to realize product level configuration according to the fuzzy and uncertain customer requirements by using the theory of the fuzzy MADM. The television is taken as an example to demonstrate feasibility of the proposed method for solving uncertain customer requirements. When results of SAW and WPM are compared, we get the same preferences to our problem and the optimal solution for selection of television is P1.

References [1] Hwang C.L. and Yoon K. P. (1981)

Springer, Berlin. [2] Millar D.W., Starr M.K. (1969) Prentice

Hall, Englewood Cliffs, New Jersey. [3] Mohanty B.K. & Bhasker B. (2005)

Decision Support Systems, .38, 611–619.

[4] Turban E. (1988) Macmillan, New York.

Page 34: Advances in Information Mining 2010 Issue1

Muley AA and Bajaj VH

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

12

[5] Xu Z. S. (2002) Systems Engineering and Electronics, 124, 9–12.

[6] Yang T. & Chou P. (2005) Mathematics and Computers in Simulation, 68, 9–21.

[7] Zhu B. & Jiang P. Y. (2005) The International Journal of Product Development, 2, 155–169.

Page 35: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-13-17

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

Ant based rule mining with parallel fuzzy cluster

Sankar K.1 and Krishnamoorthy K.

2

1Department of Master of Computer Applications, KSR College of Engineering, Tiruchengode, [email protected]

2Department of Computer Science and Engineering, SONA College of Technology, Salem, [email protected]

Abstract- Ant-based techniques, in the computer sciences, are designed those who take biological inspirations on the behavior of these social insects. Data clustering techniques are classification algorithms that have a wide range of applications, from Biology to Image processing and Data presentation. Since real life ants do perform clustering and sorting of objects among their many activities, we expect that an study of ant colonies can provide new insights for clustering techniques. The aim of clustering is to separate a set of data points into self-similar groups such that the points that belong to the same group are more similar than the points belonging to different groups. Each group is called a cluster. Data may be clustered using an iterative version of the Fuzzy C means (FCM) algorithm, but the draw back of FCM algorithm is that it is very sensitive to cluster center initialization because the search is based on the hill climbing heuristic. The ant based algorithm provides a relevant partition of data without any knowledge of the initial cluster centers. In the past researchers have used ant based algorithms which are based on stochastic principles coupled with the k-means algorithm. The proposed system in this work use the Fuzzy C means algorithm as the deterministic algorithm for ant optimization. The proposed model is used after reformulation and the partitions obtained from the ant based algorithm were better optimized than those from randomly initialized hard C Means. The proposed technique executes the ant fuzzy in parallel for multiple clusters. This would enhance the speed and accuracy of cluster formation for the required system problem.

1. INTRODUCTION Research in using the social insect metaphor for solving problems is still in its infancy. The systems developed using swarm intelligence principles emphasize distributiveness, direct or indirect interactions among relatively simple agents, flexibility and robustness [4]. Successful applications have been developed in the communication networks, robotics and combinatorial optimization fields. 1.1 ANT COLONY OPTIMIZATION Many species of ants cluster dead bodies to form cemeteries, and sort the larvae into several piles [4]. This behavior can be simulated using a simple model in which the agents move randomly in space and pick up and deposit items on the basis of local information. The clustering and sorting behavior of ants can be used as a metaphor for designing new algorithms for data analysis and graph partitioning. The objects can be considered as items to be sorted. Objects placed next to each other have similar attributes. This sorting takes place in two-dimensional space, offering a low-dimensional representation of the objects. Most swarm clustering work has followed the above model. In the work, there is implicit communication among the ants making up a partition. The ants also have memory. However, they do not pick up and put down objects but rather place summary objects in locations and remember the locations that are evaluated as having good objective function values. The objects represent single dimensions of multidimensional cluster centroids which make up a data partition.

1.2 CLUSTERING The aim of cluster analysis is to find groupings or structures within unlabeled data [5]. The partitions found should result in similar data being assigned to the same cluster and dissimilar data assigned to different clusters. In most cases the data is in the form of real-valued vectors. The Euclidean distance is one measure of similarity for these data sets. Clustering techniques can be broadly classified into a number of categories [6]. Hard C Means (HCM) is one of the simplest unsupervised clustering algorithms for a fixed number of clusters. The basic idea of the algorithm is to initially guess the centroids of the clusters and then refine them. Cluster initialization is very crucial because the algorithm is very sensitive to this initialization. A good choice for the initial cluster centers is to place them as far away from each other as possible. The nearest neighbor algorithm is then used to assign each example to a cluster. Using the clusters obtained, new cluster centroids are calculated. The above steps are repeated until there is no significant change in the centroids. Hard clustering algorithms assign each example to one and only one cluster. This model is inappropriate for real data sets in which the boundaries between the clusters may not be well defined. Fuzzy algorithms can partially assign data to multiple clusters. The strength of membership in the cluster depends on the closeness of the example to the cluster center. The Fuzzy C Means algorithm (FCM), allows an example to be a partial member of more than one cluster. The FCM algorithm is based on

Page 36: Advances in Information Mining 2010 Issue1

Ant based rule mining with parallel fuzzy cluster

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 14

minimizing the objective function. The drawback of clustering algorithms like FCM and HCM, which are based on the hill climbing heuristic, is, prior knowledge of the number of clusters in the data is required and they have significant sensitivity to cluster center initialization. The proposal of this work moves in the direction of constructing C fuzzy means clustering with ant colony optimization (parallel ant agents) in evolving efficient rule mining techniques. In this thesis, the proposal introduces the problem of combining multiple partitionings of a set of objects without accessing the original features. The system first identify several application scenarios for the resultant `knowledge reuse' framework that the system call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information in building rule mining techniques. In addition to a direct maximization approach, the system proposes three effective and efficient techniques for obtaining high-quality combiners. 2. RELATED WORKS Andrea Baraldi and Palma Blonda,[1] describe, equivalence between the concepts of fuzzy clustering and soft competitive learning in clustering algorithms was proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. Alfred Ultsch systems for clustering with collectives of autonomous agents follow either the ant approach of picking up and dropping objects or the DataBot approach of identifying the data points with artificial life creatures. In DataBot systems the clustering behaviour is controlled by movement programs. Julia Handl and Bernd Meyer Sorting and clustering methods inspired by the behavior of real ants are among the earliest methods in ant-based meta-heuristics. The system revisits these methods in the context of a concrete application and introduces some modifications that yield significant improvements in terms of both quality and efficiency. Firstly, re-examine their capability to simultaneously perform a combination of clustering and multi-dimensional scaling. In J.Handl, J.Knowles and M.Dorigo Ant-based clustering and sorting is a nature-inspired heuristic for general clustering tasks. It has been applied variously, from problems arising in commerce, to circuit design, to text-mining, all with some promise. However, although early results were broadly encouraging, there has been very limited analytical evaluation of the algorithm. Alexander Strehl, Joydeep Ghosh introduces the problem of combining multiple partitioning of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitioning. The system first

identify several application scenarios for the resultant `knowledge reuse' framework that we call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information. In addition to a direct maximization approach, the system proposes three effective and efficient techniques for obtaining high-quality combiners (consensus functions). The first combiner induces a similarity measure from the partitioning and then reclusters the objects. The second combiner is based on hypergraph partitioning. The third one collapses groups of clusters into meta-clusters which then compete for each object to determine the combined clustering. Due to the low computational costs of the techniques, it is quite feasible to use a supra-consensus function that evaluates all three approaches against the objective function and picks the best solution for a given situation. The system evaluates the effectiveness of cluster ensembles in three qualitatively different application scenarios: (i) where the original clusters were formed based on non-identical sets of features, (ii) where the original clustering algorithms worked on non-identical sets of objects, and (iii) where a common data-set is used and the main purpose of combining multiple clusterings is to improve the quality and robustness of the solution. Promising results are obtained in all three situations for synthetic as well as real data-sets. Nicolas Labroche, Nicolas Monmarch´e and Gilles Venturini introduces a method to solve the unsupervised clustering problem, based on a modeling of the chemical recognition system of ants. This system allow ants to discriminate between estimates and intruders, and thus to create homogeneous groups of individuals sharing a similar odor by continuously exchanging chemical cues. This phenomenon, known as ”colonial closure”, inspired us into developing a new clustering algorithm and then comparing it to a well-known method such as K-MEANS method. The previous literature work on fuzzy cluster depicted above insists on the following parameters. The first one handles the functional attribute with the theoretical analysis. The second and third one deal with the cluster object movement issues on synthetic data sets. The fourth and fifth one deals with heuristic ant optimization model with trial repetition. Sixth and seventh authors utilized unsupervised cluster with class tree structuring. The final one uses c-fuzzy mean cluster in the sequential way. This motivates us to precede our proposal on ACO with c-fuzzy means. Based on the C-Fuzzy sequential clustering of ACO Problem, we derived a parallel fuzzy ant clustering model to improve the attribute accuracy rate and faster execution on the proposed problem domain.

Page 37: Advances in Information Mining 2010 Issue1

Sankar K and Krishnamoorthy K

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

15

3. FUZZY ANT CLUSTERING Ant-based clustering algorithms are usually inspired by the way ants cluster dead nest mates into piles, without negotiating about where to gather the corpses. These algorithms are characterized by the lack of centralized control or a priori information, which makes them very appropriate candidates for the task at hand. Since the Fuzzy ants algorithm does not need initial partitioning of the data or a predefined number of clusters, it is very well suited for the Web People Search task, where the system do not know in advance how many clusters (or individuals) correspond to a particular document set (or person name in the case). A detailed description of the algorithm is given by Schockaert et al. It involves a pass in which ants can only pick up one item as well as a pass during which ants can only pick up an entire heap. A fuzzy ant-based clustering algorithm was introduced where the ants are endowed with a level of intelligence in the form of IF / THEN rules that allow them to do approximate reasoning. As a result, at any time the ants can decide for themselves whether to pick up a single item or an entire heap, which makes a separation of the clustering in different passes superuous. The system has experiment with a different number of ant’s runs and fixed the number of runs to 800000 for the experiments. In addition, the system has also evaluated different values for the parameters that determine the probability that a document or heap of documents is picked up or dropped by the ants and kept following values for the experiments: Table 1: Parameter settings for fuzzy clustering

n1 probality of dropping one item

1

m1 probality of picking up one item

1

n2 probality of dropping an entire heap

5

m2 probality of picking up a heap

5

3.1 Hierarchical Clustering The second clustering algorithm the system applies is an agglomerative hierarchical approach. This clustering algorithm builds a hierarchy of clustering’s that can be represented as a tree (called a dendrogram) which has singleton clusters (individual documents) as leaves and a single cluster containing all documents as root. An agglomerative clustering algorithm builds this tree from the leaves to the top, in each step merging the two clusters with the largest similarity. Cutting the tree at a given height gives a clustering at a selected number of clusters. The system have opted to cut the tree at different similarity thresholds between the document pairs, with intervals of 0.1 (e.g. for threshold 0.2 all document pairs with similarities

above 0.2 are clustered together). For the experiments, the system has used an implementation of Agnes (Agglomerative Nesting) that is fully described. 3.2 Fuzzy Ant Parallel System Clustering approaches are typically quite sensitive to initialization. In this thesis, the system examine a swarm inspired approach to building clusters which allows for a more global search for the best partition than iterative optimization approaches. The approach is described with cooperating ants as its basis. The ants participate in placing cluster centroids in feature space. They produce a partition which can be utilized as is or further optimized. The further optimization can be done via a focused iterative optimization algorithm. Experiments were done with both deterministic algorithms which assign each example to one and only one cluster and fuzzy algorithms which partially assign examples to multiple clusters. The algorithms are from the C-means family. These algorithms were integrated with swarm intelligence concepts to result in clustering approaches that were less sensitive to initialization.

4. EXPERIMENTAL SIMULATION ON ANT BASED PARALLEL CLUSTER The system implementation of fuzzy ant based parallel clustering algorithm for rule mining used three real data sets obtained from UCI repository. The data sets were Iris Human Data Set, Wine Recognition Data Set, and Glass Identification Data Set. The simulation conducted in matlab normalizes the feature values between 0 and 1. The normalization is linear. The minimum value of a dataset specific feature is mapped to 0 and the maximum value of the feature is mapped to 1. Initialize the ants with random initial values and with random direction. There are two directions, positive and negative. The positive direction means the ant is moving in the feature space from 0 to 1. The negative direction means the ant is moving in the feature space from 1 to 0. Clear the initial memory. The ants are initially assigned to a particular feature within a particular cluster of a particular partition. The ants never change the feature, cluster or the partition assigned to them. Repeat For one epoch /* One epoch is n iterations of random ant movement */ For all ants With a probability Prest the ant rests for this epoch If the ant is not resting then with a probability Pcontinue the ant continues in the same direction else it changes direction With a value between Dmin and Dmax the ant moves in the selected direction The new Rm value is calculated using the new cluster centers calculated by recording the

Page 38: Advances in Information Mining 2010 Issue1

Ant based rule mining with parallel fuzzy cluster

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 16

position of the ants known to move the features of clusters for a given partition. If the partition is better than any of the old partitions in memory then the worst partition is removed from the memory and this new partition is copied to the memories of the ants making up the partition. If the partition is not better than any of the old partitions in memory Then With a probability P Continue Current the ant continues with the current partition Else With a probability 0.6 the ant moves to the best known partition, with a probability 0.2 the ant moves to the second best known partition, with a probability 0.1 the ant goes to the third best known partition, with a probability 0.075 the ant goes to the fourth best known partition and with a probability 0.025 the ant goes to the worst known partition Until Stopping criteria The stopping criterion is the number of epochs.

Table 2- Parameter Values Parameter Value Number of ants 30 * c * # features Memory per ant 5 Iterations per epoch 50 Epochs 1000

Prest 0.01 Pcontinue 0.75 PContinueCurrent 0.20 Dmin 0.001

Dmax 0.01 Note the multiplier 30 for the number of ants allows for 30 partitions. Three data sets Glass Data Set, Wine Data Set, Iris Data Set were evaluated from a mixture of five Gaussians. The probability distribution across all the data sets is the same but the means and standard deviations of the Gaussians are different. Of the three data sets, two data sets had 500 instances each and the remaining one data set had 1000 instances each. Each instance had two attributes. To visualize the Iris data set, the Principal Component Analysis (PCA) algorithm was used to project the data points into a 2D and 3D space. 5. RESULTS AND DISCUSSIONS The ants move the cluster centers in feature space to find a good partition for the data. There are less controlling parameters than the previous ant based clustering algorithms. The previous ant clustering algorithms typically group the objects on a two-dimensional grid. Results from 18 data sets show the superiority of the algorithm over the randomly initialized FCM and HCM algorithms. For comparison purposes, Table 2 shows the frequency of occurrence of different extrema for the ant initialized FCM and HCM algorithms and the randomly initialized FCM and HCM algorithms.

Table 3- Frequency of different extrema from parallel fuzzy based ant clustering, for Glass (2 class) Iris and Wine data set

Data Set

Extrema

Frequency HCM, and Initialization

Frequency HCM, random Initialization

Sequential C-Fuzzy ACO (Existing)

Parallel C-Fuzzy ACO (Proposed)

34.1320

19 3 31 27.8

34.1343

11 19 32.12 28.5

34.1372

19 15 32.36 29.1

Glass (2 class)

34.1658

1 5 32.89 29.82

6.9981

50 23 5.3938 4.23

7.1386

0 14 5.8389 4.3658

10.9083

0 5 8.3746 5.3256

Iris

12.1437

0 8 10.6434 8.2356

9.3645

20 2 5.2369 3.2567

11.3748

15 20 8.2356 5.236

Wine

13.8483

12 18 10.2356 8.3656

The ant initialized parallel ant fuzzy algorithm always finds better extrema for the Iris data set and for the Wine data set the ant initialized algorithm finds the better extrema 49 out of 50 times. The ant initialized HCM algorithm always finds better extrema for the Iris data set and for the Glass (2 class) data set a majority of the time. For the different Iris, the ant initialized parallel algorithm finds a better extrema most of the time. The ACO approach was used to optimize the clustering criteria, the ant approach for parallel C Means, found better extrema 64% of the time for the Iris data set. The ant initialized parallel C fuzzy ACO finds better extrema all the time. The number of ants is an important parameter of the algorithm. This number only increases when more partitions are searched for at the same time; as ants are (currently) added in increments (Graph 1 and Graph 2). The quality of the final partition improves with an increase of ants, but the improvement comes at the expense of increased execution time. Graph 1: Number of Iterations Vs Time

No.of.Iterations VS. Time

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9 10

No.of.Iterations

Tim

e Ant Fuzzy Parallel

Ant Fuzzy Sequential

Page 39: Advances in Information Mining 2010 Issue1

Sankar K and Krishnamoorthy K

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

17

Graph 2: Time Vs Path Length

Time VS. Path Length

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10

Time

Path

Le

ng

th

0

5

10

15

20

25

30

35

Ant Fuzzy Sequential

Ant Fuzzy Parallel

7. CONCLUSION The system discussed a swarm inspired optimization algorithm to partition or creates clusters of data. The system described it using the ant paradigm. The approach is to have a coordinated group of ant’s position cluster centroids in feature space. The algorithm was evaluated with a soft clustering formulation utilizing the fuzzy c-means objective function and a hard clustering formulation utilizing the hard c-means objective function. The presented clustering approach seems clearly advantageous for the data sets where it is expected there will be lots of local extrema. The cluster discovery aspect of the algorithm provides the advantage of obtaining a partition at the same time it indicates the number of clusters. That partition can be further optimized or accepted as is. This is in contrast to some other schemes which require partitions to be created with different numbers of clusters and then evaluated. The results are generally a superior optimized partition (objective function) than obtained with FCM/HCM. One needs a large number of random initializations to be competitive in terms of skipping some of the poor local extrema which was done with the ant-based algorithm. It has provided enhanced final partitions on average than a previously introduced evolutionary computation clustering approach for several data sets. Random initializations have been shown to be the best approach for the c-means family and the ant clustering algorithm results in generally better partitions than a single random initialization. The parallel version of the ants algorithm could operate much faster than the current sequential implementation, thereby making it a clear choice for minimizing the chance of finding a poor extrema when doing c-means clustering. This algorithm should scale better for large numbers of examples than grid-based ant clustering algorithms.

REFERENCES [1] Baraldi A. and Blonda P. (1999a) IEEE

Transactions on Systems, Man, and Cybernetics, 29(6), 778-785.

[2] Kanade P.M. and Hall L.O. (2003) IEEE Transactions on Fuzzy Systems , 11(2), 227-232.

[3] Handl J. and Meyer B. (2002) Springer-Verlag, 2439, 913-923.

[4] Handl J., Knowles J. and Dorigo M. (2003) IOS Press, Amsterdam, the Netherlands, 204-213.

[5] Strehl A. and Ghosh J. (2002) Journal of Machine Learning Research 3, 583-617.

[6] Labroche N., Monmarche N. and Venturini G. (2002) France: IOS Presss, 345-349.

Page 40: Advances in Information Mining 2010 Issue1

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010, pp-18-22

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

Data mining- A Mathematical Realization and cryptic application using variable key

Chakrabarti P. Sir Padampat Singhania University, Udaipur-313601, Rajasthan, India, [email protected]

Abstract- In this paper we have depicted the various mathematical models based on the themes on data mining. The numerical representations of regression and linear models have been explained. We have also shown the prediction of datum in the light of statistical approaches namely probabilistic approach, data estimation and dispersion theory. The paper also deals with the efficient generation of shared keys required for direct communication among co-processors without active participation of server. Hence minimization of time-complexity, proper utilization of resource as well as environment for parallel computing can be achieved with higher throughput in secured fashion. The techniques involved are cryptic methods based support analysis, confidence rule, resource mining, sequence mining and feature extraction. A new approach towards realizing variability concept of key in Wide – Mouth Frog Protocol, Yahalom Protocol and SKEY Protocol has been depicted in this context. Keywords-data mining, regression, dispersion theory, sequence mining, variable key Regression based data-mining techniques A. Concept We have pointed out the scenario where the prediction of dependency of a datum at time instant t1 on another at t2 can be computed. If we assume d1 as datum at t1 and d2 as datum at t2

then we can write the following equation as d2 = a + bd1.. (1) Where a,b are constants. Data prediction based on linear regression model has been concentrated. B. Linear representation As per statistical prediction let the predicted value of a datum d is ∆1 . We assume that its original; value is ∆2. As per data mining based regression model , we can denote ∆i = d 2,i – (a+ b d1,i ) as the error in taking a + b d 1,i for d 2,i and this is known as error of estimation . Prediction based on probabilistic approach Suppose observed data be k1, k2, k3 ..km have respective probability p1,p2, …….pn. m When ∑ pi = 1 i = 1 then E(k) = ∑ ki pi = 1 .(2), i = 1 provided it is finite. Here, we are use bivariate probability based on K (k1, k2, k3……km) i.e. set of observed data and Q (q1, q2, q3, …….qn) i.e. set of predictive values , ( 1 < m < n) Theorem 1 If the observed data set value and predicted data set value be two jointly distributed random variable then E ( K + Q) = E (K) + E(Q) . Proof : K assume values k1, k2, k3 … km Q assume values q1, q2, q3 …. qm

P(K=ki, Q = qj) = pij, i = 1 to n and j = 1 to n E (K + Q) = ∑ ∑ (ki + qj) pij i j = ∑ ∑ ki pij + ∑ ∑ qj pij i j i j = ∑ ki ∑ pij + ∑ qj ∑ pij i j j i E( K + Q) = E (K) + E(Q)...(3) Similarly, E( K * Q) = E (K) * E (Q)…(4) Prediction based on datum estimation Let the data space be (k1, k2, k3----kn), let distribution function f1(k1) of random variable k involves a parameter whose value is unknown and we have to uses value of on the basis of observed data space (k1,k2,…….km) where (m < n). We have to select = f2(k1,k2,……..km), it is basically a number and it is taken as a given for the value of . Hence, is an estimation of and value of obtained from observed data space is on estimate of . should be negligible for successful prediction of datum. Now, we can represent the datum assumption criteria as below : E () = for true value of ….(5) and Var () <= Var (Ψ), for ..(6)

True value of and Ψ being any other estimate satisfying equation (5). Hence the data prediction has been pointed out on the basis of property of unbiasedness (equation (5)) and property of maximum variance ( equation(6)). Prediction based on dispersion theory and pattern analysis The values of the data for different sessions are not all equal. In some cases the values are close to one another, where in some cases they are highly dedicated from one another. In order to get a proper idea about overall nature of a given set of values, it is necessary to know, besides average, the extent to which the data differ

Page 41: Advances in Information Mining 2010 Issue1

Data mining- A Mathematical Realization and cryptic application using variable key

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 19

among themselves or equivalently, how they are scattered about the average. Let the values k1, k2, k3…….km are the obtained data and c be the average of the original values of km+1, km+2, ……………kn Mean Deviation of k about c will be given by 1 n-m MDc = ________ ∑ | ki – c | ..(7) ( n – m) i = 1 In particular , when c = k , mean deviation about mean will be given by 1 n-m MDk = ________ ∑ | ki – ki | .(8) ( n – m) i = 1 B. Pattern matching We want to study the trend analysis of future events based on prediction using previously observed data. If the event delivers some numerical based data estimation , then we can predict so in certain forms. We assume dp to be predicted datum and do as observed datum If dp and do are linearly related, then dp = a + b do…..(9) If exponentially related , the equation will be in the form of dp = ab

do..(10). If

logarithmic transformation based prediction rule is observed, then the equation will be Dp = A + B d t..(11). where Dp = log dp , A = log a and B = log b. In case of data merging towards obtaining a meaningful information , the convention rule is as follows- di=> d(i+k)mod n where di €D , k is the offset value and n is the number of sensed data elements ie. number of elements of set D.The value of k varies from stage to stage. Communication based on support A .Scheme A and B are two parties . K1,K2,K3,K4,K5,K6 are keys which are protected to A and B only . A sends message m1,m2,m3,m4,m5,m6 in encrypted form with the help of one or more keys . Third party will decipher each message by error-and-trial method and form sets . The key having maximum support is the shared key between A and B . If the number of shared key is more than one then that one is primary while other one is candidate to it .Here we will find shared key so that the third party will not be able to decipher the message .

B. Mathematical Analysis Message Encrypted Key m1 ek1=f(k1,k3,k4,k6)=k1^k3^k4^k6 m2 ek2=f(k3,k5)=k3^k5 m3 ek3=f(k4,k5,k6)=k4^ k5^k6 m4 ek4=f(k2,k3,k5)=k2^k3^k5 m5 ek5=f(k1,k2)=k1^k2 m6 ek6=f(k1,k2,k3,k6)=k1^k2^k3^k6 So, it is seen that k3 is supported by 4 out of 6 sets of shared key . This support of k3=66.6% . Hence shared key of A& B is k3. If hacker hacks k1,k2…….,k6 then by applying error-and-trial it

will get shared key. So concept of automatic variable shared key is proposed. The concept is that shared key = (key having maximum support) xor (xor of the value of messages where the support is not available) . Hence, k3= key having maximum support , m3,m5= messages encrypted without k3 . Therefore , shared key =k3^m3^m5 . This scheme cannot be revealed to the hacker . So it will hack k3 instead modified value of the shared key.

Communication based on confidence rule A. Scheme Input:- m1,m2,m3,m4,m5,m6 to A. K1,K2,K3,K4,K5,K6 TO A and B. Step1:

A encrypts each of the messages with combination of the keys and sends it to B.

Step2: B finds the key which has the confidence level of 100 %,i.e. key1=>key2. If key1 exists, then key2 will also exist and hence confidence of Key1=>key2 is 100 %. Step3: Shared key is key1. Step4:

( Application only for enhancing security level ) Shared (key=key1) XOR (key-new) , where key-new can be obtained such that key-new=>key1 is minimum.

B. Mathematical Analysis Message Encrypted Keys m1 Sk1=(k1,k3,k4,k6)=(k1^k3^k4^k6) m2 Sk2=(k3,k5)=(k3^k5) m3 Sk3=(k4,k5,k6)=(k4^k5^k6) m4 Sk4=(k2,k3,k5)=(k2^k3^k5) m5 Sk5=(k1,k2)=(k1^k2) m6 Sk6=(k1,k2,k3,k6)=(k1^k2^k3^k6) Only k4=>k6 has confidence level of 100 % . So, shared key=k4(up to step 3). Association Scheme Probability k1=>k4 1/3 k2=>k4 0 k3=>k4 1/4 k5=>k4 1/2 k6=>k4 2/3 So, key-new=k2 since it has least probability . Hence , shared key=k4 XOR k2. Statistical approaches of resource mining A. Based on prediction of most frequent word The most frequent key can be obtained based on Max(f1,f2,….,fn) where f1, f2, ….,fn are relative frequencies and n is total number of keys. B. Based on prediction of variable within interval

Page 42: Advances in Information Mining 2010 Issue1

Chakrabarti P

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

20

We can predict the value of a variable key if we can measure interval properly. We can apply this scheme in hacking. Theorem 2 If a variable key changes (V) over time (t) in an exponential manner, in that case the value of the variable at the centre point an interval (a1, a2) is a geometric mean of its value at a1 and a2. Proof: Let Va = mn

a

Then Va1 = mna1

and Va2 = mna2

Now, value of V at (a1 + a2)/2 = mn

(a1+a2)/2

= [m2n

(a1+a2)]1/2

= [(mn

a1)(mn

a2)]

1/2

= (Va1Va2)1/2

C. Based on prediction of interrelated variables In a message there may be a variable which is dependent on any other based on any equation in that case extraction can be made. Theorem 3 If a variable m related to another variable n in the form m = an, where a is a constant, then harmonic mean of n is related to that of n based on the same equation. Proof: Let x is no. of given values. If mHM = x / (∑ 1/mi) for i = 1 to x = x / (∑ 1/ani) [ Since mi = ani] = x / ( 1/a ∑ 1/ni) for i = 1 to x = a( x / ( ∑ 1/ni) for i= 1 to x = anHM

Shared key generation in the light of sequence mining Let us suppose that four users viz.U1,U2,U3,U4 are in a network. Each of U1,U2,U3 transmits three messages to U4 in successive sessions. Sender Key Operations U1 110110 U1(m1)�U4 U2 100101 U2(m1)�U4 U3 001010 U3(m1)�U4 U1 001100 U1(m2)�U4 U2 000011 U2(m2)�U4 U3 100001 U3(m2)�U4 U1 111100 U1(m3)�U4 U2 000001 U2(m3)�U4 U3 110100 U3(m3)�U4 A . Algorithm Step 1 : Designate each bit of key as a character. Step 2 : If the character index value is 1 include it in sequence. Step 3 : else ignore the value. Step 4 :Identify the pattern that is decided by the communicating party and fetch the combination. Step 5 : The shared key for each user will be based on the combined result Step 6 : Repeat the steps 1to5 for other users

Step 7 : Final shared key will be based on shared key in combined form of U1/U2/U3 and computation scheme. B. Analysis The bits can be denoted by A,B,C,D,E,F. Combined sequence of U1: (A,B,D,E)�(C,D)�(A,B,C,D) Table 1- Combined sequence forU1 Sequence Session A B C D E F

1 1 1 1 0 1 1 0

2 4 0 0 1 1 0 0

3 7 1 1 1 1 0 0

Combined sequence of U2 : (A,D,F)�(E,F)�(F) Table 2- Combined sequence forU2 Sequence Session A B C D E F

1 2 1 0 0 1 0 1

2 5 0 0 0 0 1 1

3 8 0 0 0 0 1 1

Combined sequence of U3 : (C,E) �(A,F) �(A,B,D) Table 3- Combined sequence forU3 Sequence Session A B C D E F

1 3 0 0 1 0 1 0

2 6 1 0 0 0 0 1

3 9 1 1 0 1 0 0

C. Method 1 Communicating parties : U1 and U4 (say).Sequence of AB and D are as follows : AB=2, D=3.Therefore x1=2 and x2=3 Therefore U1 will compute ((A.M.of2and3)*(H.M.of 2and3))

1/2 and U4 will

compute G.M. of 2and3.So, shared key= 6

1/2. If any

occurrence becomes null, then that parameter value is treated as zero. D. Method 2 Communicating parties : U3 and U4 (say) In case of U3, Union becomes C E A F B D So, shared key of U3 and U4 is C E A F B D E. Method 3 Communicating parties : U2 and U4 (say) Shared key is based on intersection and it is F. Key using feature based method Let six messages are to be sent by the sender and those have to be encrypted by combination of one or more keys using some function.

Page 43: Advances in Information Mining 2010 Issue1

Data mining- A Mathematical Realization and cryptic application using variable key

Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010 21

Table 4 - Association of keys against each message

message Keys associated M1 SK1 = ( K1,K3,K4,K6)

M2 SK2 = (K3,K5) M3 SK3 = (K4,K5,K6) M4 SK4 = (K2,K3,K5) M5 SK5 = (K1,K2) M6 SK6 = (K1,K2,K3,K6)

Table 5 - Determination of count and value

Key Initial value

Count Value (Value)2

K1 0.1 3 0.3 0.09

K2 0.2 3 0.6 0.36

K3 0.3 4 1.2 1.44

K4 0.4 2 0.8 0.64

K5 0.5 3 1.5 2.25

K6 0.6 3 1.8 3.24

Now CF = ( x , y , z ) where x = number of elements , y = linear sum of the elements and z = sum of the square of the elements CF1 = ( 4 , 4.1 , 5.41) CF2 = ( 2 , 2.7 , 3.69) CF3 = ( 3 , 4.1 , 6.13 ) CF4 = ( 3 , 3.3 , 4.05 ) CF5 = ( 2 , 0.9 , 0.45 ) CF6 = ( 4 , 3.9 , 5.13 ) So CFnet = accumulation of maximum of each tuple = ( 4 , 4.1 , 6.13) So shared key = floor of modulus of (4.1 – 6.13) = 2 Wide – mouth frog using variable key Both Alice and Bob share a secret key with a trusted server let Trent. The keys are just used for key distribution and not to encrypt any actual messages between users. The proposed algorithm is as follows-

1) Alice concatenates a timestamp, Bob’s name and a technique to deduce random session key based on timestamp and Bob’s name. She then encrypts the whole message with the key she shares with Trent. She sends this to Trent along with her name. Alice sends:- A,EKA(TA, B, f).

2) Trent decrypts the message. For enhanced security, he concatenates a new timestamp, Alice’s name, function “f” and the difference between TB and TA. He then encrypts the whole message with the key he shares with Bob. Trent sends: EKB(TB, A, f, d). Hence, f is automatic variable based on TB, d.

3) Bob decrypts it. He then first verify the sender’s name, and compute TA based on TA= TB – d

4) Then it will compute “f” based on TA and binary form of ASCII value of his name.

5) Thus he computes K, i.e. the session key with which he will communicate with Alice.

6) In the next iteration TA, KA will be changed and hence “f” and so on.

The main advantage is that nowhere the transmission of key K is used. Yahalom protocol using variable key Both Bob and Alice share a secret key with Trent. Let , RA = Nonce chosen by Alice NB =Number chosen by Bob based on RA ,A KA = Shared key between Alice and Trent KB = Shared key between Bob and Trent A=Alice’s name B=Bob’s name K=Random session key

1. Alice concatenates her name and a random

number and sends it to Bob. 2. Bob computes NB = RA + (binary form of

ASCII value of Alice). He sends Trent B, EKB (A, RA, f), where

f= offset which when applied on NB yields RA. 3. Trent generates two messages to Alice EKA

(B, K’, RA, f, d), EKB (A, K’, d),where K= session key random = f( K’, d).

4. Alice decrypts first message, extracts K using f((K’, d). Alice sends Bob two messages EKB (A, K’, d), EK (RA, f).

5. Bob decrypts A, K’, d are extracts of K like f(K’, d)= K.. Then he extracts NB using f(RA, f)≡ NB. It is to be remembered that the functions

f(K’, d) and f(RA, f) should be reversible. Bob then matches whether NB has same value. At the end, both Alice and Bob are convinced that they are talking to the other and not to a third party. Advantage is that there is no use of transmitting NB and K. Demerit is calculation of NB and K using the functions specified. Analysis of skey using variable key SKEY is mainly a program for authentication and it is based on a one-way function. The proposed algorithm is as follows- 1. Host computes a Bernouli trial with biased

coin for which p= probability of coming 1.q=(1-p)=probability of coming 0.Let number of trials be n. Assume n=6, and string=110011.

2. Host sends the string to Alice. 3. Alice modifies its own public key based on

that the new public key = previous key + ( binary equivalent of the number of 1’s present in the string).

4. Alice creates a Shared Key.

Page 44: Advances in Information Mining 2010 Issue1

Chakrabarti P

Copyright © 2010, Bioinfo Publications, Advances in Information Mining, ISSN: 0975–3265, Volume 2, Issue 1, 2010

22

5. Alice modifies the public key along with modification scheme with shared key.

6. Alice then encrypts the string with her private key and sends back to the host along with her name.

7. Host first decrypts public key and accordingly fetches it from database of Alice and computes the result.

8. If match is found, then it performs another level of verification by decrypting the string with new value of Alice’s public key.

9. If that also matches, then authentication of Alice is certified.

Conclusion The techniques involved for data prediction in this paper are namely regression rule, probabilistic approach, and datum estimation analysis and dispersion theory. We have also shown how pattern matching can be sensed. Several approaches of shared key computation on the basis of data mining techniques have been discussed in details with relevant mathematical analysis. Variable concept of key in Wide-Mouth Frog Protocol, Yahalom Protocol and SKEY Protocol has also been applied in cryptic data mining. References [1] Chakrabarti P., et. al. (2008) IJCSNS, 8,7. [2] Chakrabarti P., et. al. () Asian Journal of

Information Technology, Article ID: 706-AJIT

[3] Chakrabarti P., et. al. () Asian Journal of Information Technology, Article ID: 743-AJIT

[4] Chakrabarti P., et. al. (2008) IJHIS . [5] Chakrabarti P. (2008) International

conference on Emerging Technologies and Applications in Engineering, Technology and Sciences , Rajkot.

[6] Chakrabarti P. (2008) ICQMOIT08, Hyderabad.

[7] Schneier B. (2008) Applied Cryptography , Wiley-India Edition