flogiston br. 23

Upload: ivan-stanic

Post on 09-Mar-2016

191 views

Category:

Documents


2 download

DESCRIPTION

Овде можете погледати електронску верзију Фологистона број 23

TRANSCRIPT

  • Journal of the History of Science

    Belgrade2015

    P H L O G I S T O N23

  • 23 2015UDC 001 (091) ISSN 035-6640

    - 51: 30 37 962; : 32 81 479-: [email protected]

    The Language Shop

    ,

    500

  • , , , -, , Alessandro Camiz, --, , , , , , , -

    , , , -, , , , , , , , , . ,

    , , , - , , , , , , , -, , , -,

  • 9 Pierre Hansen, Rita Macedo, Nenad MladenoviStatistical Tests of Data Classifiability with Respect to a Clustering Criterion

    27 ?

    45

    67 aa eeoo ,

    83 .

    113 o 1914-1918.

    133 .

    149

    171 . : ,

    197 .

  • 225

    231 ao . ojo ,

    233 . , . , . 50/12

    242 , ,

    245 . , : 18542004

    253 , (VIVI )

    258 , I

    264 , . ,

    268 , (18151893) 200

    271 eja Tooo, eao je - aj oea eo oaaj

    273 . , XIX XX

    277 ,

    281 . , (On Architecture Facing the Future)

  • 287 . ,

    290 . , - -,

  • 9scientific review UDC 004.62

    Pierre Hansen1 GERAD, HEC Montral,3000 chemin de la Cte-Sainte-Catherine, Montreal, Canada H3T 2A7

    Rita Macedo2Institut de Recherche Technologique Railenium, F-59300, Famars, France

    Nenad Mladenovi3Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade

    STATISTICAL TESTS OF DATA CLASSIFIABILITY WITH RESPECT TO A CLUSTERING CRITERION

    Abstract

    We propose a new test for data classifiability that takes into ac-count a clustering criterion. Our test answers to the question whether data has a structure when a certain clustering criterion is used, or not. In order to demonstrate its abilities, we develop a classifiability test with respect to the single linkage clustering criterion. With our test we are also able to answer to the question of how many clusters are there in the data set. The quality of our test is checked on two well known data sets from the literature, the Ruspini and Fisher data with 75 and 150 entities, respectively.

    Keywords: classifiability, clustering, statistical test

    1. Introduction

    Data analysis is a discipline whose importance is raising due to the increasing abundance of available data, the increasing role of the Internet and the increasing ability to store huge amounts of data in computers. Data Mining, and more recently, Big data analysis are ar-eas that basically aim to efficiently extract knowledge from large data sets.4

    1 [email protected] [email protected] [email protected] 4 Oded Maimon and Lior Rokach, Data Mining and Knowledge Discovery Hand-book, second edition (Berlin: Springer, 2010).

  • 10

    Clustering is an important well known data reduction method in Data Mining. It is a procedure that classifies data according to their similarity (or dissimilarity). Many different clustering approaches exist in the literature. One of its paradigm is partitioning: the data are divid-ed in a set of groups, called clusters, whose entities are more similar to each other, and/or more dissimilar to entities of other groups. This means that a good clustering method should produce groups with a high intra-cluster similarity and/or a large inter-cluster dissimilarity (see e.g.5 for an introduction).

    However, it is not a good idea to perform clustering if the method cannot give any valuable answer. In addition, data can have structure with respect to one clustering criterion, but not with respect to an-other. Thus, the first question that one should answer before finding groups of entities is whether or not there is a structure in the data set (without explicitly identifying it), if the certain clustering criterion is used. And, if there is a structure, for how many clusters that structure is recognized?

    Therefore, one needs to design a statistical test of classifiability. In this paper, we try to answer to both questions (for a given (dis)similarity matrix between any two entities): is there a structure in the data and for what number of groups? As far as we are aware, there are some procedures of this type in the literature referred to as clus-tering tendency tests, but usually they are not connected to a clus-tering criterion. Surveys on these tests can be found in.6

    In this paper, we propose a general framework for building a classifiability test with respect to (w.r.t.) a given clustering criterion. Our approach is based on the simple fact that uniformly distributed entities cannot be clustered and therefore such sets do not have a structure. We build a test in comparing our data set with the corre-sponding uniformly distributed data set that has the same features. As an example of clustering criterium, we use the popular and poly-nomial hierarchical Single-linkage clustering method. We apply our classifiability test to this case and test it on two well known data sets from the literature: the Ruspini data set7 with 75 points in the 2-di-

    5 John A. Hartigan, Clustering algorithms (New York: John Wiley and Sons, 1975); A. Gordon, Classification: Methods for the Exploratory Analysis of Multivariate Data (London: Chapman and Hall, 1981); Anil K. Jain and Richard C. Dubes, Algorithms for Clustering data (Englewood Cliffs, New Jersey: Prentice Hall, 1988).6 Helena Cristina Mendes Silva, Metodos de particao e validacao em anaiise classificatoria baseados em teoria de grafos (PhD thesis, Departamento de Matematica Aplicada, Faculdade de Ciencias da Universidade do Porto, 2005); Sergios Theodoridis and Konstantinos Koutroumbas, ed., Pattern Rec-ognition, fourth edition (Burlington: El- sevier/Academic Press, 2009).7 Enrique H. Ruspini, Numerical methods for fuzzy clustering, Information Sciences, 2, 3 (1970): 319-350.

  • 11

    mensional Euclidean space (and four or five known clusters), and the Fisher iris data set8 with 150 points in the 4-dimensional Euclidean space (and three known clusters).

    These two data sets are typically used for testing clustering tech-niques. With our approach, we easily recognized structure and the number of clusters in both test instances.

    The paper is organized as follows. In Section 2, we describe all the steps of our generic algorithm that builds a statistical test w.r.t. a clustering criterion. In Section 3, we give more details of our algo-rithm for the case of the single-linkage clustering criterion, illustrating the details on Ruspini and Fisher data sets. Section 4 concludes the paper.

    2. General framework

    In this section, we present our Statistical Test on Data Classifiability (STDC) algorithm. We first define clustering problems and present possible parameters of our data classifiability test. Then, we describe the steps for building such tests w.r.t. some clustering criterion. We assume that the set of entities V to be clustered belongs to the space S (V S) and has cardinality n (|V | = n). We also assume that the dis-similarity d(i, j) between any two entities i and j is known.

    2.1. Clustering problem

    Clustering is a popular way of finding a structure in the data9, i.e. to find an intrinsic grouping in a set of unlabeled data (unsupervised classification).

    Clustering methods detect the presence of distinct groups whose entities are more similar to each other, and/or more dissimilar to entities

    8 Ronald Aylmer Fisher, The use of multiple measurements in taxonomic problems, Annals of Eugenics 7, 2 (1936-37): 179-188.9 Boris Mirkin, Mathematical classification and clustering: From how to what and why, in Classification, Data Analysis, and Data Highways, ed. Ingo Balder-jahn, Rudolf Mathar and Martin Schader (Berlin; Heidelberg: Springer, 1998), 172-181; Anil K. Jain, Murty M. Narasimha and Patrick J. Flynn, Data clus-tering: a review, ACM Computing Surveys (CSUR), 31, 3 (1999): 264-323; Gan Guojun, Ma Chaoqun and Wu Jianhong, Theory, Algorithms, and Applications (Philadelphia: Society for Industrial and Applied Mathematics, 2007); Rui Xu and Donald C. Wun- sch, Survey of clustering algorithms, IEEE Transactions on Neural Networks, 16, 3 (2005): 645-678; Rui Xu and Donald C. Wunsch, Clus-tering (Oxford: Wiley- IEEE, 2009); Sergios Theodoridis and Konstanti- nos Koutroumbas, ed., Pattern Recognition, fourth edition (Burlington: Elsevier/Academic Press, 2009); Lior Rokach, A survey of clustering algorithms, in Data Mining and Knowledge Discovery Handbook, ed. Oded Maimon and Lior Rokach (New York; London: Springer, 2010), 269-298.

  • 12

    of other groups of the data set. In other words, clustering criteria mea-sure homogeneity and/or separation among entities. There are many clustering paradigms (e.g., hierarchical, partitioning, sequential; densi-ty-based, gridbased and model-based methods) and methods that use different criteria (or objective functions) within each paradigm.10

    2.2. Parameters of the statistical test

    The following parameters have to be taken into account in our clas-sifiability test:

    (1) Clustering criterion. In designing the general steps of the STDC algorithm, one can notice three levels of generality: one is based on the nature of the objects in S that we are classifying, the other two are based on the clustering criterion and the solution method used, respec-tively. In other words, we can choose the same set S at the first level, but another criterion at the second level or another method for the same criterion on the third level. Alternatively, we can use the same clustering criterion, e.g. the minimum sum of squares, for grouping different object types. In both cases, one needs to develop a STDC al-gorithm taking into account specific knowledge and distribution types regarding both the data type and the clustering criterion. In this paper we assume that the method used on the third level is an exact one.

    Besides the clustering criterion, we need two other parameters to fix before performing our statistical test. They are the data set type S and the shape.

    (2) Data set type S. Let S be the set we want to find clusters in. From a mathematical point of view, this set can contain objects (points or entities) of different kind. They can be vectors that belong to the Euclidean space Rd, vertices or edges of a graph, vehicle routes, or any other combinatorial object.

    (3) Shape. Usually, the domain of the data set we want to cluster does not take its values from the whole space S. The domain could be some subset D C S defining the shape of data. For example, D could be the set of points in Rn that belong to a hypercube: aj < xj < bj. Fol-lowing the notation from above, it is easy to see that V C D C S.

    2.3. Statistical test on data classifiability

    We present the steps of our STDC algorithm. It is based on the idea that uniformly distributed points generated in D do not have a structure, i.e., they cannot be clustered successfully, independently of any of the above parameters. Therefore, we begin by generating points in D with uniform distribution, and repeat this N times. During these repetitions

    10 Pierre Hansen and Brigitte Jaumard, Cluster analysis and mathematical programming, Mathematical Programming, 79, 1-3 (1997):191-215.

  • 13

    we collect statistics of the event that should be related to the cluster-ing criterion we want to check. The next step is then to decide on the distribution type of the event, to estimate the parameters of that distri-bution and to check if those parameters fit well (goodness-of-fit test).

    After collecting all information regarding the distribution of the non classifiable data set with the same size, type and shape as our data, we need to find the same distribution parameter values for our data set and compare. If they are close, there is no structure in our data w.r.t. the given clustering criterion. Otherwise, there is a struc-ture in the data set considered. The algorithm has the following steps:

    Algorithm 1 STDC-general (N, n, V, D, S)1. Repeat N times

    a) Generate n points from S with uniform distribution within a given surface or volume (given D);

    b) On such random set of entities compute statistics to estimate parameters of chosen random variables (r.v.) with known dis-tribution types; note that chosen r.v. should relate to a clus-tering criterion;

    End Repeat2. Compute mean and standard deviation values for r.v. from

    above;3. Compute the same values for a given data set and plot them

    on the same diagram;4. The comparison of values with above curves should indicate if

    and where there is some structure;

    3. STDC w.r.t. single linkage clustering

    In this section, we use specific knowledge regarding the single link-age hierarchical clustering method to build a statistical test.

    3.1. Single linkage clustering criterion

    Single linkage clustering belongs to the agglomerative hierarchical clustering family.11 Agglomerative hierarchical clustering methods create a hierarchy of nested partitions over the initial data set. Each entity of the set is initially placed in one cluster and then clusters are merged iteratively until all entities belong to a single cluster. This structure can be visualized using a dendrogram. At each step of the algorithm, two clusters are merged. This requires defining an aggre-

    11 Fionn Murtagh, A survey of recent advances in hierarchical clustering algorithms, The Computer Journal, 26, 4 (1983): 354-359; Fionn Murtagh and Pedro Contreras, Algorithms for hierarchical clustering: an overview, Wiley In-terdisciplinary Reviews: Data Mining and Knowledge Discovery, 2, 1 (2012): 86-97.

  • 14

    gation link (single link, complete link, average link, Ward link...). For the single linkage clustering, first introduced in12, the two clusters to be merged are the ones with the smallest pairwise dissimilarity, i.e., the ones with the minimum distance between two of their entities. In other words, the distance between two clusters is defined as the minimum distance between any two entities of the two clusters.

    It is well known that the clusters obtained by the single linkage criterion may be easily derived if the Minimum Spanning Tree (MST) of the graph is known13. By deleting the largest edge of the MST, we obtain two clusters; the deletion of the second largest edge produces three clusters, and so on. The result of the final step is the fact that each entity belongs to its own cluster.

    3.2. Algorithm

    Without loss of generality, we will assume that the clustering objects are points from the Euclidean space Rd (S = Rd). We need to find the MST of each randomly generated set of objects. In order to do that, we find the distances dij between each two entities in Rd to get the complete weighted graph G(V, E). The STDC algorithm w.r.t. single linkage criterion, with a given set type and a given shape is as follows (Algorithm 2):

    3.3. Illustration on Ruspini example

    Our first experiment with this algorithm was done with the Ruspini data set14 (75 points in R2). Points were located in a rectangle [0,120] x [0,160], and the used parameters were the following: n = 75, d = 2, ntest = 10000, p = 2. The results are presented in Table 1 and Figure 1.

    The first column (i) of Table 1 represents the index of the ele-ments. Columns 2 4 and 6 8 refer, respectively, to statistics (being the average and the standard deviation) about the distances and ranks of the points generated at step 1. a) of STDC-1. Finally, columns 5 and 9 present the value of the ith Ruspini data set element. The four curves of the two graphs of Figure 1 represent the values of the col-umns of Table 1, with respect to i.

    Regarding the r.v. distance, presented in Figure 1 (a) and the left hand side of Table 1, results can be interpreted in the following way: the curve that represents the ranked lengths of the MST edges in Ruspini data intersects the interval [ , + ] for i between 71 and 72. Note that the value of 44.944 corresponds to the largest edge of the MST.

    12 K. Florek et al, Sur la Liaison et la Division des Points dun Ensemble Fini, Colloquium Mathematics, 2 (1951): 282-285.13 J. Gower and Ross G., Minimum spanning trees and single linkage cluster analysis, Journal of the Royal Statistical Society, Series C, 18, 1 (1969): 54-64.14 See footnote 7.

  • 15

    Algorithm 2 STDC-1 (ntest, n, d, p, shape)

    1. Repeat ntest timesa) Generate n points with uniform distribution in Rd within a giv-

    en surface or volume (shape);b) Compute the matrix of distances between all pairs of these

    points (an lp-norm is used);c) Rank these distances by order of increasing values;d) Consider the following two random variables: distance asso-

    ciated with the edges of the MST of the complete graph built on the n points; corresponding ranks;

    e) Plot mean values and the corresponding value deviations ( and + ) on two diagrams;

    End Repeat

    2. Compute mean values and standard deviations for both ran-dom variables distances and ranks;

    3. Compute the same distances and ranks for a given data set (steps 1.b)-1.d)) and plot them in the same diagrams;

    4. The comparison of values of the above random variables curves indicates if and where there is some structure. Details regarding this step will be discussed later;

    Distances Ranksi + Rusp + Rusp5 2.571 3.295 4.020 2.236 4.808 5.064 5.320 5

    10 4.067 4.806 5.545 3.000 9.754 10.411 11.069 1015 5.289 6.023 6.756 3.162 14.995 16.094 17.193 1520 6.374 7.111 7.847 3.606 20.588 22.177 23.766 2025 7.394 8.124 8.854 4.123 26.572 28.706 30.841 2530 8.380 9.110 9.840 4.243 32.952 35.768 38.584 3235 9.351 10.082 10.813 4.472 39.929 43.465 47.000 3740 10.329 11.065 11.802 5.385 47.561 51.989 56.417 4845 11.339 12.085 12.832 5.831 56.085 61.559 67.034 6150 12.395 13.160 13.926 6.403 65.661 72.513 79.364 8155 13.552 14.348 15.145 7.071 76.886 85.505 94.123 9360 14.842 15.702 16.562 8.485 90.475 101.491 112.508 12665 16.401 17.381 18.362 10.630 108.143 123.023 137.903 20170 18.555 19.870 21.185 13.601 134.985 158.357 181.729 29271 19.160 20.629 22.099 19.000 142.961 169.933 196.905 44472 19.899 21.611 23.322 24.042 152.914 185.452 217.990 55073 20.814 23.060 25.306 40.497 165.493 209.658 253.824 69674 22.186 25.898 29.609 44.944 183.370 261.402 339.434 720

    Table 1. Results for Ruspini data

  • 16

    (a) r.v. Distance (b) r.v. Rank

    Figure 1. Ruspini data

    3.4. Distribution of ranks

    In order to build a precise test, we consider the distribution of values of the ranks for some edge e of the MST in more detail. The last two steps of the algorithm above are replaced with a new step:

    Algorithm 3 STDC-2 (ntest, n, d, p, shape)

    1. Repeat ntest timesa) steps 1-11 of Algorithm 2

    End Repeat

    2. For each i = 1 to n 1 estimate empirically the density function for r.v. rank(i) (i.e., find number of cases for rank intervals)

    We run STDC-2(10000, 100, 2, 2, unit square). Figure 2 shows the empirical distributions obtained for i = 20, 40, 60, 80, 96 and 99. For i = 20, 40 the rank intervals were set to the unity, for i = 60, 80 they were set to 5 and and for i = 96, 99 to 20.

    From Figure 2, we conclude that the distribution might be Weibull, which has probability density function

    3.4 Distribution of ranks

    In order to build a precise test, we consider the distributionof values of the ranks for some edge e of the MST in moredetail. The last two steps of the algorithm above are replacedwith a new step:

    Algorithm 3 STDC-2 (ntest, n, d, p, shape)

    1. Repeat ntest times

    a) steps 1-11 of Algorithm 2

    End Repeat

    2. For each i = 1 to n1 estimate empirically the densityfunction for r.v. rank(i) (i.e., find number of cases forrank intervals)

    We run STDC-2(10000, 100, 2, 2, unit square). Figure 2shows the empirical distributions obtained for i = 20, 40, 60,80, 96 and 99. For i = 20, 40 the rank intervals were set to theunity, for i = 60, 80 they were set to 5 and and for i = 96, 99to 20.

    From Figure 2, we conclude that the distribution might beWeibull, which has probability density function

    f(x) =

    (x v

    )1e(

    xv )

    , if x v,

    and cumulative distribution function

    FW (x) = 1 e(xv )

    , if x v.

    The estimation of Weibull distribution parameters, togetherwith the goodness-of-fit test is given in Appendix.

    12

    and cumulative distribution function

    3.4 Distribution of ranks

    In order to build a precise test, we consider the distributionof values of the ranks for some edge e of the MST in moredetail. The last two steps of the algorithm above are replacedwith a new step:

    Algorithm 3 STDC-2 (ntest, n, d, p, shape)

    1. Repeat ntest times

    a) steps 1-11 of Algorithm 2

    End Repeat

    2. For each i = 1 to n1 estimate empirically the densityfunction for r.v. rank(i) (i.e., find number of cases forrank intervals)

    We run STDC-2(10000, 100, 2, 2, unit square). Figure 2shows the empirical distributions obtained for i = 20, 40, 60,80, 96 and 99. For i = 20, 40 the rank intervals were set to theunity, for i = 60, 80 they were set to 5 and and for i = 96, 99to 20.

    From Figure 2, we conclude that the distribution might beWeibull, which has probability density function

    f(x) =

    (x v

    )1e(

    xv )

    , if x v,

    and cumulative distribution function

    FW (x) = 1 e(xv )

    , if x v.

    The estimation of Weibull distribution parameters, togetherwith the goodness-of-fit test is given in Appendix.

    12

    The estimation of Weibull distribution parameters, together with the goodness-of-fit test is given in Appendix.

  • 17

    (a) i = 20, i = 40 (b) i = 60, i = 80

    (c) i = 96 (d) i = 99Figure 2. Distribution of the ranks

    3.4.1. Influence of the shape on distribution of the ranks

    The next series of experiments is performed to check if a shape (area) where random points are generated has some influence on the distri-bution of ranks rank(i),i = 1 , . . . , n 1. Again, one hundred random points were generated 10000 times in R2, on four different shapes with unit surface: (i) square; (ii) rectangle; (iii) circle; (iv) triangle. We concluded that shape has no influence on distribution (at least for d = p = 2), for each i = 1 , . . . , n 1.

    3.4.2. Influence of dimension on distribution of the ranks

    The dimension parameter d has been changed in Algorithm STDC-2 (see below) from 2 to 10 in the next series of experiments, i.e.,

    1. For each d = 2 , . . . , 10 doa) STDC-2(10000, 100, d, 2, unit square);b) Plot the empirical density function (for each i = 1 , . . . ,n 1);c) Estimate the parameters of the Weibull distribution (for each i = 1, . . . , n 1);d) Apply the Kolmogorov-Smirnov goodness-of-fit test;

    End for

  • 18

    Figure 3 shows how the dimension of the space where points are generated at random influences on the distribution of ranks. It is clear that the type of the distribution remains (Weibul), but its parameters are different when d changes.

    3.5 Test for Classifiability

    We are now able to build the test for classifiability. The test will an-swer two questions: is there any structure in given data; if yes, for what edges of minimum spanning tree those occur.

    1. Simple test: The first classifiability test does not use informa-tion about probability distribution of r.v. rank(i), i = 1,...,n 1. We simply keep maximum ranks rmax(i), i = 1,..., n 1 obtained in a sample. Those values depend on parameters n and d, (in-fluence of parameter p should be checked after), i.e., for each size n and dimension d of a data, n 1 critical values are given in the table ( rmax( i ) ) . If rgiven(i) < r max( i ) for all i = 1 , . . . , n 1, then there is no structure in the data according to this test. Figure 4(a) and Table 2 show the result of this test on Ruspini data (1970).

    (a) p = 2,..., 10, i = 40 (b) p = 2,..., 10, i = 95

    (c) p = 2 ,..., 5, i = 99 (d) p = 2,..., 10, i = 99

    Figure 3. Influence of dimension on the distribution

  • 19

    2. Classifiability test: In this test we compare r(i) and rgiven(i), for given values of . Values of r(i) are functions of tree parame-ters of the Weibull distribution. In other words, when parame-ters ( i ) , ( i ) and ( i ) are estimated in the way explained in Section 5, critical values rY (i) are obtained from F W (r(i)) < . Similar as in the Simple test, we test (now with a given signifi-cance ) classifiability by checking if

    rgiven(i) < r(i)

    A

    i = 1 , . . . , n 1.If yes, there is no structure in the data (for given tolerance ). If no, there is a structure in a data. Table 2 and Figure 4 show r(i) for = 10-2, 1 0 - 3 , . . . , 10-8 and n = 75.

    3. Clustering: Moreover, we shall give the procedure to get an optimal partition of a set of n entities into n i* + 1 clusters, where i* represents i where FW(r(i)) is minimum.

    a) Find index i* such that FW(rY(i)) is minimum;

    b) Delete all edges of minimum spanning tree whose weights are greater than or equal to the weight of i;c) Entities in the same cluster are connected with remain-ing edges of minimum spanning tree.

    (a) Simple classifiability test (b) Classifiability test

    Figure 4. Tests on the Ruspini data set (n = 75, d = 2 , p = 2)

    The last column of Table 2 presents (i) = 1 FW(rgiven(i)) for the Ruspini data set. It appears that partitioning 75 entities in 3, 4, 5, 7 and 8 clusters is equal good ((3) = (4) = ... = (8) =0).

    4. Conclusions

    In the clustering literature, sometimes clustering methods are applied without knowledge of whether there is a structure in the data, and therefore, the question of performing clustering

  • 20

    i 10-2 10-3 10-4 10-5 10-6 10-7 10-8 rgiven rmax 5 6 6 6 6 6 7 7 5 8 1.00000000000

    10 12 13 14 15 16 17 18 10 16 1.0000000000015 21 23 26 28 31 33 35 15 25 1.0000000000020 28 30 33 35 37 39 40 20 33 1.0000000000025 35 38 40 42 44 46 47 25 44 1.0000000000030 44 47 49 51 53 54 55 32 57 0.9224700927735 53 56 59 61 62 64 65 37 65 0.9811536073740 64 67 70 73 75 77 78 48 76 0.8065246939745 76 80 84 87 90 92 94 61 91 0.5294491052650 90 96 100 104 107 109 111 81 108 0.1250206232155 107 114 119 124 128 131 133 93 131 0.2088410854360 129 138 145 150 155 159 162 126 164 0.0184061527361 135 144 152 158 164 168 171 130 173 0.0248444080462 140 149 157 163 169 173 176 138 176 0.0136033892663 146 157 165 172 178 182 186 166 191 0.0000577569064 152 163 171 178 184 189 193 171 198 0.0000883340865 161 173 183 192 199 205 209 201 207 0.0000004768466 169 183 193 202 210 216 221 202 211 0.0000095963567 178 192 203 213 221 228 233 225 224 0.0000002980268 189 205 217 227 237 244 249 271 251 0.0000000000069 202 219 233 245 255 263 270 283 276 0.0000000000070 218 238 253 266 278 287 294 292 298 0.0000000596071 240 263 281 297 310 321 329 444 321 0.0000000000072 272 301 325 345 363 377 388 550 403 0.0000000000073 327 368 402 432 458 479 495 696 548 0.0000000000074 470 550 619 679 733 775 809 720 818 0.00000178814

    Table 2. Ruspini data. Critical ranks for different tolerance

    or not should be posed. In addition, sometimes some clustering cri-terion recognizes similar groups in the data and other times it does not. Thus, we need to know what criterion will be used and then in-vestigate whether there is a structure in the data or not, with respect to the particular criterion. In this paper we propose a classifiability test with respect to a given clustering criterion and show its steps for the single linkage hierarchical clustering. Our test also gives an an-swer to the question, what is the best number of clusters in the given data? Our methodology is illustrated on classical test instances from the literature (Ruspini and Fisher data). Future work may include the development of our classification test with respect to other criteria, such as Minimum sum of squares clustering.

  • 21

    (a) Classifiability test (b) Classifiability test

    Figure 5. Tests on the Fisher data set

    i 10-2 10-3 10-4 10-5 10-6 10-7 10-8 rgiven rmax 5 6 6 6 6 6 6 6 5 7 010E+01

    10 11 11 12 12 12 13 13 10 14 010E+0115 17 18 19 20 20 21 22 15 21 010E+0120 24 25 27 28 30 31 33 22 27 055E-0125 31 33 36 39 41 43 45 28 34 072E-0130 37 40 42 44 47 48 50 33 41 017E+0035 43 46 48 50 52 54 55 39 50 017E+0040 50 52 55 57 58 60 61 55 60 036E-0445 56 59 61 63 65 66 67 65 70 054E-0650 63 66 68 70 72 73 74 75 77 000E+0060 78 81 83 85 87 89 90 96 93 000E+0070 93 97 99 101 103 105 106 120 108 000E+0080 111 115 119 122 124 126 127 148 129 000E+0090 131 137 141 145 148 150 152 192 155 000E+00

    100 153 159 164 168 171 174 176 252 180 000E+00110 179 187 193 198 202 206 208 295 211 000E+00120 215 225 234 241 247 252 256 343 250 000E+00130 267 283 295 306 315 322 327 451 314 000E+00140 377 404 425 444 459 472 481 852 502 000E+00145 528 579 619 654 685 709 727 1250 797 000E+00146 594 659 713 759 799 830 855 1337 997 000E+00147 685 767 835 893 944 984 1015 1651 1148 000E+00148 855 975 1074 1160 1236 1297 1343 1973 1354 000E+00149 1379 1670 1924 2152 2359 2526 2657 4470 2563 000E+00

    Table 3. Fisher data. Critical ranks for different tolerance

    5. Appendix

    5.1 Fitting the parameters of the Weibull distribution

    We estimate the shape, scale and location parameters (, , ) of the Weibull distribution, for each rank i = 1 , . . . , n 1, on the follow-ing way:

  • 22

    (i) = min {X1(i), X2( i ) , . . . , Xntest(i)} = Xmin(i).

    We estimate mean values ((i)) and variances ( 2( i ) ) by

    squares clustering.

    5 Appendix

    5.1 Fitting the parameters of the Weibull distri-bution

    We estimate the shape, scale and location parameters (, , v)of the Weibull distribution, for each rank i = 1, . . . , n 1),on the following way:

    v(i) = min{X1(i), X2(i), . . . , Xntest(i)} = Xmin(i).

    We estimate mean values ((i)) and variances (2(i)) by

    (i) =

    ntestj=1 Xj(i)

    ntest, 2(i) =

    ((i)Xj(i))2(ntest 1)2 , i = 1, . . . , n1.

    Then, fitting (i) and (i) is done in the following way. (i)can be obtained from

    ((i) v(i))22(i) + ((i) v(i))2 =

    [1 + 1 ][1 +1 ]

    [1 + 2 ],

    where () is the gamma function. In order to facilitate thiscomputation, Table 4 provides values of [1+z][1+z]/[1+2z] for values of z between 0 and 1. If z exceeds 1, Table 4 isextended using the known equation (z+1) = z(z), i.e., letus denote with

    k(z) =[1 + (k + z)][1 + (k + z)]

    [1 + 2(k + z)], k = 0, 1, 2, . . . .

    The values of o(z), z [0, 1) are given in the Table 4. Byusing (z + 1) = z(z), we derive the recurrent relation

    k(z) =k + z

    4(k + z) 2k1(z), k = 1, 2, . . . ; z [0, 1).

    19

    Then, fitting (i) and (i) is done in the following way. (i) can be ob-tained from

    squares clustering.

    5 Appendix

    5.1 Fitting the parameters of the Weibull distri-bution

    We estimate the shape, scale and location parameters (, , v)of the Weibull distribution, for each rank i = 1, . . . , n 1),on the following way:

    v(i) = min{X1(i), X2(i), . . . , Xntest(i)} = Xmin(i).

    We estimate mean values ((i)) and variances (2(i)) by

    (i) =

    ntestj=1 Xj(i)

    ntest, 2(i) =

    ((i)Xj(i))2(ntest 1)2 , i = 1, . . . , n1.

    Then, fitting (i) and (i) is done in the following way. (i)can be obtained from

    ((i) v(i))22(i) + ((i) v(i))2 =

    [1 + 1 ][1 +1 ]

    [1 + 2 ],

    where () is the gamma function. In order to facilitate thiscomputation, Table 4 provides values of [1+z][1+z]/[1+2z] for values of z between 0 and 1. If z exceeds 1, Table 4 isextended using the known equation (z+1) = z(z), i.e., letus denote with

    k(z) =[1 + (k + z)][1 + (k + z)]

    [1 + 2(k + z)], k = 0, 1, 2, . . . .

    The values of o(z), z [0, 1) are given in the Table 4. Byusing (z + 1) = z(z), we derive the recurrent relation

    k(z) =k + z

    4(k + z) 2k1(z), k = 1, 2, . . . ; z [0, 1).

    19

    where () is the gamma function. In order to facilitate this computa-tion, Table 4 provides values of [1 + z][1 + z]/[1 + 2z] for values of z between 0 and 1. If z exceeds 1, Table 4 is extended using the known equation (z + 1) = z(z), i.e., let us denote with

    squares clustering.

    5 Appendix

    5.1 Fitting the parameters of the Weibull distri-bution

    We estimate the shape, scale and location parameters (, , v)of the Weibull distribution, for each rank i = 1, . . . , n 1),on the following way:

    v(i) = min{X1(i), X2(i), . . . , Xntest(i)} = Xmin(i).

    We estimate mean values ((i)) and variances (2(i)) by

    (i) =

    ntestj=1 Xj(i)

    ntest, 2(i) =

    ((i)Xj(i))2(ntest 1)2 , i = 1, . . . , n1.

    Then, fitting (i) and (i) is done in the following way. (i)can be obtained from

    ((i) v(i))22(i) + ((i) v(i))2 =

    [1 + 1 ][1 +1 ]

    [1 + 2 ],

    where () is the gamma function. In order to facilitate thiscomputation, Table 4 provides values of [1+z][1+z]/[1+2z] for values of z between 0 and 1. If z exceeds 1, Table 4 isextended using the known equation (z+1) = z(z), i.e., letus denote with

    k(z) =[1 + (k + z)][1 + (k + z)]

    [1 + 2(k + z)], k = 0, 1, 2, . . . .

    The values of o(z), z [0, 1) are given in the Table 4. Byusing (z + 1) = z(z), we derive the recurrent relation

    k(z) =k + z

    4(k + z) 2k1(z), k = 1, 2, . . . ; z [0, 1).

    19

    The values of o(z), z [0,1) are given in the Table 4. By using (z + 1) = z(z), we derive the recurrent relation

    squares clustering.

    5 Appendix

    5.1 Fitting the parameters of the Weibull distri-bution

    We estimate the shape, scale and location parameters (, , v)of the Weibull distribution, for each rank i = 1, . . . , n 1),on the following way:

    v(i) = min{X1(i), X2(i), . . . , Xntest(i)} = Xmin(i).

    We estimate mean values ((i)) and variances (2(i)) by

    (i) =

    ntestj=1 Xj(i)

    ntest, 2(i) =

    ((i)Xj(i))2(ntest 1)2 , i = 1, . . . , n1.

    Then, fitting (i) and (i) is done in the following way. (i)can be obtained from

    ((i) v(i))22(i) + ((i) v(i))2 =

    [1 + 1 ][1 +1 ]

    [1 + 2 ],

    where () is the gamma function. In order to facilitate thiscomputation, Table 4 provides values of [1+z][1+z]/[1+2z] for values of z between 0 and 1. If z exceeds 1, Table 4 isextended using the known equation (z+1) = z(z), i.e., letus denote with

    k(z) =[1 + (k + z)][1 + (k + z)]

    [1 + 2(k + z)], k = 0, 1, 2, . . . .

    The values of o(z), z [0, 1) are given in the Table 4. Byusing (z + 1) = z(z), we derive the recurrent relation

    k(z) =k + z

    4(k + z) 2k1(z), k = 1, 2, . . . ; z [0, 1).

    190.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    0.00 1.0000 0.9998 0.9993 0.9985 0.9975 0.9961 0.9945 0.9928 0.9906 0.98840.10 0.9858 0.9830 0.9801 0.9768 0.9735 0.9699 0.9664 0.9625 0.9585 0.95450.20 0.9502 0.9458 0.9412 0.9367 0.9319 0.9271 0.9221 0.9170 0.9119 0.90670.30 0.9015 0.8961 0.8906 0.8852 0.8796 0.8741 0.8683 0.8626 0.8568 0.85120.40 0.8453 0.8395 0.8336 0.8274 0.8215 0.8156 0.8095 0.8035 0.7975 0.79140.50 0.7854 0.7793 0.7732 0.7672 0.7611 0.7550 0.7489 0.7428 0.7367 0.73070.60 0.7246 0.7186 0.7125 0.7064 0.7004 0.6944 0.6885 0.6825 0.6765 0.67060.70 0.6647 0.6588 0.6529 0.6470 0.6412 0.6354 0.6296 0.6239 0.6182 0.61250.80 0.6068 0.6012 0.5955 0.5899 0.5844 0.5789 0.5734 0.5679 0.5625 0.55710.90 0.5518 0.5464 0.5411 0.5359 0.5306 0.5254 0.5203 0.5152 0.5101 0.5050

    Table 4: Values of [1 + z][1 + z]/[1 + 2z]

    Once an estimate of has been determined, we can obtainan estimate of through the formula

    (i) =(i) v(i)[1 + 1 ]

    .

    For calculating the gamma function, Table 5 has been used.

    n 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    1.0 1.0000 .9943 .9888 .9835 .9784 .9735 .9687 .9642 .9597 .95551.1 .9514 .9474 .9436 .9399 .9364 .9330 .9298 .9267 .9237 .92091.2 .9182 .9156 .9131 .9108 .9085 .9064 .9044 .9025 .9007 .89901.3 .8975 .8960 .8946 .8934 .8922 .8912 .8902 .8893 .8885 .88791.4 .8873 .8868 .8864 .8860 .8858 .8857 .8856 .8856 .8857 .88591.5 .8862 .8866 .8870 .8876 .8882 .8889 .8896 .8905 .8914 .89241.6 .8935 .8947 .8959 .8972 .8986 .9001 .9017 .9033 .9050 .90681.7 .9086 .9106 .9126 .9147 .9168 .9191 .9214 .9238 .9262 .92881.8 .9314 .9341 .9368 .9397 .9426 .9456 .9487 .9518 .9551 .95841.9 .9618 .9652 .9688 .9724 .9761 .9799 .9837 .9877 .9917 .9958

    Table 5: Values of gamma function (z)

    Again, the relation (z + 1) = z(z) is used to extend Table5 for z 2. In both Table 4 and Table 5, a linear interpola-tion formula is derived when the known value is between twosuccessive values from the tables.

    Table 6 shows the obtained results for all three parameters,

    20

    Table 4. Values of [1 + z][1 + z]/ [1 + 2z]

    Once an estimate of has been determined, we can obtain an es-timate of a through the formula

    0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    0.00 1.0000 0.9998 0.9993 0.9985 0.9975 0.9961 0.9945 0.9928 0.9906 0.98840.10 0.9858 0.9830 0.9801 0.9768 0.9735 0.9699 0.9664 0.9625 0.9585 0.95450.20 0.9502 0.9458 0.9412 0.9367 0.9319 0.9271 0.9221 0.9170 0.9119 0.90670.30 0.9015 0.8961 0.8906 0.8852 0.8796 0.8741 0.8683 0.8626 0.8568 0.85120.40 0.8453 0.8395 0.8336 0.8274 0.8215 0.8156 0.8095 0.8035 0.7975 0.79140.50 0.7854 0.7793 0.7732 0.7672 0.7611 0.7550 0.7489 0.7428 0.7367 0.73070.60 0.7246 0.7186 0.7125 0.7064 0.7004 0.6944 0.6885 0.6825 0.6765 0.67060.70 0.6647 0.6588 0.6529 0.6470 0.6412 0.6354 0.6296 0.6239 0.6182 0.61250.80 0.6068 0.6012 0.5955 0.5899 0.5844 0.5789 0.5734 0.5679 0.5625 0.55710.90 0.5518 0.5464 0.5411 0.5359 0.5306 0.5254 0.5203 0.5152 0.5101 0.5050

    Table 4: Values of [1 + z][1 + z]/[1 + 2z]

    Once an estimate of has been determined, we can obtainan estimate of through the formula

    (i) =(i) v(i)[1 + 1 ]

    .

    For calculating the gamma function, Table 5 has been used.

    n 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    1.0 1.0000 .9943 .9888 .9835 .9784 .9735 .9687 .9642 .9597 .95551.1 .9514 .9474 .9436 .9399 .9364 .9330 .9298 .9267 .9237 .92091.2 .9182 .9156 .9131 .9108 .9085 .9064 .9044 .9025 .9007 .89901.3 .8975 .8960 .8946 .8934 .8922 .8912 .8902 .8893 .8885 .88791.4 .8873 .8868 .8864 .8860 .8858 .8857 .8856 .8856 .8857 .88591.5 .8862 .8866 .8870 .8876 .8882 .8889 .8896 .8905 .8914 .89241.6 .8935 .8947 .8959 .8972 .8986 .9001 .9017 .9033 .9050 .90681.7 .9086 .9106 .9126 .9147 .9168 .9191 .9214 .9238 .9262 .92881.8 .9314 .9341 .9368 .9397 .9426 .9456 .9487 .9518 .9551 .95841.9 .9618 .9652 .9688 .9724 .9761 .9799 .9837 .9877 .9917 .9958

    Table 5: Values of gamma function (z)

    Again, the relation (z + 1) = z(z) is used to extend Table5 for z 2. In both Table 4 and Table 5, a linear interpola-tion formula is derived when the known value is between twosuccessive values from the tables.

    Table 6 shows the obtained results for all three parameters,

    20

    For calculating the gamma function, Table 5 has been used.

  • 23

    0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    0.00 1.0000 0.9998 0.9993 0.9985 0.9975 0.9961 0.9945 0.9928 0.9906 0.98840.10 0.9858 0.9830 0.9801 0.9768 0.9735 0.9699 0.9664 0.9625 0.9585 0.95450.20 0.9502 0.9458 0.9412 0.9367 0.9319 0.9271 0.9221 0.9170 0.9119 0.90670.30 0.9015 0.8961 0.8906 0.8852 0.8796 0.8741 0.8683 0.8626 0.8568 0.85120.40 0.8453 0.8395 0.8336 0.8274 0.8215 0.8156 0.8095 0.8035 0.7975 0.79140.50 0.7854 0.7793 0.7732 0.7672 0.7611 0.7550 0.7489 0.7428 0.7367 0.73070.60 0.7246 0.7186 0.7125 0.7064 0.7004 0.6944 0.6885 0.6825 0.6765 0.67060.70 0.6647 0.6588 0.6529 0.6470 0.6412 0.6354 0.6296 0.6239 0.6182 0.61250.80 0.6068 0.6012 0.5955 0.5899 0.5844 0.5789 0.5734 0.5679 0.5625 0.55710.90 0.5518 0.5464 0.5411 0.5359 0.5306 0.5254 0.5203 0.5152 0.5101 0.5050

    Table 4: Values of [1 + z][1 + z]/[1 + 2z]

    Once an estimate of has been determined, we can obtainan estimate of through the formula

    (i) =(i) v(i)[1 + 1 ]

    .

    For calculating the gamma function, Table 5 has been used.

    n 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    1.0 1.0000 .9943 .9888 .9835 .9784 .9735 .9687 .9642 .9597 .95551.1 .9514 .9474 .9436 .9399 .9364 .9330 .9298 .9267 .9237 .92091.2 .9182 .9156 .9131 .9108 .9085 .9064 .9044 .9025 .9007 .89901.3 .8975 .8960 .8946 .8934 .8922 .8912 .8902 .8893 .8885 .88791.4 .8873 .8868 .8864 .8860 .8858 .8857 .8856 .8856 .8857 .88591.5 .8862 .8866 .8870 .8876 .8882 .8889 .8896 .8905 .8914 .89241.6 .8935 .8947 .8959 .8972 .8986 .9001 .9017 .9033 .9050 .90681.7 .9086 .9106 .9126 .9147 .9168 .9191 .9214 .9238 .9262 .92881.8 .9314 .9341 .9368 .9397 .9426 .9456 .9487 .9518 .9551 .95841.9 .9618 .9652 .9688 .9724 .9761 .9799 .9837 .9877 .9917 .9958

    Table 5: Values of gamma function (z)

    Again, the relation (z + 1) = z(z) is used to extend Table5 for z 2. In both Table 4 and Table 5, a linear interpola-tion formula is derived when the known value is between twosuccessive values from the tables.

    Table 6 shows the obtained results for all three parameters,

    20

    Table 5. Values of gamma function (z)

    Again, the relation (z + 1) = z(z) is used to extend Table 5 for z = > 2. In both Table 4 and Table 5, a linear interpolation formula is de-rived when the known value is between two successive values from the tables.

    Table 6 shows the obtained results for all three parameters, for some values of i (i = 5 , 1 0 , . . . , 9 5 , 9 6 , . . . , 99).

    i v i v

    5 5.000 0.99111 0.04980 65 72.000 3.07997 23.466110 10.000 0.99447 0.29480 70 79.000 3.24667 29.030215 15.000 0.99845 0.77690 75 91.000 2.88167 31.225620 20.000 1.16393 1.58893 80 98.000 3.22581 41.380825 25.000 1.48851 2.82744 85 109.000 3.22477 51.535730 30.000 1.85467 4.41614 90 122.000 3.26499 67.014635 35.000 2.21231 6.41959 95 147.000 2.77521 90.100540 40.000 2.55912 8.83501 96 151.000 2.71648 102.731045 45.000 2.93430 11.7331 97 155.000 2.62628 121.620050 51.000 3.02372 14.1914 98 161.000 2.41911 151.666755 58.000 2.92699 16.3426 99 165.000 1.97237 228.917160 64.000 3.15736 20.4163

    Table 6: Fitting the Weibull distribution to data

    for some values of i (i = 5, 10, . . . , 95, 96, . . . , 99).

    5.2 Goodness-of-fit test

    In order to make a probability judgment about our choice ofdistribution, we do the following:

    1. compare empirical cumulative distribution Sntest(k(i)), k(i) =rmin(i), . . . , rmax(i); i = 1, . . . , n 1, with cumulativeWeibull distribution FW (k(i)), where rmin(i) and rmax(i)are minimum and maximum values of ranks obtained insample (ntest = 10000) respectively. We found that forn = 100 holds

    max |Sntest(k(i)) FW (k(i))| < 0.05.

    2. use Kolmogorov-Smirnov goodness-of-fit test (as sug-gested for example in 15. The hypothesis was acceptedfor every i.

    15Bruce L. Golden and Frank B. Alt, Interval estimation of a globaloptimum for large combinatorial problems, Naval Research LogisticsQuarterly, 26, 1 (1979): 69-77.

    21

    Table 6. Fitting the Weibull distribution to data

    5.2. Goodness-of-fit test

    In order to make a probability judgment about our choice of distribu-tion, we do the following:

    1. compare empirical cumulative distribution Sntest( k ( i ) ) , k ( i ) = r min( i ) , . . . , r max( i ) ; i = 1 , . . . , n - 1, with cumulative Weibull

  • 24

    distribution FW(k(i)), where rmin(i) and rmax(i) are minimum and maximum values of ranks obtained in sample (ntest = 10000) respectively. We found that for n = 100 holds

    max |Sntest(k(i)) - Fw( k ( i )) | < 0.05.

    2. use Kolmogorov-Smirnov goodness-of-fit test (as suggested for example in.15 The hypothesis was accepted for every i.

    References

    1. Fisher, Ronald Aylmer. The use of multiple measurements in taxo-nomic problems. Annals of Eugenics 7, 2 (1936-37): 179-188.

    2. Florek, K., Jan Lukaszewicz, J Perkal and S. Zubrzycki. Sur la Liaison et la Division des Points dun Ensemble Fini. Colloquium Mathemati-cae, 2 (1951): 282-285.

    3. Guojun, Gan, Ma Chaoqun and Wu Jianhong. Theory, Algorithms, and Applications. Philadelphia: Society for Industrial and Applied Mathe-matics, 2007.

    4. Golden, Bruce L. and Frank B. Alt. Interval estimationof a global op-timum for large combinatorial problems. Naval Research Logistics Quarterly, 26, 1 (1979): 69-77.

    5. Gordon, A. Classification: Methods for the Exploratory Analysis of Multivariate Data. London: Chapman and Hall, 1981.

    6. Gower, J and Ross G. Minimum spanning trees and single linkage cluster analysis. Journal of the Royal Statistical Society, Series C, 18, 1 (1969): 54-64.

    7. Hansen, Pierre and Brigitte Jaumard. Cluster analysis and math-ematical programming. Mathematical Programming, 79, 1-3 (1997):191-215.

    8. Hartigan, John A. Clustering algorithms. New York: John Wiley and Sons, 1975.

    9. Jain, Anil K. and Richard C. Dubes. Algorithms for Clustering data. En-glewood Cliffs, New Jersey: Prentice Hall, 1988.

    10. Jain, Anil K., Murty M. Narasimha and Patrick J. Flynn. Data cluster-ing: a review. ACM Computing Surveys (CSUR), 31, 3 (1999): 264-323.

    11. Maimon, Oded and Lior Rokach. Data Mining and Knowledge Discovery Handbook, second edition. Berlin: Springer, 2010.

    15 Bruce L. Golden and Frank B. Alt, Interval estimation of a global optimum for large combinatorial problems, Naval Research Logistics Quarterly, 26, 1 (1979): 69-77.

  • 25

    12. Mirkin, Boris. Mathematical classification and clustering: From how to what and why. In Classification, Data Analysis, and Data Highways, edited by Ingo Balderjahn, Rudolf Mathar and Martin Schader, 172-181. Berlin; Heidelberg: Springer, 1998.

    13. Murtagh, Fionn. A survey of recent advances in hierarchical cluster-ing algorithms. The Computer Journal, 26, 4 (1983): 354-359.

    14. Murtagh, Fionn and Pedro Contreras. Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2, 1 (2012): 86-97.

    15. Rokach, Lior. A survey of clustering algorithms. In Data Mining and Knowledge Discovery Handbook, edited by Oded Maimon and Lior Ro-kach, 269-298. New York; London: Springer, 2010.

    16. Ruspini, Enrique H. Numerical methods for fuzzy clustering. Infor-mation Sciences, 2, 3 (1970): 319-350.

    17. Silva, Helena Cristina Mendes. Metodos de partigao e validacao em analise classificatoria baseados em teoria de grafos. PhD thesis, De-partamento de Matematica Apli- cada, Faculdade de Ciencias da Uni-versidade do Porto, 2005.

    18. Theodoridis, Sergios and Konstantinos Koutroumbas, ed. Pattern Recognition, fourth edition. Burlington: Else-vier/Academic Press, 2009.

    19. Xu, Rui and Donald C. Wunsch. Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16, 3 (2005): 645-678.

    20. Xu, Rui and Donald C. Wunsch. Clustering. Oxford: Wiley- IEEE, 2009.

  • 26

    Pierre HansenGERAD, HEC Montreal, 3000 chemin de la Cte-Sainte-Catherine, Montreal, Canada, H3T 2A7

    Rita MacedoInstitut de Recherche Technologique Railenium, F-59300, Famars, France

    Nenad MladenoviMathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade

    TATTO TAE MOOT AAJE OATAA EAO A TEJM AA ATEE

    eaeo oo ae a oo aaje o a-aa oje a o ej aa aee. a e ooaa a ae a oa aj aa e o oee ej aa, e. a o oaa eoe ooo, a o e a ae oo aa-je eao a ej aa a oo jee jee ee. a a eo o aoe oo a oooo a ae o-o aea ooj jeo oaaa. ae ae ea e oeaa a a oo oaa a oaaa eae, oaaa e (Ruspini Fisher) oaa a 75 150 jeaa, eeo.

    e e: oo aaje, ae aee, ao ae

    Accepted for Publication December 8th 2015.

  • 27

    66.011:004

    1 , , , ,

    ?

    Oaj -, . . , . - , .

    : , - -,

    1.

    . - . 19. , , - . , ( ) -, , .

    1 [email protected]

  • 28

    - . -, - .

    2.

    .2 , . . , , , , , . .

    , : - 3 .4

    ( 1).5 - , - - . , . , .

    . , . , - , .6

    2 Arturo Rosenblueth and Norbert Wiener, The Role of Models in Science, Philosophy of Science, 4, 12 (1945): 316-321, preuzeto 10.06.2011, http://www.jstor.org/stable/184253.3 Alisa Bokulich, How Scientific Models Can Explain, Synthese, 1, 180 (2011): 33-45.4 John K. Gilbert,odels And Modelling: Routes To More Authentic Scien-ce Education, International Journal of Science and Mathematics Education, 2 (2004): 115130.5 Peter Godfrey-Smith, Models and Fictions in Science, Philosophical Stu-dies, 143 (2009): 101-116. (Proceedings of the 2008 Oberlin Colloquium in Philosophy).6 Jim Bull, Models are the Building Blocks of Science, in Scientific Decisi-on-Making, 46, preuzeto 08.02.2015, https://www.utexas.edu/courses/bi-o301d/Topics/Models/Text.html.

  • 29

    1. (Ronald Giere), 19887

    . . - 1. , - (top-down), , - e , - (bottom-up).

    1.

    , , ,

    Newton- , , , , .

    , , ,

    , ,

    ,

    ( ), , .

    7 Ronald N. Giere, Explaining Science: A Cognitive Approach (Chicago: Chicago University Press, 1988), 321.

  • 30

    , .

    , , , .8 : , -, , , /.

    : , - , , , , , .

    , , -; , - .9

    , , - . - , . , -, ( 2).

    8 Herbert Stachowiak, ed., Modelle - Konstruktion der Wirklichkeit (Mnc-hen: Wilhelm Fink Verlag, 1983), 17-86, preuzeto08.02.2015, http://www.muellerscience.com/MODELL/Begriffsgeschichte/GeschichtedesModellden-kens1978-79.htm.9 Jack Simons, The Roles of Theoretical and Computational Chemistry in the Chemistry Curriculum and in Research, preuzeto08.02.2015, http://www.go-ogle.hr/url?url=http://simons.hec.utah.edu/TheoryPage/WhatisTheoretical-Chem.ppt&rct=j&q=&esrc=s&sa=U&ei=lubXVNnuDMnWauzCgKAD&ved=-0CBgQFjAA&sig2=cKuVgqsczMMCkp3LXfebdQ&usg=AFQjCNE1BthDSN-hBU7fXORzXo-4jvIwTDw.

    , - . .

    19. , -. (, , -,...) . , .

  • 31

    : Rosarium Philosophorum,10

    . : 11

    F (BoNTF), (VAMP, vesicle-associated

    membrane protein).

    10 18- , (John Ferguson) . 210, Rosarium Philosophorum, 10, 18.10.2014, http://special.lib.gla.ac.uk/exhibns/month/april2009.html. 11 Rakhi Agarwal, James J. Schmidt, Robert G. Stafford and Subramanyam Swaminathan, Mode of VAMP substrate recognition and inhibition of Clo-stridium botulinum neurotoxin F, Nature Structural & Molecular Biology, 16,7 (2009): 789-794.

    2.

  • 32

    -, .

    , , ; , , .

    , - . - :

    ,

    ,

    , , ,

    - .

    , , .

    ? - . -, . : 1) - , (Isaac Newton) - , a; 2) - , - ! -, .

    , - , . , .12 , -

    12 Joshua A. Lerman et al, In silico method for modelling metabolism and gene product expression at genome scale, Nature Communications, 3, 929 (2012): 1-10, preuzeto 08. 02. 2015, http://www.nature.com/ncomms/journal/v3/n7/full/ncomms1928.html.

  • 33

    , .13

    , - , : . , .

    () . , - - .14 , , - . , (Richard Feynman): , .15

    . - . , 80 - (), .

    , - (Erwin Rudolf Josef Alexander Schrdinger) , - . :

    = ,

    13 Philipp Trster, Konstantin Lorenzen and Paul Tavan, Polarizable Six-Point Water Models from Computational and Empirical Optimization, The Journal of Chemical Physics B, 118 (2014): 15891602. 14 Paul Adrien Maurice Dirac, Quantum Mechanics of Many-Electron Sys-tems, Proceedings of the Royal Society of London A, 123 (1929): 714-733. The general theory of quantum mechanics is now complete... , . 15 Richard P. Feynman, The Character of Physical Law, preuzeto 08.02.2015, http://www.openculture.com/2012/08/the_character_of_physical_law_ric-hard_feynmans_legendary_lecture_series_at_cornell_1964.html; Tony Hey and Patrick Walters, The New Quantum Universe (Cambridge:Cambridge Uni-versity Press, 2003), 19.

  • 34

    - (), () . () e ( ). ,16 .17 , .

    , , - , .

    - , , , - , . - . -, , (basis set). , . ( -), - .

    , - , -. , , . ( - ) .

    - . , - - , ( -) , , .

    16 , - .17 : , , Phlogiston, 22 (2014): 35-62.

  • 35

    , - , .

    3.

    - .

    - . :

    , , -- , (-) - .

    - ( ).

    (QSPR QSAR).

    , .

    ().

    : , - ; - , , ( 3).

    3. - . : ; :

    . .

  • 36

    . , C(CO3N)4 Cn(CO3N)2n+2.18 - C(C(O)-O-N=O)4 ( 4).

    , - , .

    , , . , . - , ( , - ).

    - , . , - , CCH, CHNO, 2,... , .19 , -

    18 Robert W. Zoellnera, Clara L. Lazen and Kenneth M. Boehr, A computati-onal study of novel nitratoxycarbon, nitritocarbonyl, and nitrate compounds and their potential as high energy materials, Computational and Theoretical Chemistry, 979 (2012), 3337. 19 CCH . 4. Edward L. Cochran, Frank J. Adrian and Vernon A. Bowers, ESR Study of Ethynyl and Vinyl Free Radicals, Journal of Chemical Physics, 40, 213 (1964)

    A) ) a 4. A) ,

    )

  • 37

    ( ), - .20 .

    - . - 21 - .22

    - 23 .

    CHNO, : - , , ( ). 2, , .20 Stanka Jerosimi, Ljiljana Stojanovi and Miljenko Peri, Ab initio study of the 12-2 electronic transition of C2As, Journal of Chemical Physics, 133 (2010): 024307; Ljiljana Stojanovi, Stanka Jerosimi and Miljenko Peri, An ab initio study on the ground and low lying doublet electronic states of linear C2As, Chemical Physics, 379 (2011): 57.21 Vesna D. Vitnik et al., Quantum mechanical and spectroscopic (FT-IR, 13C, 1H NMR and UV) investigations of potent antiepileptic drug 1-(4-chlo-ro-phenyl)-3-phenyl-succinimide, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 117 (2014): 4253, preuzeto 08.02.2015, http://dx.doi.org/10.1016/j.saa.2013.07.099; Ivan Jurani, Simple Method for the Estimation of pKa of Amines, Croatica Chemica Acta 87, 4 (2014): 343347, preuzeto 08.02.2015, http://dx.doi.org/10.5562/cca2462. 22 Maja D. Vitorovi-Todorovi et al., Structural modifications of 4-aryl-4-oxo-2-ami nylbutanamides and their acetyl- and butyrylcholinesterase inhibitory activity. Investigation of AChE-ligand interactions by docking calculations and molecular dynamics simulations, European Journal of Medicinal Chemi-stry, 81 (2014): 158-175, preuzeto 08.02.2015, http://dx.doi.org/10.1016/j.ejmech.2014.05.008. 23 (Proxy) . , , . , (), ( ), . , -, , , 08.02.2015. , http://en.wiktionary.org/wiki/proxy.

  • 38

    ( 5) , - - - .

    , , - , () . , - ( , -)24 2013. .25

    , , , - . ( 6), - , . - , ( ) , . ( ), .

    24 Martin Karplus, Michael Levitt Arieh Warshel.25 08.02.2015, http://www.nobelprize.org/nobel_prizes/chemistry/laureates/2013/press.html.

    5. - .

    , .

    .

  • 39

    , , .

    , . - , . .

    , , - - , .

    (Multiscale modeling) , . . : - ( ), -

    6. . .

    (QM - ; MD / - ; MS

    ; FM , .)

  • 40

    - ( ), ( ), , . - ( 6). - , .

    - : - , (Royal Swedish Academy of Sciences) - 2013. . - , , , .

    - . -, - (Karl Irikura): . . .26

    1. Agarwal, Rakhi, James J.Schmidt, Robert G. Stafford and Subramanyam Swaminathan. Mode of VAMP substrate recognition and inhibition of Clostridium botulinum neurotoxin F. Nature Structural & Molecular Biology, 16,7 (2009): 789-794.

    2. Bokulich, Alisa. How Scientific Models Can Explain. Synthese 1, 180 (2011): 33-45.

    3. Bull, Jim. Models are the Building Blocks of Science. In Scientific Deci-sion-Making, 46. Preuzeto 08.02.2015. https://www.utexas.edu/courses/bio301d/SDM.pdf .

    4. Chen, Sheng and Hoi Ying Wan. Molecular mechanisms of substrate recognition and specificity of botulinum neurotoxin serotype F. Bioche-ical Journal (2011): 433, 277284.

    26 08.02.2015, http://www.pubfacts.com/author/Karl+K+Irikura; http://kwanty.wchuwr.pl/.

  • 41

    5. Dirac, Adrien Maurice Paul. Quantum Mechanics of Many-Electron Sys-tems. Proceedings of the Royal Society of London A, 123 (1929): 714-733. The general theory of quantum mechanics is now complete... , .

    6. Feynman, Richard. The Character of Physical Law. Preuzeto 08.02.2015. http://www.openculture.com/2012/08/the_character_of_physical_law_richard_feynmans_legendary_lecture_series_at_cornell_1964.html.

    7. Gilbert, John K. odels And Modelling: Routes To More Authentic Sci-ence Education. International Journal of Science and Mathematics Edu-cation, 2 (2004): 115130.

    8. Godfrey-Smith, Peter. Models and Fictions in Science. Philosophical Studies, 143 (2009): 101-116.

    9. Giere, Ronald N. Explaining Science: A Cognitive Approach. Chicago: Chi-cago University Press, 1998.

    10. Hey, Tony and Patrick Walters. The New Quantum Universe. Cambridge: Cambridge University Press, 2003.

    11. Jurani, Ivan.Simple Method for the Estimation of pKa of Amines. Croatica Chemica Acta, 87, 4 (2014): 343347. Preuzeto 08.02.2015. http://dx.doi.org/10.5562/cca2462.

    12. Jerosimi, Stanka, Ljiljana Stojanovi and Miljenko Peri. Ab initio study of the 12-X2 electronic transition of C2As. Journal of Chemical Physics, 133 (2010) 024307: 1-10.

    13. Lerman, Joshua A., Daniel R. Hyduke, Haythem Latif, Vasiliy A. Portnoy, Nathan E. Lewis, Jeffrey D. Orth, Alexandra C. Schrimpe-Rutledge, Rich-ard D. Smith, Joshua N. Adkins, Karsten Zengler and Bernhard O. Palsson. In silico method for modelling metabolism and gene product expression at genome scale. Nature Communications, 3, 929 (2012): 1-10. Preuzeto 08. 02. 2015. http://www.nature.com/ncomms/journal/v3/n7/full/ncom-ms 1928.html.

    14. Models are the Building Blocks of Science. Preuzeto 08.02.2015. https://www.utexas.edu/courses/bio301d/Topics/Models/Text.html.

    15. 08. 02. 2015. http://www.muellerscience.com/MODELL/Begriffsge schichte/GeschichtedesModelldenkens1978-79.htm.

    16. 18. 10. 2014. http://special.lib.gla.ac.uk/exhibns/month/april 2009. html.

    17. 08. 02. 2015. http://www.nobelprize.org/nobel_prizes/chemistry /la ureates/2013/press.html.

    18. 08. 02. 2015. http://www.pubfacts.com/author/Karl+K+Irikura; http://kwanty.wchuwr.pl/.

  • 42

    19. Rosenblueth, Arturo and Norbert Wiener. The Role off Models in Sci-ence. Philosophy of Science, 4, 12 (1945): 316-321. Preuzeto 10.06.2011. http://www.jstor.org/stable/184253.

    20. Stojanovi, Ljiljana, Stanka Jerosimi and Miljenko Peri. An ab initio study on the ground and low lying doublet electronic states of linear C2As. Chemical Physics Chem, 379 (2011): 57.

    21. Simons, Jack. The Roles of Theoretical and Computational Chemistry in the Chemistry Curriculum and in Research. Preuzeto 08.02.2015. http://www.google.hr/url?url=http://simons.hec.utah.edu/TheoryPage/WhatisTheoreticalChem.ppt&rct=j&q=&esrc=s&sa=U&ei=lubXVNnuDMnWauzCgKAD&ved=0CBgQFjAA&sig2=cKuVgqsczMMCkp3LXfebdQ&usg=AFQjCNE1BthDSNhBU7fXORzXo-4jvIwTDw.

    22. Stachowiak, Herbert, ed. Modelle - Konstruktion der Wirklichkeit. Mn-chen: Wilhelm Fink Verlag, 1983. Preuzeto 08.02.2015. http://www.mu-ellerscience.com/MODELL/Begriffsgeschichte/GeschichtedesModellden-kens1978-79.htm.

    23. Trster, Philipp, Konstantin Lorenzen and Paul Tavan. Polarizable Six-Point Water Models from Computational and Empirical Optimization. The Journal of Chemical Physics B (2014): 118, 15891602.

    24. Vitnik, Vesna D., eljko J. Vitnik, Neboja R. Banjac, Nataa V. Valenti, Gordana S. Uumli and Ivan O. Jurani. Quantum mechanical and spectroscopic (FT-IR, 13C, 1H NMR and UV) investigations of potent anti-epileptic drug 1-(4-chloro-phenyl)-3-phenyl-succinimide. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 117 (2014): 4253. Preuzeto 08.02.2015. http://dx.doi.org/10.1016/j.saa.2013.07.099.

    25. Vitorovi-Todorovi, Maja D., Catherine Koukoulitsa, Ivan O. Jurani, Lju-ba M. Mandi and Branko J. Drakuli. Structural modifications of 4-ar-yl-4-oxo-2-aminylbutanamides and their acetyl- and butyrylcholinesterase inhibitory activity. Investigation of AChE-ligand interactions by docking calculations and molecular dynamics simulations. European Journal of Medicinal Chemistry, 81 (2014): 158-175. Preuzeto 08.02.2015. http://dx.doi.org/10.1016/j.ejmech.2014.05.008.

    26. Zoellnera, Robert W., Clara L. Lazen and Kenneth M. Boehr. A computa-tional study of novel nitratoxycarbon, nitritocarbonyl, and nitrate com-pounds and their potential as high energy materials. Computational and Theoretical Chemistry, 979 (2012): 3337.

  • 43

    Ivan JuraniUniversity of Belgrade, the Faculty of Chemistry, Institute for Chemistry, Technology and Metallurgy, Belgrade

    WHAT IS WORTH COMPUTATIONAL MODELING OF CHEMICAL PROCESSES AND STRUCTURES?

    In this article we present a short overview of modeling in science, par-ticularly in chemistry. It is emphasized that research in sciences mostly comprises the definition and the study of the models. For the most part, the computational modeling is considered. Computational chemistry has several sub-disciplines ranging from the electronic databases (chemin-formatics) to the molecular modeling. It is demonstrated on several ex-amples how molecular modeling is useful in chemistry research and in the design of new materials. A particular attention should be paid to the modeling of the interaction of small molecules (e.g. drugs, or potential drugs) with biomacromolecules (e.g., enzymes). The article concisely presents a novel approach of multiscale modeling, which enables to ad-dress the most complex problems in chemistry and technology.

    Keywords: Models in Science, Quantum-chemical Molecular Model-ing, Multiscale (Molecular) Modeling

    8. 12. 2015.

  • 45

    o 531.61

    1 , ,

    oo : , , , . - , (2 -) , - , - . .

    : eja, ( eeja), , , ao oaa

    1.

    ( = y, ov= ), . 3 . , 19. ,

    1 [email protected] Johann Bernoulli.3 a) h, ., I-II (Moa: , 1972); b) , o I-II (Moa: , 1977); ( o o je a aoj eae).

  • 46

    .

    , , , - . 16. 17. .4 je ( = mv2, - ), 19. , (Isaac Newton) - ( ).

    - , (Ren Descartes) 1664. , -. 5 ( , 16. )6 - ( , 17. ),7 - , .

    -, (Gottfried Leibnitz)8 - , -. 1695. - , . , .

    2.

    - . - , , eoa a Oje (Leonhard Paul Euler)9 1751. , , - (

    4 Ibid., a) 83-121; b) 90-106.5 Ibid.6 Galileo Galilei.7 Christiaan Huygens.8 Ibid.9 Milorad Mladjenovi, Razvoj fizike - mehanika i gravitacija (Beograd: Gradje-vinska knjiga, 1986), 343-345.

  • 47

    ). (Lazare Carnot)10 1783. , P ds=udt dt

    , (2.1)

    (moment dactivit). , - je . (Lucien Poncelet) - (Gaspard G. Coriolis)11 1827. - . , , , 19. .

    , , : ,

    , (2.2)

    . M0 M ,

    , (2.3)

    . ,

    . (2.4)

    , - , .

    , (2.5)

    10 Ibid., 344.11 6.

  • 48

    , a - , ,

    ,

    . (2.6)

    , , , , - , .

    3.

    18. , - . (Alexis Claude Clairaut)1743.12 , , - , , eoa Oje 1761. -- (Joseph-Louis Lagrange) 1777. - .

    , - ( )

    (3.1)

    U , - ( ) , .

    (3.2)

    12 Mladjenovi, Razvoj fizike, 345.

  • 49

    , .

    , 19. , - ,

    (3.3)

    . , ,13 - (Julius Robert von Mayer, 1845) - (Hermann Helmholtz, 1847), . (3.3)

    . (3.4)

    - M M0, = 0, M, , .

    (Thomas Young)14 1808. , (William Rankin)15 1855. , . : - , - . , - (William Thomson)16 , 19. , .

    , :

    13 6.14 Ibid.15 Ibid.16 Ibid.

  • 50

    , je

    ,

    , -

    (3.5)

    . -

    (3.6)

    -

    (3.7)

    U -. (2.4)

    (3.8)

    U1 1 , a U2 2 - .

    , - , - , .

    4. ()

    18. (1720-1730) 17 , ( , ).

    17 h, ., I-II, 125-126.

  • 51

    , (Daniel Bernoulli)18 - - () , . , - (1738),19 - , . - , , . - , - , , , , . - , , -, , - . , , pS S .

    (4.1)

    - - , - , .20 .

    18 Ibid.19 ) Daniel Bernoulli, Hydrodynamika sive de viribus et motibus fluido-rum (Strasburg: Johann Reinhold Dulsseker,1738); ) e , , , a . o, . (Moa: Aae a , 1959).20 2.

  • 52

    , , pS, , . , 12. - : -.21 , , , . - , ( ), - . , - , , .

    , - ,22 , , . - , . ,

    (4.2)

    ( ) i- - . - , .

    21 , , 368-369.22 Daniel Bernoulli, Remarques sur le principle de la conservation des forces vives, pris dans un sens gnral, Histoire de lAcadmie Royale de Berlin (Berlin: 1748/50), 356-364.

  • 53

    , - , , , U .

    5.

    - , - , , . - , - , - , . ( -) L , - x O ( 1). - , ( ), . -

    (5.1)

    dx

    ,

    . (5.2)

    2

    (5.3)

    x0 0 , x .

  • 54

    1.

    - mi, Li x i i . -

    (5.4)

    ,

    . (5.5)

    , , . , - , (5.5). - () , ( ), .

    - (4.2), - , -, (5.5) - .

  • 55

    6.

    , ,23 -. , .

    - m (. 2) , O. A m = a, C r O , O - . -, , - , AO B r O, r - O C .

    2.

    (5.5) ,

    . (6.1)

    , , O, - . ,

    23 Bernoulli, Remarques sur le principle, 356-364; a . o, -., (1700-1782), , - (: Aae a , 1959), 433-501.

  • 56

    (6.2)

    v - C.

    ,

    . (6.3)

    (6.4)

    , , .

    , m M , . - , x- a, r, - , - , a r. - , (5.5)

    (6.5)

    . -

    (6.6)

    .

  • 57

    - . N , 0 ( 2) , , (6.3). -

    (6.7)

    ri , a ai 0,

    (6.8)

    7. -

    , , - , , - (5.5) -. - (. 3) S1 S2 . x - ( ) - , 0, m = V, , V=St. - S1 t S2 , -

    (7.1)

  • 58

    3.

    : .

    , z x

    . (7.2)

    , - pS pS , p p . , - (Taylor)

    (7.3)

    .

    (7.4)

    (5.5) S1, p1, 1 h1, S2, p2, 2, h2

  • 59

    (7.5)

    , (7.1)

    . (7.6)

    , -

    2m

    (7.7)

    , , . , , () , .

    8.

    (5.5) , - . , - 2

    (8.1)

    . , - ( - ) - . , ( ) - (), -

  • 60

    . , , - , . , (8.1) - , . , -

    , , , , - U

    (8.2)

    - (8.1), , - . , - , , , , .

    , (8.1)

    (8.3)

    . (8.4)

    = U,

    (8.5)

    .

  • 61

    , (8.3) (8.5) - , - . , - , - . (6.1), , G , , . (6.4) -

    (8.6)

    (-) , -

    . (8.7)

    , (6.6) , , -,

    .

    - (7.4), -, , , , . , (7.7) - () , - - . , , , - , , , , - .

    , , - , -

  • 62

    , ,24 - . , (7.4) - ,

    (8.8)

    - (8.3). -, , ,

    m

    (8.9)

    .

    , - , , .

    (8.10)

    T ( ) , U1 U2 . - U1 U2 U (x, y, z) - , 1743. . -

    (8.11)

    - ( ),

    (8.12)

    24 ore Muicki, Uvod u teorijsku fiziku: Teorijska mehanika Tom I, IV dopun-jeno izdanje (Beograd: Odsek za fiz. i met. nauke, 1987).

  • 63

    , - -, , , , - . - , , (8.11), - .

    9.

    (2.6), , -

    . (9.1)

    , (3.7) -

    (9.2)

    , ,

    (9.3)

    . (9.4)

    , 19. ( , 1855), ,

    (9.5)

    , , - , , - .

  • 64

    1. Bernoulli, Daniel. Hydrodynamika sive de viribus et motibus fluidorum. Strasburg: Johann Reinhold Dulsseker, 1738.

    2. , e. , - . , a . o. Moa: - Aae a , 1959.

    3. Bernoulli, Daniel. Remarques sur le principle de la conservation des forces vives, pris dans un sens gnral, Histoire de lAcadmie Royale de Berlin. Berlin: 1748/50.

    4. Mladjenovi, Milorad. Razvoj fizike - Mehanika i gravitacija. Beograd: Gradjevinska knjiga, 1983.

    5. Muicki, ore. Uvod u teorijsku fiziku: Teorijska mehanika, tom I, IV dopunjeno izdanje. Beograd: Odsek za fiz. i met. nauke, 1987.

    6. , . o I-II. Moa: oa, 1977. ( o o je a aoj eae).

    7. o, a ., ee. (1700-1782). , . Mo-a: Aae a , 1959.

    8. , -, . .I-II (Moa: aa,1972).

  • 65

    ore MuickiUniversity of Belgrade, the Faculty of Physics and Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade

    DEVELOPMENT OF THE MAIN CONCEPTS OF MECHANICS AND THE FIRST FORMULATIONS OF THE LAW

    OF CONSERVATION OF ENERGY

    The first part of this article gives an overview of development of the main concepts of the mechanics: the concept of energy, living force (within the meaning of kinetic energy), work, function of force and potential energy.

    The second, main part, provides the original formulation of the ex-panded law of the conservation of living force, which was described by Daniel Bernoulli, and this law is then formulated in a more general sense in which we assume Bernoulli himself implicitly understood it in its appli-cations on the mechanics of the system of particles.

    After this, the article presents application of this generalized Ber-noullis law on the mechanics of the system of particles and on hydrome-chanics for obtaining the so-called Bernoullis equation, which had not been formulated by Bernoulli himself in that form. Namely, it is explained that Bernoulli could not formulate it in its present form, since in his work he never explicitly figured the pressure of liquid because of its complex definition given by means of the height of a column of liquid that corre-sponds to that pressure.

    When solving one problem in Hydromechanics, Bernoulli himself got a relation involving columns of liquid that correspond to the pressure of liquid, namely the square of its velocity. It is concluded from this that with the increase in the square of velocity the pressure of liquid drops, and vice versa, which may be regarded as the equivalent main part of Bernoullis equation, and its present day form was probably given by one of his students.

    Finally, these results are compared with the current formulation of the law of conservation, and it is shown that Bernoullis law of conser-vation of living force generalized in this way can be interpreted in two different ways. It may be understood as a formulation of integrals of time energy, as well as integrals of spatial energy, which is illustrated on Ber-noullis equation.

    Keywords: energy, living force (i.e. kinetic energy), function of forc-es, potential energy, the law of conservation of living force

    8. 12. 2015.

  • 67

    521.1

    aa eeoo1 , ,

    AE A, AOJE AA MATEJA O OM

    TOM 2

    eee oa oo ea aae ae a3 aoje aa4 aoe a a eoj o oaa aejaa o o o oe oaje ee ea. oo aa je a ae ooe e eje oe eoje, ee eae e ea ae a oo ae aoja.

    : o a, aeja, oeo oaje, ae a, aoje aa

    1 [email protected] OI171017 -, . 3 1909. , 1994. . 23 , . 31. . . , 1935. , . oa aea a ej 1945. , . 1966. . - ( ). 1960. . , . - . eo a, a 1971. 1981. ee (A). 4 1892. , 1989. -. ,

  • 68

    1. o

    aja 1994. je eoa o ae a, oa a a eae ae , oa aea a ej, eo a, a jeo ee ee (). a e eo oe o eaa oje je oao ae ao ea a aj (Institut du Radium) a-, a eo (Irene Curie) e o (Frdric Joliot-Curie), .5 Mao je eoajao a a ee o eaa oje oeo aa.

    aje o ooo oo a oaao aa aa. aaj oe, a a a oj a a e-a je aao aoe aa, ao je o a ao -oj eae e, e eae oje je oao eoo eeja aje, a oj e ooe a oaae aejaa o o o. oo aa je a ae ooe -e eje, eae ea oo ae aoja oe eoje. ea ee, oj e aa, ojae ajaa . ae eo aa oee je a oo eja a oja je eoja aoaa.

    2. aae eoje

    Teo ja, je ooe oa a aa, oe e o ae ao eoja. oj e oaj a o-ea: oaae aejaa o o o oeo oaje ee ea. aje eo aa eo oj e oo- a oaj ee ea.

    . 1920. , 1924. . 1922, 1939, 1955. . 1946. .5 M , ., o a aa aa ooo eaeeooe oea (eoa: A, 1980).

  • 69

    2.1. oeo oaje

    oea a aa o oaa aejaa o o o jaa e aaa e, aooje, a ee aa. e oooj oaaj aaa ao ea, e aea, eeee jaa eaa,... Ooa a eja eoje je a e oa a, oe je aeja o, a e eoa a ooo ea eeoa a aejaa, a o oo o oo oea e, o e o eeea aa . Oa eja aoaa je a a oa: eeea eaa oa o eo oo ea (ao ao aa)6 aa oa oaaa o e aa aea.7

    a oee ooj oea a je aao a VIII eea oo-aea aa A 28. ooa 1960. , a ojae je aee oe.8 oe oea oee o oe oaje, oaj a je aaja o oe o je e ea jeoaa oa oja oeje ee e aea a eo o a. oa a

    . (1)

    aoa 4/3 eaa e a, ae CGS e, oj je o oe oo oa. SI e, oj e aa o, ea a a oa 0 = 1333 kg / m3.

    Taea 1. , (1)

    0.65 1.41 1.34 1.36 1.32 5.52 3.94 5.21 5.6

    -1 0 0 0 0 2 2 2 2

    1 0.66 1.33 1.33 1.33 1.33 5.32 5.32 5.32 5.32

    6 Percy W. Bridgman, Collected Experimental Papers, 7 (Cambridge: Harvard University Press, 1964).7 a, O aa oaje ea oje ee ea, Oeea oo-aea aa, 21 (1961): 3543.8 Ibid.

  • 70

    o oja oaee ojee e Tae 1 aj eea aea:

    - ea a eeo ea aaa ee e-o ea ae

    - eo eea o (1)

    1- ea a ojea aaa o o (1).

    oeee eo 1 oaje a eae ee eo 1 ae oe ea 2 3 oea. oa (1) eea je aao ee a oa aea oa je aa ea oo aa. ea a eoe a eooje eo, aaa je aoa a ao oea eeoe e aoa oea o ejo oo a. oojae oae eaje oe e o-a ae eae oaa, a e, a oao e- ao o je jeoeoa oejaa jaa oae e.9 eeao je a je eaa aa eoja aja oae a a eeo o ea a-jea jeo aao aj e aje oo ea. a je o ojaaae o eaa a o o-a a aa a eee o aa10 ( aj ao). eeaa eoja aaa oo oea, oja eaaa ae aja oo a a ooa oe, aaa e ooj.

    oo aeo a aaa e eoja oja oje eoj eeao oaa aaeo o ooo oja ej eeeaa jeea. Eoj oao o-a a e a aoa: eoa eaja eo aa ea a eeje oo aee. Oa a oea ooe o oaa e, a e aee oaa a oa o eaje joaje aoa oea oj e oa aoj, o eaa oea oe eaje eeo eee oa o-e oae a. a eoj eo a, oja je eo aa o ejo aaa oaa, eooj -ao ee e a ea: oa e aa ea-

    9 Vladan elebonovi, Pressure Excitation and Ionisation: a Simple One-Di-mensional Example, Physics of Low-Dimensional Structures, 7/8 (2001): 127-132.10 Ma Dong-Ping, Zheng Xi-te, Xu Yi-Sun and Zhang Zheng-Gang, Theoreti-cal calculations of the R1 red shift of ruby under highpressure, Physics Let-ters,115 A (1986): 245-248.

  • 71

    o oaa aoa oea e ae eooee ae. ea e oe eo a e a eo o o eaje, a oa joaje o aoa oea oj aj aje eo oejaa joa-je. Oa aa e aoj eeoo aa, ea jooa aoa oea. eoj a-aa a oaa eeo a a aeo oe oje je e jeao .11 ooa eoae oo a e eae, aeo o eeoo aa, ae oe joa oea e oje aae-o o ejo eje oea e e oe a oa. : - (Einstein-de Haas) , . - .

    eoo a a je aa o oea oaje ee ea oa eoje. eaa a-aaa, aeea aoa,12 oaj a eaa oj eo aje ooee eae a e oaje jae ae oa ee ea aeo ea, oje a a ee eoa oje e oae oa aa o ao oa. a e, eea eo e oaje a eaoa o = 2.9*10-6 rad / s o aje (1.2*10-6

  • 72

    a aa eo, oa o oa a eoja aje ea o eo e oaje, a e je . o ae oje e jaa oo e eoje je oe oaje Meea. ao - a e a je ee eoje Meea oo ee jeao eoo ee oaje oo oee oe. ooj eoj, oaja eeo ea je oea eaje, a joaje o o aoa / oea oj ae eo aa. aa a-a oe ae e Meea, oe e oe a aejaa o oa e Mee aoj (A = 71), a, aoe, o oaj ej eeeaa oja aa eo A. ea a Mee je, o a, e a a jooao ao o o ejo eeea oj o o eaa a aa Meea. Me, e o e a aa Meea o oe eeeaa o jeea, a a je oe-ja joaje oea eo a o oejaa joaje oje aoa. ae, a e Meea je, a, o-ao oe o joaje ea aejaa ( o ) oa-ja Meea je e oaa oa.

    2.2. Maeja o o o

    aae aejaa o ejo oo a / eeae je eeaa, ea, a e oe. Tea e a e ee, a jee oe. eeea je, a e, eoa ao a oa, eo oa oo aea jaaoj e. jee eoe o o oa o je ao eao o eo ooo oja ea, e ooa oe a aee ea ee Aoaoo oja. aa oea oao eo oja ea je a e aoja ao ea e oe eao, oo aa e ea o 1 o , e je oj ea. ao-ja ao ea a ee o o:

    (2)

    e o aj oja aaa aea. o a ae e, ooj jea eao oaa e eaa o ee aa, a aa ao j, a oa a oo e oe e

  • 73

    eoae oejae ea. Oaa oa o ea, a oa a aea eoaa.

    eo oa, oj a oe aoa a 6 e-eeao aoa oaa jeo eeoo a.

    ea eea aeo eaa je eajo:

    , (3)

    e NA oaaa Aoao oj, e , a A e ao a aejaa. ae oja oj e je

    aaa eeja o eeo, eaa ao

    (4)

    (e - aeeae eeoa). Moe e oaa 14 a je oao eao ee aojae oa e-ajoo (Wigner-Seitz) aja, oj a joo aeeae Z. oa a oja je aoaa eoja :

    1. a aejaa je aa ja a oe je aeja oe.

    2. a oao e, ao aeja eaa e a- eaa. aa oja e aaa oj a o a a aa. ae e e eooj eo i, ao a a aoj a a

    (5)

    , (6)

    e je oaa aoj a.

    3. Maae e ae i ae i + 1 oeae eajo (7)

    oja e oe e eaa15 eoa a je

    .

    14 Chun-ying Leung and Banggu Zhang, Physics of Dense Matter (Bei-jing: Sci-ence Press and Singapore: World Scientific, 1984).15 o 5.

  • 74

    4. Maeae jeoao a, o e eoaa a aae eeje ea ao aaa oje oo

    . (8)

    ea aaaa oaje e a je ao

    i = (9)

    5. aja a e ae o

    , (10)

    e oaaa oa ae aejaa.

    6. oe eoa 3 oe e oaa a je

    . (11)

    oae o eoa 1 - 6, o e e eee eaje

    (12)

    a eoa ao ae oaee: oe e e ao a ae eae o ea, ae oe oja e eaa oea . eja aj oao e o a ae eae o ea, a je o ae jo e ooeo. Moe e oa ae aa a oj e ooj eoj aaa a ao eaa.

  • 75

    Oeje e a W oj oa a a oej aejaa aj a e jeaaa a oeo aae eeje.

    ae,

    (13)

    jeaaa e a oeo aae eeje E

    . (14)

    oe jeae (12)-(14) oja e ee a a a-a a a a i :

    (15)

    (16)

    eo oae a, oea a aoe a ea ae i a i + 1, aa je ao

    ptr (17)

    o e oe aoa ee oa o

    ptr , (18)

    e je ja aa aa (15) - (16). Jeaa (18) aje a a oj e eoj o aa aea o a eaa. Me, e oe eo aa oe. oe eo aa a eaa ae eeo ao

    (19)

    oe o E1 oaaa oeja joaje. oeja jo-aje je a oe aejae eeeao oa, a e aaaj aa (4) (12) oe aoaj a = r x 10-10 m.

  • 76

    Oa oa ee je 16 a 19 a aejaa, a oje eeeaa oa a oja e eaaj a ea. Je ej o aejaa a je aoa oo oaaa aoooj ea. ea eoa oaj. ojeo je a eaa oaa, aa o ee eo, oe e o a 30 40 %. eo ae aa je o oe aejaa. oa o o oe- aa. a ooa ee eeae aa o eeo oejaa.

    e a eoja o oo (Charles-Augustin de Coulomb) oeja, o je e ea a aoaja.17

    aa a aa ao eo ee oe eoj je ae oaaa jeae aa aejaa, a a oaoj eea T oja je jeaa . o -oo o, oa eoja a a ae o-eje. ae jeae aa a T > 0.18 oee a a a eej o ea o je aee o eoje oa aae e o aa. ee je ee a oj oeje e-ea o ea19

    . (20)

    .

    aea 2 aje aoe aa, a e-eaa j e o oe ee. ojeo je a e ee eeaa oe 7000 , o je o e-aa ea eeeaa.20

    16 Vladan elebonovi, High pressure phase transitions -examples of classi-cal predictability, Earth, Moon and Planets, 58 (1992): 203-213.17 Ibid. 18 Vladan elebonovi, A note on the thermal component of the equation of state of solids, Earth, Moon and Planets, 54 (1995): 145-149.19 Ibid. 20 Simone Anzellini et al, Melting of Iron at Earths Inner Core Boundary Based on Fast X-ray Diffraction, Science, 340 (2013): 464-466.

  • 77

    Taea 2.

    (km) 0-39 39-2900 2900-4980 4980-6371

    [kg /m3] 3000 6000 12000 19740

    max[MBar] 0.25 1.29 2.89 3.7

    [K] 1300 2700 4100 7000

    A=26.56

    o aaae aeo je a a Meea.21 o o oe, eaa eeaa Meea o ea 800 , a ea aoa aa aejaa o oa e Mee aoj je A = 71.

    ae oe ooe eoje a oj oa oaaa jeoao oee aaaa. oa oa aa e ao eeo ea oje e oeje. ao ea oja e: oj ea ojea ao, aoea aa, e eeae ao, jaa aeo oa ooe ea ao a oaje. oeo je eeao o o eoja aje e ao a A aejaa o oa e ojea aoj. Oaj oe aaea je je oj je eeeao oe.eea aea a eo A a e ojee a oje je oa eoja o aa eea.22

    21 Pavle Savi, The internal stucture of the planets Mercury, Venus, Mars and Jupiter according to the Savi-Kaanin theory, Advances In Space Researcs,1 (1981); Vladan elebonovi, The origin of rotation, dense matter physics and all that: a tribute to Pavle Savi, Bulletin Astronomique de Belgrade, 151 (1995): 37-43.22 Vladan elebonovi, The origin of rotation, dense matter physics and all that: a tribute to Pavle Savi, Bulletin Astronomique de Belgrade, 151 (1995): 37-43.

  • 78

    Taea 3.

    1.4 71

    113 1 70

    28.12 2 71

    26.56 3 18

    69 4 19

    1 96 1 38

    1.55 2 43

    - 3 44

    6.5 4 32

    7.26 5 32

    - 67

    o J1 o J4 oaaaj aee Jea, o 1 o 5 ae aa. To je ae ea. oo je oea a eea ea, oja ooo a, aj a ej aa, a o a ea o ejo aaa aaa ooo eoa aeao ea. Me, jeoao aao oe aee o e e ae a eeaa aa. Me aeo ee aj e eo A, a a a a Ma To. oao je a o e ojea eo e oo a o eoo ae aaa, a aee e a aj a ej aa. Taee o o o e a o aa aoj oj aeo ea. ae, oo je oe a a Me ee aa eoo oaa, a a je oa e aaoa a ao ee a oo e a, a ojoj e aa aa. a aa a a Ma To. o aa eaja aa oo oa aa oo a eoje e aa, a o a ee e.

  • 79

    3. Moo ae

    eoja a oo a a aoj. aeo eoo ea. Jeaa (1) eea je oe oae oj o aeaa oa 1960. . o eeao aaa aeee oae ooo e o jea. Jo jea oa ae o oa oee ee e eaje a ee aea. a a oe e eaa a ea oee je a eeea oaa a ao 19 aejaa. eoa je ao a oo oeee a o e aejaa. Oo oeee je eo eao a aoaoje oae, ae je.

    ea ea eoje a aooe oee oe eo, e ao o oeaa, a jao je a eo o eaa ae a oao oja aejaa a oje je aoaoj oa eoja eea.

    ooe o oe oea oaje ee ea, e o a oa eoja aje ea o ao a oaje. eeao o e oaj eoj oja ooa a e oaj ea a je eo. O eoo ea e j oo a aeae eoje.

    4. Eo

    oo aa o je a ae ooe eje eae eoje o oaa aejaa o o o, oj eo oo ea eo ae a aoje aa. O , , : . , . . , . Pe , , , . . eea aoaoj aaa, a jo e oeoa ae e ee ea. ea e oaj ao oaj. a ja aa

  • 80

    aa a eea eoa oeo oaee oa ooaa je aoaoja a aaa oaaa aejaa o o o. ae je aoa oj e oo a eooj o aa. a aoa, oo je o e ae a, a je oao ooo oe oeo 2015. . aoaoja je eea, oojea oea je eaaa, oao e aaa. aa aeeoa a e eaa oe e oa ao a ej ae aee 1.

    1. Anzellini, Simone, Arthur Dewaelle, Mezouar, Pierre Lobeyre and Guillaume Morard, Melting of Iron at Earths Inner Core Boundary Based on Fast X-ray Diffraction. Science, 340 (2013): 464-466.

    2. Bridgman, Percy W. Collected Experimental Papers, 7. Cambridge: Harvard University Press, 1964.

    3. Celebonovi, Vladan. High pressure phase transitions -examples of classical predictability. Earth, Moon and Planets, 58 (1992): 203-213.

    4. Celebonovi, Vladan. A note on the thermal component of the equation of state of solids. Earth, Moon and Planets, 54 (1995): 145-149.

    5. Celebonovi, Vladan. The origin of rotation, dense mat- ter physics and all that: a tribute to Pavle Savi. Bulletin Astronomique de Bel-grade, 151 (1995): 37-43.

    6. Celebonovi, Vladan. Pressure Excitation and Ionisation: a Simple One-Dimensional Example. Physics of Low-Dimensional Structures, 7/8 (2001): 127-132.

    7. Chun-ying Leung. Physics of Dense Matter. Beijing: Science Press and Singapore: World Scientific, 19