recent advances in data mining
TRANSCRIPT
ARTICLE IN PRESS
0952-1976/$ - se
doi:10.1016/j.en
Engineering Applications of Artificial Intelligence 19 (2006) 361–362
www.elsevier.com/locate/engappai
Editorial
Recent advances in data mining
Data mining methods have been successfully introducedin many fields. It is still a research topic, but besides that oftremendous interest for the industry in order to solve theirreal-world problems. Consequently, the research in datamining is not only driven by theoretical aspects. More thanany other field it is influenced by practical problems andresearchers are consequently taking up the request for thespecial needs of the industry and incorporate them intotheir research aspects.
Therefore, new research work on distributed dataclustering, incremental clustering and pattern mining ispublished in this special issue that should inspire thecommunity to further developments.
Clustering is still a topic of tremendous interest. The newaspect of distributed clustering while preserving the dataprivacy is becoming more and more important, since underthe recent computer technology trends data sources arecreated in different places for one specific problem thatrepresents the data of the specific workplace, and as suchthey are valuable, but would be more valuable still if theycould be set into a much broader context. Therefore,combining several data sources representing data of thesame problem from different places, would allow obtainingmore reasonable results as concerns application. Aparticular field of application is medicine where data on adisease are collected and kept in one hospital; butcombining the data from different places would yield amuch larger data base and could provide more valuableresults. To ensure that the owner of the data will allowusing his data for a specific analysis, we need to guaranteethe privacy of the data without loosing accuracy and theexplanation capability of the results. The paper of da Silvaand Klusch (2006) is dealing with the development ofclustering methods that can work under these require-ments.
The continuously created data streams by the WorldWide Web or by automatic data acquisition systems, suchas image scanners in medicine or quality monitoringsystems in any manufacturing process, require incrementaldata-analysis methods that can analyse the data as long asthey arrive in temporal sequence without starting theanalysis process from scratch after each new sample.Bouguila and Ziou (2006) present in their paper an on-line clustering method based on the Dirichlet distribution
e front matter r 2006 Elsevier Ltd. All rights reserved.
gappai.2006.01.015
and minimum message length principle. Incrementalgraph-clustering methods for case-based maintenance arepresented by Perner (2006). Using clustering to learndistance function for supervised similarity assessment ispresented by Eick et al. (2006).Mining patterns in a large collection of data is becoming
more and more important. A problem of finding paireditemsets with high correlation in one database is alreadyknown as Discovery of Correlation and has been studied,as the highly correlated itemsets are characteristic in thedatabase. However, even non-characteristic paired itemsetsare also meaningful, provided the degree of correlationincreases significantly in the local database as comparedwith the global one. This problem is studied by Taniguchiand Haraguchi (2006).Medical applications are still of great interest to the data
mining community as well as to practioners. The usage ofData Mining methods for image segmentation of medicalimages is consequently further developed. Shuo Li et al.(2006) present a method based on principal componentanalysis and support vector machines.New applications for intrusion detection and medical
literature mining advance the application of data mining.The explosion of knowledge in many fields leads to a
huge amount of literature and records that requiresconcept-knowledge in order to be able to retrieve thedesired information from a literature data base. Thisconcept knowledge can be built automatically by usingconcept mining methods. This is described by Bichindaritzand Akkineni (2006) in their paper.Network intrusion detection is an arising topic. The
tremendous need to ensure the security of networks anddata is paving the way of this topic. Automatic detectionmethods are necessary to observe the huge amount oftraffic data and to find out novel situations. Perdisci et al.(2006) present recent results in their paper.All the papers in this special issue are selected papers
from the Industrial Conference on Data Mining ICDM-Leipzig 2005 (www.data-mining-forum.de) and the Inter-national Conference on Data Mining MLDM 2005(www.mldm.de). The program of these two events showsonce more that these events have developed over the yearsinto the leading meeting places for data mining researchersin pattern recognition and industry.
ARTICLE IN PRESSEditorial / Engineering Applications of Artificial Intelligence 19 (2006) 361–362362
References
Bichindaritz, I., Akkineni, S., 2006. Concept Mining for Indexing Medical
Literature, Engineering Applications of Artificial Intelligence, in this
special issue, doi:10.1016/j.engappai.2006.01.009.
Bouguila, N., Ziou, D, 2006. Online clustering via finite mixtures of
dirichlet and minimum message length. Engineering Applications of
Artificial Intelligence, in this special issue, doi:10.1016/j.engappai.
2006.01.012.
Eick, Chr. F., Rouhana, A., Bagherjeiran, A., Vilalta, R., 2006. Using
clustering to learn distance functions for supervised similarity
assessment. Engineering Applications of Artificial Intelligence, in this
special issue, doi:10.1016/j.engappai.2006.01.004.
Perdisci, R., Giacinto, G., Roli, F., 2006. Alarm clustering for intrusion
detection systems in computer networks. Engineering Applications of
Artificial Intelligence, in this special issue, doi:10.1016/j.engappai.
2006.01.003.
Perner, P., 2006. Case base maintenance by conceptual clustering of
graphs. Engineering Applications of Artificial Intelligence, in this
special issue, doi:10.1016/j.engappai.2006.01.014.
Shuo Li, Fevens, Th., Krzyzak, A., Li S., 2006. Automatic clinical image
segmentation using pathological modelling, PCA and SVM. Engineer-
ing Applications of Artificial Intelligence, in this special issue,
doi:10.1016/j.engappai.2006.01.011.
da Silva, J.C., Klusch, M., 2006. Inference in distributed data clustering.
Engineering Applications of Artificial Intelligence, in this special issue,
doi:10.1016/j.engappai.2006.01.013.
Taniguchi, T., Haraguchi, M., 2006. Discovery of hidden correlations
in a local transaction database based on differences of correlations.
Engineering Applications of Artificial Intelligence, in this special issue,
doi:10.1016/j.engappai.2006.01.006.
Petra PernerInstitute of Computer Vision and Applied Computer
Sciences, IBaI, Kornerstr. 10, 04107 Leipzig, Germany
E-mail addresses: [email protected],[email protected].