international journal of computational intelligence and information security
TRANSCRIPT
-
8/2/2019 International Journal of Computational Intelligence and Information Security
1/6
-
8/2/2019 International Journal of Computational Intelligence and Information Security
2/6
leaks. We demonstrate several synthetic as well as real-world examples of heap dumps for which our approach
provides more insight into the problem thanstate-of-the-art tools such as Eclipses MAT.Memory leaks are afrequent source of bugs in applications that use dynamic memoryallocation. They occur if programmers mistakesprevent the deallocation of memory that is nolonger used. Undetected memory leaks cause slow-Permission to
make digital or hard copies of allor part of this work for personal or classroom use is granted without fee provided
that copies are notmade or distributed for profit or commercial advantage and that copies bear this notice and thefullcitation on the first page. To copy otherwise, to republish, to post on servers or to redistribute tolists, requires
prior specific permission and/or a fee. downs and eventually the exhaustion of allavailable memory, triggering out-
of-memory conditions that usually lead to application crashes.These crashes significantly affect availability,particularly of long-running server applications,which is why memory leaks are one of most frequently reported
types of bugs against serverframeworks. Memory leaks are challenging to identify and debug for several reasons.
First, theobserved failure may be far removed from the error that caused it, requiring the use of heap analysistoolsthat examine the state of the reachability graph when a failure occurred. Second, real-worldapplications usually
make heavy use of several layers of frameworks whose implementation detailsare unknown to the developers
debugging encountered memory leaks. Often, these developerscannot distinguish whether an observed reference
chain is legitimate (such as when objects are keptin a cache in anticipation of future uses), or represents a
leak.Third, the sheer size of the heaplarge-scale server applications can easily contain tens of millions of objectsmakes manual inspection of even a small subset of objects difficult orimpossible.Our key contributions can be
summarized as follows:1. Although analysis techniques are widely used in heap analysis, our work is the first toemploygraph mining for detecting leaking candidates. Specifically, we demonstrate that graph grammarmining
used in an offline manner can detect both seeded and known memory leaks in realapplications.2. Compared to
other offline analysis techniques, our approach does not require any a prioriknowledge about which classes arecontainers, or about their internal structure. It capturescontainers even when these are embedded into application
classes, such as ad-hoc listsor arrays.3. Our approach can identify leaks even if the leaks locations within the graphdo not share acommon ancestor node, or if the paths from that ancestor to the instances are difficult to find by
themanual examination that is required in existing tools such as Eclipse Memory Analyzer (MAT).4. Graphgrammar mining can find recursive structures, giving a user insight into the data structuresused in a program. For
instance, linked lists and trees can be identified by their distinct signatures.5. Finally, the ability to combine
subgraph frequency with location information makes ouralgorithm robust to the presence of object structures that
occur naturally with high frequencywithout constituting a leak.
International Journal of Computational Intelligence and Information Security, December
2011 Vol. 2, No. 1250SECTION IV
4. Graph Mining Based on Anomaly DetectionGAD is a graph-based approach to finding anomalies in data by searching for three factors:modifications,
insertions, and deletions of vertices and edges, each unique factor runs its ownalgorithm that finds a normative
substructure and attempts to find the substructures that are similarbut not completely identical to the discovered
normative substructure. A normative substructure is arecurring subgraph of vertices and edges that, whencoalesced into a single vertex, most compressesthe overall graph.
4.1. Categories Of Intrusion Detection systemIntrusion detection is classified into two types.1) Misuse detection, 2)Anomaly detection. Misusedetection useswell-defined patterns of the attack that exploit weakness in system and applicationsoftware to identify the
intrusions (Kumar and Spafford 1995). These patterns are encoded
inadvance and used to match against user behavior to detect intrusions. Anomaly detectionidentifies deviations
-
8/2/2019 International Journal of Computational Intelligence and Information Security
3/6
from the normal usage behavior patterns to identify the intrusion. The normalusage patterns are constructed from
the statically measures of the system features, for example theCPU and I/O activities by a particular user orprogram. The behavior of the user is observed andany deviation from the constructed normal behavior is detected
as intrusion.
4.2. What Is Anomaly
Anomaly detection refers to detecting patterns in a given data set that do not conform to anestablished normalbehavior. The patterns thus detected are called anomalies and translate to criticaland actionable information in
several application domains. Anomalies are also referred to as outlier,surprise deviation etc.Most anomaly
detection algorithms require a set of purely normal data to train the model andthey implicitly assume thatanomalies can be treated as patterns not observed before. Since anoutlier may be defined as a data point which is
very different from the rest of the data, based onsome measure, we employ several detection schemes in order to
see how efficiently these schemesmay deal with the problem of anomaly detection. The statistics community hasstudied the conceptof outliers quite extensively. In these techniques, the data points are modeled using a
stochasticdistribution, and points are determined to be outliers depending upon their relationship with thismodel.
However with increasing dimensionality, it becomes increasingly difficult and inaccurate toestimate the
multidimensional distributions of the data points. However recent outlier detectionalgorithms that we utilize in this
study are based on computing the full dimensional distances of thepoints from one another as well as oncomputing the densities of local neighborhoods.The deviation measure is our extension of the traditional method
of discrepancy detection. As indiscrepancy detection, comparisons are made between predicted and actual sensorvalues, anddifferences are interpreted to be indications of anomalies. This raw discrepancy is entered into
anormalization process identical to that used for the value change score, and it is this representationof relative
discrepancy which is reported. The deviation score for a sensor is minimum if there is nodiscrepancy andmaximum if the discrepancy between predicted and actual is the greatest seen todate on that sensor. Deviation
requires that a simulation be available in any form for generatingsensor value predictions. However the remaining
sensitivity and cascading alarms measures requirethe ability to simulate and reason with a causal model of the
system being monitored. Sensitivityand cascading Alarms
International Journal of Computational Intelligence and Information Security, December
2011 Vol. 2, No. 1251An appealing way to assess whether current behavior is anomalous or not is via comparison topast behavior. This
is the essence of the surprise measure. It is designed to highlight a sensor which[18] behaves other than it has
historically. Specifically, surprise uses the historical frequencydistribution for the sensor in two ways: It is thosesensors and to examine the relative likelihoods of different values of the sensor. It is those sensors which display
unlikely values when other values of the sensor are more likely which get a high surprise [19] score. Surprise is not
high if the onlyreason a sensors value is unlikely is that there are many possible values for the sensor, allequallyunlikely.CONCLUSION VTrends obtain through data mining intended to be used for marketing purpose
or for some otherethical purposes, may be misused. Unethical businesses may used the information obtained
throughdata mining to take advantage of vulnerable people or discriminated against a certain group of people. In
addition, data mining technique is not a 100 percent accurate; thus mistakes do happenwhich can have seriousconsequence. Although it is against the law to sell or trade personalinformation between different organizations,
selling personal information have occurred. Forexample, according to Washing Post, in 1998, CVS had sold their
patients prescription purchasesto a different company. In addition, American Express also sold their customerscredit carepurchases to another company. What CVS and American Express did clearly violate privacy
lawbecause they were selling personal information without the consent of their customers. The sellingof personal
information may also bring harm to these customers because you do not know what theother companies are
-
8/2/2019 International Journal of Computational Intelligence and Information Security
4/6
planning to do with the personal information that they have purchased. In ourpaper we briefly discuss with the
process of mining in the graph and techniques used in the graphmanagement enhancements to our comes throughthe novel algorithm implementation to preventmisuse detection and privacy.
References[1] C. Aggarwal, N. Ta, J. Feng, J. Wang, M. J. Zaki. XProj: A Framework for Projected
Structural Clustering of XMLDocuments,
KDD Conference, 2007.[2] R. Agrawal, A. Borgida, H.V. Jagadish. EfficientMaintenance of transitive
relationships in large data andknowledge bases,ACM SIGMOD Conference, 1989.[3] D. Chakrabarti, Y. Zhan, C. Faloutsos R-MAT: A Recursive Model for Graph Mining.SDM Conference, 2004.[4] J. Cheng, J. Xu Yu, X. Lin, H.Wang, and P. S. Yu, Fast Computing Reachability
Labelings for Large Graphs withHigh Compression Rate,EDBT Conference, 2008.[5] J. Cheng, J. Xu Yu, X. Lin, H. Wang, and P. S. Yu, Fast Computation of Reachability
Labelings in Large Graphs,
EDBT Conference, 2006.[6] E. Cohen. Size-estimation framework with applications to transitive closure andreachability,Journal of Computer and System Sciences, v.55 n.3, p.441-453, Dec. 1997.[7] E. Cohen, E. Halperin, H. Kaplan, and U. Zwick,Reachability and distance queries via 2-hop labels,ACM Symposium on Discrete Algorithms, 2002.[8] D. Cook, L. Holder, Mining Graph Data,John Wiley & Sons Inc, 2007.[9] D. Conte, P. Foggia, C. Sansone, and M. Vento. Thirty years of graph matching in
pattern recognition.Int. Journalof Pattern Recognition andArtificial Intelligence
, 18(3):265298, 2004.[10] M. Faloutsos, P. Faloutsos, C. Faloutsos, On Power LawRelationships of the Internet Topology.SIGCOMM Conference, 1999.[11] G. Flake, R. Tarjan, M. Tsioutsiouliklis. Graph Clustering and Minimum Cut Trees, Internet Mathematics
, 1(4),385408, 2003.
International Journal of Computational Intelligence and Information Security, December
2011 Vol. 2, No. 1252[12] D. Gibson, R. Kumar, A. Tomkins, Discovering Large Dense Subgraphs in Massive Graphs,VLDB Conference,2005.[13] M. Hay, G. Miklau, D. Jensen, D. Towsley, P. Weis. Resisting Structural Re-
identification in Social Networks,VLDB Conference, 2008.[14] H. He, A. K. Singh. Graphs-at-a-time: Query Language and Access Methods for
Graph Databases. In
-
8/2/2019 International Journal of Computational Intelligence and Information Security
5/6
Proc. ofSIGMOD 08, pages 405418, Vancouver, Canada, 2008.[15] H. He, H. Wang, J. Yang, P. S. Yu. BLINKS:Ranked keyword searches on graphs. InSIGMOD, 2007.[16] H. Kashima, K. Tsuda, A. Inokuchi. Marginalized Kernels between Labeled Graphs,
ICML
, 2003.[17] L. Backstrom, C. Dwork, J. Kleinberg. Wherefore Art Thou R3579X Anonymized
Social Networks, HiddenPatterns, and Structural Steganography.[18] T. Kudo, E. Maeda, Y.
Matsumoto. An Application of Boosting to Graph Classification,NIPS Conf.2004.[19] J. Leskovec, J. Kleinberg, C. Faloutsos. Graph Evolution: Densification and Shrinking
Diameters.ACM Transactions on Knowledge Discoveryfrom Data (ACM TKDD), 1(1), 2007.
N.Swapna GoudB.Tech Computer Science & Information Technology from VijayRural Engineering College,
M.Tech Computer Science Engineering from AnuragGroup of Institutions (CVSR EngineeringCollege). She is currently working as AsstProf for CVSR Engineering College, having seven and
half years of experience inAcademic has guided many UG & PG students and her research areasinclude DataMining, Design Analysis of Algorithms, Service Oriented Architecture our
work focusing on Graph Mining
S.Vaishnavipursuing M.Tech Computer Science Engineering at JNTUH B.TechComputer Science &
Engineering. She is currently working as Asst Prof atNarayanamma Institute of Technology &
Science having seven and half years of experience in Academic has guided many UG students
her research areas include DataMining, Design Analysis of Algorithms, Service OrientedArchitecture our work focusing on Graph Mining
-
8/2/2019 International Journal of Computational Intelligence and Information Security
6/6
of9Leave a Comment
Submit
Characters: 0