malware detection using machine learning techniques

24
Malware detection using Machine learning: A review Submitted To: Maam Tahira Mehboob Presented By: Anum Nisa Sumaiya Arshad MAY 18, 2016 | Machine Learning

Upload: arshadraja786

Post on 17-Feb-2017

306 views

Category:

Engineering


9 download

TRANSCRIPT

Malware detection using Machine learning: A review

Submitted To: Maam Tahira MehboobPresented By:

Anum NisaSumaiya Arshad

MAY 18, 2016 |  Machine Learning

ABOUT MALWARE & ITS DETECTION TECHNIQUES:

INTODUCTION:

MAY 18, 2016 |  Machine Learning

ABOUT MALWARE & ITS DETECTION TECHNIQUES:

Malware is …      Malicious software      Virus, Spam, …

Increasing threats          *Continuous and increased attacks on infra-             structure           *Threats to business, national security & personal              security of PCs

Attacks are becoming more advanced and sophisticated!

MAY 18, 2016 |  Machine Learning

MALWARE Executables

Host vs Network based approachesLimitation of existing techniques       -Signature-based approach          * Fails to detect zero-day attacks.          * Fails to detect threats with evolving capabilities                  such as metamorphic and polymorphic malware.       -Anomaly-based approach            *Producing high false positive rate.       -Supervised Learning based approach            *Poor performance on new and evolving malware            *Building classifier model is challenging due to                  diversity of malware classes, imbalanced                            distribution, data imperfection issues, etc.

MAY 18, 2016 |  Machine Learning

Red Hocks (Viruses)

MAY 18, 2016 |  Machine Learning

Our Goal

Machine Learning based approach-Two level:      *Supervised learning approach to   detect malicious  flows and further   identify specific type       *Combine unsupervised learning with    supervised  learning to address new class discovery problem

MAY 18, 2016 |  Machine Learning

Two level malware detection framework:

Macro-level classifier  

Used to isolate malicious flows from the non-malicious ones.

Micro-level classifier

Further categorize the malicious flows into one of the preexisting malware or new malware

Proposed Framework

MAY 18, 2016 |  Machine Learning

Proposed Framework Block diagram

MAY 18, 2016 |  Machine Learning

Classification ProcessMachine learning, data mining, and text classification & detection methods to detect Malicious Executable includes:

Classifies Unknown or Malicious using ML alogorithmsRandom Forest ClassifierBoosted J 48 decision treeKNN, naïvebayes, SVM, Multilayer Perceptron MLPMal-ID Basic Detection AlgorithmBoth the Bayes network and random fore

st classifiers produced more accurate readi

ngs.But boosted Decision Tree (J48) is best cl

assifierMAY 18, 2016 |  Machine Learning

Experimental EvaluationOur Analysis Shows that among three major foms of viruses such as computer viruses, Internet worms and Trojan horses the most dangerous is trojans

MAY 18, 2016 |  Machine Learning

ANALYSIS

MAY 18, 2016 |  Machine Learning

ANALYSISThis section will introduce analysis techniques for mobile and PCs malware. It will transfer well known techniques from the common computer world to the platforms of mobile devices.The main idea of dynamic analysis is executing a

given sample in a controlled environment, monitoring its behavior, and obtaining information about its nature and purpose. 

This is especially important in the field of malware research because a malware analyst must be able to assess a program’s threat and create proper counter-measures.

While static analysis might provide more precise results, the sheer mass of newly emerging malware each day makes it impossible to conduct a static analysis for even a small portion of today’s malware. MAY 18, 2016 |  Machine Learning

ANALYSIS Of PARAMETERS:

To analyze malware detection techniques some evaluation parameters are used to detect quality factors (NonFunctional Requirements) :Category/Type of VirusDetection TechniquesAlgorithm/ Technology/ MechanismBest Classification methodologyEvaluation criterionImplementation Tools

MAY 18, 2016 |  Machine Learning

J48 is an extension of ID3. 

The additional features of J48 are:

accounting for missing values, decision trees pruning,

continuous attribute value ranges, derivation of rules, etc. In the WEKA data mining tool, J48 is an open source Java implementation of the C4.5 algorithm.

Boosted J 48 Decision Tree

MAY 18, 2016 |  Machine Learning

Boosted J 48 Decision Tree

MAY 18, 2016 |  Machine Learning

Conclusion:

We proposed an effective malware detection framework based on data mining & machine learning techniques:

    Two level ML based classifier

     New class detection

     Encrypted data

A tree based kernel for SVM was proposed to handle the data imperfection issue in network flow data

And Boosted J 48 decision tree classifier is analysized as best classifier among no of different classifiers

MAY 18, 2016 |  Machine Learning

Conclusion Contd:

However this paper shows the comparison of efficiency rate of different malware detection techniques including KNN, Naives Bayes, J 48 boosted, SVM(Support  Vector Machine).We explain the feasibility of some detection methods and highlight the major causes of increasing no of  malware files, but more research is necessary. 

MAY 18, 2016 |  Machine Learning

MAY 18, 2016 |  Machine Learning

Future Works

Develop a hierarchical multi-class learning method to enhance the testing efficiency when the number of malware classes becomes extremely large.

Detection (of malware) accuracy can be improved, through further research into classification algorithms and ways to mark malware data more accurately.And most of the classifiers used are not optimized for hardware operations or applications. Additionally hardware algorithm design can increase precision or accuracy and efficiency. 

MAY 18, 2016 |  Machine Learning

MAY 18, 2016 |  Machine Learning

ExtraMetamorphic malware is rewritten with each iteration so

that each succeeding version of thecode is different from the preceding one. The code changes makes it difficult for signature-based antivirus software programs to recognize that different iterations are the same malicious program.

Polymorphic malware also makes changes to code to avoid detection. It has two parts, but one part remains the same with each iteration, which makes the malware a little easier to identify.

an you imagine that a piece of malware code can change its shape and signature each time it appears, to make it extremely hard for signature based antivirus to detect them ?! This is called Polymorphic or Metamorphic malware.

software. Trojans can be employed by cyber-thieves and hackers trying to gain access to users' systems. Users are typically tricked by some form of social engineering into loading and executing Trojans on their systems. Once activated, Trojans can enable cyber-criminals to spy on you, steal your sensitive data, and gain backdoor access to your system. These actions can include:

Deleting dataBlocking dataModifying dataCopying dataDisrupting the performance of computers or computer

networks