Skip to content

MAvRK7/Malware-detection-using-MLmodels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Bridging Legacy and Modern Threat Detection

Machine Learning and Deep Learning Models on EMBER2018 & CIC-EvasivePDF2022

Python Framework License: MIT

πŸš€ Overview

Malware detection remains a critical challenge as attackers constantly evolve tactics.
This project benchmarks traditional ML, deep learning, and ensemble methods for malware detection across two generations of attacks:

  • EMBER2018 β†’ legacy malware (2006–2018)
  • CIC-EvasivePDF2022 β†’ recent evasive PDF malware

The aim is to highlight strengths, weaknesses, and trade-offs between models, focusing on accuracy, adaptability, and robustness to imbalanced datasets.


πŸ”‘ Key Features

  • Datasets:

    • EMBER2018 (structured malware features, legacy attacks)
    • CIC-EvasivePDF2022 (modern evasive samples, PDFs)
  • Algorithms Tested:

    • Traditional ML: Random Forest, XGBoost, AdaBoost, Logistic Regression, KNN
    • Deep Learning: CNN, MLP, RNN–LSTM, Transformer
    • Ensembles: Stacking, Voting classifiers (hybrid ML + DL)
  • Challenges Tackled:

    • Class imbalance handling (resampling, weighting)
    • Comparative evaluation of structured vs evasive malware detection

πŸ“Š Results Summary

Model / Method EMBER2018 Accuracy CIC-EvasivePDF2022 Accuracy
Random Forest 99.6% 99.3%
XGBoost 99.7% 99.1%
CNN 78.4% 98.1%
RNN–LSTM 50.4% 96.2%
Transformer 76.7% 97.4%
Voting Ensemble 99.5% 99.1%

πŸ‘‰ Key Insight:

  • ML methods excel on structured, legacy malware (EMBER).
  • DL models shine on evasive, complex malware (CIC-EvasivePDF).
  • Ensembles combine the best of both worlds.

πŸ› οΈ Tech Stack

  • Python 3.9+
  • Scikit-learn, PyTorch, XGBoost
  • Pandas, NumPy, Matplotlib/Seaborn for analysis

⚑ Quick Start

  1. Clone repo
    git clone https://github.com/MAvRK7/Bridging-Legacy-Modern-Threat-Detection.git
    cd Bridging-Legacy-Modern-Threat-Detection

About

This repo is about the cyber security project where malware is detected and classified

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published