Repo for the paper "An Empirical Study on the Robustness and Generalization of Machine Learning for Malware Analysis"
This script provides a basis for automating the collection and extraction of data from malware.
The code is simple and can therefore be easily adapted to any requirement.
dataset/
├── .env The directories that will be used are specified here.
├── extract.py Run this script after adapting the code
├── gen_samples.py The actual data extractor
└── parse_bazar.py Parse MalwareBazaar file naming stylegit clone https://github.com/unipr-xAI-lab/ml-malware-dataset-extractor.git
cd ml-malware-dataset-extractor/dataset
pip install -r requirements.txt
python3 extract.py