GAGE: Genetic Algorithm-based Graph Explainer for Malware Analysis

Publication

M. Saqib, B. C. M. Fung, P. Charland, and A. Walenstein. GAGE: genetic algorithm-based graph explainer for malware analysis. In Proceedings of the 40th IEEE International Conference on Data Engineering (ICDE), pages 2258-2270, Utrecht, Netherlands: IEEE Computer Society, May 2024.

@inproceedings{SFCW24icde,
author = "M. Saqib and B. C. M. Fung and P. Charland and A. Walenstein",
title = "{GAGE}: Genetic Algorithm-based Graph Explainer for Malware Analysis",
booktitle = "Proc. of the 40th IEEE International Conference on Data Engineering (ICDE)",
pages = "2258-2270",
address = "Utrecht, Netherlands",
month = "May",
year = "2024",
publisher = "IEEE Computer Society",
}

Overview

GAGE is a tool designed for malware analysis using genetic algorithms to explain graph structures. It integrates various components for constructing Canonical Executable Graphs (CEGs), training Graph Convolutional Networks (GCNs), and performing quantitative and robustness analysis.

Repository Structure

Main Scripts

CEG.py: Constructs CEGs from disassembled binaries.
AED_training.py: Trains embeddings for functions.
GA.py: Implements the Genetic Algorithm-based Graph Explainer (GAGE) algorithm.

GCN Related

GCN_AccuracyEnhancement.py: Enhances the accuracy of the GCN model.
GCN_predict.py: Uses the trained GCN model for predictions.
GCN_without_edgeFeatures.py: GCN model without edge features.
GCN_withoutDuplicateNodes.py: GCN model excluding duplicate nodes.

GA for Experiments

GA_for_experiment_cfgexplainer.py: GA script tailored for CFG explainer experiments.
GA_for_experiment_gage_cfg.py: GA script configured for GAGE experiments.

Quantitative and Robustness Analysis

quantitative_analysis.py: Script for performing quantitative analysis on the generated graphs.
robustness_MMD_1perm.py: Calculates robustness using the Minimum Mean Discrepancy (MMD) metric with one permutation.
robustness_subgraph.py: Analyzes the robustness of subgraphs.
robustness_calculation.py: General robustness calculation script.

Plotting and Visualization

myplot/CFGExplainer.py: Visualization for CFG explainer.
myplot_benign_4.py: Plots benign samples.
myplot_bladabindi_emotet.py: Plots Bladabindi and Emotet malware samples.
myplot_gex_3.py: General plotting for experiments.
myplot_gex_4.py: Additional plotting for benign samples.

Utility Scripts

extract_subgraph.py: Extracts subgraphs from the main graph.
blockFeatureGenerator.py: Generates block features for the graphs.
utility.py: General utility functions used across different scripts.

Acknowledgments

GAGE was developed by Mohd Saqib under the supervision of Benjamin C. M. Fung, in the McGill Data Mining and Security Lab in Canada. It is distributed under the Creative Commons Attribution-NonCommercial 4.0 International License (for detail click here!). This research is supported by BlackBerry Limited (ALLRP 561035), Defence Research & Development Canada, and NSERC Alliance Grants (ALLRP 561035-20). Special thanks to Philippe Charland and Andrew Walenstein.

Disclaimer

The software is provided as-is with no warranty or support. We do not take any responsibility for any damage, loss of income, or any problems you might experience from using our software. If you have questions, you are encouraged to consult the paper and the source code. If you find our software useful, please cite our paper above.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
myplot		myplot
.gitignore		.gitignore
AED_training.py		AED_training.py
CEG.py		CEG.py
CFGExplainer.py		CFGExplainer.py
GA.py		GA.py
GA_for_expriment_cfgexplainer.py		GA_for_expriment_cfgexplainer.py
GA_for_expriment_gage_cfg.py		GA_for_expriment_gage_cfg.py
GA_for_expriments.py		GA_for_expriments.py
GCN_AccuracyEnhancement.py		GCN_AccuracyEnhancement.py
GCN_predict.py		GCN_predict.py
GCN_withoutDuplicateNodes.py		GCN_withoutDuplicateNodes.py
GCN_without_edgeFeatures.py		GCN_without_edgeFeatures.py
LICENCE		LICENCE
README.md		README.md
blockFeatureGenerator.py		blockFeatureGenerator.py
extract_subgraph.py		extract_subgraph.py
gage_cfg_gcn.py		gage_cfg_gcn.py
main.py		main.py
myplot_benign_4.png		myplot_benign_4.png
myplot_bladabindi_emotet.png		myplot_bladabindi_emotet.png
myplot_gex_3.png		myplot_gex_3.png
quatitative_analysis.py		quatitative_analysis.py
robustness_MMD.py		robustness_MMD.py
robustness_MMD_1perm.py		robustness_MMD_1perm.py
robustness_calculation.py		robustness_calculation.py
robustness_subgraph.py		robustness_subgraph.py
test_cfgfe.py		test_cfgfe.py
utility.py		utility.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GAGE: Genetic Algorithm-based Graph Explainer for Malware Analysis

Publication

Overview

Repository Structure

Main Scripts

GCN Related

GA for Experiments

Quantitative and Robustness Analysis

Plotting and Visualization

Utility Scripts

Acknowledgments

Disclaimer

About

Releases

Packages

Contributors 2

Languages

License

McGill-DMaS/GAGE

Folders and files

Latest commit

History

Repository files navigation

GAGE: Genetic Algorithm-based Graph Explainer for Malware Analysis

Publication

Overview

Repository Structure

Main Scripts

GCN Related

GA for Experiments

Quantitative and Robustness Analysis

Plotting and Visualization

Utility Scripts

Acknowledgments

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages