Skip to content

A package implementing the NEURAL FINGERPRINTS FOR ADVERSARIAL ATTACK DETECTION methode

License

Notifications You must be signed in to change notification settings

HaimFisher/fingerprints-armor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Fingerprints for Adversarial Attack Detection

Neural Fingerprints is a method for detecting adversarial attacks on deep neural networks. This project implements the key components of the Neural Fingerprints approach as described in the paper Neural Fingerprints for Adversarial Attack Detection.

The module consists of two main components:

  1. Data base creator - Generates a database of activations from the last layers of a neural network model, for both original images and adversarial examples.
  2. Fingerprints armor - Implements the Neural Fingerprints technique to detect potential adversarial attacks by analyzing the activations in the database.

Data base creator

The Data Base Creator module contains scripts and modules for creating adversarial examples activations database from the last layers of a neural network model. The goal is to build a database to test our adversarial fingerprint protection mechanism.

Input Structure

The input should be structured as an ImageNet dataset:

images/
├── category_1/
│   ├── image_1.JPEG
│   ├── image_2.JPEG
│   └── ...
├── category_2/
│   ├── image_1.JPEG
│   ├── image_2.JPEG
│   └── ...
└── ...

Output Format

The output will be pickle files containing the activations of the last X layers of the model for the original images and the adversarial examples generated from them.

Example directory structure:

database/
├── category_1/
│   ├── orig
│   │   ├── image_1.pkl
│   │   ├── image_2.pkl
│   │   └── ...
│   ├── attack
│   │   ├── image_1_0_to_1_ifgsm_3_01.pkl
│   │   ├── image_2_0_to_31_ifgsm_1_01.pkl
│   │   └── ...
├── category_2/
│   ├── orig
│   │   ├── image_1.pkl
│   │   ├── image_2.pkl
│   │   └── ...
│   ├── attack
│   │   ├── image_1_0_to_1_ifgsm_96_01.pkl
│   │   ├── image_2_0_to_78_ifgsm_9_01.pkl
│   │   └── ...
└── ...

Supported Attacks: The module supports generating adversarial examples using Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) attacks.

Files

This module contains the Attack class, which provides methods for performing adversarial attacks on input tensors. It includes methods for Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) attacks.

This module contains the ModelLastLayers class, which is used to extract activations from the last X FC layers of a model. It registers hooks to capture the activations during the forward pass.

The main script to create the DB. It uses the ModelLastLayers class to extract activations and the Attack class to generate adversarial examples.

Exsample

see db_creator_test for an exsample how to run the db_creator file.

Fingerprints Armor

The folder fingerprints_armor contains scripts and modules that implement the Neural Fingerprints method to detect adversarial attacks. The goal is to create fingerprints such that they can be used by the following methods: vote, anomaly, and ll_ratio to detect patterns that match an adversarial attack.

The fingerprint method is implemented via SingleClassFingerprintsArmor to find the fingerprint for every class to detect the adversarial attack on it.

The input should be the output from Data Base Creator and the output will supply functions:

  • vote: Predicted class labels based on the voting mechanism.
  • anomaly (likelihood): Anomaly score for each data point based on likelihood anomaly detection.
  • likelihood_ratio: Likelihood ratio value used to determine if a data point is an adversarial attack or not.

These functions will take a DataFrame containing the activations from the last x layers of the model after a suspicious image has been run through it as input and will return detection results indicating whether the image is attacked or not.

Files

This module handles the input and output operations for processing the activations database.

The main script that implements the activation fingerprint detection algorithm to identify potential adversarial attacks on neural networks. It uses statistical methods for fingerprint analysis.

Example

See fingerprints_armor_test for an example of how to run the fingerprints file and get the results.

In this test, each class tested will result in a CSV file with the following structure:

  • File Naming Convention: Each CSV file is named after the class being analyzed (e.g., class_name.csv).

  • File Structure: The CSV file contains the following columns:

    y vote anomaly ll_ratio cls
    0 0 0.2 1.5 class_1
    1 1 0.3 2.1 class_1
    0 0 0.1 1.2 class_1
    1 1 0.4 2.5 class_1
    ... ... ... ... ...

    Each row represents a data point, with the corresponding values for the actual label, predicted class label (vote), anomaly score, and likelihood ratio.

Aggregated Results:

After running the tests for all classes, the individual CSV files are concatenated into a single DataFrame (data). The aggregated results provide a comparison across multiple classes, showing how the defense mechanism performed in terms of voting, anomaly detection, and likelihood ratio.

Metric Columns:

  • vote: Based on the voting mechanism (e.g., majority vote from multiple fingerprint models).
  • anomaly: Anomaly scores that help identify unusual data points.
  • ll_ratio: Likelihood ratio for detecting adversarial examples.
  • Ground Truth (y): The ground truth label (0 for original data and 1 for attacked data) to compare against the predicted values.

Confusion Matrix:

For each metric (vote, anomaly, and ll_ratio), the confusion matrix is calculated. The confusion matrix compares the predicted values (from vote, anomaly, or ll_ratio) against the actual ground truth (y). This helps evaluate the performance of the adversarial defense mechanism.

Confusion Matrix Structure:

Predicted: 0 Predicted: 1
True: 0 True Negative (TN) False Positive (FP)
True: 1 False Negative (FN) True Positive (TP)

Where:

  • True Negative (TN): The number of original data points classified correctly as original.
  • False Positive (FP): The number of attacked data points classified incorrectly as original.
  • False Negative (FN): The number of original data points classified incorrectly as attacked.
  • True Positive (TP): The number of attacked data points classified correctly as attacked.

Quantile-Based Decision:

For each of the metrics (vote, anomaly, ll_ratio), the output is analyzed using a threshold based on the 1% quantile of the attack data. If a prediction for a particular metric exceeds this threshold, it is classified as an adversarial attack. This helps in assessing the robustness of the defense mechanism at different thresholds.

Configuration and Settings

The project uses a settings approach to manage configurations, allowing you to set paths, thresholds, and parameters directly in the code. This means you don’t have to specify configurations through command-line arguments each time you run the script, which simplifies setup and execution.

Open the settings script and adjust the values as needed.

Setup

pip

Run:

pip install fingerprints-armor

Then use it as follow:

import db_creator.db_creator
import fingerprints_armor.fingerprints

Docker

To ensure a consistent and reproducible environment, a Dockerfile is provided. The Docker setup installs all necessary dependencies and sets up the environment for running the scripts.

Building the Docker Image

To build the Docker image, run the following command in the root directory of the repository:

docker build -t neural-fingerprints .

Running the Docker Container

To run the Docker container, use the following command:

docker run -v /path/to/imagenet:/image_net_data/ -v /path/to/output:/neural_fingerprints_db/ -it neural-fingerprints

Replace /path/to/imagenet with the path to your ImageNet dataset and /path/to/output with the path where you want to save the output DB.

Running Without Docker

If you prefer not to use Docker, you can run the project directly on your local machine by setting up a Python environment and installing the required dependencies. Follow the steps below to get started.

  1. Clone the Repository Start by cloning the repository to your local machine:
git clone https://github.com/yourusername/neural-fingerprints.git
cd neural-fingerprints
  1. Set Up a Virtual Environment It's recommended to create a virtual environment to isolate the project dependencies. You can do this with venv (or virtualenv if preferred).

For Linux/macOS:

python3 -m venv venv
source venv/bin/activate

For Windows:

python -m venv venv
.\venv\Scripts\activate
  1. Install Dependencies Once the virtual environment is activated, install the required dependencies using the requirements.txt file:
pip install -r requirements.txt

This will install all the necessary Python libraries, including any dependencies needed for adversarial attack detection.

  1. Running the Project After the dependencies are installed, you can run the project scripts directly.

To create the activations database (original and adversarial examples), use:

python src/db_creator/db_creator.py

To run the Neural Fingerprints defense on the database and detect adversarial attacks:

python src/fingerprints_armor/fingerprints.py

This will process the activations from the database and output detection results to the specified directory.

About

A package implementing the NEURAL FINGERPRINTS FOR ADVERSARIAL ATTACK DETECTION methode

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published