Neural Network Diffusion

This repository contains the code and implementation details for the research paper titled Neural Network Diffusion. The paper explores novel paradigms in deep learning, specifically focusing on diffusion models for generating high-performing neural network parameters.

Authors

Kai Wang¹, Dongwen Tang¹, Boya Zeng², Yida Yin³, Zhaopan Xu¹, Yukun Zhou, Zelin Zang¹, Trevor Darrell³, Zhuang Liu*⁴, and Yang You*¹(* equal advising)
¹National University of Singapore, ²University of Pennsylvania, ³University of California, Berkeley, and ⁴Meta AI

Overview

Figure: Our approach consists of two processes: parameter autoencoder and parameter generation. Parameter autoencoder aims to extract the latent representations and reconstruct model parameters via the decoder. The extracted representations are used to train a diffusion model (DM). During inference, a random noise vector is fed into the DM and the trained decoder to generate new parameters.

Abstract: Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also generate high-performing neural network parameters. Our approach is simple, utilizing an autoencoder and a diffusion model. The autoencoder extracts latent representations of a subset of the trained neural network parameters. Next, a diffusion model is trained to synthesize these latent representations from random noise. This model then generates new representations, which are passed through the autoencoder's decoder to produce new subsets of high-performing network parameters. Across various architectures and datasets, our approach consistently generates models with comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models are not memorizing the trained ones. Our results encourage more exploration into the versatile use of diffusion models.

Environment

We support all versions of pytorch>=2.0.0. But we recommend to use python==3.11 and pytorch==2.5.1, which we have fully tested.

conda create -n pdiff python=3.11
conda activate pdiff
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
git clone https://github.com/NUS-HPC-AI-Lab/Neural-Network-Diffusion.git --depth=1
cd Neural-Network-Diffusion
pip install -r requirements.txt

Quick Start

This will run three steps sequentially: preparing the dataset, training p-diff, and evaluating. Then the results will be saved in the root directory and save checkpoint in ./checkpoint

cd workspace
bash run_all.sh main cifar100_resnet18 0
# bash run_all <category> <tag> <device>

Detailed Usage

Prepare checkpoints dataset.

cd ./dataset/main/cifar100_resnet18
rm performance.cache  # optional
CUDA_VISIBLE_DEVICES=0 python train.py
CUDA_VISIBLE_DEVICES=0 python finetune.py

Train pdiff and generate models.

cd ../../../workspace
bash launch.sh main cifar100_resnet18 0
# bash launch <category> <tag> <device>
CUDA_VISIBLE_DEVICES=0 python generate.py main cifar100_resnet18
# CUDA_VISIBLE_DEVICES=<device> python generate.py <category> <tag>

Test original checkpoints and generated checkpoints and their similarity.

CUDA_VISIBLE_DEVICES=0 python evaluate.py main cifar100_resnet18
# CUDA_VISIBLE_DEVICES=<device> python evaluate.py <category> <tag>

All our <category> and <tag> can be found in ./dataset/<category>/<tag>.

Register Your Own Dataset

Create a directory that mimics the dataset folder and contains three contents:

mkdir ./dataset/main/<tag>
cd ./dataset/main/<tag>

checkpoint: A directory contains many .pth files, which contain dictionaries of parameters.
generated: An empty directory, where the generated model will be stored.
test.py: A test script to test the checkpoints. It should be callable as follows:

CUDA_VISIBLE_DEVICES=0 python test.py ./checkpoint/checkpoint001.pth
# CUDA_VISIBLE_DEVICES=<device> python test.py <checkpoint_file>

Register a dataset.
Add a class to the last line of the dataset file.

cd ../../../dataset
vim __init__.py  
# This __init__.py is the dataset file.

# on line 392
+ class <Tag>(MainDataset): pass

Create your launch script.
You can change other hyperparameters here.

cd ../workspace/main
cp cifar10_resnet18.py main_<tag>.py
vim main_<tag>.py

# on line 33
- from dataset import Cifar100_ResNet18 as Dataset
+ from dataset import <Tag> as Dataset

Train pdiff and generate models.
Following Section "Detail Usage".
Test original ckpt and generated ckpt and their similarity.
Following Section "Detail Usage".

Acknowledgments

We thank Kaiming He, Dianbo Liu, Mingjia Shi, Zheng Zhu, Bo Zhao, Jiawei Liu, Yong Liu, Ziheng Qin, Zangwei Zheng, Yifan Zhang, Xiangyu Peng, Hongyan Chang, Zirui Zhu, Dave Zhenyu Chen, Ahmad Sajedi and George Cazenavette for valuable discussions and feedbacks.

Citation

If you found our work useful, please consider citing us.

@misc{wang2024neural,
      title={Neural Network Diffusion}, 
      author={Kai Wang and Dongwen Tang and Boya Zeng and Yida Yin and Zhaopan Xu and Yukun Zhou and Zelin Zang and Trevor Darrell and Zhuang Liu and Yang You},
      year={2024},
      eprint={2402.13144},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 218 Commits
dataset		dataset
figures		figures
model		model
workspace		workspace
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Network Diffusion

Authors

Overview

Environment

Quick Start

Detailed Usage

Register Your Own Dataset

Acknowledgments

Citation

About

Releases

Packages

Contributors 5

Languages

NUS-HPC-AI-Lab/Neural-Network-Diffusion

Folders and files

Latest commit

History

Repository files navigation

Neural Network Diffusion

Authors

Overview

Environment

Quick Start

Detailed Usage

Register Your Own Dataset

Acknowledgments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages