Skip to content

Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective

License

Notifications You must be signed in to change notification settings

zhichao-lu/robust-residual-network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

8c6d2c2 · Jun 7, 2024

History

16 Commits
Jun 7, 2024
Dec 22, 2022
Dec 22, 2022
Aug 3, 2023
Aug 3, 2023
Dec 28, 2022
Aug 3, 2023
Dec 22, 2022
Aug 3, 2023
Mar 24, 2024
Jun 7, 2024
Jun 7, 2024
Aug 3, 2023
Aug 3, 2023

Repository files navigation

Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective [arXiv]

Overview

This work presents a holistic study of the impact of architectural choice on adversarial robustness.

(Left) Impact of architectural components on adversarial robustness on CIFAR-10, relative to that of adversarial training methods. (Right) Progress of SotA robust accuracy against AutoAttack without additional data on CIFAR-10 with perturbations of ϵ = 8 / 255 chronologically.

Impact of Block-level Design

The design of a block primarily comprises its topology, type of convolution and kernel size, choice of activation, and normalization. We examine these elements independently through controlled experiments and propose a novel residual block, dubbed RobustResBlock, based on our observations. An overview of RobustResBlock is provided below:

rrblock

Table 1. White-box adversarial robustness of WRN with RobustResBlock

# P # F P G D 20 C W 40
D = 4 , W = 10 39.6M 6.00G 57.70 54.71 [BaiduDisk]
D = 5 , W = 12 70.5M 10.6G 58.46 55.56 [BaiduDisk]
D = 7 , W = 14 133M 19.6G 59.41 56.62 [BaiduDisk]
D = 11 , W = 16 270M 39.3G 60.48 57.78 [BaiduDisk]

Impact of Network-level Design

Independent Scaling by Depth ( D 1 : D 2 : D 3 = 2 : 2 : 1 )

We allow the depth of each stage ( D i { 1 , 2 , 3 } ) to vary among { 2 , 3 , 4 , 5 , 7 , 9 , 11 } , details and pre-trained checkpoints of 7 3 = 343 depth settings are available from here.

scale_depth

Independent Scaling by Width ( W 1 : W 2 : W 3 = 2 : 2.5 : 1 )

We allow the width (in terms of widening factors) of each stage ( W i { 1 , 2 , 3 } ) to vary among { 4 , 6 , 8 , 10 , 12 , 14 , 16 , 20 } , details and pre-trained checkpoints of 8 3 = 512 width settings are available from here.

scale_width

Interplay between Depth and Width ( D i : W i = 7 : 3 )

compound_scale

compare_scale

Table 2. Performance of independent scaling ( D or W ) and compound scaling ( D & W )

# F Target Scale by D 1 W 1 D 2 W 2 D 3 W 3 # P # F P G D 20 C W 40
D 5 10 5 10 2 10 24.0M 5.25G 56.05 53.14 [BaiduDisk]
5G W 4 11 4 13 4 6 24.5M 5.71G 56.89 53.87 [BaiduDisk]
D & W 14 5 14 7 7 3 17.7M 5.09G 57.49 54.78 [BaiduDisk]
D 6 12 6 12 3 12 48.5M 9.59G 56.42 53.91 [BaiduDisk]
10G W 5 13 5 16 5 7 44.4M 10.5G 57.06 54.29 [BaiduDisk]
D & W 17 7 17 9 8 4 39.3M 9.74G 58.06 55.45 [BaiduDisk]
D 9 14 8 14 4 14 90.4M 18.6G 57.11 54.48 [BaiduDisk]
20G W 7 16 7 18 7 8 81.7M 20.4G 58.02 55.34 [BaiduDisk]
D & W 22 8 22 11 11 5 74.8M 20.3G 58.47 56.14 [BaiduDisk]
D 14 16 13 16 11 16 185M 38.8G 57.90 55.79 [BaiduDisk]
40G W 11 18 11 21 11 9 170M 42.7G 58.48 56.15 [BaiduDisk]
D & W 27 10 28 14 13 6 147M 40.4G 58.76 56.59 [BaiduDisk]

Adversarially Robust Residual Networks (RobustResNets)

We use the proposed compound scaling rule to scale RobustResBlock and present a portfolio of adversarially robust residual networks.

Table 3. Comparison to SotA methods with additional 500K data

Method Model # P # F A A
RST WRN-28-10 36.5M 5.20G 59.53
AWP WRN-28-10 36.5M 5.20G 60.04
HAT WRN-28-10 36.5M 5.20G 62.50
Gowal et al. WRN-28-10 36.5M 5.20G 62.80
Huang el al. WRN-34-R 68.1M 19.1G 62.54
Ours RobustResNet-A1 19.2M 5.11G 63.70 [BaiduDisk]
Ours WRN-A4 147M 40.4G 65.79 [BaiduDisk]

How to use

1. Use our RobustResNets

  from models.resnet import PreActResNet
  depth = [D1, D2, D3]
  channels = [16, 16*W1, 32*W2, 64*W3]
  block_types = ['robust_res_block', 'robust_res_block', 'robust_res_block']
  
  # Syntax
  model = PreActResNet(
    depth_configs=depth,
    channel_configs=channels,
    block_types=block_types,
    scales=8,
    base_width=10,
    cardinality=4,
    se_reduction=64
    num_classes=10,  # for CIFAR-10/SVHN/MNIST)
  
  # See Table 2 "D&W" rows for D1, D2, D3 and W1, W2, W3, see below for examples
  RobustResNet-A1 = PreActResNet(
    depth_configs=[14, 14, 7],
    channel_configs=[5, 7, 3],
    ...)
  RobustResNet-A2 = PreActResNet(
    depth_configs=[17, 17, 8],
    channel_configs=[7, 9, 4],
    ...)
  RobustResNet-A3 = PreActResNet(
    depth_configs=[22, 22, 11],
    channel_configs=[8, 11, 5],
    ...)
  RobustResNet-A4 = PreActResNet(
    depth_configs=[27, 28, 13],
    channel_configs=[10, 14, 6],
    ...)
  
  # If you prefer to use WRN's block but with our scalings
  WRN-A1 = PreActResNet(
    depth_configs=[14, 14, 7],
    channel_configs=[5, 7, 3],
    block_types = ['basic_block', 'basic_block', 'basic_block']
    ...)

2. Just want to use our block RobustResBlock

  from models.resnet import RobustResBlock
  # See Table 1 above for the performance of RobustResBlock
  block = RobustResBlock(
    in_chs, out_chs,
    kernel_size=3, 
    scales=8, 
    base_width=10, 
    cardinality=4,
    se_reduction=64,
    activation='ReLU', 
    normalization='BatchNorm')

3. Use our compound scaling rule, RobustScaling, to scale your custom models

Please see examples/compound_scaling.ipynb

How to evaluate pre-trained models

  • Download the checkpoints, which should contain the following:
    arch_xxx/
      -arch_xxx.log  # training log
      -arch_xxx.yaml  # configuration file 
      -checkpoints/
        -arch_xxx.pth  # last epoch checkpoint
        -arch_xxx_best.pth  # checkpoint for best robust acc on valid set
    
  • Run the following lines to evaluate adversarial robustness
  python eval_robustness.py \
    --data "path to data" \
    --config_file_path "path to configuration yaml file" \
    --checkpoint_path "path to checkpoint pth file" \
    --save_path "path to file for logging evaluation" \
    --attack_choice [FGSM/PGD/CW/AA] \
    --num_steps [1/20/40/0] \
    --batch_size 100  # batch size for evaluation, adjust according to your GPU memory

CIFAR-10 (TRADES)

Model # P # F Clean P G D 20 C W 40 AA
WRN-28-10 36.5M 5.20G 84.62 55.90 53.15 51.66 [BaiduDisk]
RobNet-large-v2 33.3M 5.10G 84.57 52.79 48.94 47.48 [BaiduDisk]
AdvRush 32.6M 4.97G 84.95 56.99 53.27 52.90 [BaiduDisk]
RACL 32.5M 4.93G 83.91 55.98 53.22 51.37 [BaiduDisk]
RRN-A1 (ours) 19.2M 5.11G 85.46 58.47 55.72 54.42 [BaiduDisk]
WRN-34-12 66.5M 9.60G 84.93 56.01 53.53 51.97 [BaiduDisk]
WRN-34-R 68.1M 19.1G 85.80 57.35 54.77 53.23 [BaiduDisk]
RRN-A2 (ours) 39.0M 10.8G 85.80 59.72 56.74 55.49 [BaiduDisk]
WRN-46-14 128M 18.6G 85.22 56.37 54.19 52.63 [BaiduDisk]
RRN-A3 (ours) 75.9M 19.9G 86.79 60.10 57.29 55.84 [BaiduDisk]
WRN-70-16 267M 38.8G 85.51 56.78 54.52 52.80 [BaiduDisk]
RRN-A4 (ours) 147M 39.4G 87.10 60.26 57.90 56.29 [BaiduDisk]

CIFAR-100 (TRADES)

Model # P # F Clean P G D 20 C W 40 AA
WRN-28-10 36.5M 5.20G 56.30 29.91 26.22 25.26 [BaiduDisk]
RobNet-large-v2 33.3M 5.10G 55.27 29.23 24.63 23.69 [BaiduDisk]
AdvRush 32.6M 4.97G 56.40 30.40 26.16 25.27 [BaiduDisk]
RACL 32.5M 4.93G 56.09 30.38 26.65 25.65 [BaiduDisk]
RRN-A1 (ours) 19.2M 5.11G 59.34 32.70 27.76 26.75 [BaiduDisk]
WRN-34-12 66.5M 9.60G 56.08 29.87 26.51 25.47 [BaiduDisk]
WRN-34-R 68.1M 19.1G 58.78 31.17 27.33 26.31 [BaiduDisk]
RRN-A2 (ours) 39.0M 10.8G 59.38 33.00 28.71 27.68 [BaiduDisk]
WRN-46-14 128M 18.6G 56.78 30.03 27.27 26.28 [BaiduDisk]
RRN-A3 (ours) 75.9M 19.9G 60.16 33.59 29.58 28.48 [BaiduDisk]
WRN-70-16 267M 38.8G 56.93 29.76 27.20 26.12 [BaiduDisk]
RRN-A4 (ours) 147M 39.4G 61.66 34.25 30.04 29.00 [BaiduDisk]

CIFAR-10 (SAT)

Model # P # F P G D 20 C W 40
WRN-28-10 36.5M 5.20G 52.44 50.97 [BaiduDisk]
RRN-A1 (ours) 19.2M 5.11G 57.62 56.06 [BaiduDisk]
WRN-34-12 66.5M 9.60G 52.85 51.36 [BaiduDisk]
RRN-A2 (ours) 39.0M 10.8G 58.39 56.99 [BaiduDisk]
WRN-46-14 128M 18.6G 53.67 52.95 [BaiduDisk]
RRN-A3 (ours) 75.9M 19.9G 58.81 57.60 [BaiduDisk]
WRN-70-16 267M 38.8G 54.12 50.52 [BaiduDisk]
RRN-A4 (ours) 147M 39.4G 59.01 57.85 [BaiduDisk]

CIFAR-10 (MART)

Model # P # F P G D 20 C W 40
WRN-28-10 36.5M 5.20G 57.69 52.88 [BaiduDisk]
RRN-A1 (ours) 19.2M 5.11G 59.34 54.42 [BaiduDisk]
WRN-34-12 66.5M 9.60G 57.40 53.11 [BaiduDisk]
RRN-A2 (ours) 39.0M 10.8G 60.33 55.51 [BaiduDisk]
WRN-46-14 128M 18.6G 58.43 54.32 [BaiduDisk]
RRN-A3 (ours) 75.9M 19.9G 60.95 56.52 [BaiduDisk]
WRN-70-16 267M 38.8G 58.15 54.37 [BaiduDisk]
RRN-A4 (ours) 147M 39.4G 61.88 57.55 [BaiduDisk]

How to train

Baseline adversarial training

python -m torch.distributed.launch \
  --nproc_per_node=2 --master_port 24220 \  # use a random port number
  main_dist.py \
  --config_path ./configs/CIFAR10 \
  --exp_name ./exps/CIFAR10 \  # path to where you want to store training stats
  --version [WRN-A1/A2/A3/A4] \  # you may also change it to RobustResNet-A1/A2/A3/A4
  --train \ 
  --data_parallel \
  --apex-amp

Advanced adversarial training

Please download the additional pseudolabeled data from Carmon et al., 2019.

python -m torch.distributed.launch \
  --nproc_per_node=8 --master_port 14226 \  # use a random port number
  adv-main_dist.py \
  --log-dir ./checkpoints/ \  # path to where you want to store training stats
  --config-path ./configs/Advanced_CIFAR10
  --version [WRN-A1/A2/A3/A4] \ 
  --desc drna4-basic-silu-apex-500k \  # name of the folder for storing training stats
  --apex-amp --adv-eval-freq 5 \  # evaluation frequency, will significantly slow down your training if too often
  --start-eval 310 \  # start evaluating after N epochs
  --apex_amp --advnorm --adjust_bn True \
   --num-adv-epochs 400 --batch-size 1024 --lr 0.4 --weight-decay 0.0005 --beta 6.0 \
  --data-dir /datasets/ --data cifar10s \
  --aux-data-filename /datasets/ti_500K_pseudo_labeled.pickle \  # location to where you download the pseudolabeled data
  --unsup-fraction 0.7

Requirements

The code has been implemented and tested with Python 3.8.5, PyTorch 1.8.0, and apex(use for accel).

Part of the code is based on the following repos:

About

Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages