Adversarial Training Using the Fast Gradient Sign Method (FGSM)¶

[Paper] [Notebook] [TF Implementation] [Torch Implementation]

In this example we will demonstrate how to train a model to resist adversarial attacks constructed using the Fast Gradient Sign Method. For more background on adversarial attacks.

Import the required libraries¶

In [1]:

Copied!





import tempfile
import os

import numpy as np

import fastestimator as fe
from fastestimator.architecture.tensorflow import LeNet
from fastestimator.backend import argmax
from fastestimator.dataset.data import cifair10
from fastestimator.op.numpyop.univariate import Normalize
from fastestimator.op.tensorop import Average
from fastestimator.op.tensorop.gradient import Watch, FGSM
from fastestimator.op.tensorop.loss import CrossEntropy
from fastestimator.op.tensorop.model import ModelOp, UpdateOp
from fastestimator.trace.io import BestModelSaver
from fastestimator.trace.metric import Accuracy
from fastestimator.util import BatchDisplay, GridDisplay, to_number
import tempfile
import os

import numpy as np

import fastestimator as fe
from fastestimator.architecture.tensorflow import LeNet
from fastestimator.backend import argmax
from fastestimator.dataset.data import cifair10
from fastestimator.op.numpyop.univariate import Normalize
from fastestimator.op.tensorop import Average
from fastestimator.op.tensorop.gradient import Watch, FGSM
from fastestimator.op.tensorop.loss import CrossEntropy
from fastestimator.op.tensorop.model import ModelOp, UpdateOp
from fastestimator.trace.io import BestModelSaver
from fastestimator.trace.metric import Accuracy
from fastestimator.util import BatchDisplay, GridDisplay, to_number

In [2]:

parameters

Copied!





# training parameters
epsilon=0.04  # The strength of the adversarial attack
epochs=10
batch_size=50
train_steps_per_epoch=None
eval_steps_per_epoch=None
save_dir=tempfile.mkdtemp()
# training parameters
epsilon=0.04  # The strength of the adversarial attack
epochs=10
batch_size=50
train_steps_per_epoch=None
eval_steps_per_epoch=None
save_dir=tempfile.mkdtemp()

Step 1 - Data and `Pipeline` preparation¶

In this step, we will load ciFAIR10 training and validation datasets and prepare FastEstimator's pipeline.

Load dataset¶

We use a FastEstimator API to load the ciFAIR10 dataset and then get a test set by splitting 50% of the data off of the evaluation set.

In [3]:

Copied!

from fastestimator.dataset.data import cifair10

train_data, eval_data = cifair10.load_data()
test_data = eval_data.split(0.5)
from fastestimator.dataset.data import cifair10

train_data, eval_data = cifair10.load_data()
test_data = eval_data.split(0.5)

Prepare the `Pipeline`¶

We will use a simple pipeline that just normalizes the images

In [4]:

Copied!





pipeline = fe.Pipeline(
        train_data=train_data,
        eval_data=eval_data,
        test_data=test_data,
        batch_size=batch_size,
        ops=[
            Normalize(inputs="x", outputs="x", mean=(0.4914, 0.4822, 0.4465), std=(0.2471, 0.2435, 0.2616))
        ])
pipeline = fe.Pipeline(
        train_data=train_data,
        eval_data=eval_data,
        test_data=test_data,
        batch_size=batch_size,
        ops=[
            Normalize(inputs="x", outputs="x", mean=(0.4914, 0.4822, 0.4465), std=(0.2471, 0.2435, 0.2616))
        ])

Step 2 - `Network` construction¶

Model Construction¶

Here we will leverage the LeNet implementation built in to FastEstimator

In [5]:

Copied!

model = fe.build(model_fn=lambda: LeNet(input_shape=(32, 32, 3)), optimizer_fn="adam", model_name="adv_model")
model = fe.build(model_fn=lambda: LeNet(input_shape=(32, 32, 3)), optimizer_fn="adam", model_name="adv_model")

2023-03-22 12:21:03.745710: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-03-22 12:21:03.745734: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)

Metal device set to: Apple M1 Max

`Network` defintion¶

This is where the adversarial attack will be implemented. To perform an FGSM attack, we first need to monitor gradients with respect to the input image. This can be accomplished in FastEstimator using the Watch TensorOp. We then will run the model forward once, compute the loss, and then pass the loss value into the FGSM TensorOp in order to create an adversarial image. We will then run the adversarial image through the model, compute the loss again, and average the two results together in order to update the model.

In [6]:

Copied!





network = fe.Network(ops=[
        Watch(inputs="x"),
        ModelOp(model=model, inputs="x", outputs="y_pred"),
        CrossEntropy(inputs=("y_pred", "y"), outputs="base_ce"),
        FGSM(data="x", loss="base_ce", outputs="x_adverse", epsilon=epsilon),
        ModelOp(model=model, inputs="x_adverse", outputs="y_pred_adv"),
        CrossEntropy(inputs=("y_pred_adv", "y"), outputs="adv_ce"),
        Average(inputs=("base_ce", "adv_ce"), outputs="avg_ce"),
        UpdateOp(model=model, loss_name="avg_ce")
    ])
network = fe.Network(ops=[
        Watch(inputs="x"),
        ModelOp(model=model, inputs="x", outputs="y_pred"),
        CrossEntropy(inputs=("y_pred", "y"), outputs="base_ce"),
        FGSM(data="x", loss="base_ce", outputs="x_adverse", epsilon=epsilon),
        ModelOp(model=model, inputs="x_adverse", outputs="y_pred_adv"),
        CrossEntropy(inputs=("y_pred_adv", "y"), outputs="adv_ce"),
        Average(inputs=("base_ce", "adv_ce"), outputs="avg_ce"),
        UpdateOp(model=model, loss_name="avg_ce")
    ])

Step 3 - `Estimator` definition and training¶

In this step, we define the Estimator to connect the Network with the Pipeline and set the traces which will compute accuracy (Accuracy) and save the best model (BestModelSaver) along the way. We will compute accuracy both with respect to the clean input images ('clean accuracy') as well as with respect to the adversarial input images ('adversarial accuracy'). At the end, we use Estimator.fit to trigger the training.

In [7]:

Copied!





traces = [
    Accuracy(true_key="y", pred_key="y_pred", output_name="clean_accuracy"),
    Accuracy(true_key="y", pred_key="y_pred_adv", output_name="adversarial_accuracy"),
    BestModelSaver(model=model, save_dir=save_dir, metric="base_ce", save_best_mode="min"),
]
estimator = fe.Estimator(pipeline=pipeline,
                         network=network,
                         epochs=epochs,
                         traces=traces,
                         train_steps_per_epoch=train_steps_per_epoch,
                         eval_steps_per_epoch=eval_steps_per_epoch,
                         monitor_names=["base_ce", "adv_ce"],
                         log_steps=1000)
traces = [
    Accuracy(true_key="y", pred_key="y_pred", output_name="clean_accuracy"),
    Accuracy(true_key="y", pred_key="y_pred_adv", output_name="adversarial_accuracy"),
    BestModelSaver(model=model, save_dir=save_dir, metric="base_ce", save_best_mode="min"),
]
estimator = fe.Estimator(pipeline=pipeline,
                         network=network,
                         epochs=epochs,
                         traces=traces,
                         train_steps_per_epoch=train_steps_per_epoch,
                         eval_steps_per_epoch=eval_steps_per_epoch,
                         monitor_names=["base_ce", "adv_ce"],
                         log_steps=1000)

In [8]:

Copied!

estimator.fit()
estimator.fit()

    ______           __  ______     __  _                 __            
   / ____/___ ______/ /_/ ____/____/ /_(_)___ ___  ____ _/ /_____  _____
  / /_  / __ `/ ___/ __/ __/ / ___/ __/ / __ `__ \/ __ `/ __/ __ \/ ___/
 / __/ / /_/ (__  ) /_/ /___(__  ) /_/ / / / / / / /_/ / /_/ /_/ / /    
/_/    \__,_/____/\__/_____/____/\__/_/_/ /_/ /_/\__,_/\__/\____/_/     
                                                                        

FastEstimator-Start: step: 1; logging_interval: 1000; num_device: 1;
WARNING:tensorflow:From /Users/skynet/.pyenv/versions/miniforge3-4.10.3-10/envs/femos38/lib/python3.8/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089
FastEstimator-Train: step: 1; adv_ce: 2.5932403; avg_ce: 2.500473; base_ce: 2.4077055;
FastEstimator-Train: step: 1000; adv_ce: 1.8527539; avg_ce: 1.6701212; base_ce: 1.4874886; steps/sec: 100.77;
FastEstimator-Train: step: 1000; epoch: 1; epoch_time(sec): 10.91;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 133.06;
Eval Progress: 66/100; steps/sec: 117.13;
Eval Progress: 100/100; steps/sec: 74.83;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 1000; epoch: 1; adv_ce: 1.6358021; adversarial_accuracy: 0.3874; avg_ce: 1.4667553; base_ce: 1.2977084; clean_accuracy: 0.543; min_base_ce: 1.2977084; since_best_base_ce: 0;
FastEstimator-Train: step: 2000; adv_ce: 1.4819262; avg_ce: 1.2961853; base_ce: 1.1104442; steps/sec: 92.93;
FastEstimator-Train: step: 2000; epoch: 2; epoch_time(sec): 10.78;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 128.45;
Eval Progress: 66/100; steps/sec: 130.29;
Eval Progress: 100/100; steps/sec: 115.47;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 2000; epoch: 2; adv_ce: 1.5952863; adversarial_accuracy: 0.4086; avg_ce: 1.3943998; base_ce: 1.1935132; clean_accuracy: 0.5844; min_base_ce: 1.1935132; since_best_base_ce: 0;
FastEstimator-Train: step: 3000; adv_ce: 1.5492551; avg_ce: 1.3347089; base_ce: 1.1201626; steps/sec: 92.02;
FastEstimator-Train: step: 3000; epoch: 3; epoch_time(sec): 10.86;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 126.13;
Eval Progress: 66/100; steps/sec: 120.81;
Eval Progress: 100/100; steps/sec: 124.64;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 3000; epoch: 3; adv_ce: 1.5346261; adversarial_accuracy: 0.4336; avg_ce: 1.3141936; base_ce: 1.093761; clean_accuracy: 0.6146; min_base_ce: 1.093761; since_best_base_ce: 0;
FastEstimator-Train: step: 4000; adv_ce: 1.3029621; avg_ce: 1.08198; base_ce: 0.86099803; steps/sec: 95.21;
FastEstimator-Train: step: 4000; epoch: 4; epoch_time(sec): 10.51;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 105.34;
Eval Progress: 66/100; steps/sec: 122.6;
Eval Progress: 100/100; steps/sec: 123.17;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 4000; epoch: 4; adv_ce: 1.5180533; adversarial_accuracy: 0.4364; avg_ce: 1.2781447; base_ce: 1.038236; clean_accuracy: 0.6398; min_base_ce: 1.038236; since_best_base_ce: 0;
FastEstimator-Train: step: 5000; adv_ce: 1.3933566; avg_ce: 1.1443093; base_ce: 0.8952619; steps/sec: 93.05;
FastEstimator-Train: step: 5000; epoch: 5; epoch_time(sec): 10.74;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 123.29;
Eval Progress: 66/100; steps/sec: 132.07;
Eval Progress: 100/100; steps/sec: 116.19;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 5000; epoch: 5; adv_ce: 1.4809489; adversarial_accuracy: 0.4582; avg_ce: 1.2281222; base_ce: 0.97529566; clean_accuracy: 0.6596; min_base_ce: 0.97529566; since_best_base_ce: 0;
FastEstimator-Train: step: 6000; adv_ce: 1.3028104; avg_ce: 1.0527444; base_ce: 0.8026783; steps/sec: 94.44;
FastEstimator-Train: step: 6000; epoch: 6; epoch_time(sec): 10.59;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 110.63;
Eval Progress: 66/100; steps/sec: 108.65;
Eval Progress: 100/100; steps/sec: 106.96;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 6000; epoch: 6; adv_ce: 1.4869825; adversarial_accuracy: 0.4452; avg_ce: 1.226828; base_ce: 0.9666735; clean_accuracy: 0.664; min_base_ce: 0.9666735; since_best_base_ce: 0;
FastEstimator-Train: step: 7000; adv_ce: 1.3652477; avg_ce: 1.1188885; base_ce: 0.8725292; steps/sec: 93.72;
FastEstimator-Train: step: 7000; epoch: 7; epoch_time(sec): 10.68;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 116.1;
Eval Progress: 66/100; steps/sec: 133.15;
Eval Progress: 100/100; steps/sec: 128.46;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 7000; epoch: 7; adv_ce: 1.4863496; adversarial_accuracy: 0.4602; avg_ce: 1.2245204; base_ce: 0.96269095; clean_accuracy: 0.6634; min_base_ce: 0.96269095; since_best_base_ce: 0;
FastEstimator-Train: step: 8000; adv_ce: 1.32106; avg_ce: 1.0630496; base_ce: 0.80503905; steps/sec: 93.0;
FastEstimator-Train: step: 8000; epoch: 8; epoch_time(sec): 10.75;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 114.21;
Eval Progress: 66/100; steps/sec: 121.23;
Eval Progress: 100/100; steps/sec: 126.59;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 8000; epoch: 8; adv_ce: 1.4645156; adversarial_accuracy: 0.4636; avg_ce: 1.1880443; base_ce: 0.91157293; clean_accuracy: 0.6918; min_base_ce: 0.91157293; since_best_base_ce: 0;
FastEstimator-Train: step: 9000; adv_ce: 1.5193198; avg_ce: 1.2024301; base_ce: 0.8855404; steps/sec: 88.37;
FastEstimator-Train: step: 9000; epoch: 9; epoch_time(sec): 11.32;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 110.76;
Eval Progress: 66/100; steps/sec: 130.51;
Eval Progress: 100/100; steps/sec: 132.61;
FastEstimator-Eval: step: 9000; epoch: 9; adv_ce: 1.4988724; adversarial_accuracy: 0.4574; avg_ce: 1.2090085; base_ce: 0.9191448; clean_accuracy: 0.6834; min_base_ce: 0.91157293; since_best_base_ce: 1;
FastEstimator-Train: step: 10000; adv_ce: 1.451181; avg_ce: 1.1657407; base_ce: 0.8803004; steps/sec: 92.99;
FastEstimator-Train: step: 10000; epoch: 10; epoch_time(sec): 10.75;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 108.18;
Eval Progress: 66/100; steps/sec: 120.07;
Eval Progress: 100/100; steps/sec: 121.8;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/adv_model_best_base_ce.h5
FastEstimator-Eval: step: 10000; epoch: 10; adv_ce: 1.4784583; adversarial_accuracy: 0.4698; avg_ce: 1.1881733; base_ce: 0.89788854; clean_accuracy: 0.696; min_base_ce: 0.89788854; since_best_base_ce: 0;
FastEstimator-Finish: step: 10000; adv_model_lr: 0.001; total_time(sec): 121.55;

Model Testing¶

Let's start by re-loading the weights from the best model, since the model may have overfit during training

In [9]:

Copied!

model.load_weights(os.path.join(save_dir, "adv_model_best_base_ce.h5"))
model.load_weights(os.path.join(save_dir, "adv_model_best_base_ce.h5"))

In [10]:

Copied!

estimator.test()
estimator.test()

FastEstimator-Test: step: 10000; epoch: 10; adv_ce: 1.4894997; adversarial_accuracy: 0.4588; avg_ce: 1.201717; base_ce: 0.91393447; clean_accuracy: 0.6842;

In spite of our training the network using adversarially crafted images, the adversarial attack is still effective at reducing the accuracy of the network. This does not, however, mean that the efforts were wasted.

Comparison vs Network without Adversarial Training¶

To see whether training using adversarial hardening was actually useful, we will compare it to a network which is trained without considering any adversarial images. The setup will be similar to before, but we will only use the adversarial images for evaluation purposes and so the second CrossEntropy Op as well as the Average Op can be omitted.

In [11]:

Copied!





clean_model = fe.build(model_fn=lambda: LeNet(input_shape=(32, 32, 3)), optimizer_fn="adam", model_name="clean_model")
clean_network = fe.Network(ops=[
        Watch(inputs="x"),
        ModelOp(model=clean_model, inputs="x", outputs="y_pred"),
        CrossEntropy(inputs=("y_pred", "y"), outputs="base_ce"),
        FGSM(data="x", loss="base_ce", outputs="x_adverse", epsilon=epsilon, mode="!train"),
        ModelOp(model=clean_model, inputs="x_adverse", outputs="y_pred_adv", mode="!train"),
        UpdateOp(model=clean_model, loss_name="base_ce")
    ])
clean_traces = [
    Accuracy(true_key="y", pred_key="y_pred", output_name="clean_accuracy"),
    Accuracy(true_key="y", pred_key="y_pred_adv", output_name="adversarial_accuracy"),
    BestModelSaver(model=clean_model, save_dir=save_dir, metric="base_ce", save_best_mode="min"),
]
clean_estimator = fe.Estimator(pipeline=pipeline,
                         network=clean_network,
                         epochs=epochs,
                         traces=clean_traces,
                         train_steps_per_epoch=train_steps_per_epoch,
                         eval_steps_per_epoch=eval_steps_per_epoch,
                         log_steps=1000)
clean_estimator.fit()
clean_model = fe.build(model_fn=lambda: LeNet(input_shape=(32, 32, 3)), optimizer_fn="adam", model_name="clean_model")
clean_network = fe.Network(ops=[
        Watch(inputs="x"),
        ModelOp(model=clean_model, inputs="x", outputs="y_pred"),
        CrossEntropy(inputs=("y_pred", "y"), outputs="base_ce"),
        FGSM(data="x", loss="base_ce", outputs="x_adverse", epsilon=epsilon, mode="!train"),
        ModelOp(model=clean_model, inputs="x_adverse", outputs="y_pred_adv", mode="!train"),
        UpdateOp(model=clean_model, loss_name="base_ce")
    ])
clean_traces = [
    Accuracy(true_key="y", pred_key="y_pred", output_name="clean_accuracy"),
    Accuracy(true_key="y", pred_key="y_pred_adv", output_name="adversarial_accuracy"),
    BestModelSaver(model=clean_model, save_dir=save_dir, metric="base_ce", save_best_mode="min"),
]
clean_estimator = fe.Estimator(pipeline=pipeline,
                         network=clean_network,
                         epochs=epochs,
                         traces=clean_traces,
                         train_steps_per_epoch=train_steps_per_epoch,
                         eval_steps_per_epoch=eval_steps_per_epoch,
                         log_steps=1000)
clean_estimator.fit()

    ______           __  ______     __  _                 __            
   / ____/___ ______/ /_/ ____/____/ /_(_)___ ___  ____ _/ /_____  _____
  / /_  / __ `/ ___/ __/ __/ / ___/ __/ / __ `__ \/ __ `/ __/ __ \/ ___/
 / __/ / /_/ (__  ) /_/ /___(__  ) /_/ / / / / / / /_/ / /_/ /_/ / /    
/_/    \__,_/____/\__/_____/____/\__/_/_/ /_/ /_/\__,_/\__/\____/_/     
                                                                        

FastEstimator-Start: step: 1; logging_interval: 1000; num_device: 1;
FastEstimator-Train: step: 1; base_ce: 2.3047411;
FastEstimator-Train: step: 1000; base_ce: 1.1804659; steps/sec: 150.06;
FastEstimator-Train: step: 1000; epoch: 1; epoch_time(sec): 7.17;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 104.44;
Eval Progress: 66/100; steps/sec: 115.12;
Eval Progress: 100/100; steps/sec: 110.92;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/clean_model_best_base_ce.h5
FastEstimator-Eval: step: 1000; epoch: 1; adversarial_accuracy: 0.3022; base_ce: 1.1240247; clean_accuracy: 0.6064; min_base_ce: 1.1240247; since_best_base_ce: 0;
FastEstimator-Train: step: 2000; base_ce: 0.8876241; steps/sec: 137.26;
FastEstimator-Train: step: 2000; epoch: 2; epoch_time(sec): 7.3;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 106.95;
Eval Progress: 66/100; steps/sec: 113.4;
Eval Progress: 100/100; steps/sec: 127.38;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/clean_model_best_base_ce.h5
FastEstimator-Eval: step: 2000; epoch: 2; adversarial_accuracy: 0.3038; base_ce: 0.98229593; clean_accuracy: 0.6534; min_base_ce: 0.98229593; since_best_base_ce: 0;
FastEstimator-Train: step: 3000; base_ce: 0.8849621; steps/sec: 132.52;
FastEstimator-Train: step: 3000; epoch: 3; epoch_time(sec): 7.55;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 103.08;
Eval Progress: 66/100; steps/sec: 130.25;
Eval Progress: 100/100; steps/sec: 125.19;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/clean_model_best_base_ce.h5
FastEstimator-Eval: step: 3000; epoch: 3; adversarial_accuracy: 0.2696; base_ce: 0.9033165; clean_accuracy: 0.6916; min_base_ce: 0.9033165; since_best_base_ce: 0;
FastEstimator-Train: step: 4000; base_ce: 0.95111996; steps/sec: 140.33;
FastEstimator-Train: step: 4000; epoch: 4; epoch_time(sec): 7.12;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 109.92;
Eval Progress: 66/100; steps/sec: 122.95;
Eval Progress: 100/100; steps/sec: 128.74;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/clean_model_best_base_ce.h5
FastEstimator-Eval: step: 4000; epoch: 4; adversarial_accuracy: 0.2716; base_ce: 0.8884863; clean_accuracy: 0.691; min_base_ce: 0.8884863; since_best_base_ce: 0;
FastEstimator-Train: step: 5000; base_ce: 0.8336683; steps/sec: 140.4;
FastEstimator-Train: step: 5000; epoch: 5; epoch_time(sec): 7.12;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 101.26;
Eval Progress: 66/100; steps/sec: 129.9;
Eval Progress: 100/100; steps/sec: 118.06;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/clean_model_best_base_ce.h5
FastEstimator-Eval: step: 5000; epoch: 5; adversarial_accuracy: 0.2754; base_ce: 0.82907814; clean_accuracy: 0.7166; min_base_ce: 0.82907814; since_best_base_ce: 0;
FastEstimator-Train: step: 6000; base_ce: 0.5674667; steps/sec: 139.48;
FastEstimator-Train: step: 6000; epoch: 6; epoch_time(sec): 7.17;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 106.19;
Eval Progress: 66/100; steps/sec: 132.69;
Eval Progress: 100/100; steps/sec: 127.4;
FastEstimator-BestModelSaver: Saved model to /var/folders/lx/drkxftt117gblvgsp1p39rlc0000gn/T/tmpv5xrd_mi/clean_model_best_base_ce.h5
FastEstimator-Eval: step: 6000; epoch: 6; adversarial_accuracy: 0.2522; base_ce: 0.81661767; clean_accuracy: 0.72; min_base_ce: 0.81661767; since_best_base_ce: 0;
FastEstimator-Train: step: 7000; base_ce: 0.78401643; steps/sec: 141.15;
FastEstimator-Train: step: 7000; epoch: 7; epoch_time(sec): 7.09;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 110.63;
Eval Progress: 66/100; steps/sec: 134.3;
Eval Progress: 100/100; steps/sec: 132.15;
FastEstimator-Eval: step: 7000; epoch: 7; adversarial_accuracy: 0.2384; base_ce: 0.8509378; clean_accuracy: 0.7172; min_base_ce: 0.81661767; since_best_base_ce: 1;
FastEstimator-Train: step: 8000; base_ce: 0.664769; steps/sec: 142.86;
FastEstimator-Train: step: 8000; epoch: 8; epoch_time(sec): 7.0;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 105.74;
Eval Progress: 66/100; steps/sec: 112.19;
Eval Progress: 100/100; steps/sec: 116.31;
FastEstimator-Eval: step: 8000; epoch: 8; adversarial_accuracy: 0.2188; base_ce: 0.96747774; clean_accuracy: 0.6962; min_base_ce: 0.81661767; since_best_base_ce: 2;
FastEstimator-Train: step: 9000; base_ce: 0.40541548; steps/sec: 134.91;
FastEstimator-Train: step: 9000; epoch: 9; epoch_time(sec): 7.41;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 110.06;
Eval Progress: 66/100; steps/sec: 120.28;
Eval Progress: 100/100; steps/sec: 131.44;
FastEstimator-Eval: step: 9000; epoch: 9; adversarial_accuracy: 0.2194; base_ce: 0.87976956; clean_accuracy: 0.7232; min_base_ce: 0.81661767; since_best_base_ce: 3;
FastEstimator-Train: step: 10000; base_ce: 0.74041855; steps/sec: 135.69;
FastEstimator-Train: step: 10000; epoch: 10; epoch_time(sec): 7.38;
Eval Progress: 1/100;
Eval Progress: 33/100; steps/sec: 96.99;
Eval Progress: 66/100; steps/sec: 124.55;
Eval Progress: 100/100; steps/sec: 118.18;
FastEstimator-Eval: step: 10000; epoch: 10; adversarial_accuracy: 0.2048; base_ce: 0.91465575; clean_accuracy: 0.7202; min_base_ce: 0.81661767; since_best_base_ce: 4;
FastEstimator-Finish: step: 10000; clean_model_lr: 0.001; total_time(sec): 86.08;

As before, we will reload the best weights and the test the model

In [12]:

Copied!

clean_model.load_weights(os.path.join(save_dir, "clean_model_best_base_ce.h5"))
clean_model.load_weights(os.path.join(save_dir, "clean_model_best_base_ce.h5"))

In [13]:

Copied!





print("Normal Network:")
normal_results = clean_estimator.test("normal")
print("The whitebox FGSM attack reduced accuracy by {:.2f}".format(list(normal_results.history['test']['clean_accuracy'].values())[0] - list(normal_results.history['test']['adversarial_accuracy'].values())[0]))
print("-----------")
print("Adversarially Trained Network:")
adversarial_results = estimator.test("adversarial")
print("The whitebox FGSM attack reduced accuracy by {:.2f}".format(list(adversarial_results.history['test']['clean_accuracy'].values())[0] - list(adversarial_results.history['test']['adversarial_accuracy'].values())[0]))
print("-----------")
print("Normal Network:")
normal_results = clean_estimator.test("normal")
print("The whitebox FGSM attack reduced accuracy by {:.2f}".format(list(normal_results.history['test']['clean_accuracy'].values())[0] - list(normal_results.history['test']['adversarial_accuracy'].values())[0]))
print("-----------")
print("Adversarially Trained Network:")
adversarial_results = estimator.test("adversarial")
print("The whitebox FGSM attack reduced accuracy by {:.2f}".format(list(adversarial_results.history['test']['clean_accuracy'].values())[0] - list(adversarial_results.history['test']['adversarial_accuracy'].values())[0]))
print("-----------")

Normal Network:
FastEstimator-Test: step: 10000; epoch: 10; adversarial_accuracy: 0.2556; base_ce: 0.8501647; clean_accuracy: 0.7136;
The whitebox FGSM attack reduced accuracy by 0.46
-----------
Adversarially Trained Network:
FastEstimator-Test: step: 10000; epoch: 10; adv_ce: 1.4894997; adversarial_accuracy: 0.4588; avg_ce: 1.201717; base_ce: 0.91393447; clean_accuracy: 0.6842;
The whitebox FGSM attack reduced accuracy by 0.23
-----------

As we can see, the normal network is significantly less robust against adversarial attacks than the one which was trained to resist them. The downside is that the adversarial network requires more epochs of training to converge, and the training steps take about twice as long since they require two forward pass operations. It is also interesting to note that as the regular model was training, it actually saw progressively worse adversarial accuracy. This may be an indication that the network is developing very brittle decision boundaries.

Visualizing Adversarial Samples¶

Lets visualize some images generated by these adversarial attacks to make sure that everything is working as we would expect. The first step is to get some sample data from the pipeline:

In [14]:

Copied!





class_dictionary = {
    0: "airplane", 1: "car", 2: "bird", 3: "cat", 4: "deer", 5: "dog", 6: "frog", 7: "horse", 8: "ship", 9: "truck"
}
batch = pipeline.get_results(mode="test")
class_dictionary = {
    0: "airplane", 1: "car", 2: "bird", 3: "cat", 4: "deer", 5: "dog", 6: "frog", 7: "horse", 8: "ship", 9: "truck"
}
batch = pipeline.get_results(mode="test")

Now let's run our sample data through the network and then visualize the results

In [15]:

Copied!

batch = clean_network.transform(batch, mode="test")
batch = clean_network.transform(batch, mode="test")

In [16]:

Copied!





n_samples = 10
y = np.array([class_dictionary[clazz.item()] for clazz in to_number(batch["y"][0:n_samples])])
y_pred = np.array([class_dictionary[clazz.item()] for clazz in to_number(argmax(batch["y_pred"], axis=1)[0:n_samples])])
y_adv = np.array([class_dictionary[clazz.item()] for clazz in to_number(argmax(batch["y_pred_adv"], axis=1)[0:n_samples])])

GridDisplay([BatchDisplay(image=batch["x"][0:n_samples], title="x"),
             BatchDisplay(image=batch["x_adverse"][0:n_samples], title="x_adv"),
             BatchDisplay(text=y, title="y"),
             BatchDisplay(text=y_pred, title="y_pred"),
             BatchDisplay(text=y_adv, title="y_adv")
            ]).show()
n_samples = 10
y = np.array([class_dictionary[clazz.item()] for clazz in to_number(batch["y"][0:n_samples])])
y_pred = np.array([class_dictionary[clazz.item()] for clazz in to_number(argmax(batch["y_pred"], axis=1)[0:n_samples])])
y_adv = np.array([class_dictionary[clazz.item()] for clazz in to_number(argmax(batch["y_pred_adv"], axis=1)[0:n_samples])])

GridDisplay([BatchDisplay(image=batch["x"][0:n_samples], title="x"),
             BatchDisplay(image=batch["x_adverse"][0:n_samples], title="x_adv"),
             BatchDisplay(text=y, title="y"),
             BatchDisplay(text=y_pred, title="y_pred"),
             BatchDisplay(text=y_adv, title="y_adv")
            ]).show()

No description has been provided for this image

As you can see, the adversarial images appear very similar to the unmodified images, and yet they are often able to modify the class predictions of the network. Note that if a network's prediction is already wrong, the attack is unlikely to change the incorrect prediction, but rather to increase the model's confidence in its incorrect prediction.