Restarts and Ensembles: Evaluating Robustness with Multiple PGD Runs

2.3. Restarts and Ensembles: Evaluating Robustness with Multiple PGD Runs#

This tutorial shows how to:

Run PGD with multiple random initializations (restarts).
Aggregate multiple fixed-ε attack runs using ensembling to obtain aggregated evaluation metrics.

We will:

Load the CIFAR-10 test set and a pretrained model.
Run PGD multiple times with random_start=True.
Compute robust accuracy (RA) and attack success rate (ASR) across runs using ensemble metrics.
Use FixedEpsilonEnsemble to select per-sample the strongest adversarial example among runs.

%%capture --no-stdout
try:
    import secmlt
except ImportError:
   %pip install secml-torch[foolbox,adv_lib]

# Imports
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Subset

from secmlt.metrics.classification import (
    Accuracy,
    AttackSuccessRate,
    AccuracyEnsemble,
    EnsembleSuccessRate,
)
from secmlt.adv.backends import Backends
from secmlt.models.pytorch.base_pytorch_nn import BasePyTorchClassifier
from secmlt.adv.evasion.perturbation_models import LpPerturbationModels
from secmlt.adv.evasion.pgd import PGD
from secmlt.adv.evasion.aggregators.ensemble import FixedEpsilonEnsemble


device = "cuda" if torch.cuda.is_available() else "cpu"
dataset_path = "data/datasets/"  # relative to this notebook's folder
print(f"Using device: {device}")

Using device: cpu

2.3.1. Data and Robust Model (CIFAR-10)#

We load a small CIFAR-10 test subset and a robust model from RobustBench (L∞ threat model), then wrap it with SecML‑Torch’s BasePyTorchClassifier.

%%capture --no-stdout

# Load CIFAR-10 test subset
transform = transforms.Compose([transforms.ToTensor()])
test_dataset = torchvision.datasets.CIFAR10(
    root=dataset_path, train=False, download=True, transform=transform
)
num_samples = 20
batch_size = num_samples // 2
test_subset = Subset(test_dataset, list(range(num_samples)))
test_loader = DataLoader(test_subset, batch_size=batch_size, shuffle=False)
print(f"Loaded {len(test_subset)} samples from CIFAR-10 test set")

net = torch.hub.load("chenyaofo/pytorch-cifar-models", "cifar10_resnet20", pretrained=True, trust_repo=True)

net = net.to(device)
net.eval()

# Wrap the model with SecML-Torch's BasePyTorchClassifier
model = BasePyTorchClassifier(net, preprocessing=transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)))

# Baseline accuracy on clean data
clean_acc = Accuracy()(model, test_loader)
print(f"Clean accuracy: {clean_acc.item():.4f} ({clean_acc.item() * 100:.2f}%)")

Loaded 20 samples from CIFAR-10 test set
Clean accuracy: 0.9500 (95.00%)

2.3.2. PGD with Random Initialization#

Multiple random starts could mitigate local optima in non-convex loss landscapes and often increase attack success. Here we configure PGD (fixed-ε, L∞) and run it several times with random_start=True.

# PGD configuration (L∞)
epsilon = 4 / 255     # Max L∞ perturbation
num_steps = 20        # PGD iterations
step_size = 1 / 255   # Step size per iteration
perturbation_model = LpPerturbationModels.LINF

print("Attack configuration:")
print(f"  - Epsilon: {epsilon:.4f} ({epsilon * 255:.0f}/255)")
print(f"  - Steps:   {num_steps}")
print(f"  - Step sz: {step_size:.4f} ({step_size * 255:.0f}/255)")

pgd = PGD(
    perturbation_model=perturbation_model,
    epsilon=epsilon,
    num_steps=num_steps,
    step_size=step_size,
    random_start=True,
    backend=Backends.NATIVE,
)
print("PGD (native) ready with random_start=True")

Attack configuration:
  - Epsilon: 0.0157 (4/255)
  - Steps:   20
  - Step sz: 0.0039 (1/255)
PGD (native) ready with random_start=True

# Single PGD run
adv_loader_single = pgd(model, test_loader)
acc_single = Accuracy()(model, adv_loader_single)
asr_single = AttackSuccessRate()(model, adv_loader_single)
print("=== Single-run PGD ===")
print(f"RA (single):  {acc_single.item():.4f} ({acc_single.item() * 100:.2f}%)")
print(f"ASR (single): {asr_single.item():.4f} ({asr_single.item() * 100:.2f}%)")

=== Single-run PGD ===
RA (single):  0.0000 (0.00%)
ASR (single): 1.0000 (100.00%)

2.3.2.1. Multiple Restarts: Evaluation Across Runs#

We now perform several runs (restarts) and compute ensemble metrics across runs:

AccuracyEnsemble gives robust accuracy across runs.
EnsembleSuccessRate gives success rate across runs across runs.

num_runs = 3
adv_loaders = []
for i in range(num_runs):
    print(f"Running PGD restart {i+1}/{num_runs}...")
    adv_loaders.append(pgd(model, test_loader))
    acc_single = Accuracy()(model, adv_loaders[i])
    print(f"Single run: accuracy {acc_single.item():.4f} ({acc_single.item() * 100:.2f}%)")

ra_ensemble = AccuracyEnsemble()(model, adv_loaders)
asr_ensemble = EnsembleSuccessRate()(model, adv_loaders)

print("=== Ensemble over multiple PGD runs ===")
print(f"RA (ensemble across runs):  {ra_ensemble.item():.4f} ({ra_ensemble.item() * 100:.2f}%)")
print(f"ASR (ensemble across runs): {asr_ensemble.item():.4f} ({asr_ensemble.item() * 100:.2f}%)")

Running PGD restart 1/3...

Single run: accuracy 0.0000 (0.00%)
Running PGD restart 2/3...

Single run: accuracy 0.0000 (0.00%)
Running PGD restart 3/3...

Single run: accuracy 0.0000 (0.00%)
=== Ensemble over multiple PGD runs ===
RA (ensemble across runs):  0.0000 (0.00%)
ASR (ensemble across runs): 1.0000 (100.00%)

2.3.3. Fixed-ε Ensembling: Select Strongest Adversarial per Sample#

For fixed-ε attacks like PGD, we can build a per-sample ensemble that picks the adversarial example with the worst loss among multiple runs. This yields a new dataloader containing the selected adversarial examples across runs.

criterion = FixedEpsilonEnsemble(loss_fn=torch.nn.CrossEntropyLoss(), maximize=True, y_target=None)
best_advs_loader = criterion(model, test_loader, adv_loaders)

ra_best = Accuracy()(model, best_advs_loader)
asr_best = AttackSuccessRate()(model, best_advs_loader)

print("=== Fixed-ε ensemble selection (per-sample strongest) ===")
print(f"RA (best-advs):  {ra_best.item():.4f} ({ra_best.item() * 100:.2f}%)")
print(f"ASR (best-advs): {asr_best.item():.4f} ({asr_best.item() * 100:.2f}%)")

=== Fixed-ε ensemble selection (per-sample strongest) ===
RA (best-advs):  0.0000 (0.00%)
ASR (best-advs): 1.0000 (100.00%)