Modelling Task 1: Synthetic Adversarial Observation Perturbation Dataset Generation

Project: corpora-task-modelling-1778795810213-620a9917 • Generated: 2026-05-14 22:57

Generate realistic synthetic datasets of sensor observations with controlled adversarial perturbations using simulation and GAN-based augmentation to evaluate detection and inference pipelines before hardware deployment.

Monte CarloSimulationGANFeasibility

Source in Roadmap / Ideate	Chapter 1 – AOI-GBE Foundations & Data Collection
Why model first	Provides a diverse, controllable set of perturbations to benchmark detection and inference algorithms, reducing costly field data collection and enabling early validation of robustness metrics.

What Is Modelled

Multi‑modal sensor observation streams (camera, LiDAR, IMU, radar) from autonomous agents, together with a parametric adversarial perturbation model that injects realistic noise, spoofing, and semantic manipulation into the raw sensor data.

Objectives

Create a large‑scale, labeled dataset of clean and perturbed observations that covers the full spectrum of attack vectors identified in Chapter 1 (noise, spoofing, semantic, and communication sabotage).
Train a conditional GAN (CC‑GAN) that can reconstruct missing or corrupted sensor streams and generate realistic adversarial samples for downstream detection and inference testing.
Develop a Monte Carlo sampling engine that systematically explores perturbation parameter space (magnitude, frequency, modality mix) and produces a balanced distribution of attack scenarios.
Provide reproducible evaluation scripts that benchmark detection accuracy, policy robustness, and inference latency on the synthetic dataset.

Success Criteria

Dataset size ≥ 200 k observation tuples with ≥ 30 % labeled perturbations, covering at least 5 distinct attack types.
CC‑GAN reconstruction MAE < 5 % on held‑out perturbed data and a Frechet Inception Distance (FID) < 10 for image modalities.
Monte Carlo engine produces a perturbation distribution that matches the empirical distribution of real logs (KL‑div < 0.1).
Detection pipeline achieves ≥ 90 % F1 on synthetic attacks and ≥ 80 % F1 when transferred to a held‑out real‑world test set.

Output Form

A versioned dataset in Parquet/CSV with accompanying metadata (sensor specs, perturbation parameters, timestamps), trained CC‑GAN checkpoints, a Monte Carlo configuration file, and a Python package containing evaluation utilities and a reproducible Docker image.

Key Parameters & What They Affect

Parameter	Range / Units	Affects	Notes
perturbation_magnitude	0.0–0.5 (normalized sensor value)	qualityreliability	Controls the amplitude of additive Gaussian or Poisson noise; higher values increase attack severity.
spoofing_probability	0.0–1.0	securitycost	Probability that a given sensor frame is replaced with a fabricated signal.
semantic_attack_type	categorical (label‑swap, object‑removal, path‑spoof)	qualityinterpretability	Defines the high‑level manipulation applied to image or LiDAR point clouds.
latent_dim	128–512	speedquality	Dimensionality of the CC‑GAN latent vector; larger values improve fidelity but increase training time.
simulation_fps	30–120 fps	speedcost	Frame rate of the physics‑based simulator; higher FPS yields more realistic dynamics but requires more compute.

Input Data

Required data:

Clean sensor logs from test rigs (camera, LiDAR, IMU, radar) with timestamps and ground‑truth poses.
Sensor calibration files (intrinsics, extrinsics, noise models).
Environment maps (3D meshes, semantic labels) for simulation.

Natural Sources (from the project)

Test rig runs described in Chapter 1 – AOI‑GBE Foundations & Data Collection.
Supplier data from UAV manufacturers (flight logs, telemetry).
Prior simulation outputs from the AOI‑GBE prototype (Section 2.2).

Acquired Sources

KITTI, nuScenes, and Waymo Open Dataset for baseline clean imagery and point clouds.
OpenStreetMap and OpenSceneGraph for realistic urban terrain.
Publicly available adversarial benchmark datasets (e.g., D‑REX, XSTest) for semantic attack templates.

Synthesised Sources

Physics‑based simulation in AirSim or CARLA to generate clean and perturbed sensor streams.
GAN‑generated perturbations seeded by a DOE‑style Latin‑Hypercube of attack parameters.
Synthetic point‑cloud augmentations using Open3D and PyTorch3D.

Engineer / Scientist Guidance

Set up a reproducible AirSim/CARLA environment with the target vehicle and sensor suite; export clean telemetry logs.
Define a perturbation taxonomy (noise, spoofing, semantic) and encode each as a parameterized function.
Implement a Monte Carlo engine in Python that samples perturbation parameters across the defined ranges and writes a CSV of attack scenarios.
Pre‑process clean logs to create paired datasets (clean, perturbed) for GAN training; normalize sensor data and align timestamps.
Configure a CC‑GAN architecture in PyTorch: a conditional generator with a GRU encoder for temporal context and a multi‑head discriminator for each modality.
Use Optuna or Ray Tune to perform hyper‑heuristic search over GAN hyperparameters (latent_dim, learning rate, batch size) and perturbation generation strategies.
Train the GAN on a GPU cluster; monitor reconstruction loss and FID; save checkpoints every 10 k iterations.
Generate synthetic perturbations by sampling the trained generator with random latent vectors conditioned on clean observations.
Validate synthetic data by comparing statistical distributions (mean, variance, KL‑divergence) to real logs; perform detection model benchmarking.
Package the dataset, GAN checkpoints, and evaluation scripts into a Docker image; publish to a private registry for downstream teams.

Recommended Tools

AirSim (C++/Python) or CARLA (Python) for simulationGazebo + ROS2 for alternative physics enginePyTorch 2.0 for GAN implementationTensorFlow 2.12 for alternative trainingOptuna 3.x or Ray Tune 2.x for hyper‑heuristic orchestrationWeights & Biases for experiment trackingNumPy, Pandas, OpenCV, Open3D for data processingDocker, NVIDIA Container Toolkit for reproducibilityApache Parquet for dataset storageScikit‑learn for evaluation metrics (F1, KL‑div, FID)pytest for unit tests

Validation & Verification

The dataset will be validated by (1) statistical comparison of clean vs. perturbed distributions against real logs (KL‑div < 0.1), (2) reconstruction error of the CC‑GAN on a held‑out test set (MAE < 5 %), (3) detection pipeline performance on synthetic attacks (F1 ≥ 90 %) and on a separate real‑world test set (F1 ≥ 80 %), and (4) inference latency measured on target edge hardware (≤ 50 ms).

Expected Impact

Quality

Provides high‑fidelity, attack‑rich data that improves detection and inference robustness by exposing models to a broader spectrum of perturbations.

Timescale

Reduces field data collection from 6–12 months to 2–3 months by leveraging simulation and GAN augmentation.

Cost

Cuts hardware procurement and test‑bed maintenance costs by ~40 % through virtual experimentation.

Risk Retired

Mitigates deployment risk by enabling early validation of detection pipelines and policy resilience against unseen adversarial scenarios.

Software Tool Development Prompts

Drop these into a coding assistant toscaffold the supporting software for this modelling task.

Create a Python script that sets up an Optuna study to tune a conditional GAN for multimodal sensor data. The study should explore latent_dim ∈ [128, 512], learning_rate ∈ [1e-4, 1e-3], batch_size ∈ [32, 128], and generator depth ∈ [2, 4]. For each trial, train for 10 k steps on a GPU, evaluate reconstruction MAE on a validation set, and report the best hyperparameters. Include code to log metrics to Weights & Biases and to save the best model checkpoint.

Write a Dockerfile that installs AirSim, ROS2, PyTorch, and all dependencies, copies a pre‑trained CC‑GAN checkpoint, and exposes a REST API endpoint that accepts a clean observation JSON, applies the GAN to generate a perturbed observation, and returns the result. The container should run under NVIDIA runtime and expose port 8000.

Risks & Assumptions

Assumption: The physics‑based simulator accurately reproduces sensor noise characteristics; discrepancies may reduce realism.
Risk: GAN training may suffer from mode collapse, leading to low diversity in synthetic perturbations.
Risk: Monte Carlo sampling may not cover rare but critical attack scenarios if parameter ranges are too narrow.
Assumption: Real‑world logs used for validation are representative of the target deployment environment.
Risk: Computational cost of hyper‑heuristic search may exceed budget if not properly constrained.
Risk: Distribution shift between simulated and real sensor data could cause detection models to overfit to synthetic artifacts.