Build a causal‑graph discovery and diffusion‑based manifold projection pipeline to generate counterfactual explanations and evaluate their robustness against adversarial perturbations in a simulated environment.
Causal DiscoveryDiffusion ModelsMonte Carlo SimulationAdversarial TestingFeasibility
Enables early assessment of explanation fidelity and robustness, guiding design of the recourse engine before deployment.
What Is Modelled
A data‑driven pipeline that learns a causal graph from multimodal interaction logs, projects observations onto a learned manifold using diffusion models, generates counterfactual explanations that respect the causal structure, and evaluates the fidelity and robustness of those explanations under synthetically injected adversarial perturbations.
Objectives
Accurately recover a causal graph from noisy, partially observed interaction logs.
Train a diffusion‑based manifold projector that can reconstruct perturbed observations while preserving causal semantics.
Generate counterfactual explanations that satisfy causal plausibility constraints and are minimally invasive in the latent space.
Quantify robustness of counterfactuals against a spectrum of adversarial perturbations (e.g., sensor noise, semantic tampering, feature poisoning).
Integrate the pipeline into a simulation environment (e.g., OpenAI Gym or AirSim) to validate end‑to‑end performance under realistic attack scenarios.
Success Criteria
Causal graph precision/recall ≥ 0.85 on held‑out synthetic benchmarks.
Diffusion reconstruction MAE ≤ 5% of clean data on perturbed samples.
Counterfactual fidelity (change in outcome probability) ≥ 0.7 while altering ≤ 10% of features.
Simulation runs complete within 2 h per episode with ≤ 1 % failure rate.
Output Form
A reproducible Python package containing: (1) a causal‑graph estimator, (2) a diffusion‑based manifold projector, (3) a counterfactual generator, (4) an adversarial perturbation generator, and (5) a robustness evaluation dashboard. Outputs include JSON logs of counterfactuals, robustness metrics, and a Jupyter notebook for visual inspection.
Key Parameters & What They Affect
Parameter
Range / Units
Affects
Notes
causal_discovery_method
enum: ['PC', 'GES', 'FCI', 'LiNGAM']
quality
Choice influences graph sparsity and orientation accuracy.
pc_alpha
0.01 – 0.2
quality
Significance threshold for PC; lower values increase precision.
diffusion_steps
50 – 200
speedquality
Number of denoising steps; higher values improve fidelity but increase latency.
guidance_scale
0.0 – 5.0
quality
Controls adherence to causal constraints during sampling.
adversarial_noise_level
0.0 – 0.5 (L2 norm)
robustness
Magnitude of perturbation applied to observations.
OpenAI Gym environments with known causal structure (e.g., FetchReach)
Stable Diffusion pre‑trained weights for image modalities
CleverHans/Foolbox adversarial libraries
Synthesised Sources
Synthetic causal graphs generated via DAG simulation (networkx)
Synthetic perturbations created with PGD or FGSM on observation vectors
Simulated sensor noise injected into recorded logs using NumPy
Engineer / Scientist Guidance
Set up a reproducible Python environment (conda or venv) with PyTorch ≥1.12, CausalNex, DoWhy, and Diffusers.
Load interaction logs and split into training/validation/test sets; ensure temporal ordering is preserved.
Run causal discovery using the chosen algorithm (PC by default). Tune alpha via cross‑validation to achieve ≥0.85 precision.
Train a domain‑specific DDPM (or DPM‑Solver) on clean observations; freeze the encoder and fine‑tune the denoiser for 50–100 steps.
Implement a counterfactual sampler that: (a) samples latent codes via the diffusion model, (b) applies causal constraint guidance (e.g., via a penalty term on edge directions), and (c) decodes to counterfactual observations.
Generate adversarial perturbations using PGD with a fixed L2 budget; apply them to validation data.
Evaluate counterfactuals on perturbed data: compute outcome probability shift, feature change ratio, and causal plausibility score.
Aggregate robustness metrics across perturbation levels; plot robustness curves.
Integrate the pipeline into a Gym environment: at each step, feed the current observation, generate a counterfactual, and log the explanation and robustness score.
Automate the entire workflow with a Makefile or CI pipeline; store results in a SQLite database for later analysis.
Recommended Tools
Python 3.11PyTorch 2.0Diffusers (HuggingFace) for DDPM/DPM‑SolverCausalNex / DoWhy for causal discoveryCleverHans / Foolbox for adversarial attacksOpenAI Gym / AirSim for simulationRay Tune / Optuna for hyper‑parameter searchTensorBoard / Weights & Biases for loggingJupyterLab for interactive exploration
Validation & Verification
Validate causal graph against synthetic ground truth using precision/recall. Validate diffusion reconstruction via MAE on held‑out perturbed samples. Validate counterfactuals by checking that the outcome probability changes by ≥0.7 while altering ≤10% of features, and that the causal graph constraints are respected (no edge reversal). Robustness is verified by measuring counterfactual success rate across 10 adversarial perturbation levels and ensuring ≥0.8. Cross‑validate with an independent dataset (e.g., Tuebingen cause–effect pairs).
Expected Impact
Quality
Provides counterfactual explanations that remain faithful under adversarial noise, improving operator trust and decision support.
Timescale
Reduces the design cycle for the recourse engine by 30% by exposing robustness early.
Cost
Avoids costly post‑deployment debugging by catching brittle explanations in simulation.
Risk Retired
Mitigates risk of misleading explanations in adversarial environments, reducing safety incidents.
Software Tool Development Prompts
Drop these into a coding assistant toscaffold the supporting software for this modelling task.
Create a Python module that loads a CSV of interaction logs, performs PC causal discovery with a user‑specified alpha, and returns a graph in networkx format. Include functions to compute precision/recall against a provided ground‑truth graph.
Implement a diffusion‑based counterfactual generator that takes a trained DDPM model, a causal graph, and an observation vector, and outputs a counterfactual. The generator should accept a guidance_scale parameter and enforce causal constraints by adding a penalty term to the latent sampling loss. Provide a CLI that accepts --obs_path, --graph_path, --guidance_scale, and outputs the counterfactual to a JSON file.
Risks & Assumptions
Assumes that the interaction logs contain enough variability to learn a faithful causal graph; sparse data may lead to over‑fitting.
Diffusion model may not fully capture multimodal dependencies if trained on limited data; consider multimodal diffusion architectures.
Adversarial perturbations generated synthetically may not reflect real‑world attack patterns; supplement with domain‑specific attack templates.
Causal discovery algorithms are sensitive to hidden confounders; incorporate FCI or latent variable extensions if necessary.
Computational cost of diffusion sampling may be high; use DPM‑Solver or DDIM for faster inference.