Senior Adversarial Robust Explanation Engineer

corpora-jobs-1778796293285-db9d41c6 - Frontier Development

Applied ScientistSenior1 position

⚡

Why This Role is Different

Frontier Development Role

Build the most resilient explanation engine ever seen in multi‑agent AI, combining state‑of‑the‑art explainers with adversarial training to guarantee that blame signals cannot be gamed by malicious agents or operators.

The Frontier Element

You’ll develop a novel adversarial‑steered explanation weighting algorithm that jointly optimizes model accuracy and explanation stability, a technique that has never been applied to multi‑agent blame attribution.

🔬

Project Context

Research Area

Adversarial‑Robust Explanation Engine

From: Misattribution of Blame in Cooperative Multi‑Agent Systems

Why This Role is Critical

The explanation ensemble must remain stable under adversarial perturbations to preserve operator trust and prevent manipulation of blame signals.

What You Will Build

An adversarially trained ensemble of SHAP, LIME, and Integrated Gradients, with a learned weighting scheme that penalizes explanation drift and outputs a robustness‑score‑augmented blame manifold.

🛠

Key Responsibilities

Design and train an ensemble of post‑hoc explainers (SHAP, LIME, Integrated Gradients) with a learned weighting function that penalizes divergence under adversarial perturbations.
Generate adversarially perturbed logs using gradient‑based and black‑box attacks to train the ensemble.
Quantify explanation robustness with metrics such as SSIM, IoU, and custom blame‑stability indices.
Integrate the robustness scores into the blame manifold visualization and provide API hooks for downstream modules.
Conduct user‑study evaluations with operators to validate that robust explanations reduce misattribution and improve decision quality.

🎯

Required Skills & Experience

Technical Must-Haves

Explainable AI (SHAP, LIME, Integrated Gradients)

Expert

Core components of the explanation ensemble.

Adversarial machine learning (white‑box and black‑box attacks)

Advanced

Training robust explanations.

Deep learning frameworks (PyTorch, TensorFlow)

Advanced

Implementing gradient‑based attacks and ensemble training.

Evaluation metrics for explanation stability (SSIM, IoU, CIES)

Proficient

Measuring robustness.

Experience Requirements

3+ years of research or industry experience in explainable AI and adversarial robustness.
Publications on robust explanations or adversarial‑aware interpretability.
Experience deploying explanation engines in real‑time systems.

Education

PhD or Master’s in Computer Science, AI, or related field with focus on explainability or adversarial ML.

⭐

Preferred Skills

Experience with human‑centered evaluation of explanations.
Knowledge of causal inference to align explanations with causal graphs.
Familiarity with dashboard visualization libraries (D3.js, Plotly).

🤝

You Will Thrive Here If...

Thrives in ambiguous environments where the definition of “robust” is evolving.
Enjoys bridging theory and practice, turning complex proofs into production code.
Values rapid iteration and data‑driven experimentation.

📈

Impact & Growth

12-Month Impact

Deliver a production‑ready explanation engine that reduces explanation drift by 70% under adversarial attacks, enabling operators to trust blame scores in a live multi‑agent logistics simulation with 99% uptime.

Growth Opportunity

Lead the company’s explainability research portfolio, mentor a team of ML engineers, and shape industry standards for adversarial‑robust interpretability.

Ready to Push the Boundaries?

If this sounds like the challenge you have been looking for, we want to hear from you. We value what you can build over where you have been.