Lead the frontier of adversarial curriculum design by marrying large‑language‑model semantics with multi‑agent reinforcement learning. Your work will set the standard for provably robust ToM policies that survive evolving sabotage tactics in real‑time, distributed environments.
You will pioneer a bi‑level Stackelberg game where an LLM oracle continuously mutates deceptive messages, creating an ever‑shifting threat space that forces agents to learn anticipatory reasoning. This is the first end‑to‑end, provably robust ToM curriculum that operates at scale.
Adversarial Curriculum‑Driven Theory‑of‑Mind (AC‑ToM)
From: Theory of Mind Defenses Against Communication Sabotage
Design and operationalize the LLM‑driven curriculum that generates adaptive deceptive scenarios, formalize the bi‑level Stackelberg game, and prove robustness guarantees for the HTMAD policy.
A scalable AC‑ToM training pipeline, automated scenario generator, provable robustness analysis toolkit, and a benchmark suite that quantifies resilience against unseen sabotage tactics.
PhD in Computer Science, Machine Learning, Robotics, or related field.
Within 12 months, deliver a curriculum that boosts adversarial robustness by ≥30% over baseline HTMAD agents, publish a landmark paper, and integrate the training pipeline into the company’s production MARL stack.
Lead a growing curriculum research team, mentor junior scientists, and expand the framework to new domains such as autonomous vehicles and industrial IoT.
If this sounds like the challenge you have been looking for, we want to hear from you. We value what you can build over where you have been.