Principal Research Scientist – Adversarial Curriculum & Theory‑of‑Mind Integration

corpora-jobs-1778796293285-db9d41c6 - Frontier Development

Research ScientistPrincipal1 position

⚡

Why This Role is Different

Frontier Development Role

Lead the frontier of adversarial curriculum design by marrying large‑language‑model semantics with multi‑agent reinforcement learning. Your work will set the standard for provably robust ToM policies that survive evolving sabotage tactics in real‑time, distributed environments.

The Frontier Element

You will pioneer a bi‑level Stackelberg game where an LLM oracle continuously mutates deceptive messages, creating an ever‑shifting threat space that forces agents to learn anticipatory reasoning. This is the first end‑to‑end, provably robust ToM curriculum that operates at scale.

🔬

Project Context

Research Area

Adversarial Curriculum‑Driven Theory‑of‑Mind (AC‑ToM)

From: Theory of Mind Defenses Against Communication Sabotage

Why This Role is Critical

Design and operationalize the LLM‑driven curriculum that generates adaptive deceptive scenarios, formalize the bi‑level Stackelberg game, and prove robustness guarantees for the HTMAD policy.

What You Will Build

A scalable AC‑ToM training pipeline, automated scenario generator, provable robustness analysis toolkit, and a benchmark suite that quantifies resilience against unseen sabotage tactics.

🛠

Key Responsibilities

Architect and implement the LLM‑driven adversarial scenario generator, ensuring diversity and coverage of novel sabotage tactics.
Formalize and prove robustness guarantees for the HTMAD policy within the Stackelberg game framework.
Develop a distributed training pipeline that scales to thousands of agents and millions of simulated interactions.
Create a comprehensive benchmark suite and evaluation metrics to quantify robustness, latency, and cooperative performance.
Publish findings in top conferences and collaborate with the broader research community to refine the curriculum.

🎯

Required Skills & Experience

Technical Must-Haves

Large‑Language‑Model fine‑tuning and prompting

Expert

Designing adaptive adversarial prompts that drive the curriculum.

Multi‑Agent Reinforcement Learning

Advanced

Implementing policy learning in partially observable, noisy environments.

Game‑theoretic modeling (Stackelberg games)

Expert

Formalizing the bi‑level optimization between learner and adversary.

Adversarial training & robustness analysis

Advanced

Ensuring policies withstand unseen perturbations.

Python, PyTorch / TensorFlow

Proficient

Rapid prototyping and experimentation.

Experience Requirements

8+ years in ML/AI with a focus on RL, adversarial learning, or LLM research.
Track record of peer‑reviewed publications in top conferences (NeurIPS, ICML, ICLR, ICLR, AAAI).
Experience building large‑scale distributed training systems.

Education

PhD in Computer Science, Machine Learning, Robotics, or related field.

⭐

Preferred Skills

Experience with LLM‑TOC architecture or similar theory‑of‑mind LLM extensions.
Prior work on provable robustness or formal verification in RL.
Cloud‑scale training expertise (AWS SageMaker, GCP, Azure).

🤝

You Will Thrive Here If...

Thrives in high‑ambiguity, high‑impact environments.
Self‑starter who turns theoretical ideas into production‑ready systems.
Values rapid iteration and measurable experimentation.

📈

Impact & Growth

12-Month Impact

Within 12 months, deliver a curriculum that boosts adversarial robustness by ≥30% over baseline HTMAD agents, publish a landmark paper, and integrate the training pipeline into the company’s production MARL stack.

Growth Opportunity

Lead a growing curriculum research team, mentor junior scientists, and expand the framework to new domains such as autonomous vehicles and industrial IoT.

Ready to Push the Boundaries?

If this sounds like the challenge you have been looking for, we want to hear from you. We value what you can build over where you have been.