You will pioneer a new class of belief‑aware abstractions that fuse information‑theoretic regularization with hierarchical policy decomposition. Your work will directly tackle the hardest credit‑assignment bottleneck in decentralized RL, enabling agents to reason about uncertainty at multiple temporal scales.
By embedding a variational bottleneck in belief space, you will create the first end‑to‑end differentiable pipeline that learns to discard spurious observations while preserving essential coordination cues—an approach that has never been demonstrated at scale in MARL.
Hierarchical Belief‑Aware Abstraction
From: Partial Observability Amplification of Misalignment
To design and train a multi‑scale belief hierarchy that compresses sensory embeddings through a variational bottleneck conditioned on observation history and a shared world‑model prior, thereby preserving task‑relevant modalities and reducing credit‑assignment errors in partially observable MARL.
A modular belief‑hierarchy framework, training pipelines, evaluation metrics, and integration modules that expose belief divergence signals to the rest of the BAAC stack.
PhD in Computer Science, Electrical Engineering, or Machine Learning with a focus on RL or probabilistic modeling.
Within 12 months, deliver a benchmark‑competitive BAAC agent that reduces credit‑assignment error by ≥30 % on SMAC and MPE, open‑source the belief‑hierarchy code, and publish a paper on belief‑aware abstraction.
Lead a growing research team of 5–7 scientists, shape the product roadmap for BAAC, and transition from prototype to production‑grade deployment.
If this sounds like the challenge you have been looking for, we want to hear from you. We value what you can build over where you have been.