Senior Research Engineer – Hierarchical Token‑Budgeted Chain‑of‑Thought Architect

corpora-jobs-1778796293285-db9d41c6 - Frontier Development

Research EngineerSenior1 position

⚡

Why This Role is Different

Frontier Development Role

Lead the design and implementation of a token‑budgeted reasoning engine that lets MARL agents ask for counterfactual explanations on‑the‑fly, cutting inference cost while keeping explanations audit‑ready. Your work will be the linchpin that turns theoretical CoT ideas into a production‑grade, low‑latency system.

The Frontier Element

You will pioneer a hybrid RL–transformer architecture that learns to allocate a hard token budget in real time, a capability that has never been demonstrated at scale in multi‑agent settings.

🔬

Project Context

Research Area

Token‑Budgeted Chain‑of‑Thought Decomposition for Sample‑Efficient MARL

From: Explainability Budget Optimization for Sample Efficiency

Why This Role is Critical

This role is critical to operationalize the token‑budgeted CoT mechanism that balances explanation depth with compute limits, a core enabler of the 40% sample‑efficiency gains.

What You Will Build

A reinforcement‑learning controller that learns when to invoke token‑budgeted CoT, a token‑budget scheduler, and a pruning engine that eliminates filler tokens while preserving reasoning fidelity.

🛠

Key Responsibilities

Design a token‑budget scheduler that integrates with the top‑level policy and enforces depth/breadth limits during inference.
Implement token‑pruning and filler‑token detection mechanisms inspired by Distilled Reasoning Pruning and TokenSkip.
Develop a reinforcement‑learning controller that learns when to trigger CoT versus lightweight heuristics, optimizing for sample efficiency.
Prototype and benchmark the system on high‑stakes MARL environments (autonomous logistics, finance, healthcare).
Collaborate with the neuro‑symbolic team to expose symbolic explanation hooks to the token‑budgeted CoT engine.

🎯

Required Skills & Experience

Technical Must-Haves

Reinforcement Learning & Multi‑Agent RL

Expert

Designing agents that learn under token constraints.

Transformer Architectures & Token‑Level Control

Advanced

Implementing token‑budgeted CoT and pruning.

Reinforcement‑Learning‑Based Controller Training

Advanced

Learning when to invoke CoT.

Experience Requirements

5+ years building RL systems with sample‑efficiency focus.
Track record of deploying transformer‑based reasoning modules in production.

Education

PhD or Master’s in Computer Science, Machine Learning, or Robotics with a focus on RL.

⭐

Preferred Skills

Experience with token‑budgeted reasoning in LLMs (e.g., AdaCoT, DRP).
Knowledge of low‑latency inference frameworks (ONNX, TensorRT).

🤝

You Will Thrive Here If...

Thrives in high‑autonomy environments where experimentation is rewarded.
Can iterate from prototype to production in a few sprints.

📈

Impact & Growth

12-Month Impact

Within 12 months, deliver a token‑budgeted CoT engine that reduces sample complexity by ≥30% on benchmark MARL tasks while keeping inference latency under 50 ms per decision.

Growth Opportunity

Lead a cross‑functional team that expands token‑budgeted reasoning to other modalities (vision, speech) and scales the architecture to thousands of concurrent agents.

Ready to Push the Boundaries?

If this sounds like the challenge you have been looking for, we want to hear from you. We value what you can build over where you have been.