Michael Amir

New Role

2026.06

Joining Google DeepMind

I've joined Google DeepMind as a researcher, working on provenance.

ICML 2026

2026.06

Scaling Multi-Agent Environment Co-Design with Diffusion Models

Engineers train robot policies, and architects design the warehouses robots operate in. But these two problems are usually solved separately. We introduce DiCoDe, which uses diffusion models to co-design agent behavior and the environment together: as the agents improve they signal which environments are useful, and as the environments improve they become better training settings for the agents. This scales multi-agent co-design to much larger problems, including warehouse layouts and wind farm control.

H. X. Li, M. Amir, A. Prorok

paper

DiCoDe: diffusion-based multi-agent environment co-design pipeline

Three Papers @ ICLR 2026

2026.03

I have three papers accepted at ICLR 2026! I'll be presenting work on robot policy watermarking, when diversity helps in cooperative multi-agent learning, and higher-order interactions in multi-agent pathfinding.

Remotely Detectable Robot Policy Watermarking

How do you verify what policy a robot is deploying without access to its internals? We formalize this remote-observation setting through noisy, asynchronous video glimpses and introduce CoNoCo, which embeds a spectral watermark into a robot's motions via colored noise while preserving the marginal action distribution. The result is remote policy auditing from ordinary footage, with applications to safety, accountability, and IP protection.

M. Amir*, M. Flageat*, A. Prorok

paper website

Remotely detectable robot policy watermarking

When Is Diversity Rewarded in Cooperative Multi-Agent Learning?

We know how to train heterogeneous agents, but when does heterogeneity actually improve performance? In cooperative MARL, we show that the answer is tightly related to the curvature of the reward function, and introduce HetGPS, an algorithm that automatically discovers environments where diversity pays off.

M. Amir*, M. Bettini*, A. Prorok

paper website

Diversity in cooperative multi-agent learning

Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding

Standard GNNs model pairwise interactions, but multi-agent pathfinding is fundamentally a group coordination problem. HMAGAT uses directed hypergraph attention to capture higher-order interactions, establishing a new state of the art among learning-based MAPF solvers and outperforming a much larger baseline in dense environments.

R. Jain, K. Okumura, M. Amir, P. Lio, A. Prorok

paper

Hypergraph neural networks for multi-agent pathfinding

Oral @ AAAI 2026

2025.11

Graph Attention-Guided Search for Dense Multi-Agent Pathfinding

Finding near-optimal solutions for dense multi-agent pathfinding (MAPF) problems in real-time remains challenging even for state-of-the-art planners. To this end, we develop a hybrid framework that integrates a learned heuristic derived from MAGAT, a neural MAPF policy with a graph attention scheme, into a leading search-based algorithm, LaCAM. Our work establishes, perhaps for the first time, that learnt heuristics can outperform SoTA search-based planners in dense MAPF problems.

R. Jain, K. Okumura, M. Amir, A. Prorok

paper code

CoRL 2025

2025.08

ReCoDe: RL-based Dynamic Constraint Design for Multi-Agent Coordination

We introduced ReCoDe, a novel RL framework that learns to dynamically generate constraints, rather than actions, to improve team coordination in real-time. This allows agent teams to benefit from expert controllers that determine their next action subject to these constraints.

M. Amir, G. Yang, Z. Gao, K. Okumura, H. Woo, A. Prorok

paper video

IEEE T-RO

2025.07

Time, Travel, and Energy in the Uniform Dispersion Problem

This work explores the fundamental efficiency limits of robot swarms. We prove when it's possible (or impossible!) to simultaneously optimize for time, distance, and energy during dispersion, and introduce FCDFS, an ant-robotics algorithm that achieves these bounds using only 5 bits of memory and zero communication.

M. Amir and A. M. Bruckstein

paper preprint

Recent Work

Reviewing

Contact