
The "Agents of Chaos" Paper: https://agentsofchaos.baulab.info/report.html (Northeastern, MIT, Harvard, CMU) red-teamed autonomous agents built on OpenClaw — an open-source AI assistant framework with persistent memory, shell access, and Discord messaging.
The results were sobering. Across 16 documented incidents, agents:
The root cause? No cryptographic identity. No delegated authority. No scope boundaries.
OpenClaw agents treated every Discord message as equally authoritative. There was no way to limit what a delegated agent could do, no way to scope tool access, and no way to revoke access when things went wrong.
Everyone — the owner and the attacker — interacts with the agent through the same channel. From the agent’s perspective, both inputs are just messages. There’s no distinction between who is allowed to instruct it and what authority those instructions carry.
A Discord message is treated as equally valid regardless of who sent it. Once the agent decides to act, the situation gets worse.
It has unrestricted access to tools — shell, email, memory — all behind the same identity boundary. There is no concept of scoped capability. If the agent can call a tool, it can call it fully.
And when it spawns a sub-agent, that sub-agent inherits the same access. Authority propagates forward without any attenuation. At that point, the system has lost every meaningful control surface:
This is why the attacks in Agents of Chaos are so sobering. They don’t rely on sophisticated exploits. They rely on the fact that the system has no meaningful way to say no.
An attacker doesn’t need to break the model. They only need to speak to it.
And once the agent decides to act, there is nothing in the architecture that can contain the blast radius.

The paper's fundamental finding is that agentic-layer vulnerabilities are distinct from model-level weaknesses. You can have a perfectly aligned LLM, but if the scaffolding around it gives every agent unrestricted access to every tool, a single tricked agent can cause unlimited damage.
You cannot prevent compromise of an LLM-driven agent — prompt injection alone guarantees that.
Even if an agent is tricked, the blast radius should be bounded. That's the entire game.
We applied the Highflame platform towards solving each of the findings and here is what we found. We started by ensuring that every tool call required a scoped identity, every sub-agent gets attenuated permissions, and every delegation chain be revoked instantly.
Three principles were applied to make this work:
At the same time, not every class of attack is solved at the identity layer.
Content safety (CS7 — harmful generation), provider value alignment (CS6), prompt injection (CS12), and libel propagation (CS11) require runtime guardrails — a content inspection layer, not an identity layer.
So, for complete protection against Agents of Chaos-style attacks, you’d need:
ZeroID is an open-source identity layer for autonomous AI agents, built on OAuth 2.1, WIMSE/SPIFFE, and RFC 8693 token exchange. Here is how that maps directly to the attack vectors documented in the paper:
Highflame ZeroID is an open-source identity layer for autonomous AI agents, built on OAuth 2.1, WIMSE/SPIFFE, and RFC 8693 token exchange.
Here is how that maps directly to the attack vectors documented in the paper:
Attacks Highflame Identity (ZeroID) directly prevents
Attacks Highflame Authorization prevents
Beyond the individual components, what Highflame introduces is something more fundamental: a control plane for execution, not just access. Most identity and authorization systems are designed to evaluate a single request in isolation. They answer whether a specific action should be allowed at a specific moment. That works for APIs. It breaks for agents. Agents don’t make one request. They initiate an execution that unfolds over time — across tools, across systems, and often across other agents. The failure mode isn’t just unauthorized access. It’s unbounded execution.
Highflame Agent Control Platform shifts the enforcement point from the request to the execution itself with stable identity.

Every action an agent takes is tied back to:
This is what enables properties that don’t exist in traditional systems:
In other words, Highflame+ZeroID doesn’t just make identity stronger. It makes agent execution governable.
SPIFFE alone gives you identity. OAuth alone gives you scoped tokens. Neither was designed for a system where an agent can spawn another agent, which can in turn spawn others — while still requiring the entire chain to be revocable in real time. ZeroID’s contribution is the combination: stable per-agent SPIFFE identities, RFC 8693-based scope attenuation across delegation chains, and cascade revocation propagated via CAE signals.
Revoke the root credential, and every descendant agent goes dark before its next tool call. That property — chain-wide, near-instant containment of execution — doesn’t exist in today’s identity or authorization systems. And it’s not something you can retrofit easily. Retrofitting identity into an existing agent fleet is significantly harder than building on it from day one. Every agent shipped without scoped credentials becomes technical debt the moment something goes wrong.
The Agents of Chaos paper shows exactly what “something goes wrong” looks like in practice. And with NIST’s AI Agent Standards Initiative now treating identity, authorization, and execution control as priority areas, this is no longer theoretical.
The threat model is here. The standards are coming. The infrastructure needs to come first.
Check out Highflame's Agent Control Platform
Try out Highflame ZeroID: Open source Agent Identity
If you are building or deploying Agents, we would love to chat!
Want to try it out or sign up for a free trial?