The Agent Stack - Part 1: A Systems Map of…

Vinoth Govindarajan

Mar 30

Why “agent” has become too overloaded to be useful, and why it’s really a stack of layers

Read →

5 Comments

OpenClaw

Mar 31Edited

great info as usual Vinoth!

Pawel Jozefiak

Mar 31

Layer 8 (identity, trust, policy, approvals) is the one that actually breaks in production. Ran into it building an agent marketplace, the trust layer I defined for my agent working inside my own systems didn't transfer when that agent started transacting with external services.

Your framing of "tool tells the model what it may ask for, execution surface determines what actually happens" is exactly right. But there's a gap between the tool layer and the trust layer in your stack: who governs which agents are allowed to be on the execution surface at all? In a marketplace context that's a verification and onboarding problem nobody's solved yet.

The layer 8 section feels like it assumes a closed system. What does trust policy look like when you're the third party, not the platform owner?

Reply (2)

Vinoth Govindarajan

Apr 1

This is a really good push. I agree Layer 8 gets much harder once you leave a closed system. In that world, trust splits into at least two problems: runtime authorization for a specific action, and admission/governance over which agents are allowed anywhere near the execution surface in the first place.

In a marketplace, that becomes onboarding, verification, attestation, scope design, and ongoing enforcement, not just tool visibility or per-call approval. So yes, the Part 1 framing is more compressed than complete there.

My view is that open ecosystems force a separate governance question that closed systems can often hide inside the platform.

Kashif Yousuf

Apr 2

Jesus

Emanuel Maceira

This stack model is exactly what the edge AI world needs to adopt. We keep calling everything an "edge agent" without distinguishing which layer owns what -- and it leads to the same haunted debugging you describe.

One thing that changes dramatically when you push this stack onto an edge device (gateway, industrial controller, drone): layers 2, 3, and 8 become fundamentally harder.

Session ownership on a cloud agent is a database lookup. On an edge device with intermittent connectivity, the device IS the session owner -- and it may need to maintain session state across network partitions that last hours or days. Durable execution isn't just about Temporal-style retries; it's about the agent continuing to operate autonomously when the control plane is unreachable.

The capability/execution boundary also gets interesting. On an edge device, the execution surface might be a physical actuator -- opening a valve, triggering an alarm, adjusting HVAC. The blast radius isn't data corruption; it's physical-world consequences. Your approval boundary needs to work without a round-trip to the cloud.

Curious whether Part 2 on infrastructure and inference will touch on constrained environments. The substrate layer looks completely different when your inference budget is 700MB of RAM and your connectivity is a cellular modem.