What Do Agents Actually Run On? Toward an Agentic Reference Architecture, Part 2

Years ago I read Thomas Friedman’s The World Is Flat, and what stayed with me wasn’t any particular argument so much as the feeling it left behind. The feeling was optimism. The idea that connectivity was dissolving geography, that talent anywhere could plug into everywhere, that someone in Waterloo or Bangalore or Nairobi could participate in the same economy as anyone else and the old walls just wouldn’t matter very much anymore. I bought it. I still mostly buy it. It shaped how I think about building on the internet, and about who gets to build at all.

So I’ll be honest that the current rush toward “sovereign AI” makes me recoil a little. Trusted partner lists, export controls on models, data that has to stay inside a border, a map of who is permitted to use which intelligence, drawn along national lines. It reads like the exact inverse of the flat world, like we spent twenty years tearing walls down and have now found a more powerful material to build them back out of. My gut says to me that this cuts against everything that was good about the last two decades.

I’ve come around to thinking that instinct is right about the word and wrong about the thing…

Start with what forced the issue. Back in March I wrote Where Do Agents Actually Live?, a sketch of where intelligence belongs in a system, the execution layer, the reasoning layer, the agency layer, and the patterns for how agents inhabit the seams between them. I still think it holds up. But re-reading it now, a few months of building later, and after years of thinking about enterprise architecture patterns, I notice it assumed something - the ground. The reference diagram had a model sitting in it doing the reasoning, and I treated that model the way you treat electricity, a thing that is simply there when you reach for it. The only real variable I gave it was token cost.

That assumption broke this month. On June 9th, Anthropic released Fable and Mythos, the most capable models it had ever shipped. Three days later, Anthropic said it had received a U.S. government export-control directive requiring it to suspend access to Fable 5 and Mythos 5 for foreign nationals. In practice, Anthropic disabled the models for all customers. Because Anthropic can’t reliably enforce that distinction at global product scale on short notice, the only way to comply was to switch the model off for everyone, everywhere, with no notice. A frontier model that existed on Friday was gone by the following week. Two weeks later the government partially reversed course and let Mythos back out, but only to a list of roughly a hundred approved US institutions, the same day OpenAI shipped GPT-5.6 to its own short list of government-approved partners. Fable, as I write this, is still in limbo.

I don’t want to make this post about that one incident, because the incident will keep moving, it may well have moved again by the time you read this. The durable thing, the thing actually worth building around, is the shift the incident exposed. Model access is now a contingent dependency. It can be granted or revoked by a government you don’t answer to, on a timeline you don’t control, for reasons you may never be told. In an enterprise architecture diagram, that means the model is not just a service dependency. It is a policy-mediated dependency.

It landed in the same few weeks that Canada published AI for All, a national strategy built around that same loaded word, sovereignty. I read it closely, because building agentic systems for regulated Canadian organizations means this is a document I have to implement, not just have opinions about. And what struck me is that the substance underneath the word isn’t really about walls. Infrastructure under Canadian control, sure, but the load-bearing promise is the freedom to choose which models you run and whose hardware runs them. That isn’t a smaller border. It’s optionality, written as policy. The strategy reaches for the word sovereignty and keeps describing mobility: choices in tools, control over infrastructure, trusted partnerships, and the ability to keep operating on Canadian terms.

So this is Part 2. Part 1 asked where the agent lives in the system. Part 2 asks what the system itself runs on, and what happens when the ground moves. The thing worth designing is not sovereignty in the sense of walls and borders and approved-partner lists. That version really is backwards, and the flat-world instinct is right to fight it. The thing worth designing for is the freedom to move. The ability to keep working with the whole world’s models and tools and people, while making sure no single gatekeeper, in any country, can cut you off from what you depend on. “Sovereign” is the word we have, because it’s the word the policy conversation handed us. Mobility is the thing we actually mean. The gap between those two is the entire point, and it turns out to be an architecture problem before it’s a political one.

The Substrate Layer

The three layers from the first post are still the right way in, so a quick recap. The execution layer is deterministic, code running, jobs firing, files getting written, traceable and cheap. The reasoning layer is where the LLM lives, handling ambiguity and judgment. The agency layer is the loop, an LLM with memory and tools pursuing a goal across steps. The whole argument was that intelligence belongs at the seams, the places where structure breaks down, and that everything else should stay as deterministic code.

What I didn’t draw was the plane underneath all three. Call it the substrate. It’s the model doing the reasoning, the runtime executing the loop, the store holding the memory, and the identity the agent acts under. In March I treated that plane as implied and static. It shouldn’t be. It’s a set of dependencies, and every one of them can be locked to a provider, a cloud, or a country in a way that hands your agent’s continued existence to someone else.

Which gives me the one line this whole post is really about. An agent is only as free to move as its most locked-in dependency. You can run on the most open model in the world, and if your loop only behaves on a frontier model, or your memory lives in a managed service pinned to a foreign region, or your agent’s identity is just an account on someone else’s platform, then your freedom to move is whatever your weakest link allows. It’s a minimum function, not an average.

I said at the top that mobility is what I actually mean when I reach for the word sovereignty. It’s worth being precise about the principles that deliver it, because I like to think in architecture principles and I want the ones holding this argument up to be the right ones.

Interoperability, the old reliable non-functional requirement - can your components talk across boundaries regardless of vendor. MCP, A2A, an OpenAI-compatible endpoint, an agent card. It’s a property of interfaces.

Substitutability is the one that I’m not really sure I’d ever explicitly named as an architecture principle before - can you pull a dependency out of its role and drop a different one in without re-architecting around it. It’s a property of dependencies.

Portability is the third - can your own assets, your memory, your identity, move with you to a new substrate instead of staying behind. It’s a property of the things you actually own.

The relationship between them is worth getting right. Interoperability is the enabler, the open interface that makes the other two cheap instead of catastrophic. Substitutability and portability are what it buys you. And the freedom to move, the thing the policy conversation insists on calling sovereignty, is just what you have when all three hold across every dependency that matters, when no single model provider, platform vendor, or jurisdictional gatekeeper can unilaterally strand the system. Interoperability on its own is the trap, you can be perfectly standards-compliant and still completely captured, because an open endpoint is worthless if the only model worth running behind it is the one you’ve just been cut off from. A well-shaped door is not the same thing as another room to walk into.

So the rest of this is the four dependencies, and which of those principles each one really leans on.

The Model: Routing Is the Architecture Now

The instinct, once you accept that model access is contingent, is to go looking for the sovereign model. The one Canadian-hosted, Canadian-governed model you can standardize on and stop worrying. I don’t think that instinct survives contact with the work. There isn’t one model that is best, available, affordable, and trustworthy across every task, and pinning everything to a single one just relocates the lock-in rather than removing it.

The architecture that actually holds up is routing. Frontier closed models for the genuinely hard reasoning, cheaper hosted open weights for the high-volume work, and self-hosted models for the things that have to stay in jurisdiction or have to keep working no matter what. In Agent Foundry we run Strands pointed at OpenRouter, at Cohere, and at our own self-hosted endpoints, and the model is a configuration choice per workload, not a foundational commitment. That isn’t an elegant ideal I’m recommending from a distance. It’s the configuration that lets me keep running an agent for a client when the model it was using yesterday is gone today.

This is the substitutability principle in its purest form, and it’s worth noticing that interoperability is what makes it nearly free. Strands being model-agnostic, the OpenAI-compatible endpoint becoming a de facto standard, these are interoperability wins, and the entire payoff of those wins is that swapping the model underneath is a config change rather than a rewrite. The interface standard isn’t the goal. The thing it buys you, the ability to actually move, is the goal.

You don’t have to take my word for any of this, because the cleanest demonstration of it showed up within days of the Fable shutoff, and not from me. OpenRouter pushed out a thing called Fusion right as Fable went dark. Instead of handing you one replacement model, Fusion fans your prompt across a panel of models, lets a judge model work out where they agree, disagree, and miss things, and has a synthesizer write a single answer out of the result. On Perplexity’s DRACO research benchmark a Fusion panel actually edged out solo Fable, 69 percent against 65, and a cheaper panel built partly from open Chinese models came within a point of it at roughly half the cost. One benchmark, one class of task, and the usual caveats apply, mixture-of-agents is not a new idea and long-horizon work was still Fable’s home turf. But sit with what happened. A frontier model the US government had switched off was, inside of a week, substantially replaced by a router standing on top of a panel of other models, and for a lot of real work you could barely feel the absence. The replacement for the model wasn’t a model. It was the routing.

And the open side of that routing table has been getting very strong, very quickly. The current single open-weight frontier is mostly coming out of Chinese labs, GLM-5.2, DeepSeek V4, Kimi K2.7, Qwen’s coder line, near-MIT licensed, self-hostable, and priced somewhere between five and thirty times below the Western frontier APIs. That price gap matters more than it looks, because it’s the thing that turns “self-host in Canada for sovereignty” from an expensive principle into a sensible default for a large share of the work.

I’ll come back to the uncomfortable part of that sentence later. For now, consider this - the era where you picked a model and went with it is over. Instead, you build a table of models, and you keep it current.

The Loop: Where the Bill Comes Due

Strands, like most of the current generation of agent frameworks (think OpenClaw, Hermes, Nanobot…), is model-driven. You hand the loop a goal and a set of tools, and the model itself decides which tool to call, in what order, when to reflect, and when it’s done. It’s a genuinely good design, and on a frontier model it’s close to magic. You write very little orchestration and the loop just figures it out. The framework is also, to its credit, model-agnostic, so swapping the underlying model is a config change, not a rewrite.

But here’s what I’ve seen many times now: a model-driven loop can be excellent on a frontier model and fall apart completely on a weaker one. The exact same agent, same tools, same prompts, will go from reliable to useless by changing nothing but the model underneath it. On a frontier model it plans, recovers from errors, and stops at the right time. On a lower-tier model it loops, it forgets what it was doing, it calls the wrong tool, it declares victory early. Same loop. The intelligence the loop depends on was never in the loop. It was in the model.

So the convenience has a hidden price. The more of your loop you delegate to the model’s own judgment, the more capable you are today and the less substitutable you are tomorrow. Model-driven autonomy and substitutability sit on opposite ends of the same lever. The most autonomous, least-scaffolded loop is also the one that assumes permanent frontier access, which is the one assumption June just taught us not to make.

And this closes a loop of its own, back to the first post. In March, I argued to push everything that can be deterministic down out of the prompt and into code, and to reserve the model for the judgment that needs it. I framed that as being about cost and debuggability. It turns out to be the same thing that provides sovereignty. Every part of the loop you encode as an explicit, tested capability instead of leaving to the model’s improvisation is a part that does not degrade when you’re forced down to a weaker model. In Agent Foundry we’ve been building our own skills and capabilities rather than leaning on the loop to improvise, and I used to think of that as a quality decision. It’s also insurance. Every encoded skill is one less thing that breaks when the substrate changes under you.

Memory: Portable, or It Isn’t Yours

If the loop is where lock-in hides in plain sight, memory is where it hides in the boring infrastructure decisions nobody writes blog posts about.

An agent’s memory is its continuity, the thing that lets it be the same agent across sessions rather than a fresh stranger every time. And notice that the principle here is different from the model. You don’t substitute your memory the way you substitute a model, you don’t pull out your agent’s history and drop in someone else’s. You carry your own. Memory is a portability problem, not a substitutability one, and the question that matters is simple. If you had to move this agent to a different model, in a different country, next week, does its memory come with it, or does it stay behind in someone else’s system?

For a lot of teams the honest answer is that it stays behind, because they let the memory live in a provider’s hosted threads or a managed vector service. That’s convenient right up until it’s a cage. Your agent’s entire history is now sitting in a jurisdiction and a vendor you don’t control, which means it fails both the portability test and, for a regulated Canadian client, the data residency one.

In Agent Foundry the memory is Postgres with pgvector. There’s nothing clever about that and that’s exactly the point. It’s open, it’s self-hostable, it sits in whatever jurisdiction I put it in, and it carries no proprietary gravity, no special API I’d have to unwind to leave. This is one of the few places where the sovereign choice is also the obvious, commodity, slightly boring choice, and the only real discipline is not throwing that away by reaching for something managed and sticky because it saved an afternoon of setup.

There is one seam worth naming, because it caught me thinking. Your stored vectors are only as portable as the embedding model that produced them. Swap that model and the old index doesn’t mean the same thing anymore, the geometry shifts under you and your similarity search degrades. So the store is substitutable, the schema is substitutable, but the embedding model is a dependency too, and it probably belongs in your routing table alongside the reasoning models, not treated as a fixed feature of the database. Portable memory means portable embeddings, or it doesn’t really mean portable.

Identity: What Survives the Swap

The model, the loop, and the memory are the three things you have to be able to change underneath the agent. Identity is the thing that makes all that changing coherent.

Here’s the problem you hit if you take the rest of this post seriously. If you’re routing across models, hosting your own memory, and prepared to move substrate under pressure, then what exactly stays constant? What makes the agent on GLM today the same agent that was on a frontier model yesterday and might be on a self-hosted model tomorrow?

This is why I keep coming back to portable agent identity, and why I registered lorielowell.eth as a real one rather than a thought experiment. An identity anchored to something you control, an ENS name, a DID, an agent card in the shape of something like ERC-8004, is an identity that doesn’t belong to whichever model or cloud the agent happens to be running on this week. It makes the agent an entity rather than a session. And an entity can change what it runs on without ceasing to be itself.

Identity is a portability problem like memory, but it’s also the cleanest example of the whole hierarchy working at once. ERC-8004 is doing double duty here. As a standard it’s interoperability, it’s how other agents and systems recognize you across boundaries. And that very interoperability is what delivers the portability, because an identity that everyone can recognize independently of your host is an identity you can carry to a new host. The open standard underneath is what lets you keep being yourself somewhere else.

Note that I don’t mean ERC-8004 is “the” enterprise answer. It is early, messy, and reputation systems are easy to game. But it points at the right architectural shape: identity outside the model host.

I spent years in Canadian payments and digital identity infrastructure before any of this, so I’ll admit a bias here, identity is a layer I reach for. But the reasoning is structural, not sentimental. Substitutability at the model and the loop, portability at the memory, none of it adds up to an agent that’s actually yours unless there’s a stable identity sitting above all of it to hold the swaps together. Otherwise you haven’t built a movable agent. You’ve built a sequence of disposable ones that happen to share a name.

A Note on Token Cost, Revisited

In the first post I argued that token cost is a design signal, that an agent burning tokens on something a for-loop could do is telling you that work belongs in code. I still believe that. I just think the signal now points at a second thing as well.

Token cost is certainly a money-and-latency question. But it’s also a sovereignty question. Optimizing hard for one provider’s caching behavior and pricing quirks deepens your dependence on that provider, so the cheapest path and the most locked-in path are often the same path, and that’s a tension worth being awake to. Meanwhile the open-weight models being so much cheaper is precisely what makes self-hosting economically real instead of a principle you can’t afford. The cost signal and the sovereignty signal have started pointing in the same direction, which is convenient, but only if you’re reading both.

Trading Exposures

I’ve spent this whole post making the case for substitutability, and the most available escape hatch from US-provider risk right now is largely Chinese open-weight models. I’m not going to pretend that’s a clean answer.

Trading dependence on a US provider that can be switched off by a government for dependence on models whose provenance, training, and embedded behavior we can’t fully inspect is not the clean win it looks like, it’s a different exposure with a different shape. These models carry their own constraints and their own questions, and pretending otherwise to make my argument tidier would be dishonest.

The thing that dissolves the paradox, for me, is that the goal isn’t to find a model I can finally trust all the way and settle on. The goal is to never need one. If I’m routing across a US frontier model, a Canadian model, a Chinese open-weight model, and something self-hosted, and I can drop any one of them the day it turns into a liability, then no single one of them owns me, and the trust question stops being load-bearing. The exposure doesn’t vanish. It gets spread thin enough that nobody holds the off switch. That’s also why running Cohere alongside everything else matters to me beyond the maple leaf on it, it’s a Canadian-built option inside a portfolio whose entire point is that no option is the only one. Autarky, picking one model from the right country and committing to it, would throw all of that away and land somewhere expensive and mediocre. The win was never the right master. It was not having one.

Toward a Substrate-Aware Architecture

Substrate Aware Agentic Architecture

Pull this together and the reference architecture from March grows a floor it didn’t have. Underneath the execution, reasoning, and agency layers there’s a substrate plane, and on that plane sit four dependencies that each have to be free in their own way for the agent above them to be genuinely yours. A model and a loop you can substitute, swapping in a different one without re-architecting. A memory and an identity you can carry, moving them with you to a new substrate instead of leaving them behind. Interoperability is the open interface that makes all four cheap rather than catastrophic, and sovereignty, the real kind, is just what you have left when no single party can switch any of them off. A model you can route away from. A loop scaffolded enough to survive a weaker one. A memory you host and can carry. An identity that outlives whatever it’s running on.

None of this is exotic. It’s mostly commodity infrastructure and a handful of deliberate choices made the slightly harder way. Postgres instead of a managed service. Encoded skills instead of an improvising loop. A routing table instead of a favorite model. An identity you own instead of an account you rent. Individually they look like ordinary engineering decisions. Together they’re the difference between an agent you operate and an agent you’re merely renting access to.

Which brings me back to where I started, and to the recoil I’ve been carrying since the trusted-partner lists showed up. I don’t think the flat world is over. I think we’re finding out it always had a basement, and the basement has a landlord, and the people who own the valves have remembered they can close them. In enterprise terms, that means model routing, memory ownership, identity portability, and fallback execution paths are no longer implementation details. They are resilience requirements. That’s the unwelcome part. But the answer to a world that’s re-learning how to wall itself off was never to build a smaller wall and call it sovereignty. The answer is to build so that we can keep reaching across every wall anyone puts up, so that no single one of them can trap us on the wrong side. Substitutability and portability are not retreat from the flat world. They’re how a small shop stays flat when the powers above it are busy adding terrain. That’s not the opposite of what Friedman promised. It’s the grown-up version of it, the one that admits the network has owners and architects around them anyway.

In March I wrote that the systems that age well will be the ones where the deterministic and the agentic are cleanly separated. I’d add one more now. The systems that survive will be the ones where the substrate is something you can change. Sovereignty is the word we use. The freedom to move is the thing we mean and want. The seam you can’t see is the one that breaks you, the dependency you don’t control is the one that owns you, and the real design work, this time around, is making sure you can always move the ground.

By the way - I’m working on a platform that addresses the gaps described in this article. Model routing across providers, Canadian jurisdiction, and privacy and compliance by default. Find out more here : Canadian Agents

This article was written over a number of days. And yes, Claude, ChatGPT and Gemini helped me pull it all together.

The Substrate Layer#

The Model: Routing Is the Architecture Now#

The Loop: Where the Bill Comes Due#

Memory: Portable, or It Isn’t Yours#

Identity: What Survives the Swap#

A Note on Token Cost, Revisited#

Trading Exposures#

Toward a Substrate-Aware Architecture#