Something has been nagging at me since I built the AgentVista prototype.

I built it as a capability proxy, a single endpoint where autonomous agents can acquire real-world capabilities at runtime. Send an SMS. Host a file. Create a payment link. No pre-configured credentials. No developer account. Pay with a crypto wallet, get a result.

The architecture felt right. Then I started pulling on a thread that complicated things.

Will this layer need to exist independently? Or will the agent platforms just absorb it?


The Bake-It-In Path

Look at what’s happening right now. Perplexity Computer launched this week with 19 integrated AI models, persistent memory, and hundreds of connectors, all under one roof. OpenClaw has a community skills library that keeps growing. Every major platform is racing to become the complete environment for agentic work, reasoning, execution, and capabilities bundled together.

The logic is obvious. The more you bake in, the stickier the product. Users don’t want to think about which SMS provider has the best Canadian delivery rates. They want to tell their agent to send a message and have it work. So platforms integrate Twilio, or build their own equivalent, and capability access becomes a feature of the subscription.

This is the walled garden path. It’s probably where most enterprise deployments end up in the near term, IT departments want managed environments, known boundaries, auditability. Perplexity Computer’s controlled cloud model is explicitly designed for that pitch.

The problem is the same as it’s always been with walled gardens: capability catalogs get defined by platform commercial relationships, not by what agents actually need. You get the integrations the platform bothered to build. And every agent on that platform hits the same ceiling at the same time.


The Agentic AWS Path

Here’s the more interesting question: does something different emerge, not a capability feature of an agent platform, but a capability infrastructure layer that any agent on any platform can call into?

This model already exists for one capability class. fal.ai, Replicate, and Runware are all essentially generative AI proxies, one endpoint, many underlying models, pay per execution, no lock-in to a specific provider. Need Flux for image generation? Stable Diffusion? A fine-tuned model on Replicate? You call one API and they route it. The abstraction layer is the product.

That’s exactly the model. It just only covers generative capabilities. Nobody has done the same thing for operational capabilities, the SMS, email, file hosting, sandboxed compute, payment rails side of the world. The genAI aggregators proved the proxy model works. The gap is everything that isn’t generative.

Think about what AWS actually is. Not a product, a primitives marketplace. S3 doesn’t care what language your application is written in. Lambda doesn’t care which framework you’re using. The value is that every service is composable, available to every application, and the infrastructure is neutral to whatever’s running above it.

An agentic equivalent would look like capability primitives, send_sms, execute_python, host_file, geocode, generate_image, exposed through a standard interface that OpenClaw agents, Perplexity Computer agents, Claude Code agents, and agents that don’t exist yet all call into. Pay per execution. No platform lock-in.

This requires the capability layer to be genuinely neutral. And here’s the uncomfortable part of the AWS analogy: AWS worked because Amazon’s primary business was retail. They had no incentive to lock you into a particular framework. Every major agent platform right now IS a framework company. Their incentive runs exactly the wrong direction.

So the neutral infrastructure layer either comes from a startup with no framework to protect, or from the existing cloud providers extending what they already have. And this is the part that doesn’t get said enough: AWS, Azure, and GCP are already most of the way there. SNS does SMS. S3 does file hosting. Lambda does sandboxed execution. They have the underlying services, the billing infrastructure, the enterprise relationships, and critically, the neutrality. An “AWS for agents” that routes capability requests through their own infrastructure isn’t a stretch, it’s an obvious product extension. They just haven’t framed it that way yet.

If that happens, the architecture I’ve been thinking through turns out to be exactly right. Built by Amazon.


The Third Path Nobody’s Talking About

There’s a scenario underneath both of the above.

What if the capability layer disappears into the model itself?

The trend in foundation models is more native tooling with every release. Web search went native. Code execution went native. Image generation went native. The boundary between “in the model” and “called by the model” keeps shifting inward. If next year’s models natively support sending an SMS, the same way they natively support web search today, then a capability proxy becomes redundant. The model is the capability endpoint.

This is the real question for anything positioned as a proxy layer, and it’s worth being honest about rather than pretending the risk doesn’t exist.

My read: it changes the shape of the opportunity rather than eliminating it. Even if models get native SMS capability, the questions of who authorized that SMS, under whose verified identity, against what budget, with what audit trail don’t go away. The accountability and identity layer remains necessary regardless of where the capability primitives live.

Which means the moat for a capability proxy isn’t the capabilities themselves. It’s the trust infrastructure, the identity framework, the attribution model, the record of what was consented to and by whom.


Where This Leaves Things

Agent platforms will bake in capabilities for their own ecosystems. Cloud providers will probably build neutral infrastructure layers that work across platforms. Model providers will keep expanding what’s natively available. All three of those things can be true simultaneously and probably will be.

The piece that doesn’t get built by default by any of those players is the identity and accountability layer for autonomous agents operating across platforms. Who verified the human behind this agent. What they consented to. What budget they authorized. What every consequential action was attributed to.

That’s not a capability problem. It’s a trust infrastructure problem. And trust infrastructure, historically, doesn’t get built by the platforms it governs.