What Does an AI Agent Actually Need?

I’ve been spending the last few days thinking my way through a prototype called AgentVista. I say “thinking my way through” because that’s really what it was — less coding, more chasing a question that kept evolving.

It started simple: what would a directory of MCP servers look like for agents?

It ended somewhere more interesting: a permissionless capability proxy, USDC micro-payments, and a conversation about how existing digital identity infrastructure — built for payments — might be exactly what autonomous agents need for accountability.

But I need to back up, because I also got some things wrong along the way. And the wrong turns are actually the interesting part.

I Was Wrong About the Problem

My first instinct was that the core friction for agents using external capabilities was the install problem. An agent discovers it needs to search the web. It finds a Brave Search MCP server in a directory. Now what — it has to install something? Mid-task?

Except that’s not really true anymore. Modern agentic runtimes — Claude Code, OpenClaw, and others — can write Python, install packages, execute shell commands. A capable agent with filesystem access can bootstrap new tooling mid-task. The install problem is largely solved.

The real problem is the line that comes right after install.

The agent installs the Twilio SDK. Now it needs an API key. Where does that come from? A human had to create a Twilio account. A human had to agree to terms of service. A human had to set up billing, generate the key, and inject it into the agent’s environment — and all of that had to happen before the session started. Five human actions, per service. And it’s not just Twilio. Brave Search requires its own account. ElevenLabs, Stripe, Clearbit, Mapbox — each one is a separate credential relationship, a separate billing setup, a separate key rotation headache.

This isn’t a niche problem. A Cloud Security Alliance survey of 285 IT and security professionals found that organizations are genuinely struggling to move agentic programs from experimentation into production because of it. 44% of teams are using static API keys shared across agents. Another 35% depend on shared service accounts. Only 18% of security leaders feel confident their systems can handle agent identities properly, and only 23% have a formal enterprise-wide strategy for agent identity management at all. The industry has a name for this: the identity crisis.

The Solutions That Exist (And the Gap They Leave)

Once I framed it as a credential problem rather than an install problem, I found companies already working on it. Composio is probably the most complete — managed OAuth flows, token storage, automatic refresh, 500+ app integrations. They literally call it “the Authentication Wall” and have built a real business around helping developers get past it. Nango and Auth0’s Token Vault are in the same space.

These are good solutions. But they share an assumption that limits them.

Every one of them requires a developer who has already built a product on top of the platform. A developer has a Composio account. A developer has configured which integrations their product supports. An end user might go through an OAuth flow, but a developer had to construct that experience first. The agent operates within a pre-built product wrapper, not autonomously.

For that use case — building a product where users connect their accounts and an agent acts on their behalf — Composio works well. But that’s not the only use case that matters.

What about a truly autonomous agent with no developer-built product underneath it? What about an experimental agent a solo developer spins up at 11pm to try something? What about agents that discover capability needs mid-task that nobody anticipated when the product was designed?

The gap isn’t credential management. It’s permissionless access — the ability for an agent to acquire and use a capability without any pre-existing account relationship on either side.

The Proxy Model

The architecture I landed on: don’t build a directory, don’t build another credential management layer. Build a proxy with a permissionless payment rail.

The agent holds a USDC wallet. It calls one endpoint, describes what it needs in natural language, and a micro-payment goes with the request. AgentVista — the proxy — holds all the downstream credentials. The Twilio account. The Brave Search key. The ElevenLabs API. The Stripe integration. It routes the request to the right service, executes it, normalizes the output, and returns a result plus a receipt.

The agent never sees an API key. It never manages auth. It never knows which underlying provider handled the request. It just describes a need and gets a result.

No developer pre-configuration required. No OAuth setup. No platform account. The entire credential layer is invisible, and access is gated only by the wallet balance and whatever identity tier the user has verified.

The Stripe analogy is useful here. Stripe doesn’t require merchants to have a pre-existing relationship with Visa before processing their first payment. The abstraction is the product. AgentVista is that abstraction for agent capabilities — and the permissionless model is the thing Composio doesn’t do.

What’s Actually in the Catalog

If you’re building a capability proxy, you have to be honest about what belongs in it. The existing MCP directories are full of local tools — filesystem, GitHub, SQLite, database connectors. Useful for developer agents running in a configured environment. Not useful for an autonomous agent that needs to do something in the world.

The right question is: what would a capable human assistant be able to do in five minutes that an agent currently can’t do without pre-configured credentials?

Send an SMS. Make a phone call. Generate audio from text. Download a YouTube video or extract its transcript. Host a file and return a public URL so a human can actually receive it. Create a Stripe payment link. Run a Python script in a sandbox. Geocode an address. Validate that a contact is real.

These aren’t exotic. They’re the connective tissue between reasoning and action. And right now, every single one of them requires credential relationships that most agents simply don’t have unless a developer set them up in advance.

The catalog that matters isn’t “connect to your local database.” It’s “send this,” “host this,” “convert this,” “generate this.” Real-world actions, available on demand.

Why the Payment Rail Has to Be Crypto

Here’s the thing about permissionless access: it only works if the payment model is also permissionless.

A credit card requires a human account. Pre-loaded credits require a human to sign up and deposit. Both reintroduce the same dependency problem through the billing door. You’ve solved the credential wall and immediately erected a payment wall in its place.

A crypto wallet doesn’t have this problem. An agent can hold a USDC wallet without anyone’s permission. No account. No signup. No human in the loop. The developer funds the wallet, sets a budget cap, and the agent spends from it per execution. Entirely machine-to-machine.

This is one of those rare cases where the actual properties of cryptocurrency — permissionless, programmable, no issuing authority required — match a real use case rather than being applied by analogy. USDC on Base is the practical choice: sub-cent transaction fees, stable value, fast confirmation.

The economics also make sense at agent scale. Web search: $0.002. SMS: $0.008. Browser automation: $0.010. An agent running a hundred capability calls in a session might spend $0.50. That’s a rounding error compared to what the LLM inference costs for the same session.

The Identity Problem I Didn’t See Coming

Once the proxy model clicked, I thought I was done. One endpoint, one wallet, describe a need, get a result. Clean.

Then I started thinking through what’s actually in the capability catalog. Sending SMS. Making phone calls. Creating payment links. And a problem became obvious.

You can’t let a fully anonymous agent send bulk SMS messages. That’s spam. Potentially illegal under CASL or TCPA depending on jurisdiction. You can’t let an anonymous agent create payment links, because someone receiving that link has a reasonable expectation that a real accountable party is behind it. The permissionless model is great until the agent does something consequential, and then you need to know who is responsible.

This is actually a broader problem in the agent ecosystem right now. With 44% of teams using shared service account credentials, there’s essentially no accountability chain. If an agent misbehaves, you know it used service-api-key-prod. You don’t know which session triggered it, which user authorized it, or who bears responsibility.

The consequences are already showing up in real numbers. SailPoint’s research found that 80% of organizations report AI agents have already taken unintended actions — 39% accessed unauthorized systems, 33% accessed sensitive data they shouldn’t have. 23% report their agents were tricked into revealing access credentials entirely. Gravitee’s 2026 State of AI Agent Security report found 88% of organizations had confirmed or suspected AI agent security incidents within the past year.

The audit trail problem is just as bad. The same CSA survey found only 28% of organizations can reliably trace agent actions back to a human sponsor across all environments. Only 21% maintain a real-time inventory of active agents — meaning most organizations don’t even know how many agents are running at any given moment. 84% doubt their ability to pass a compliance audit focused on agent behavior. Gartner predicts that by 2028, 25% of enterprise breaches will be traced back to AI agent abuse.

That’s not a security posture you can defend to a regulator, and regulators are coming.

The solution I landed on is tiered identity, where what you’ve verified determines what your agent can do.

Level 1 — wallet signature. Anonymous. Sufficient for read-only, non-consequential operations. Web search, fetch a URL, check the weather. Nobody gets contacted, nothing moves.

Level 2 — OTP verification. Phone or email. You’ve proven you control a real contact point. Now the agent can send communications on your behalf — email, SMS, webhooks.

Level 3 — government-grade KYC. A verified identity credential, issued by a bank or government identity provider, asserting that you are a real, verified person in a real jurisdiction. The credential can travel with an agent session. The agent acts under your verified identity, and every consequential action — payment links, voice calls, financial transactions — is attributed to you.

The liability shift matters enormously. Without verified identity, a capability proxy bears responsibility for everything agents do through it. With Level 3 identity, liability shifts back to the verified individual. The proxy becomes infrastructure, not the responsible party — the same model underlying how digital payment rails work. The rail is not responsible for the transaction. The verified account holder is.

What I find genuinely elegant about this is that it’s a cleaner model of human oversight than what most people propose for AI agents. Instead of “a human must approve every action,” you get “a human verifies their identity once, sets a budget cap, and the agent operates within those economic and identity constraints autonomously.” The human is in the loop exactly twice. Everything in between is genuinely autonomous and fully attributed.

Where This Landed

AgentVista, as I’ve been building it, is a verified capability execution proxy with a permissionless payment rail.

One endpoint. One session token that encodes your identity tier and points to a funded USDC wallet. The agent describes what it needs in natural language. Gets a normalized result back, plus a receipt: transaction hash, cost, balance remaining. The entire credential layer — every downstream API key, every OAuth token, every billing relationship — is managed by the proxy and invisible to the agent.

The thing that makes it different from Composio isn’t the capabilities or the routing. It’s that no developer pre-configuration is required. Any agent with a funded wallet and a session token can access any capability in the catalog, subject only to what identity tier has been verified. Permissionless.

Is this a real product? Not yet. There’s real work to do — regulatory compliance for communications capabilities, identity provider integrations, the actual infrastructure for holding and managing all those downstream credentials securely. None of that is trivial.

But I think the architecture is right. And I think the specific gap it’s filling — permissionless capability access for autonomous agents, with a verifiable identity and accountability model — is real and not currently addressed by what’s out there.

The credential wall is the thing keeping agents from being genuinely autonomous. Solving it for developer-built products is valuable. Solving it permissionlessly, for any agent, anywhere, is a different thing entirely.

Still figuring out what to do with this. If you’re building agents and hitting exactly this wall — or if you’re thinking about agent identity and accountability and have thoughts on the regulatory angle — I’d genuinely like to talk.

I Was Wrong About the Problem#

The Solutions That Exist (And the Gap They Leave)#

The Proxy Model#

What’s Actually in the Catalog#

Why the Payment Rail Has to Be Crypto#

The Identity Problem I Didn’t See Coming#

Where This Landed#