Search as you type

Employable Agents

You wouldn't hand a new intern the keys to your office, without explicit intent, boundaries and controls. Why do it with AI?

2 months ago • 4 min read

By Dr Shaun Conway

Let’s make this real for a second.

Imagine a brilliant new graduate walks into your office. They’re sharp, eager, and clearly talented. They ask for an internship.

Do you immediately hand them the keys to the building, the password to the corporate bank account, and full admin access to your production database?

Of course not. That would be negligent.

You sign an agreement first. You define their role. You give them a temporary badge that only opens the lobby and the staffroom. You check their work before it goes out to a client.

Yet, right now, this is exactly how we are treating AI Agents.

We are so excited that we figured out how to give Agents "tools" (via MCPs and API calls) that we forgot to build the infrastructure of employment.

We’ve solved capability—the intern can do the work. But we haven’t solved coordination—how to ensure they do it safely, correctly, and accountably.

Here is the "HR Department" we need to build for the Agent economy.

The Employment Contract (Interop & Protocols)

When you hire a human, you don't just say "go work." You give them a job description.

In the Agent world, we need more than a system prompt. We need a standard way to define the Job Envelope.

Intent: What are you trying to achieve?
Budget: How much compute/money can you burn?
Settlement: What does "done" look like?

We are moving from building single "super agents" to building supply chains of specialists. Your "General Manager" agent needs to be able to hire a "Specialist" agent without giving them the keys to the whole castle. That requires a protocol for delegation, not just tool use.

The Shared Office (State as Truth)

If your intern works remotely and tells you on Slack, "I filed those documents," do you believe them? Maybe. But you’d rather check the filing cabinet yourself.

Right now, we have a massive problem with "Chat-driven drift." An agent says in the chat window that it updated a record, but the actual database hasn't changed. The transcript is polite, but the work is fake.

We need to stop treating the chat log as reality. The only thing that matters is Shared State.

In Qi, we view Flow as a state machine.
Agents don't just "talk"; they propose state transitions.
If the state doesn't update, the work didn't happen.

**Learn about shared state, audit trails, and verifiable workflows.**

The Access Card (Security & Permissions)

You give the intern a badge that expires in 3 months and only opens specific doors.

In software terms, this is the death of the static API key. You cannot paste a root-access key into an LLM and hope it doesn't get confused (or hacked).

We need Capability Tokens (for Qi we implement UCANs). Instead of "Here is the key to the bank," the agent gets a token that says: "You are authorised to spend up to $50 on Server Costs, valid only for this specific session."

This also means installing Firebreaks. There are moments in every workflow—before money moves or code ships—where the system must pause and demand a signature. It could be a human manager, or a trusted independent oracle. But the intern doesn't get to push to production on Day 1.

The Performance Review (Outcome Evals)

How do you grade the intern? Do you grade them on how eloquent they were in the interview? Or do you grade them on whether the spreadsheet balanced?

We are currently grading Agents on their "eloquence"—evaluating their reasoning and tone. That’s vanity metrics.

We need Outcome-Grounded Evals.

Did the file move?
Did the verifiable credential issue?
Did the payment settle?

Every Skill needs a definition of success that isn't text—it's an evaluation rubric that must be passed before issuing a signed digital receipt —the Universal Decision and Impact Determination (UDID) proving the outcome.

The Shift from "Can" to "Verified"

The first phase of this AI wave was about abilities: "Look, the AI can use a calculator! It can write code!"

The next phase—the one that actually matters for enterprise—is about Guarantees.

Guarantees that the agent stayed within budget.
Guarantees that the data wasn't leaked.
Guarantees that the outcome is real.

We aren't just building a library of tools. We are building the systems that allow you to "hire" these digital workers as synthetic labour, without risking the project or company.

Are you still building toys for the intern to play with? Or are you building the office where the real work gets done?