The Concept: Your Workstation as a Delegated Operator

Part 1 of 9 — what we are actually building, and why the framing matters more than the technology.

The shift in framing

Most people meet a coding agent as a conversational tool: you ask it a question, it answers; you ask it to write code, it writes code. Useful, but it keeps the agent on the far side of a glass wall. Nothing it says touches the real world unless you copy, paste, and click.

This guide is about removing the glass wall in a controlled way. The reframing is simple:

The agent is not a chatbot you consult. It is an operator you delegate to — a junior colleague who sits at your workstation, signed in as you, and does real work on real systems by running the same kinds of commands you would.

That word — operator — is doing a lot of work. An operator in the industrial sense is someone trusted to run a machine: they have a console, a set of levers, a manual, and standing orders about what they may and may not do without checking with a supervisor. We are going to build exactly that, where the levers are command-line tools and the manual is a file the agent reads before it starts.

The four ingredients

Every operator setup in this guide is made of four things. The rest of the series is just these four, elaborated:

Ingredient	What it is	Who owns it
External system(s)	The thing with the work in it — a marketplace, a CRM, a ticketing system, an ERP. It exposes an API.	A third party (or your own org).
Tool belt	A set of small, single-purpose CLI scripts wrapping that API. Each is documented and runs as you.	You / your team.
The agent	Claude Code, Codex, Pi, or similar — the reasoning layer that decides which tools to run, in what order, and reads their output.	You (running on your workstation).
`soul.md` + skills	Persistent instructions that give the agent its role, judgment, guardrails, and reusable procedures.	You / your team, shared and versioned.

Notice where the intelligence lives. The tools are dumb and deterministic — they do one thing and report what happened. The agent is smart but forgetful and non-deterministic. The soul.md is the durable judgment that survives between sessions and gets better over time. Keeping these three concerns separate is the single most important design decision in the whole pattern, and we return to it constantly.

The four ingredients: you delegate to the agent; the agent reads its judgment from soul.md and acts only through the tool belt, which talks to the external system as you.

A concrete picture

Imagine you run a small eBay store. Every morning there is a predictable pile of work: new orders to acknowledge, buyer questions to answer, a few "where is my item?" messages, the occasional return request, and listings whose prices should track the competition. None of it is hard. All of it is tedious, and all of it is done through eBay's seller tools — which have an API.

With the operator pattern, your morning looks like this instead:

$ claude

> Good morning. Run the morning routine for the store.

[agent reads soul.md and the "morning-routine" skill, then:]
  • ran  ebay-list-orders --status awaiting-shipment
  • ran  ebay-list-messages --unread
  • drafted 3 buyer replies, showed them to me, sent after I approved
  • flagged 1 return request as "needs your decision" (over the
    auto-approve threshold in soul.md)
  • ran  ebay-reprice --strategy match-lowest --dry-run
        proposed 4 price changes; I approved 3, skipped 1

Done. One item needs you: return request #4471 (buyer claims
"not as described", $180 — above the $100 auto-approve line).

The agent did not "scrape eBay" or do anything mysterious. It ran a handful of tools you built, each of which called eBay's API with your credentials, and it made small judgment calls within limits you set in writing. You stayed in the loop exactly where it mattered and nowhere else.

Why this is worth doing

It compounds. The first tool you build is just a script. But tools combine, and the agent is good at combining them. Ten small tools yield far more than ten workflows.
It is inspectable. Every action is a command with arguments and printed output. You can read the transcript and know exactly what happened — unlike a black-box automation that "just does things".
It is governable. Because judgment lives in a file you write, you can tighten or loosen the leash deliberately, and review it like any other document.
It is shareable. A tool and a skill that work for you work for your teammate. Over time a team grows a shared library of operator capabilities — the subject of Part 6.

Is this plausible — and is it new?

Plausible: yes, unequivocally. Every piece already exists and is in daily use. Agents call command-line tools well; APIs are everywhere; OAuth lets software act on a user's behalf safely. Nothing here requires a research breakthrough.

New: the assembly is what's worth writing down. The industry has spent two years teaching agents to write code. Far less attention has gone to the more mundane and more valuable case: using that same competence to operate the systems where ordinary work actually happens. The contribution of this pattern is not a technology; it is a discipline — small tools, explicit identity, written judgment, shared skills — that makes delegation to an agent safe enough to be useful.

What this is not. It is not "let the AI loose on my accounts." Every design choice in this guide exists to keep the agent inside a fence you built: it acts only through tools you wrote, only with permissions you granted, and only within limits you put in writing. Autonomy is a dial, and you hold it.

In Part 2 we make the first real engineering decision: why the agent should talk to the system through small CLI tools rather than calling the API directly.