Agents vs Sub-Agents: Should You Care?

TL;DR

Claude Chat and Claude Code use the same model but serve completely different purposes — one keeps your entire conversation history, the other spins up with just a repo and leaves when done. The real power of sub-agents is context isolation: a research specialist can read ten papers without polluting a writer's draft with GPU specs and citation noise. At scale, this matters — a single agent that has recently debugged Python, planned a trip, and written a LinkedIn post will subtly contaminate every future task with leftover context.

You don’t need to understand plumbing to use a faucet. But when the water turns brown, you are glad someone does.

That is the difference between using AI and building with it. Most people can stop at the faucet. The ones who build systems need to know where the pipes go.

Claude Chat and Claude Code: The Same Brain, Different Jobs

Anthropic built two products with the same underlying model. They tell you everything you need to know about agents and sub-agents.

Claude Chat is the generalist. It chats, reasons, writes, answers questions. It knows your conversation history, your preferences, your tone. It is always there.

Claude Code is the specialist. It spins up inside a terminal, reads your codebase, runs tests, writes files, and submits pull requests. It does not need to know what you talked about yesterday. It needs the repo. It does the work, hands back the result, and leaves.

That is the architecture. One coordinator. Many specialists. Same intelligence, different context.

What Changes When the Specialist Spins Up

Here is a real scenario from my own setup.

I asked my AI to research the state of local LLM inference in 2026, write a summary, and draft a LinkedIn post about it.

Without sub-agents, my main agent did everything. It searched for benchmarks, read papers, kept all that context, then tried to write. The result was competent but crowded. The draft felt like it had one foot in research mode and one foot in writing mode. The tone shifted mid-paragraph. It cited three different GPU architectures in a post meant to be readable.

With sub-agents, the process splits. A research specialist does the deep dive in its own workspace. It reads ten sources, rates each for credibility, notes contradictions, and returns a structured brief. A writer specialist takes that brief and writes the post. The writer never sees the raw research. It only sees the distilled points. The result is cleaner, faster, and more focused.

The difference is context isolation. The researcher’s memory of every paper it skimmed does not leak into the writer’s draft.

A Night in the Kitchen

Here is what a busy Friday night looks like at a high-end restaurant.

4 PM: The executive chef walks the line. Checks prep levels, tastes sauces, adjusts the specials board. No specialists needed. This is the chef’s daily routine.

5 PM: A guest announces a severe nut allergy after ordering the tasting menu. The chef calls in the pastry chef, who knows every ingredient in every dessert. The pastry chef reviews the entire menu, identifies cross-contamination risks, and builds a safe alternative tasting sequence. Hands it back to the chef.

6 PM: A VIP table requests a wine pairing the sommelier never planned for. The sommelier spins up — tastes the dishes blind, pulls three bottles from the cellar, and writes a pairing note. The chef reads the note and approves.

7 PM: The oven breaks mid-service. The chef calls maintenance. The technician arrives with a diagnostic toolkit, fixes the thermostat, and leaves. The chef never learns how the thermostat works. The oven works. Service continues.

8 PM: The restaurant’s Instagram needs a post about tonight’s special. The chef hands a photo and three notes to the social media manager, who writes the caption, schedules it, and moves on.

10 PM: The chef locks up. Alone again. Every specialist has gone home. The kitchen is quiet.

The pattern: the chef is always present. The specialists arrive when needed, do one thing exceptionally well, and leave. The chef does not need to be a pastry expert, a sommelier, an electrician, and a copywriter. The chef needs to know when to call them.

When It Actually Matters

Context isolation is the real win, but it only shows up at scale.

If you ask for a weather forecast, spawning a research sub-agent is absurd. The main agent can do it in one web search. Adding a specialist adds latency, cost, and complexity for no gain.

But when tasks pile up, the alternative gets ugly. Imagine a single agent that has recently researched GPU kernels, written a LinkedIn post, debugged a Python script, and planned a trip. Its memory is a cocktail of technical specs, social media tone, error logs, and hotel reviews. The next task it handles will be subtly contaminated by all of them. It might suggest a hotel with good GPU ventilation.

Sub-agents prevent this by design. Each specialist’s workspace gets wiped after the job. The researcher’s memory of CUDA benchmarks does not leak into the travel planner. The writer’s stop-slop rules do not constrain the coder’s variable naming.

The Framework and the Model

OpenClaw, AutoGen, CrewAI — these are frameworks. They provide the harness. The language model provides the intelligence. Neither works alone.

The framework handles workspace isolation, tool access, and lifecycle. It creates the specialist’s temporary office, gives them the tools they need, and cleans up when they leave.

The model handles the reasoning. It decides when to delegate, how to frame the request, and how to synthesize the result. A weak model with a good framework still makes bad delegation decisions. A strong model with no framework still gets its memory polluted.

A framework without a capable model is a car without an engine. A model without a framework is an engine without a car. You need both to get anywhere.

This also means your setup is model-portable. Swap the model and the quality of delegation changes. Swap the framework and the mechanics of isolation change. The architecture stays the same.

Should You Care?

If you use AI for simple tasks — asking questions, drafting emails, checking weather — you do not need to care. Claude works fine on its own.

If you build with AI — chaining tasks, running daily workflows, managing multiple domains — you should care. Context isolation is not a luxury at that scale. It is architecture.

How to Try This Yourself

You do not need to install OpenClaw or learn a framework to feel the difference. You can experiment with what you already have.

Start with a chatbot. Ask it to plan a weekend trip, research restaurants, and write a packing list. It will do all three in one long response. The restaurant recommendations will be influenced by the weather forecast. The packing list will include items from the restaurant research that have nothing to do with luggage. This is not the model being bad. This is context pollution in action.

Now try the same thing with separation. Use one chat session for research: “Find me the best restaurants in Portland for a rainy weekend.” Copy the result. Open a new chat session for planning: “Plan a Saturday itinerary using these restaurants.” Paste your research. Open a third session for the packing list: “What should I pack for a rainy weekend in Portland with this itinerary?” Paste the itinerary.

The second approach takes slightly more effort. The results will be noticeably better. Each task gets a clean context. The packing list will not include umbrella recommendations shaped by your dinner reservation.

That manual separation is what sub-agents do automatically. You are the coordinator. The new chat sessions are your specialists. The only difference is scale.

What Changes at Scale

The weekend trip is three tasks. Now imagine thirty. Daily email summaries, article research, code reviews, meeting prep, social media scheduling, expense tracking. Done by one agent, each task contaminates the next. Done with separation, each task stays clean.

The practical gains are composability. Own the application layer. Rent the model layer. Build a system where specialists do what they do best, and the coordinator keeps the whole thing coherent.

That is the stack. That is the rule.

The AI 5-Layer Cake

Saving with Model Tiering and Token Optimization

Why I Stopped Typing to My AI Agents

Avoiding the road not taken - my OpenClaw setup journey

How was this?