How AI agents actually work — planning, tools, memory, and the loop
People throw around “agent” to mean anything from a chatbot with one tool to a swarm of autonomous workers. This post is the protocol- and architecture-level walkthrough I keep wanting to link people to: what an LLM agent actually is, the loop it runs, the moving parts (planner, tools, memory, critic), and how the ecosystem pieces — MCP, A2A, plain OpenAPI — fit together.
Companion posts: MCP (how an agent talks to its tools) and A2A (how agents talk to each other).
TL;DR — an agent is a loop that wraps an LLM with tools, memory, and a goal. Each turn the LLM decides to either call a tool or give a final answer. The interesting engineering is in everything around that loop: how you structure the plan, how you expose tools, how you control cost and stop conditions, and how you keep it from going off the rails.
The minimum-viable agent
Stripped of jargon, every agent is this:
flowchart LR
goal[Goal / user message] --> A
subgraph Loop
A[LLM step] -->|tool call?| B{decide}
B -- tool --> T[Run tool]
T --> S[Observation]
S --> A
B -- final --> O[Answer]
end
O --> user[User]
That’s it. The LLM is given the goal + a list of tools + the conversation so far. It outputs either a tool call (with arguments) or a final answer. If it’s a tool call, the runtime executes it, appends the observation, and loops. The whole field of “agent design” is variations on this theme.
What’s actually in an agent
A production agent has more parts than the loop suggests:
flowchart TB
subgraph Agent
direction TB
P[Planner / LLM]
T[Tool registry]
M[(Short-term memory<br/>conversation buffer)]
L[(Long-term memory<br/>vector store / KV)]
C[Critic / verifier]
G[Guardrails]
Tr[Tracer / logger]
end
user[User / caller] --> P
P <--> M
P <--> L
P --> T
T --> P
P --> C
C --> P
P --> G
G --> user
Tr --- P
Tr --- T
- Planner — the LLM choosing what to do next.
- Tool registry — typed callable functions (often via MCP).
- Short-term memory — the running conversation / scratchpad.
- Long-term memory — facts, past tasks, retrieved docs.
- Critic — a second LLM pass that checks the plan or output.
- Guardrails — content filters, schema validators, scope checks.
- Tracer — every step logged for debugging and evals.
Skipping any one of these is fine for a demo and painful in production.
The four planning patterns you’ll meet
There’s a small zoo of “agent architectures”. Most reduce to four shapes.
1. ReAct — interleave reasoning and action
The classic loop. The LLM emits a Thought → Action → Observation cycle
until it emits a Final Answer.
sequenceDiagram
autonumber
participant U as User
participant A as Agent (LLM)
participant T as Tool
U->>A: "What's the weather in Tokyo and how does that compare to Bangalore?"
A->>A: Thought: I need both temperatures
A->>T: get_weather(city="Tokyo")
T-->>A: {temp:18, cond:"cloudy"}
A->>A: Thought: now Bangalore
A->>T: get_weather(city="Bangalore")
T-->>A: {temp:27, cond:"clear"}
A->>A: Thought: enough; compose answer
A-->>U: "Tokyo is 18°C cloudy, Bangalore 27°C clear — Bangalore is ~9° warmer."
Pros: simple, transparent, easy to debug. Cons: pays the full prompt cost on every step (the whole transcript ships each turn); rambles on hard problems.
2. Plan-and-execute — write the plan first, then do it
Two phases: a planner LLM produces an explicit step list, an executor runs each step (often with cheaper models), a replanner kicks in if something fails.
flowchart TB
goal[Goal] --> P[Planner LLM<br/>writes step list]
P --> E[Executor]
E --> S1[Step 1: tool call]
E --> S2[Step 2: tool call]
E --> S3[Step 3: synthesise]
S1 --> Check{ok?}
S2 --> Check
S3 --> Check
Check -- yes --> Done[Answer]
Check -- no --> RP[Replan]
RP --> E
Pros: cheaper (executor steps don’t need the big model), clearer audit trail, easier to parallelise independent steps. Cons: bad plans cascade; you need a real replan path.
3. Tree of Thoughts / search
For problems where there are many candidate next moves (puzzles, code, math), the agent expands a search tree of partial solutions and scores them with a critic.
flowchart TB root[Goal] --> n1[Option A] root --> n2[Option B] root --> n3[Option C] n1 --> n11[A1] n1 --> n12[A2] n2 --> n21[B1] n3 --> n31[C1] n3 --> n32[C2] classDef best fill:#1a1530,stroke:#22d3ee,stroke-width:2px class n12 best
Critic prunes weak branches; agent commits to the highest-scoring leaf. Powerful but expensive — you’re running the LLM many times per goal.
4. Reflexion — learn from your own failures
After a failed attempt, the agent writes a reflection (a short post-mortem) into long-term memory, then retries with that reflection in context. Useful for repeated tasks where the failure modes are stable.
sequenceDiagram
participant A as Agent
participant E as Env / tools
participant L as Long-term memory
loop attempt
A->>E: try plan
E-->>A: outcome
alt failed
A->>A: reflect("what went wrong")
A->>L: write reflection
A->>A: load reflections, retry
else succeeded
A-->>A: done
end
end
Most production “agents” are plan-and-execute + a tiny bit of ReAct inside each step + occasional reflection on failure. Pure tree search is rare outside coding/math agents.
Tools — the only way an agent affects the world
A tool is a function the model can call. Three things matter:
- Name + description — what the model reads to decide whether to call.
- Input schema — what the model reads to decide how to call.
- Side-effect class — read-only, write, or destructive (governs UI confirmation policy).
Modern stacks expose tools via MCP — the model gets a discoverable list of tools from one or more MCP servers, with strict JSON schemas.
flowchart LR Agent[Agent runtime] -- MCP --> S1[fs server] Agent -- MCP --> S2[github server] Agent -- MCP --> S3[postgres server] Agent -- MCP --> S4[browser server]
Rules of thumb that hold up:
- Narrow tools beat wide ones.
read_file(path)+write_file(path, body)beatsfs(op, path, ...). Models pick by name and description; clutter hurts. - Describe the contract, not the implementation. “Returns up to 50 results, sorted by recency” is what the model needs.
- Fail loud and structured. Errors should be JSON the model can reason about, not stack traces.
- Idempotency matters. Agents retry. Accept an idempotency key on any side-effecting tool.
For the wire-level details, see the MCP post.
Memory — what the agent remembers
Two layers, very different problems.
Short-term: the working conversation
The transcript that ships with every LLM call. Constraints:
- Bounded by the context window (large but not free).
- Token cost on every turn — keep it tight.
- The most recent observations bias the model heavily.
Common tactics: summarisation (compress old turns into a short note), windowing (drop everything older than N steps), scratchpad separation (keep the chain-of-thought in a scratchpad you don’t show the user but do feed back to the model).
Long-term: what survives across sessions
Stored externally; retrieved on demand. Two flavours:
flowchart LR
subgraph LT[Long-term memory]
direction TB
V[(Vector store<br/>semantic recall)]
K[(KV / SQL store<br/>exact facts)]
end
Agent --> Q[query] --> V
Agent --> Q2[lookup] --> K
V --> Ctx[snippets into context]
K --> Ctx
- Vector store: semantic search over past turns / docs. “Have I seen something like this before?”
- KV / SQL: exact, structured state. “User’s preferred timezone”, “last invoice id”.
A tip that saves a lot of pain: make memory a tool, not magic. The
planner explicitly calls recall(query) and remember(fact). That keeps
it inspectable and lets the model decide when memory is worth fetching.
The loop, drawn properly
Putting tools + memory + planner together, a real ReAct-ish step looks like:
sequenceDiagram
autonumber
participant U as User
participant R as Runtime
participant P as Planner LLM
participant M as Memory
participant T as Tools (MCP)
participant G as Guardrails
U->>R: goal
R->>M: load short+long context
loop until done or budget hit
R->>P: prompt = system + tools + memory + transcript
P-->>R: tool_call OR final_answer
alt tool_call
R->>G: validate args, scope, schema
G-->>R: ok | reject
R->>T: execute tool
T-->>R: observation
R->>M: append observation
else final_answer
R->>G: validate output
R-->>U: answer
end
end
Three production details visible in that diagram:
- Budget cap. Always have a hard ceiling on steps + tokens + wall time. Without it, agents will loop forever on edge cases.
- Guardrails on both sides. Validate tool args before execution (scope, schema, destructive-flag); validate the final output before showing it.
- Memory is explicit. Loaded into the prompt, appended after each step. It is not magic.
Multi-agent: when one isn’t enough
Sometimes the right shape isn’t one agent with many tools — it’s many agents, each with its own model, system prompt, and tools. They talk over A2A (see the A2A post).
The patterns that actually work:
Supervisor + workers
flowchart TB user[User] --> S[Supervisor agent] S --> W1[Worker: research] S --> W2[Worker: code] S --> W3[Worker: write] W1 --> S W2 --> S W3 --> S S --> user
Supervisor decomposes the goal, routes subtasks to specialists, recombines results. Easy to reason about, easy to add a worker.
Pipeline
flowchart LR user[User] --> A1[Agent 1: extract] A1 --> A2[Agent 2: enrich] A2 --> A3[Agent 3: format] A3 --> user
Linear. Each agent’s output is the next one’s input. Great when the stages are stable.
Debate / critic
flowchart LR user[User] --> A[Proposer] A --> C[Critic] C -- ok --> user C -- revise --> A
Two agents play tennis until the critic accepts. Boosts quality on open-ended writing/code tasks at the cost of latency.
Marketplace
flowchart TB user[User] --> R[Router] R -. card .-> A1[Agent A] R -. card .-> A2[Agent B] R -. card .-> A3[Agent C] R -- A2A task --> Best[Chosen agent] Best --> user
Router fetches A2A agent cards from a registry, picks the best fit per task, and forwards. This is the future shape once A2A registries mature.
Rule of thumb: start with one agent and many tools. Add a second agent only when (a) the prompts genuinely conflict, or (b) the second “agent” is owned by another team / vendor.
How MCP, A2A, and OpenAPI compose
Three layers, three jobs:
flowchart LR user[User] --> agentA[Agent A] agentA -- MCP --> tools[(A's tools)] agentA -- A2A --> agentB[Agent B] agentB -- MCP --> btools[(B's tools)] tools -- OpenAPI --> svc1[(REST service)] btools -- OpenAPI --> svc2[(REST service)]
- OpenAPI is between code and a service.
- MCP is between an agent and its own tools.
- A2A is between two agents as peers.
You’ll usually use all three in the same stack, and that’s fine — each solves a different problem.
Failure modes (and what to do about them)
| failure | symptom | mitigation |
|---|---|---|
| Loops | same tool called over and over | Cap steps; detect repeated (tool, args_hash) and break |
| Wrong tool picked | model invents args, calls delete_user instead of get_user | Tighter inputSchema; rename ambiguous tools; verb-prefix destructive ones |
| Hallucinated tools | model calls a tool that doesn’t exist | Strict tool-list in system prompt; reject unknown calls with a structured error |
| Context overflow | latency spikes, token cost explodes | Summarise old turns; window the transcript; move large blobs to resources |
| Prompt injection via observation | tool result tells the model “ignore previous instructions” | Treat observations as data; sanitise / wrap in delimited blocks; never let observations unlock new tools |
| Cost runaway | bill goes up linearly with attempts | Per-task budget; tier models (cheap planner, expensive only when stuck) |
| Silent regression | agent quietly worse after a model swap | Trace + replay corpus; eval suite blocking deploys |
The last one is the most under-rated. Eval-driven development is to agents what tests are to code. Build the corpus before you scale the behaviour.
When not to build an agent
Plenty of LLM problems aren’t agent problems. If your task is:
- a single tool call → just call the tool with structured outputs.
- a fixed pipeline → write it as code with LLM steps inside.
- pure retrieval + answer → it’s a RAG app, not an agent.
- deterministic with rules → don’t use an LLM at all.
Agents earn their keep when the next step depends on the previous result in ways you can’t enumerate up front. If you can draw the flowchart, you don’t need an agent.
A practical starter shape
If you’re building one today, start here:
- One agent, plan-and-execute. Cheap planner, expensive only when replanning.
- Tools via MCP, even for local stuff. Future-proofs you.
- Two memory tools:
recall(query)andremember(fact). No magic middleware. - Hard budget: max 12 steps, max $X tokens, max 60s wall.
- Tracing on day one. Every step + arg + observation logged.
- Eval corpus on day two. 20 representative tasks; replay them on every change.
- Add a critic before you add a second agent.
- Add a second agent only when a different team owns the work or the prompts genuinely fight each other.
Most “agent failures” I’ve seen weren’t model failures — they were missing budgets, missing traces, or tool surfaces too wide for the model to navigate. Get those right and the loop mostly works itself.