How AI agents actually work — planning, tools, memory, and the loop

People throw around “agent” to mean anything from a chatbot with one tool to a swarm of autonomous workers. This post is the protocol- and architecture-level walkthrough I keep wanting to link people to: what an LLM agent actually is, the loop it runs, the moving parts (planner, tools, memory, critic), and how the ecosystem pieces — MCP, A2A, plain OpenAPI — fit together.

Companion posts: MCP (how an agent talks to its tools) and A2A (how agents talk to each other).

TL;DR — an agent is a loop that wraps an LLM with tools, memory, and a goal. Each turn the LLM decides to either call a tool or give a final answer. The interesting engineering is in everything around that loop: how you structure the plan, how you expose tools, how you control cost and stop conditions, and how you keep it from going off the rails.

The minimum-viable agent

Stripped of jargon, every agent is this:

flowchart LR
  goal[Goal / user message] --> A
  subgraph Loop
    A[LLM step] -->|tool call?| B{decide}
    B -- tool --> T[Run tool]
    T --> S[Observation]
    S --> A
    B -- final --> O[Answer]
  end
  O --> user[User]

That’s it. The LLM is given the goal + a list of tools + the conversation so far. It outputs either a tool call (with arguments) or a final answer. If it’s a tool call, the runtime executes it, appends the observation, and loops. The whole field of “agent design” is variations on this theme.

What’s actually in an agent

A production agent has more parts than the loop suggests:

flowchart TB
  subgraph Agent
    direction TB
    P[Planner / LLM]
    T[Tool registry]
    M[(Short-term memory<br/>conversation buffer)]
    L[(Long-term memory<br/>vector store / KV)]
    C[Critic / verifier]
    G[Guardrails]
    Tr[Tracer / logger]
  end
  user[User / caller] --> P
  P <--> M
  P <--> L
  P --> T
  T --> P
  P --> C
  C --> P
  P --> G
  G --> user
  Tr --- P
  Tr --- T

Planner — the LLM choosing what to do next.
Tool registry — typed callable functions (often via MCP).
Short-term memory — the running conversation / scratchpad.
Long-term memory — facts, past tasks, retrieved docs.
Critic — a second LLM pass that checks the plan or output.
Guardrails — content filters, schema validators, scope checks.
Tracer — every step logged for debugging and evals.

Skipping any one of these is fine for a demo and painful in production.

The four planning patterns you’ll meet

There’s a small zoo of “agent architectures”. Most reduce to four shapes.

1. ReAct — interleave reasoning and action

The classic loop. The LLM emits a Thought → Action → Observation cycle until it emits a Final Answer.

sequenceDiagram
  autonumber
  participant U as User
  participant A as Agent (LLM)
  participant T as Tool
  U->>A: "What's the weather in Tokyo and how does that compare to Bangalore?"
  A->>A: Thought: I need both temperatures
  A->>T: get_weather(city="Tokyo")
  T-->>A: {temp:18, cond:"cloudy"}
  A->>A: Thought: now Bangalore
  A->>T: get_weather(city="Bangalore")
  T-->>A: {temp:27, cond:"clear"}
  A->>A: Thought: enough; compose answer
  A-->>U: "Tokyo is 18°C cloudy, Bangalore 27°C clear — Bangalore is ~9° warmer."

Pros: simple, transparent, easy to debug. Cons: pays the full prompt cost on every step (the whole transcript ships each turn); rambles on hard problems.

2. Plan-and-execute — write the plan first, then do it

Two phases: a planner LLM produces an explicit step list, an executor runs each step (often with cheaper models), a replanner kicks in if something fails.

flowchart TB
  goal[Goal] --> P[Planner LLM<br/>writes step list]
  P --> E[Executor]
  E --> S1[Step 1: tool call]
  E --> S2[Step 2: tool call]
  E --> S3[Step 3: synthesise]
  S1 --> Check{ok?}
  S2 --> Check
  S3 --> Check
  Check -- yes --> Done[Answer]
  Check -- no --> RP[Replan]
  RP --> E

Pros: cheaper (executor steps don’t need the big model), clearer audit trail, easier to parallelise independent steps. Cons: bad plans cascade; you need a real replan path.

3. Tree of Thoughts / search

For problems where there are many candidate next moves (puzzles, code, math), the agent expands a search tree of partial solutions and scores them with a critic.

flowchart TB
  root[Goal] --> n1[Option A]
  root --> n2[Option B]
  root --> n3[Option C]
  n1 --> n11[A1]
  n1 --> n12[A2]
  n2 --> n21[B1]
  n3 --> n31[C1]
  n3 --> n32[C2]
  classDef best fill:#1a1530,stroke:#22d3ee,stroke-width:2px
  class n12 best

Critic prunes weak branches; agent commits to the highest-scoring leaf. Powerful but expensive — you’re running the LLM many times per goal.

4. Reflexion — learn from your own failures

After a failed attempt, the agent writes a reflection (a short post-mortem) into long-term memory, then retries with that reflection in context. Useful for repeated tasks where the failure modes are stable.

sequenceDiagram
  participant A as Agent
  participant E as Env / tools
  participant L as Long-term memory
  loop attempt
    A->>E: try plan
    E-->>A: outcome
    alt failed
      A->>A: reflect("what went wrong")
      A->>L: write reflection
      A->>A: load reflections, retry
    else succeeded
      A-->>A: done
    end
  end

Most production “agents” are plan-and-execute + a tiny bit of ReAct inside each step + occasional reflection on failure. Pure tree search is rare outside coding/math agents.

Tools — the only way an agent affects the world

A tool is a function the model can call. Three things matter:

Name + description — what the model reads to decide whether to call.
Input schema — what the model reads to decide how to call.
Side-effect class — read-only, write, or destructive (governs UI confirmation policy).

Modern stacks expose tools via MCP — the model gets a discoverable list of tools from one or more MCP servers, with strict JSON schemas.

flowchart LR
  Agent[Agent runtime] -- MCP --> S1[fs server]
  Agent -- MCP --> S2[github server]
  Agent -- MCP --> S3[postgres server]
  Agent -- MCP --> S4[browser server]

Rules of thumb that hold up:

Narrow tools beat wide ones. read_file(path) + write_file(path, body) beats fs(op, path, ...). Models pick by name and description; clutter hurts.
Describe the contract, not the implementation. “Returns up to 50 results, sorted by recency” is what the model needs.
Fail loud and structured. Errors should be JSON the model can reason about, not stack traces.
Idempotency matters. Agents retry. Accept an idempotency key on any side-effecting tool.

For the wire-level details, see the MCP post.

Memory — what the agent remembers

Two layers, very different problems.

Short-term: the working conversation

The transcript that ships with every LLM call. Constraints:

Bounded by the context window (large but not free).
Token cost on every turn — keep it tight.
The most recent observations bias the model heavily.

Common tactics: summarisation (compress old turns into a short note), windowing (drop everything older than N steps), scratchpad separation (keep the chain-of-thought in a scratchpad you don’t show the user but do feed back to the model).

Long-term: what survives across sessions

Stored externally; retrieved on demand. Two flavours:

flowchart LR
  subgraph LT[Long-term memory]
    direction TB
    V[(Vector store<br/>semantic recall)]
    K[(KV / SQL store<br/>exact facts)]
  end
  Agent --> Q[query] --> V
  Agent --> Q2[lookup] --> K
  V --> Ctx[snippets into context]
  K --> Ctx

Vector store: semantic search over past turns / docs. “Have I seen something like this before?”
KV / SQL: exact, structured state. “User’s preferred timezone”, “last invoice id”.

A tip that saves a lot of pain: make memory a tool, not magic. The planner explicitly calls recall(query) and remember(fact). That keeps it inspectable and lets the model decide when memory is worth fetching.

The loop, drawn properly

Putting tools + memory + planner together, a real ReAct-ish step looks like:

sequenceDiagram
  autonumber
  participant U as User
  participant R as Runtime
  participant P as Planner LLM
  participant M as Memory
  participant T as Tools (MCP)
  participant G as Guardrails
  U->>R: goal
  R->>M: load short+long context
  loop until done or budget hit
    R->>P: prompt = system + tools + memory + transcript
    P-->>R: tool_call OR final_answer
    alt tool_call
      R->>G: validate args, scope, schema
      G-->>R: ok | reject
      R->>T: execute tool
      T-->>R: observation
      R->>M: append observation
    else final_answer
      R->>G: validate output
      R-->>U: answer
    end
  end

Three production details visible in that diagram:

Budget cap. Always have a hard ceiling on steps + tokens + wall time. Without it, agents will loop forever on edge cases.
Guardrails on both sides. Validate tool args before execution (scope, schema, destructive-flag); validate the final output before showing it.
Memory is explicit. Loaded into the prompt, appended after each step. It is not magic.

Multi-agent: when one isn’t enough

Sometimes the right shape isn’t one agent with many tools — it’s many agents, each with its own model, system prompt, and tools. They talk over A2A (see the A2A post).

The patterns that actually work:

Supervisor + workers

flowchart TB
  user[User] --> S[Supervisor agent]
  S --> W1[Worker: research]
  S --> W2[Worker: code]
  S --> W3[Worker: write]
  W1 --> S
  W2 --> S
  W3 --> S
  S --> user

Supervisor decomposes the goal, routes subtasks to specialists, recombines results. Easy to reason about, easy to add a worker.

Pipeline

flowchart LR
  user[User] --> A1[Agent 1: extract]
  A1 --> A2[Agent 2: enrich]
  A2 --> A3[Agent 3: format]
  A3 --> user

Linear. Each agent’s output is the next one’s input. Great when the stages are stable.

Debate / critic

flowchart LR
  user[User] --> A[Proposer]
  A --> C[Critic]
  C -- ok --> user
  C -- revise --> A

Two agents play tennis until the critic accepts. Boosts quality on open-ended writing/code tasks at the cost of latency.

Marketplace

flowchart TB
  user[User] --> R[Router]
  R -. card .-> A1[Agent A]
  R -. card .-> A2[Agent B]
  R -. card .-> A3[Agent C]
  R -- A2A task --> Best[Chosen agent]
  Best --> user

Router fetches A2A agent cards from a registry, picks the best fit per task, and forwards. This is the future shape once A2A registries mature.

Rule of thumb: start with one agent and many tools. Add a second agent only when (a) the prompts genuinely conflict, or (b) the second “agent” is owned by another team / vendor.

How MCP, A2A, and OpenAPI compose

Three layers, three jobs:

flowchart LR
  user[User] --> agentA[Agent A]
  agentA -- MCP --> tools[(A's tools)]
  agentA -- A2A --> agentB[Agent B]
  agentB -- MCP --> btools[(B's tools)]
  tools -- OpenAPI --> svc1[(REST service)]
  btools -- OpenAPI --> svc2[(REST service)]

OpenAPI is between code and a service.
MCP is between an agent and its own tools.
A2A is between two agents as peers.

You’ll usually use all three in the same stack, and that’s fine — each solves a different problem.

Failure modes (and what to do about them)

failure	symptom	mitigation
Loops	same tool called over and over	Cap steps; detect repeated `(tool, args_hash)` and break
Wrong tool picked	model invents args, calls `delete_user` instead of `get_user`	Tighter `inputSchema`; rename ambiguous tools; verb-prefix destructive ones
Hallucinated tools	model calls a tool that doesn’t exist	Strict tool-list in system prompt; reject unknown calls with a structured error
Context overflow	latency spikes, token cost explodes	Summarise old turns; window the transcript; move large blobs to resources
Prompt injection via observation	tool result tells the model “ignore previous instructions”	Treat observations as data; sanitise / wrap in delimited blocks; never let observations unlock new tools
Cost runaway	bill goes up linearly with attempts	Per-task budget; tier models (cheap planner, expensive only when stuck)
Silent regression	agent quietly worse after a model swap	Trace + replay corpus; eval suite blocking deploys

The last one is the most under-rated. Eval-driven development is to agents what tests are to code. Build the corpus before you scale the behaviour.

When not to build an agent

Plenty of LLM problems aren’t agent problems. If your task is:

a single tool call → just call the tool with structured outputs.
a fixed pipeline → write it as code with LLM steps inside.
pure retrieval + answer → it’s a RAG app, not an agent.
deterministic with rules → don’t use an LLM at all.

Agents earn their keep when the next step depends on the previous result in ways you can’t enumerate up front. If you can draw the flowchart, you don’t need an agent.

A practical starter shape

If you’re building one today, start here:

One agent, plan-and-execute. Cheap planner, expensive only when replanning.
Tools via MCP, even for local stuff. Future-proofs you.
Two memory tools: recall(query) and remember(fact). No magic middleware.
Hard budget: max 12 steps, max $X tokens, max 60s wall.
Tracing on day one. Every step + arg + observation logged.
Eval corpus on day two. 20 representative tasks; replay them on every change.
Add a critic before you add a second agent.
Add a second agent only when a different team owns the work or the prompts genuinely fight each other.

Most “agent failures” I’ve seen weren’t model failures — they were missing budgets, missing traces, or tool surfaces too wide for the model to navigate. Get those right and the loop mostly works itself.