The Six Pillars of Coding Agents (and How to Evaluate Any New Product)

Agent loop, tools, context, memory, multi-agent, and harness—one framework I use instead of chasing every new launch.

Published June 3, 2026

Every coding agent looks different in the UI, but most products wrestle with the same six problems. I use this map when comparing Cursor, Claude Code, Copilot agents, or internal tools—instead of feature checklists that age in a month.

One request, six systems

Say you ask an agent to refactor formatDate in src/utils.ts to use dayjs instead of moment. Between your enter key and “done,” roughly this happens:

Plan the next step — read files, check dependencies, edit, test. That is the agent loop (think → act → observe). Without it, you only have a chatbot.
Touch the repo — read, write, shell. That is the tool system (schemas, permissions, concurrency).
Stay within context limits — remember what changed in nine files without stuffing 200k tokens of noise. That is context engineering.
Remember team conventions across sessions — “we use pnpm.” That is memory (session vs long-term).
Fork exploration — a sub-agent scans the repo and returns a short summary so the parent thread stays clean. That is multi-agent (usually for context isolation, not role-play theater).
Stay safe and operable — confirm rm -rf, retry APIs, handle Ctrl+C, detect infinite loops. That is harness engineering.

Six pillars of agents

Pillar	One line	Analogy
Agent loop	Repeat think → act → observe	Heartbeat
Tool system	Files, shell, APIs	Hands
Context engineering	What enters the window this turn	Blood supply
Memory	Facts across sessions	Long-term recall
Multi-agent	Split work / isolate context	Team lanes
Harness	Policy, retries, hooks, lifecycle	Skeleton

New launches are easier to read through these lenses: what did they change in the loop, context, or harness?

Depth most demos skip

Agent loop — Production loops add truncation recovery, layered retries, seven-ish exit reasons (user abort, max turns, hook veto, context overflow), and streaming tool execution. See who owns the loop.

Tool system — Accuracy often drops as tool count grows (deferred loading, sandbox scripts, “mask don’t remove” for cache stability).

Context engineering — Fifty tool calls × 2k tokens each fills a window fast. “Lost in the middle” means curating beats stuffing. Common tactics: offload to disk, compress/summarize, retrieve (RAG), isolate (sub-agents), cache (prompt/KV).

Memory — From a plain MEMORY.md file to SQLite + hybrid search—pick by audience size and debuggability, not hype.

Multi-agent — Sub-agents mainly compress exploration into a small parent message; worktrees add filesystem isolation.

Harness — The difference between a demo and something you trust on a client repo.

How this connects to my other writing

Loop mechanics → Chatbot vs Agent
Build a minimal loop → Mini Claude Code
Long-term recall → Memory as context budget
Project-level AI workflow → Rules, Skills, SDD

Products change weekly; these problems do not. That is what I optimize for when shipping agent features for clients.