The five patterns.
Anthropic's December 2024 paper names five composable patterns that cover most production agent designs. Each essay quotes the original definition, explains when the pattern is appropriate, names public projects that use it, and links public benchmarks where they exist.
The patterns are composable: a routing pattern often feeds into a prompt chain; an orchestrator-worker often wraps an evaluator-optimizer at the worker layer; a parallelization pattern can sit inside any of the others. The patterns are also cumulative: each one introduces failure modes the previous one did not have.
The naming and the five-way split are Anthropic's. The cited public examples and benchmark links are independent.
Prompt chaining →
A linear sequence of LLM calls where each step's output feeds the next. Reduces error by giving each call fewer degrees of freedom.
Tasks that decompose cleanly into stages: outline then draft then revise; parse then validate then transform.
Cheapest of the five patterns when chain depth is capped.
Routing →
A classifier picks one of N specialised handlers. The classifier may itself be an LLM or a deterministic rule.
Input classes have meaningfully different cost or quality requirements: cheap model for FAQ, reasoning model for complex query, human escalation for the boundary case.
Adds a small classification call per input. Saves cost when most inputs route to a cheaper handler.
Parallelization →
Fan out to N independent calls and aggregate. Two flavours: sectioning (sub-tasks) and voting (same task, multiple attempts).
The task has independent sub-parts, or higher confidence comes from multiple votes on the same prompt.
Linear in N. Latency is bounded by the slowest call, not the sum.
Orchestrator-worker →
A central LLM plans, dispatches subtasks to worker LLMs, then merges. Powerful but expensive when the planner over-decomposes.
Complex tasks where the subtasks are not known until the input arrives.
Most expensive of the five patterns. Worth a worker-cap and a per-task budget cap.
Evaluator-optimizer →
A generator proposes a candidate, an evaluator critiques it, the loop repeats until the evaluator accepts or a budget is hit.
Quality matters more than latency, the evaluator can articulate clear acceptance criteria, and the task is amenable to iterative refinement.
Variable. Cost is dominated by the iteration count; cap iterations.