Definition
“In the orchestrator-workers workflow, a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results. Unlike parallelization, its key difference is its flexibility: subtasks aren't pre-defined, but determined by the orchestrator based on the specific input.”
From Anthropic, “Building Effective Agents”, December 2024.
What it does
The orchestrator receives the input and produces a plan: a list of subtasks with associated handlers (often other LLM calls, sometimes deterministic tools). The orchestrator dispatches the subtasks, collects their outputs, and synthesises the final result. The plan can be sequential, parallel, or a directed graph.
The pattern's strength is dynamism: the orchestrator can adapt the plan to the input rather than committing to a single decomposition up-front. The trade-off is cost and unpredictability. The orchestrator's plan is itself a non-deterministic LLM output, so the same input may produce different worker counts on different runs.
When it is appropriate
- The set of subtasks depends on the input. Code refactoring is the canonical example: which files need to change, and how, is not knowable until the input is parsed.
- Subtasks differ from each other enough that a single shared handler would be a bad fit. Worker specialisation pays off when workers can carry tailored prompts, tools, or model choices.
- The downstream cost of a wrong plan is bounded. If the orchestrator produces a poor plan, the recovery is a re-plan, not an irreversible action.
Public examples
- The Anthropic cookbook hosts a reference implementation.
- LangGraph's supervisor pattern is the orchestrator-worker shape with a typed message contract between supervisor and workers.
- CrewAI's hierarchical process appoints a manager agent to plan and delegate to workers.
- AutoGen Teams provide a similar orchestrator-and-worker abstraction.
Cost considerations
The orchestrator-worker pattern is the most expensive of the five, because the orchestrator is itself an LLM call (often a large one) that runs in addition to N worker calls. The cost has three terms: the orchestrator's planning call, N worker calls, and a synthesis call. The orchestrator decides N based on the input, so cost is variable per input.
Two cost discipline patterns appear in production reference code. The first is a hard cap on worker count: the orchestrator prompt instructs the model not to exceed a maximum; the dispatcher enforces the cap programmatically. The second is per-task budget accounting: a budget passed into the orchestrator that the synthesis call must respect. Vendor pricing pages (Anthropic, OpenAI) make the calculation straightforward once N is bounded.
Failure mode
The dominant failure mode is cost spikes: a permissive planning prompt produces a plan with N workers when three would do. The plan is plausible, the workers run, the cost is many times the expected per-task cost. Worker caps and post-run budget alerts mitigate this.
A secondary failure mode is synthesis disagreement: the workers return contradictory outputs and the synthesiser cannot reconcile them. Strict per-worker output schemas make this tractable.
Glossary
See orchestrator-worker, planner, worker.
Foundational definitions on the sibling reference site: centralised orchestration, planner-executor pattern.