Definition
“Routing classifies an input and directs it to a specialized followup task. This workflow allows for separation of concerns, and building more specialized prompts. Without this workflow, optimizing for one kind of input can hurt performance on other inputs.”
From Anthropic, “Building Effective Agents”, December 2024.
What it does
The pattern is a two-step pipeline: a classifier reads the input and emits a label, then a dispatcher picks the handler associated with that label. The classifier might be a small LLM call, an embedding-based nearest-neighbour lookup, or a deterministic rule over input metadata. The handler set is open: a cheap model for the easy class, a reasoning model for the hard class, a human escalation path for the boundary case.
The strength of routing is cost separation: the cheap path stays cheap, the expensive path is reserved for inputs that genuinely need it. The trade-off is that mis-classification routes inputs to the wrong handler, which is a different (and harder) failure mode than a single oversized prompt would have.
When it is appropriate
Routing is the right pattern when:
- Input classes have different cost or quality requirements. Customer support is the textbook example: refund queries route differently from technical questions.
- The classification is reliable. If the classifier mis-labels more than a small fraction of inputs, the cost saving evaporates under retry overhead.
- There is a sensible default handler for unclassified inputs, including a path to human escalation.
Public examples
- OpenAI's structured outputs documentation shows the classifier-then-handler shape with JSON schema.
- Anthropic's tool-use documentation describes routing-by-tool-choice as a first-class pattern.
- LangGraph's conditional edges model routing as a graph edge whose target is computed at runtime from the node's output.
- CrewAI Flows provides
routeras a first-class decorator.
Cost considerations
The classifier adds a fixed per-input call. If the classifier is a small model and the dispatched-to handlers include a much larger model, the routing pattern saves cost on average. Vendor pricing pages (Anthropic, OpenAI) typically separate by an order of magnitude between the cheapest and most capable models, which is the cost lever the pattern exploits.
A second consideration is mis-routing recovery. If a downstream handler can detect that the input was mis-classified (for example, the “cheap” handler returns low confidence), a fall-through to a more capable handler raises cost on that input but preserves quality.
Failure mode
The dominant failure mode is silent mis-routing: the classifier confidently picks the wrong branch and the wrong handler returns plausible-but-wrong output. Mitigation typically uses a confidence threshold on the classifier output (a “confidence gate”) and a fall-through path to a more capable handler when confidence is low.
Glossary
See routing, classifier, confidence gate.
Foundational definitions on the sibling reference site: tool routing.