$building.effective.agents
Menu
Last verified: April 2026
· Frameworks · Category overview

Agent frameworks: a category overview.

The landscape of agent frameworks groups into five categories with overlapping but distinct trade-offs. Each category has a few credible occupants. This page names them with links to their own documentation. There are no ranked verdicts on this page: vendor-published trade-offs and public benchmarks are the references.

The choice of framework is downstream of the choice of pattern. Once the agent's shape is decided (one of the five patterns, or a composition of them), the relevant question is which framework offers the primitives that pattern needs without paying for primitives it does not. The categories below sort frameworks by the primitives they emphasise, not by quality.

The general advice in vendor docs is consistent: start without a framework, write the loop by hand, add a framework when the application's structure makes the framework's primitives cheaper than rewriting them. See Anthropic's “Building Effective Agents” and the OpenAI cookbook for the equivalent advice from each vendor.

C01

Orchestration graphs

Frameworks that model an agent as a directed graph of nodes (LLM calls, tools, conditionals) with explicit state and edges. Strong for production: durable execution, checkpointing, observability are first-class.

C02

Multi-agent / role-based

Frameworks that compose multiple specialised agents, often with explicit role prompts (planner, researcher, executor) and a coordinator. Closer to the orchestrator-worker pattern by default.

C03

Single-agent SDKs

Library-level wrappers around vendor APIs that handle the loop, tool calls, and structured outputs without a graph or orchestration layer. Suited to a one-agent-with-tools shape.

C04

Programming-style abstractions

Frameworks that compile prompts and pipelines from declarative descriptions, treating agents as programs to optimise rather than scripts to run.

C05

Minimal / from-scratch

Sub-frameworks at the size of a single file. Suitable for educational use, prototypes, or when the production path is to write the loop by hand and add only what the application needs.

A note on benchmarks

Framework rankings tend to age poorly because the underlying models change faster than the frameworks. Public agent benchmarks (SWE-Bench, AgentBench, GAIA) measure full-agent performance, not framework performance. See evaluating an agent for how to read these.

Read next