What it actually does
LangGraph models an agent system as an explicit graph: nodes are functions or model calls, edges are transitions, state is a typed object that flows through the graph. Where LangChain hides orchestration behind chains, LangGraph makes it visible.
We use LangGraph in production across orchestrator-worker patterns, routing patterns, and evaluator-optimiser patterns. The reason is that the orchestration shape is explicit; debugging is the same activity as reading the code.
What is good
- Linear scaling. Concurrency adds latency, not coordination overhead, up to the LLM-provider rate limits.
- State-checkpointing works. We checkpoint mid-pattern and resume on failure; that ability is structural, not bolted on.
- Type-driven. The state object is typed; the graph is verifiable; debugging is faster.
- Plays well with LangChain integrations while not requiring you to live inside the LangChain abstraction layer.
What is broken or surprising
- Cognitive load is higher than CrewAI for first-time users. The graph model is the right model; it is also a model you must learn before you ship.
- Documentation lags the code in places. Read the source for the edge cases.
- Dependency on LangChain for some integrations means the same churn risk.
When you would choose it
Pick LangGraph if you expect orchestration complexity and you expect to scale. Skip LangGraph if you are prototyping a simple role-based multi-agent system and want to ship in a week; use CrewAI. The honest comparison rule lives at langgraph-vs-crewai.
Cost at scale
Open source. Cost is model passthrough. Per-task cost across our pipeline is dominated by the patterns being orchestrated, not by the framework overhead. The framework adds maybe 1-3% to the per-task cost in latency-equivalent terms, which is rounding error.
Read next

Oliver runs Digital Signet, a research and product studio that operates ~500 production sites with AI agents as the engineering layer. The Digital Signet portfolio is built using a continuous AI-agent build pipeline, one of the largest agent-operated publishing operations on the open web. The handbook draws directly from those deployments: real cost data, real failure modes, real recovery patterns.