LLM backends and per-node model routing in dagraph

dagraph separates the concept of a backend (how to reach a provider) from the model (which model to call). A backend is the transport layer — it knows how to authenticate, send a request, and parse the response for a specific provider. You set a default backend once with the --backend CLI flag; individual nodes can override it by prefixing the model name with a provider identifier.

Backend summary

Backend	How it runs	Billing	Required setup
`claude_code`	Spawns the `claude` CLI as a subprocess	Claude Code plan (Max/Pro)	`claude` CLI installed and authenticated
`api`	Anthropic Messages API	Per-token via API key	`ANTHROPIC_API_KEY` environment variable
`openai`	OpenAI Chat Completions API	Per-token via API key	`OPENAI_API_KEY` environment variable
`gemini`	Google GenAI API	Per-token via API key	`GEMINI_API_KEY` environment variable
`bedrock`	AWS Bedrock	Per-token via AWS account	`pip install dagraph[bedrock]` + AWS credentials
`ollama`	Local Ollama daemon	Free (local)	Ollama running on `localhost:11434`
`codex`	OpenAI Codex CLI subprocess	OpenAI plan	`codex` CLI installed and authenticated

Backend details

claude_code (default)

The claude_code backend spawns claude -p as a subprocess for each node call. It runs against your Claude Code plan (Max or Pro) rather than the API, so nodes do not accrue per-token API charges. Token usage is still recorded in traces and counted against any budget you set (using equivalent API cost), but the charges are your plan subscription, not pay-per-token.

agentgraph run workflow.yaml --input topic="AI agents"
# --backend claude_code is the default, so this is equivalent to:
agentgraph run workflow.yaml --backend claude_code --input topic="AI agents"

The claude_code backend starts a new, isolated claude session per node call with no persistent tools. Streaming (stream: true on a node) is silently ignored for this backend.

api — Anthropic Messages API

The api backend calls the Anthropic Messages API directly. It supports all agent node features including tools, mcp_servers, output_schema, and stream.Setup: Set ANTHROPIC_API_KEY in your environment or in a .env file at your project root.

export ANTHROPIC_API_KEY=sk-ant-...
agentgraph run workflow.yaml --backend api --input topic="AI agents"

# Pin specific nodes to the api backend using the anthropic/ prefix
- id: structured_output
  type: agent
  model: anthropic/claude-sonnet-4-6
  output_schema:
    type: object
    properties:
      summary: { type: string }
      score: { type: number }

openai — OpenAI Chat Completions

The openai backend calls the OpenAI Chat Completions API.Setup: Set OPENAI_API_KEY in your environment or .env.

export OPENAI_API_KEY=sk-...
agentgraph run workflow.yaml --backend openai --input topic="AI agents"

Use the openai/ prefix to route a specific node to OpenAI regardless of the default backend:

- id: critic
  type: agent
  model: openai/gpt-4o
  prompt: Critique this draft: {{ draft }}

gemini — Google GenAI

The gemini backend calls the Google GenAI API.Setup: Set GEMINI_API_KEY in your environment or .env.

export GEMINI_API_KEY=AIza...
agentgraph run workflow.yaml --backend gemini

- id: fast_summary
  type: agent
  model: gemini/gemini-2.0-flash
  prompt: Summarise in two sentences: {{ document }}

bedrock — AWS Bedrock

The bedrock backend calls AWS Bedrock. It requires the [bedrock] extra and AWS credentials configured in your environment (via AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY, an IAM role, or an AWS profile).Setup:

pip install "dagraph[bedrock]"
export AWS_DEFAULT_REGION=us-east-1
# Use IAM role, instance profile, or:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...

- id: analysis
  type: agent
  model: bedrock/anthropic.claude-sonnet-4-6-v1
  prompt: Analyse the following data: {{ data }}

ollama — local models

The ollama backend sends requests to a locally running Ollama daemon on localhost:11434. No API key is required. Use this as a free last resort in a fallback_chain, or as the default backend for development.Setup: Install and start Ollama, then pull the models you need:

ollama pull llama3.2
agentgraph run workflow.yaml --backend ollama

- id: draft
  type: agent
  model: ollama/llama3.2
  prompt: Write a short summary of {{ topic }}.

codex — OpenAI Codex CLI

The codex backend spawns the OpenAI Codex CLI as a subprocess, similar to how claude_code spawns the claude CLI. It runs against your OpenAI plan.Setup: Install and authenticate the codex CLI.

- id: code_review
  type: agent
  model: codex/o4-mini
  prompt: Review this code and suggest improvements: {{ code }}

Model-prefix routing

Any node can be pinned to a specific backend by prefixing the model value with provider/. dagraph splits on the first /, resolves the backend, and passes the remainder to that backend’s SDK.

provider/model-name

Examples:

`model` value	Backend used	Model passed to SDK
`claude-sonnet-4-6`	default backend (from `--backend`)	`claude-sonnet-4-6`
`anthropic/claude-sonnet-4-6`	`api`	`claude-sonnet-4-6`
`openai/gpt-4o`	`openai`	`gpt-4o`
`gemini/gemini-2.0-flash`	`gemini`	`gemini-2.0-flash`
`bedrock/anthropic.claude-sonnet-4-6-v1`	`bedrock`	`anthropic.claude-sonnet-4-6-v1`
`ollama/llama3.2`	`ollama`	`llama3.2`

Supported prefixes: anthropic, openai, gemini, bedrock, ollama. An unknown prefix raises a validation error at run time.

Mix models from different providers in a single DAG. Route cheap fast nodes to Haiku or ollama/llama3.2, send quality-sensitive nodes to claude-sonnet-4-6, and pin any node that needs structured output to anthropic/claude-sonnet-4-6 (the only backend that supports output_schema).

Setting the default backend

Use the --backend flag with agentgraph run. Every node that does not have a provider prefix in its model field uses this default.

agentgraph run workflow.yaml --backend api
agentgraph run workflow.yaml --backend openai
agentgraph run workflow.yaml --backend ollama

Nodes with a model prefix always override the default:

budget:
  max_usd: 1.00

nodes:
  # Uses default backend (whatever --backend is)
  - id: draft
    type: agent
    model: claude-haiku-4-5-20251001
    prompt: Draft a summary of {{ topic }}.

  # Always uses OpenAI regardless of --backend
  - id: critique
    type: agent
    model: openai/gpt-4o
    depends_on: [draft]
    prompt: Critique this draft: {{ draft }}

Fallback chains

Every agent node (and the generator/evaluator roles inside composite nodes) accepts a fallback_chain: an ordered list of model strings to try when the primary model returns a retriable error.

- id: research
  type: agent
  model: anthropic/claude-sonnet-4-6
  fallback_chain:
    - openai/gpt-4o
    - gemini/gemini-2.0-flash
    - ollama/llama3.2          # last resort: local, free
  prompt: |
    Research the topic: {{ topic }}.
    Return 3 key findings as bullet points.

Retriable errors (walk the chain): HTTP 429, 5xx, network errors, timeouts. Non-retriable errors (bypass the chain immediately): HTTP 401/403 (auth errors) and HTTP 400/422 (bad request). A different provider cannot fix bad credentials or invalid inputs. The scheduler tries each entry in order. The first model to return a successful response wins. Cost is charged only for the successful attempt. Every attempt — including failed ones — is recorded in the trace so you can see which provider actually served the response.

Multi-provider example

The following example (from examples/multi_provider_fallback.yaml) shows two nodes each using a different primary provider with a fallback chain:

name: multi_provider_fallback

inputs:
  topic:
    type: string

nodes:
  - id: research
    type: agent
    model: anthropic/claude-sonnet-4-6
    fallback_chain:
      - openai/gpt-4o
      - gemini/gemini-2.0-flash
      - ollama/llama3.2
    prompt: |
      Research the topic: {{ topic }}.
      Return 3 key findings as bullet points.

  - id: critique
    type: agent
    model: openai/gpt-4o
    fallback_chain:
      - anthropic/claude-sonnet-4-6
    depends_on: [research]
    prompt: |
      Critique this research:

      {{ research }}

      What's missing? What's overstated?

Run it with:

agentgraph run multi_provider_fallback.yaml \
  --backend api \
  --input topic="renewable energy storage"

Get Started

Core Concepts

Guides

Configuration

LLM backends and per-node model routing in dagraph

Backend summary

Backend details

Model-prefix routing

Setting the default backend

Fallback chains

Multi-provider example

Get Started

Core Concepts

Guides

Configuration

​Backend summary

​Backend details

​Model-prefix routing

​Setting the default backend

​Fallback chains

​Multi-provider example

Backend summary

Backend details

Model-prefix routing

Setting the default backend

Fallback chains

Multi-provider example