dagraph DAG YAML: complete schema and field reference

A dagraph workflow is a single YAML file. At the top level you declare metadata, optional inputs with types and defaults, a global budget cap, the list of nodes, and optional hooks and output mappings. Every field documented here maps directly to a validated Pydantic model — dagraph will report a clear error at agentgraph validate time if anything is wrong before a single LLM call is made.

Top-level fields

name

string

required

A short identifier for the DAG. This name appears in run logs, the agentgraph list output, and trace spans. Use lowercase snake_case (for example, research or draft_with_review).

description

string

A human-readable description of what the workflow does. Optional but recommended — it shows in agentgraph inspect output and helps teammates understand the DAG at a glance.

inputs

object

A map of input name to InputSpec. Declaring inputs makes the CLI self-documenting (agentgraph validate lists all expected inputs), coerces types before the first LLM call, and rejects missing required values early. Omit this field for free-form input passing.

Show InputSpec properties

type

string

default:"string"

The scalar type for this input. Accepted values: string, integer, number, boolean. Type coercion happens before Jinja rendering, so {{ count * 2 }} works correctly when type: integer.

required

boolean

default:"true"

When true, the run fails immediately if this input is not supplied on the command line or in the webhook payload. Set to false and supply a default for optional inputs.

default

string | integer | number | boolean

The value used when the input is absent and required: false. Must match the declared type.

description

string

Documents the input in agentgraph validate help output. Use this to tell the person running the DAG what value to supply.

budget

object

A global cap applied to the entire run. When either limit is exceeded, dagraph raises BudgetExceededError and stops the run immediately — no partial charges accumulate after the cap is hit. Both fields are optional; omit the field entirely for an uncapped run.

Show Budget properties

max_tokens

integer

Maximum billable tokens across the whole run. Billable tokens are input tokens plus output tokens. Cache read and cache write overhead are excluded, which matters especially on the claude_code backend where each subprocess loads a large system-prompt cache.

max_usd

number

Maximum spend in US dollars. The engine calculates cost using published per-million-token prices for each model. Unknown models report $0 cost and do not trigger this cap.

nodes

array

required

The list of node specs that make up the DAG. Every node must have a unique id. Nodes run as soon as all nodes listed in their depends_on have completed. Nodes with no dependencies start immediately and run in parallel. See Node types for the full list of node type values and their fields.

outputs

object

A map of file path to node ID. After a successful run, dagraph writes each referenced node’s output text to the corresponding file. Relative paths resolve against the directory where you ran agentgraph run. Parent directories are created automatically. Nothing is written for paused or failed runs.

outputs:
  report.md: synthesizer
  summary.txt: executive_summary

hooks

object

A map of lifecycle event name to a list of hook specs. Hooks fire at key points in the run without blocking execution. Hook failures are logged as warnings and never fail the DAG. See Lifecycle hooks for the full event list and hook configuration.Supported events: on_dag_start, on_dag_complete, on_dag_paused, on_dag_failed, on_node_start, on_node_complete, on_node_failed.

Per-node policy fields

Every node type (agent, bash, evaluator_loop, etc.) shares three optional policy fields that control whether the node runs, how it retries, and how much it may spend. These fields are described here once rather than repeated for each node type.

when

string

A Jinja expression evaluated against the current inputs and the outputs of upstream nodes. If the expression is truthy, the node runs normally. If it is falsy, the node is skipped — its status is set to skipped and an empty string is passed to any downstream nodes that depend on it.

- id: deep_analysis
  type: agent
  model: claude-sonnet-4-6
  when: "{{ topic | length > 10 }}"
  prompt: "Analyze {{ topic }} in depth."

retry

object

Per-node retry configuration. Without this field the node runs exactly once. BudgetExceededError and ApprovalPending are never retried regardless of this configuration.

Show RetryPolicy properties

max_attempts

integer

default:"1"

Total number of attempts including the first. Minimum 1, maximum 20. A value of 3 means one initial attempt and up to two retries.

backoff_seconds

number

default:"0"

How many seconds to wait between attempts. Use this to avoid hammering a provider that is rate-limiting you.

retry_on

string[]

default:"[\"*\"]"

A list of exception class names to retry on. "*" matches any exception. Names are matched against the exception’s full class hierarchy, so ["OSError"] also catches FileNotFoundError. Set this to a specific list to avoid retrying on errors that indicate bad inputs.

budget

object

A per-node spending cap that overrides (does not add to) the DAG-level budget for this node. The cap applies to the sum of all LLM calls a node makes — for an evaluator_loop with three iterations, the cap covers all six calls (generator + evaluator × 3). Exec-type nodes like bash and python_exec report zero token usage, so a budget there is a no-op.Same fields as the top-level budget: max_tokens and max_usd.

Complete example

The following DAG illustrates most top-level fields together. The three researcher nodes start in parallel; the synthesizer starts only after all three complete.

name: research
description: Parallel three-angle research on a topic, synthesized by Opus.

inputs:
  topic:
    type: string
    required: true
    description: The subject to research

budget:
  max_tokens: 50000
  max_usd: 2.00

nodes:
  - id: research_a
    type: agent
    model: claude-haiku-4-5-20251001
    max_output_tokens: 1500
    prompt: |
      Research "{{ topic }}" from a TECHNICAL perspective.
      Cover: mechanisms, architecture, implementation details.
      Return 5-8 concise bullet points.

  - id: research_b
    type: agent
    model: claude-haiku-4-5-20251001
    max_output_tokens: 1500
    prompt: |
      Research "{{ topic }}" from an ECONOMIC perspective.
      Cover: market forces, cost structures, winners, losers.
      Return 5-8 concise bullet points.

  - id: research_c
    type: agent
    model: claude-haiku-4-5-20251001
    max_output_tokens: 1500
    prompt: |
      Research "{{ topic }}" from a HUMAN/SOCIAL perspective.
      Cover: behavior, adoption, second-order effects on people.
      Return 5-8 concise bullet points.

  - id: synthesizer
    type: agent
    model: claude-sonnet-4-6
    max_output_tokens: 3000
    depends_on: [research_a, research_b, research_c]
    prompt: |
      Synthesize the three research angles into a unified report.
      Technical: {{ research_a }}
      Economic: {{ research_b }}
      Human/Social: {{ research_c }}

outputs:
  report.md: synthesizer

hooks:
  on_dag_complete:
    - type: webhook
      url: https://hooks.slack.com/services/${SLACK_TOKEN}
      headers:
        Content-Type: application/json

Run agentgraph validate research.yaml before your first run. Validation catches unknown fields, duplicate node IDs, missing depends_on targets, and DAG cycles without making any LLM calls.

Get Started

Core Concepts

Guides

Configuration

dagraph DAG YAML: complete schema and field reference

Top-level fields

Per-node policy fields

Complete example

Get Started

Core Concepts

Guides

Configuration

​Top-level fields

​Per-node policy fields

​Complete example

Top-level fields

Per-node policy fields

Complete example