Budget caps
DAG-level budget
Set abudget: field at the top of your YAML to apply a global cap to the entire run. If the cap is exceeded at any point — mid-node, mid-iteration — dagraph raises BudgetExceededError and stops immediately. No additional charges accumulate after the cap is hit.
Maximum billable tokens for the run. Billable tokens are input tokens plus output tokens. Cache read and cache write tokens are excluded — this matters on the
claude_code backend, which loads a large system-prompt cache for each subprocess. That overhead does not count against your token cap.Maximum spend in US dollars. The engine calculates cost using published per-million-token prices for each model. You can set
max_tokens, max_usd, or both — each is checked independently.Per-node budget
Override the budget for a single node by adding abudget: field directly on the node. This is useful when one expensive node (such as a synthesizer using claude-opus-4-7) should be capped independently while lighter nodes share the DAG-level budget.
A per-node cap covers the total spend of everything that node does — for an evaluator_loop with three iterations, the cap applies across all six LLM calls (three generator + three evaluator calls combined).
If you use
claude_code as your backend, the USD cost shown is an approximation based on the API pricing table. The claude_code backend draws from your Claude Code subscription plan, not your API balance. Use max_tokens for a more reliable cap on that backend.Retry policies
Configuring retries
Add aretry: field to any node to automatically re-run it on failure. Without retry:, every node runs exactly once.
Total number of attempts including the first. Minimum 1, maximum 20.
max_attempts: 3 means one initial attempt and up to two retries.Seconds to wait between attempts. Use a non-zero value when retrying after rate-limit errors so you don’t immediately hit the same limit again.
A list of exception class names to retry on.
"*" matches any exception. Names are matched against the full class hierarchy, so ["OSError"] also catches FileNotFoundError. Narrow this list to avoid retrying on errors that indicate bad inputs or invalid prompts.Exceptions that are never retried
Two exceptions bypassretry_on entirely and are never retried, regardless of your configuration:
BudgetExceededError— the budget cap is a hard limit; retrying would immediately exceed it again.ApprovalPending— this is a control-flow signal from anapproval_gatenode, not a failure.
Conditional execution
Use thewhen: field on any node to skip it entirely based on inputs or upstream outputs. The expression is a Jinja template evaluated against all inputs and the outputs of completed upstream nodes. If the expression is truthy, the node runs normally. If falsy, the node is skipped — its status is set to skipped and an empty string is passed downstream.