How the caps fit together¶
The caps exist because Tilth runs unattended. An interactive agent doesn't need a hard ceiling — a human watching the scrollback notices a runaway and stops it. Tilth has no such human for the length of a run, so the caps are that human: a budget the harness enforces on its behalf. They're set deliberately loose (a stuck task should be the exception), and a hit is always loud and resumable rather than silent.
At any moment during a run, seven things can stop it:
- All
prd.jsontasks done — happy path; outer loop exits cleanly. MAX_WALL_CLOCK_MINUTESexceeded — outer loop checks at the top of each task; the current task finishes first.MAX_TOKENSexceeded — same enforcement granularity as wall-clock; inter-task only.MAX_ITERATIONS_PER_TASKexceeded inside a single task — that task is markedfailed, the run halts (does not continue to the next task — failures are halting events, not skip events). The nexttilth resumeflips the failed task back topendingand the agent retries it with a fresh iteration budget; partial work survives via a soft-reset of the FAILED placeholder commit.MAX_EVALUATOR_CALLS_PER_TASKexceeded inside a single task (optional;0= off, the default) — same shape as cap 4: task markedfailedwith reasonevaluator_cap, run halts,tilth resumeretries with a fresh budget. Targets the worker↔evaluator ping-pong case where the iteration budget alone would let the worker keep retrying right up untiliter_cap.- 3 consecutive empty model responses inside a single task — a turn with no tool calls, no content, no reasoning (a provider hiccup; observed: an endpoint that 200s with zero usage tokens). Same shape as cap 4: task marked
failedwith reasonempty_responses, run halts,tilth resumeretries. Empty calls cost no tokens, so the token cap can't catch a stuck endpoint — this backstop does. Fixed at 3 retries (with backoff); no env knob. - 3 consecutive worker turns without a case inside a single task — the worker went quiet but never called
submit_case. After 3 nudges the task is markedfailedwith reasonno_case, run halts,tilth resumeretries. Fixed at 3; no env knob.
Caps 2 and 3 are session-level. Caps 4–7 are task-level. There is no per-call cap and no per-task token cap. Caps 6 and 7 aren't env-tunable — they're fixed circuit-breakers against a dead endpoint or a worker that never presents its case. That's the full safety story.
What hitting a cap looks like¶
A cap-4 (iter_cap) hit, with the post-run summary the harness prints on the way out:
Two things to read here. Top: the cap fires and the run halts mid-task list (T-003 of five). Bottom: the run summary surfaces every cap as a percentage, so it's obvious which one bit — duration and tokens are both well under, only iterations were tight. (This capture predates the default bump — the cap was set to 8 for this run, below today's default of 32 — which is why it halts this early.)
The failed=1 pending=2 line in the summary is what tilth resume reads to plan its retry — see Resuming a session for what picks up from this exact point (same session id).
Diagram suggestion — a layered cap diagram: an outer ring labelled "Session-level caps (MAX_WALL_CLOCK_MINUTES, MAX_TOKENS)" containing an inner ring labelled "Task-level stops (MAX_ITERATIONS_PER_TASK, MAX_EVALUATOR_CALLS_PER_TASK, plus the fixed empty_responses / no_case backstops)." A central node represents the worker model call. Annotate each ring with what happens on hit (run stops vs. task fails + run halts).
Cross-references¶
- See The two loops for the loop structure that the caps gate, including the inner-loop flow chart that shows where
iter_cap,evaluator_cap,empty_responses, andno_caseare emitted. - See Token recording for where
MAX_TOKENSis read, where the running counter lives, and the trade-off behind between-task enforcement. - See Resume mechanics for what
tilth resumedoes to a cap-stopped run (and why a token-cap stop will re-trip immediately unless you bump the cap).
![Terminal capture: task T-003 reaches iter 8, the harness logs "task T-003 hit iteration cap [TILTH_MAX_ITERATIONS_PER_TASK=8]" and then "× T-003 failed (iter_cap); halting run". A run summary block follows: session 20260523-082151-45f0a5, branch session/20260523-082151-45f0a5, duration 2m27s (2.0% of TILTH_MAX_WALL_CLOCK_MINUTES=120), tokens 75,387 (3.8% of TILTH_MAX_TOKENS=2,000,000), tasks done=2 failed=1 pending=2.](../../assets/iter-cap-and-summary.png)