Pipeline Steps

This is the per-step reference. For the overview and rationale, see Pipeline. For the fix loop, see Auto-Fix Loop.

intent → rebase → review → test → document → lint → push → pr → ci

Each step can produce findings, request approval, trigger auto-fix, or apply safe fixes during its own pass. Steps that encounter fatal errors stop the pipeline. Steps can also be pre-skipped when starting a run, skipped by the user, or skipped automatically by the pipeline. In the TUI, yolo mode is an explicit override that auto-resolves paused steps: auto-fix and ask-user findings are fixed once with every finding selected, fix-review gates are approved, and gates with only no-op findings are approved as-is. Every pipeline agent invocation is prompt-steered to keep intentional writes inside the run worktree and avoid mutating system state outside it. This is a soft boundary, not OS-level sandbox enforcement. The steering still allows requested test evidence under the managed temporary no-mistakes-evidence directory or the configured in-repo evidence directory, plus incidental temp or cache writes from normal development tools. Configured shell commands and one-shot agent subprocesses are scoped to their step: when the invocation exits, fails, or is cancelled, no-mistakes terminates remaining child processes it spawned so background workers do not outlive the run. When configured Test or Lint command output exceeds 64 KiB, the complete output remains in the authoritative step log while findings, IPC responses, and repair prompts receive a valid-UTF-8 head-and-tail projection capped at 64 KiB. The truncation marker reports the exact original and omitted byte counts and points to no-mistakes axi logs --step <step> --full for the complete output. Commits created by the shared Review, Test, Document, and Lint fix path use the configurable commit.fix_message template.

Intent

Uses agent-supplied intent when a run provides it, otherwise infers the author’s intent from recent local Claude Code, Codex, OpenCode, Rovo Dev, Pi, or GitHub Copilot CLI transcripts. This is best-effort context, and when available it is included in rebase fixes, review checks and fixes, test detection, evidence validation, and fixes, documentation checks and fixes, lint detection and fixes, CI auto-fixes, and PR drafting.

Behavior:

Uses run-supplied intent verbatim and skips transcript-based inference, even when intent.enabled is false
Runs transcript-based inference only when intent.enabled is true
Matches local agent transcripts against non-deleted changed files when present, falling back to all changed files for all-deletion diffs, may use the configured pipeline agent to disambiguate plausible matches, and summarizes the likely author intent with that agent
Stores the derived summary, source, session ID, and match score on the run
Logs accepted candidate diagnostics, including source, session, CWD, score, confidence, overlap, decision, and acceptance reason
Logs the matched source, score, and sanitized inferred intent when a transcript matches
Skips instead of failing when disabled, no matching transcript is found, the diff is empty, extraction errors, or persistence fails

This step does not block the pipeline for missing transcripts, summarization that exceeds the five-minute extraction cap, or other extraction failures, which are reported as skipped outcomes. It can fail the run only if cleanup fails after the disambiguation agent leaves worktree side effects.

Rebase

Fetches the latest authoritative remote state, fetches the configured pushed-branch target, and rebases your branch onto those refs.

Behavior:

Fetches origin/<default_branch> from the remote into the worktree, and also fetches the pushed branch for non-default branches unless the push rewrote branch history
Without fork routing, the pushed-branch target is origin/<branch>
With GitHub fork routing, the pushed-branch target is the fork branch fetched into refs/remotes/no-mistakes-push/<branch>
If the branch is not the default branch, tries rebasing onto the pushed-branch target first, then origin/<default_branch>
If the push rewrote branch history, skips the pushed-branch rebase target so prior remote autofix commits do not get reintroduced
If the push rewrote the default branch and origin/<default_branch> advanced after that rewrite, pauses for manual approval before updating the branch
If the branch carries commits from the contributor’s local default branch that are not on origin/<default_branch>, pauses with an ask-user finding instead of silently bundling that local work into the PR
The local-default check is best-effort and only fires when the local default tip is ahead of origin/<default_branch> and is an ancestor of the branch HEAD
Skips targets that don’t exist or are already ancestors
If a fast-forward is possible, does a hard-reset instead of a rebase
If the diff against the default branch is empty after rebase, completes rebase and skips all remaining pipeline steps
On conflict: records conflicting files, aborts the rebase, and reports findings

Auto-fix: when enabled, the agent resolves conflict markers, stages files, and runs git rebase --continue in a non-interactive Git environment so Git accepts the existing commit message instead of opening an editor. The prompt includes user intent when available. Manual fix rounds also include any per-conflict user notes, any selected user-authored findings from the TUI or AXI interface, and sanitized prior-round history in the prompt. The Rebase step does not synthesize a fix commit subject; git rebase --continue preserves the rebased commits’ subjects.

Default auto-fix limit: 3.

Review

AI code review of your diff.

Behavior:

Diffs the base commit against head
Filters out files matching ignore_patterns from the repo config
Sends the filtered diff to the agent with structured review instructions and a structured output schema
Includes user intent when the run has supplied intent or transcript matching found a relevant local agent session; the detailed provenance semantics are documented in Intent extraction
Treats authoritative intent as enforceable for source-verifiable acceptance criteria, but does not report the absence of a remote branch, push, pull request, or CI state that this run’s later Push, PR, or CI step owns
Removes any returned finding whose sole claim is that one of those same-run delivery outcomes is not present yet, while keeping findings about pre-existing or external pull requests, third-party artifacts, and lifecycle state that the current run does not own
Keeps the later Push, PR, and CI steps responsible for strictly validating their own outcomes after review completes
For changes that claim a durable bug fix, reconstructs the concrete failing sequence and required invariant, inspects relevant sibling paths and shared state transitions, and reports an inadequate fix only when source evidence proves the same authorized failure remains reachable; the recommendation targets the earliest supported shared boundary
Does not treat code shape or duplication alone as evidence of a systemic defect, demand speculative redesign, block explicitly authorized short-term containment merely because a later durable fix is possible, expand the user’s scope, or promote optional improvements into blockers
Agent returns findings with severity (error, warning, info), file location, description, and an action (no-op, auto-fix, ask-user)
Also returns a risk_level (low, medium, high) and risk_rationale
With the default session_reuse: true, Claude and Codex reuse one reviewer session across the initial review and every full rereview, and a separate fixer session across review-fix turns
A resume failure retries the same turn in a fresh session for that role, never skips the full rereview, and unsupported agents run cold
Atomically records the exact commit examined when a full review completes successfully; a parked review retains its candidate only for recovery, while failed, skipped, superseded, and legacy reviews grant no inferred approval authority

Approval: required if any finding has severity error or warning. Findings with action: ask-user pause for approval instead of entering the normal auto-fix loop. This is for findings that challenge the author’s intent, not routine correctness, reliability, or security fixes that may need to re-add a small amount of deleted logic. With the default auto_fix.review: 0, blocking review findings park for approval even when their action is auto-fix; setting repo or global auto_fix.review above 0 re-enables the automatic review fix loop for eligible auto-fix findings. Findings with action: no-op are informational only. The shared finding-action model owns the behavior for a missing action.

Auto-fix: the agent receives the selected previous findings plus any per-finding user notes, any selected user-authored findings from the TUI or AXI interface, and a sanitized history of prior rounds for that step, including earlier fix summaries and which findings the user left unselected. The fixer applies all selected fixes before running one focused verification limited to the changed area, and it is instructed not to run the complete repository test or lint suite during the fix round. The dedicated Test and Lint steps after review remain the authoritative gates, although their coverage may be focused when commands are unconfigured. Follow-up review passes use the history to avoid re-reporting user-ignored findings unless the code now has a materially different problem.

Default auto-fix limit: 0.

Post-review HEAD continuity

At entry to every remaining step in the fixed pipeline order - Test, Document, Lint, Push, PR, and CI - no-mistakes compares the live worktree HEAD with the pipeline-recorded head. An equal head or a pipeline-descendant commit continues. A backward reset, divergent sibling, or unverifiable relationship fails the run before that step performs work, including for steps that would not create a commit.

Test

Runs targeted local validation of the change and requested intent, then gathers evidence for that intent. Local Test is never a repository-wide regression-suite substitute; broad regression is owned by remote CI and remains mandatory before a PR is ready. commands.test owns the configuration contract for any explicit baseline command.

Behavior:

If commands.test is set in repo config: runs it first as a baseline via the platform shell (sh -c on POSIX, cmd.exe /c on Windows) and captures output. Non-zero exit produces error findings. Configure a targeted command here (see repo-config); do not treat this field as CI-parity complete-suite configuration.
If commands.test is empty, or user intent is available after the baseline command passes: the agent validates the change with the smallest relevant evidence-oriented tests or manual checks, returning structured findings with severity, description, and action (no-op, auto-fix, ask-user). Both the normal evidence agent and the Test-repair agent are instructed not to run the complete repository test suite; a generic driver instruction asking for broad or full-suite confirmation does not override that product boundary. For UI, HTML, CSS, browser, visual layout, or copy-placement changes, the agent attempts reviewer-visible visual evidence and explains in testing_summary when screenshots, images, videos, GIFs, or rendered HTML artifacts are not captured.
“Do not run everything” is not “run nothing”: when no targeted check can establish the intent, the agent must write or improve a focused test, perform manual verification with evidence, or report a warning finding that sufficient targeted evidence is not possible.
The step records the exact tests and checks it exercised in a tested array, may include a short natural-language testing_summary, and includes an artifacts array for reviewer-visible evidence; path artifacts may be repository-relative paths or absolute paths under the temporary no-mistakes-evidence/<runID> directory, url artifacts must be externally visible, and content artifacts should be short logs or command output shown directly in the PR.
By default, evidence is stored under the temporary no-mistakes-evidence/<runID> directory. With test.evidence.store_in_repo: true, evidence is stored under <test.evidence.dir>/<branch-slug> inside the worktree, staged during push, and published with the branch. Unsafe, symlinked, or Git-ignored evidence directories fall back to temporary storage for that run.
Before finishing, test agents are instructed to remove transient working-tree artifacts they created, such as downloaded models, caches, build outputs, large binaries, or generated data directories, while preserving intentional source or test-file changes and evidence files under the dedicated evidence directory.
Missing evidence for user intent can be reported as a warning with action: ask-user.
If the agent creates new test files (detected via git status --porcelain), they are recorded as informational no-op findings and do not require approval when tests pass.

Approval: test findings with action: ask-user pause for approval, including missing-evidence warnings for user intent. action: auto-fix findings stay eligible for the fix loop. action: no-op findings are informational only.

Auto-fix: the agent receives the previous test findings plus any per-finding user notes, any selected user-authored findings from the TUI or AXI interface, and a sanitized history of prior rounds for that step, including earlier fix summaries and any findings the user left unselected in prior approval cycles. Repair mode reproduces the specific failure, applies a root-cause fix, and re-runs only focused verification - not a complete-suite confirmation - then the step’s configured baseline (if any) and evidence path run again.

Default auto-fix limit: 3.

Document

Updates matching documentation for code changes and reports only unresolved gaps.

Behavior:

Diffs the base commit against head and skips the step if there are no non-ignored changed files to document
Asks the agent to find every documentation gap, update docs or doc comments for all gaps it can resolve, verify its edits, and commit any documentation changes under the placement policy
The placement policy gives each fact one authoritative owner, prefers removing stale duplicates or replacing them with pointers, avoids new documentation surfaces for perceived gaps, and keeps durable incident lessons near their owner instead of in AGENTS.md
document.instructions can add trusted default-branch ownership rules for the repository
When commands.lint is empty, performs documentation and agent-driven lint in one combined housekeeping invocation, categorizing findings for the document or lint gate; if that pass is skipped, its structured output is unusable, or a daemon restart loses the in-memory result, lint runs its own agent pass instead
Includes user intent when available
Returns findings only for unresolved documentation gaps or human judgment calls
Requires approval whenever any unresolved documentation finding is returned, including info findings

Auto-fix: documentation fixes happen during the initial document pass. Unresolved findings pause for approval instead of starting another automatic document/fix loop. If you manually trigger a fix from the TUI or AXI interface, the agent receives the selected previous findings plus any per-finding user notes, any selected user-authored findings, and sanitized prior-round history.

Default auto-fix limit: not used for automatic document follow-up loops.

Lint

Runs linters and static analysis.

Behavior:

If commands.lint is set: runs it via the platform shell (sh -c on POSIX, cmd.exe /c on Windows). Non-zero exit produces warning findings.
If commands.lint is empty: consumes lint-category findings from the document step’s combined housekeeping pass, avoiding a second cold agent invocation. If no usable combined result exists, the lint step detects appropriate linters/formatters, applies safe fixes, reruns the relevant checks, commits any agent changes, and returns structured findings only for unresolved issues.

Approval: lint findings with action: ask-user pause for approval. action: auto-fix findings stay eligible for the fix loop when commands.lint is configured. action: no-op findings are informational only. Combined-pass lint findings use the same gate: error and warning findings pause for a decision, while info findings do not.

Auto-fix: when commands.lint is configured, the lint step follows the same pattern as test - the agent fixes action: auto-fix issues using the previous findings plus any per-finding user notes, any selected user-authored findings from the TUI or AXI interface, and a sanitized history of prior rounds for that step, including earlier fix summaries and any findings the user left unselected in prior approval cycles, then lint re-runs. When commands.lint is empty, unresolved findings from the combined pass pause for approval instead of starting another automatic lint/fix loop, because the agent already attempted safe fixes during housekeeping.

Default auto-fix limit: 3.

Push

Pushes the validated branch to the configured push target.

Behavior:

If commands.format is set, runs it first
Stages in-repo test evidence artifacts when test.evidence.store_in_repo is enabled and the evidence directory is not ignored by Git
Commits any uncommitted agent changes with message no-mistakes: apply agent fixes
Without fork routing, successful run-start validation selects the upstream URL from the working clone; when it matches the gate worktree’s origin, the worktree URL is used so embedded credentials retained outside the database can authenticate. If validation fails, the run continues with its prior routing.
With GitHub fork routing, the push target is repos.fork_url
Immediately before remote mutation, reloads the durable review-approved commit and refuses to push when that binding is missing, malformed, or unreachable
Requires the commit proposed for push to equal or descend from the review-approved commit, allowing commits made by later pipeline steps without authorizing unrelated history
Re-reads the push target via git ls-remote before pushing
For existing branches, refuses to force-push when the live remote carries commits the pipeline has not incorporated by patch-id
Fails closed when the remote safety check cannot verify whether the push would discard existing remote work
Uses --force-with-lease=<ref>:<sha> with an explicit SHA anchor for allowed existing-branch rewrites
Pushes the exact verified commit SHA instead of mutable worktree HEAD
Treats the branch as already pushed when the remote already points at that verified commit
Uses regular push for new branches
Updates the run’s head SHA in the database to the exact commit delivered

A remote branch can move without being rejected when all remote commits are already represented in the validated head, or when a run is intentionally rewriting history it already knew about. Any other out-of-band commit stops the push instead of being overwritten. Pre-skipping or later skipping Review leaves no approval binding, so Push fails closed unless Push is also skipped.

This step never requires approval - it runs automatically after review, test, document, and lint pass.

PR

Creates or updates a pull request.

Skipped when:

The branch is the default branch
The upstream host is not GitHub, GitLab, Bitbucket Cloud (bitbucket.org), or Azure DevOps (dev.azure.com / *.visualstudio.com)
The provider CLI (gh or glab) is not installed for GitHub or GitLab
The provider CLI is not authenticated for GitHub or GitLab
Bitbucket Cloud credentials are missing (NO_MISTAKES_BITBUCKET_EMAIL or NO_MISTAKES_BITBUCKET_API_TOKEN)
The az CLI with the azure-devops extension is not installed or not authenticated for Azure DevOps
A legacy or manually edited GitLab, Bitbucket, or Azure DevOps repo record has fork_url set, because fork MR/PR routing is currently GitHub-only

Behavior:

Checks for an existing PR on the branch
If one exists, updates it. If not, creates a new one.
Uses the provider CLI for GitHub/GitLab, the az CLI for Azure DevOps, and the Bitbucket API for Bitbucket Cloud
For GitHub fork routing, keeps gh --repo pointed at the parent repository from origin, checks existing PRs with the bare branch name, filters matching PRs by head owner, and creates PRs with --head <fork-owner>:<branch>
PR title: agent-generated with user intent when available, in conventional commit format (type(scope): description or type: description); user-facing product impact should use feat or fix so release automation can pick it up; when a scope is used, it should be the primary affected real module/package from the changed paths and kept broad rather than file-level
PR body includes a ## Intent section when user intent is available, an agent-authored ## What Changed, and regenerated ## Risk Assessment, ## Testing, and ## Pipeline sections from recorded step results and rounds; auto-fix results in ## Pipeline render as an issue -> fix -> verification narrative using captured fix summaries, re-check success text, and any still-open findings
Generated PR bodies are capped at 63,488 bytes, leaving a 2 KB safety buffer below GitHub’s 65,536-character body limit.
When a body would exceed that cap, the PR step first omits older ## Pipeline update rounds at clean update boundaries, keeps the newest rounds when possible, and points reviewers to the run log for the full pipeline history.
Intent, ## What Changed, risk, and testing sections are kept ahead of pipeline history; if those sections or the newest pipeline update are still too large, the PR step truncates at line or section boundaries and adds an explicit marker.
The regenerated ## Testing section prefers the recorded testing_summary as prose, uses a compact recorded-check count when no summary is available, includes produced evidence artifacts from path, url, or content fields when available, and only adds an outcome with run count and total duration when it is failed or needed as a fallback
Evidence artifacts render compactly in PR bodies: repository-relative path artifacts and url artifacts become Evidence links, content artifacts appear in collapsible details blocks, GitHub PRs convert repository-relative paths to blob URLs, readable UTF-8 text files from the temporary evidence directory are embedded inline with truncation for large files, and binary, visual, or over-budget local artifacts render as non-link local file references
For Azure DevOps, the PR description is capped at 4000 characters (UTF-16 code units, matching .NET’s measurement): the agent is told about the cap and asked to keep the ## What Changed section compact; if the assembled body still overruns, the ## Testing section is dropped first (it embeds artifact and log content and is effectively unbounded) so the Intent, What Changed, Risk Assessment, and Pipeline sections are preserved; a final connector-level clamp truncates with a visible marker as a last-resort backstop

Stores the PR URL in the database and streams it to the TUI.

CI

Monitors PR health after creation and auto-fixes CI failures. Mergeability polling and merge-conflict handling now apply to GitHub, GitLab, and Azure DevOps.

Active for GitHub, GitLab, Bitbucket Cloud (bitbucket.org), and Azure DevOps (dev.azure.com / *.visualstudio.com).

GitHub requires gh CLI, installed and authenticated.
GitLab requires glab CLI, installed and authenticated.
Bitbucket Cloud requires NO_MISTAKES_BITBUCKET_EMAIL and NO_MISTAKES_BITBUCKET_API_TOKEN.
Azure DevOps requires the az CLI with the azure-devops extension, authenticated with a PAT.

Behavior:

Polls provider CI status at increasing intervals: every 30s for the first 5 minutes, every 60s for 5-15 minutes, every 120s after that
Continues its normal monitoring loop until the PR is merged, closed, declined, or the configured ci_timeout idle window elapses, then parks at an approval gate instead of ending the run
The ci_timeout reference owns idle re-arming, unlimited monitoring, and fail-closed reconciliation while that gate is parked
On GitHub, GitLab, and Azure DevOps, polls provider mergeability alongside CI checks while the PR remains open
While the PR stays open, the TUI and terminal title show Checks passed once checks are green and known mergeability is clear, and no-mistakes axi returns outcome: checks-passed with successful-output reporting instructions so agents can summarize the run, ask the user to review and merge, and list any pipeline fixes instead of waiting
If the default branch moves after checks-passed, keeps watching the same PR; a clean behind PR needs no action, while an actual GitHub, GitLab, or Azure DevOps merge conflict is auto-fixed by rebasing onto the base and re-pushing through the force-push safety guard
The ready signal clears if checks start running again, new failures appear, provider state becomes uncertain, or the PR is merged, closed, or declined
Waits a 60s grace period before trusting empty results (CI checks may not have registered yet)
If CI failures or, on GitHub, GitLab, or Azure DevOps, a merge conflict are already known while other checks are still pending: waits for all checks to finish before attempting an auto-fix
On CI failure: fetches failed job logs (GitHub via gh run view --log-failed, GitLab via glab ci trace, Bitbucket Cloud via failed pipeline step logs; Azure DevOps has no first-class build-log command, so the agent fixes from the failing-check list without logs), sends them to the agent with user intent when available, and, if the agent produces changes, commits them and uses the same force-push safety guard as the push step
On GitHub, GitLab, or Azure DevOps merge conflict: asks the agent to rebase onto the latest default-branch tip and make the smallest correct root-cause fix for the conflicts, using user intent when available
If both CI failures and a GitHub, GitLab, or Azure DevOps merge conflict are present: fixes both in the same attempt
If a fix attempt produces no changes: automatic mode leaves the failure undeduplicated so it can retry until the auto-fix limit, while manual fix mode returns immediately for manual intervention
Deduplicates fix attempts only after a fix is actually committed and pushed
Exits cleanly when the PR is merged, closed, or declined
If the idle timeout is reached while the PR is still open: pauses for user approval, even when CI checks are currently healthy
If the idle timeout is reached while CI failures or, on GitHub, GitLab, or Azure DevOps, a merge conflict are still known: pauses for user approval with findings for the remaining issues
If the idle timeout is reached while GitHub, GitLab, or Azure DevOps PR mergeability is still unresolved: pauses for user approval with a finding describing the unresolved mergeability state
If CI failures or a GitHub, GitLab, or Azure DevOps merge conflict persist after the auto-fix limit: pauses for user approval with findings listing each failing check and/or the merge conflict

Default auto-fix limit: 3 total CI auto-fix attempts.

Step statuses

Each step progresses through these statuses:

Status	Meaning
`pending`	Not yet started
`running`	Currently executing
`fixing`	Agent is auto-fixing issues
`awaiting_approval`	Paused, waiting for user action
`fix_review`	Paused after a fix cycle, showing results for review
`completed`	Finished successfully
`skipped`	Pre-skipped for the run, skipped by the user, or skipped automatically by the pipeline
`failed`	Step failed; the step log includes the returned error message so command stderr and provider errors are visible in the per-step log, not only in the daemon log

When a non-terminal run has a step in awaiting_approval or fix_review, AXI run objects also expose awaiting_agent: parked <duration> as a run-level observability signal. The signal clears as soon as the approval wait ends, including axi respond and cancellation, and does not change how gates resolve. When a step is running or fixing, AXI run objects expose an active_steps table with active duration, latest activity, native subprocess PID when present, and the current round such as round 1, auto-fix 1/3, or fix 2. If the latest activity is older than step_quiet_warning, AXI prefixes it with quiet to make possible wedges visible without changing the run state. Step logs also record native subprocess start, exit, and retry lifecycle lines plus explicit auto-fix and user-fix round markers.