@@ -175,3 +175,30 @@ Implementation targets: |
| 175 | 175 | - a first-class permission rule editor |
| 176 | 176 | - AST-aware, LSP-aware, or symbol-aware editing |
| 177 | 177 | - multi-agent or team orchestration |
| 178 | + |
| 179 | +## Audit |
| 180 | + |
| 181 | +### Landed |
| 182 | + |
| 183 | +- clarify now has explicit pressure-pass discipline instead of only slot-follow-up behavior: `src/loader/runtime/clarify_strategy.py`, `src/loader/runtime/workflow_policy.py`, and `src/loader/runtime/workflow_lanes.py` track readiness gates such as `non_goals`, `decision_boundaries`, and `pressure_pass`, and can drive later clarify rounds toward examples, tradeoffs, and challenged assumptions |
| 184 | +- brownfield clarify is now grounded in discovered workspace evidence instead of relying only on user answers and task text: `src/loader/runtime/clarify_grounding.py` feeds repo paths, repo facts, slot-aware evidence, pressure-aware evidence, and grounded brief hints into clarify prompts, fallback questions, and persisted brief synthesis |
| 185 | +- invalidation and recovery now use richer structured evidence than file drift alone: `src/loader/runtime/artifact_invalidation.py`, `src/loader/runtime/workflow_policy.py`, and `src/loader/runtime/workflow_recovery.py` now distinguish confirmed touchpoints, inferred touchpoints, acceptance anchors, contradicted assumptions, verification contradictions, and task-boundary drift, and that evidence is surfaced through workflow inspection |
| 186 | +- workflow/operator surfaces now explain clarify pressure and recovery evidence more directly: `src/loader/runtime/inspection.py` and `src/loader/cli/main.py` surface pressure metadata, recovery evidence, and the newer workflow history context instead of only route labels |
| 187 | +- the runtime shell is now genuinely controller-based instead of monolithic: `src/loader/runtime/workflow_recovery.py`, `src/loader/runtime/turn_preparation.py`, `src/loader/runtime/turn_completion.py`, `src/loader/runtime/turn_iteration.py`, `src/loader/runtime/turn_preamble.py`, `src/loader/runtime/workflow_state.py`, and `src/loader/runtime/turn_loop.py` now own distinct orchestration seams, and `src/loader/runtime/conversation.py` is down to a compact coordinator |
| 188 | + |
| 189 | +### Verification |
| 190 | + |
| 191 | +- `uv run pytest -q` is green: `231 passed` |
| 192 | +- `tests/test_clarify_strategy.py` covers pressure-pass reviews, readiness gates, and later-round clarify pressure selection |
| 193 | +- `tests/test_clarify_grounding.py` covers workspace evidence extraction, slot-aware evidence selection, pressure-aware grounding, and grounded brief hints |
| 194 | +- `tests/test_artifact_invalidation.py`, `tests/test_workflow_policy.py`, `tests/test_workflow_runtime.py`, and `tests/test_inspection.py` cover structured drift evidence, contradiction-driven recovery, workflow pressure metadata, and operator-facing recovery summaries |
| 195 | +- `tests/test_turn_preparation.py`, `tests/test_turn_completion.py`, `tests/test_turn_iteration.py`, `tests/test_turn_preamble.py`, `tests/test_workflow_state.py`, and `tests/test_turn_loop.py` give direct coverage to the new controller seams instead of relying only on large end-to-end runtime tests |
| 196 | +- targeted `ruff` checks stayed green on the touched runtime/controller modules and their new tests throughout the extraction work, and the full suite remained green after each slice |
| 197 | + |
| 198 | +### Residual debt |
| 199 | + |
| 200 | +- clarify is now pressure-aware and grounded, but it is still bounded and lighter than OMX's deeper interview style; Loader still does not adapt interview depth by task class or run richer challenge/consensus passes |
| 201 | +- the new invalidation evidence is a much better contract than text overlap alone, but it is still runtime-authored and heuristic; Loader still does not use deeper semantic reasoning over artifacts, symbols, or model-assisted contradiction analysis |
| 202 | +- `src/loader/runtime/conversation.py` is now a real coordinator, but `src/loader/runtime/turn_iteration.py` remains the heaviest seam and still carries a fair amount of repair/completion/tool-routing policy that claw-code spreads across even narrower runtime modules |
| 203 | +- workflow/operator surfaces explain more than they did at Sprint 11, but they still stop short of artifact diffs, prompt/history comparison, and richer timeline drill-down |
| 204 | +- Loader is much closer to a controller-based runtime than it was at the start of Sprint 12, but it still does not match claw-code or OMX on deeper planning rigor, semantic artifact discipline, or broader operator ergonomics |