@@ -0,0 +1,150 @@ |
| 1 | +# Sprint 15: Bootstrap Ownership, Service Burn-Down, and Explore Independence |
| 2 | + |
| 3 | +## Prerequisites |
| 4 | + |
| 5 | +Sprint 14 |
| 6 | + |
| 7 | +## Goals |
| 8 | + |
| 9 | +Finish the last high-value runtime cleanup that Sprint 14 exposed: move bootstrap ownership and the remaining agent-owned runtime services onto explicit runtime seams, so Loader's hot path is not only context-driven after initialization, but also context-driven at initialization. |
| 10 | + |
| 11 | +Sprint 14 was a real architectural win. `RuntimeContext` is now the primary seam across workflow state, turn phases, response repair, response routing, turn looping, workflow recovery, and finalization. The older `RuntimeLegacyServices` shim is gone, raw-text tool recovery no longer depends on hidden agent extractors, and the main runtime path is much less accidental than it was when the audit line first branched. |
| 12 | + |
| 13 | +That said, the current residual debt is now very specific: |
| 14 | + |
| 15 | +- `conversation.py` and `explore.py` still bootstrap from `agent._build_runtime_context()` |
| 16 | +- `agent/loop.py` still owns too much prompt/session/bootstrap coordination for a runtime that is otherwise context-owned |
| 17 | +- `agent/reasoning.py` and `agent/safeguards.py` still own meaningful runtime behavior behind typed protocols |
| 18 | +- the audit's core warning still matters in a narrower form: |
| 19 | + Loader should keep deleting wrapper-only ownership, not just wrapping it in nicer files |
| 20 | + |
| 21 | +Sprint 15 is about finishing that next contraction honestly: |
| 22 | + |
| 23 | +- runtime bootstrapping becomes an explicit runtime contract instead of an agent-only helper |
| 24 | +- explore mode stops being a special runtime that still depends on agent construction shape |
| 25 | +- reasoning and safeguard ownership become more inventoryable and less agent-bound |
| 26 | +- `agent/loop.py` shrinks further toward entrypoint/session orchestration instead of runtime-service ownership |
| 27 | + |
| 28 | +This sprint should feel like closing the structural loop opened by Sprint 14, not starting a new product branch. |
| 29 | + |
| 30 | +The references for this sprint are: |
| 31 | + |
| 32 | +- `refs/claw-code/rust/crates/runtime/src/conversation.rs` |
| 33 | +- `refs/claw-code/rust/crates/runtime/src/policy_engine.rs` |
| 34 | +- `refs/claw-code/rust/crates/runtime/src/prompt.rs` |
| 35 | +- `refs/claw-code/rust/crates/runtime/src/runtime_context.rs` |
| 36 | +- `refs/claw-code/PARITY.md` |
| 37 | +- `.docs/audit.txt` |
| 38 | +- `.docs/audit_sprints/trunk_sitrep.md` |
| 39 | +- `.docs/audit_sprints/sprint13_closure.md` |
| 40 | +- `refs/oh-my-codex/src/ralplan/runtime.ts` |
| 41 | +- `refs/oh-my-codex/src/verification/verifier.ts` |
| 42 | + |
| 43 | +## Deliverables |
| 44 | + |
| 45 | +### 1. Runtime bootstrap becomes a first-class runtime seam |
| 46 | + |
| 47 | +Sprint 14 made `RuntimeContext` load-bearing after construction. Sprint 15 should make construction itself less agent-special. |
| 48 | + |
| 49 | +Implementation targets: |
| 50 | + |
| 51 | +- introduce an explicit runtime bootstrap/factory seam under `src/loader/runtime/`, likely around: |
| 52 | + - building `RuntimeContext` |
| 53 | + - initializing project/session/prompt/capability state needed by runtimes |
| 54 | + - synchronizing prompt/capability metadata when the backend or prompt contract changes |
| 55 | +- reduce direct runtime dependence on `agent._build_runtime_context()` so `ConversationRuntime` and `ExploreRuntime` do not depend on a hidden agent helper as their primary construction mechanism |
| 56 | +- keep the contract pragmatic: |
| 57 | + - it is acceptable for `Agent` to call the factory |
| 58 | + - it is not acceptable for runtime correctness to depend on ad hoc agent-only bootstrap behavior |
| 59 | + |
| 60 | +The goal is to make runtime ownership explicit from the first line of construction, not only once the turn is already running. |
| 61 | + |
| 62 | +### 2. Burn down more agent-owned runtime services |
| 63 | + |
| 64 | +The remaining runtime service ownership now lives mostly in `agent/reasoning.py` and `agent/safeguards.py`. |
| 65 | + |
| 66 | +Implementation targets: |
| 67 | + |
| 68 | +- inventory the still-runtime-relevant behavior in: |
| 69 | + - `src/loader/agent/reasoning.py` |
| 70 | + - `src/loader/agent/safeguards.py` |
| 71 | +- move or re-home the behavior that is still genuinely runtime-owned, especially around: |
| 72 | + - confidence / verification service boundaries |
| 73 | + - stream filtering / steering / duplicate detection |
| 74 | + - action validation hooks |
| 75 | +- prefer one real implementation plus compatibility exports over keeping a runtime wrapper around an agent-owned implementation indefinitely |
| 76 | +- explicitly delete or retire dead wrapper layers where the runtime already has a better home |
| 77 | + |
| 78 | +This is the sprint where we should be suspicious of “adapter forever” solutions. If a behavior is still part of the runtime contract, it should increasingly live under `runtime/`. |
| 79 | + |
| 80 | +### 3. Explore runtime should share the same bootstrap discipline |
| 81 | + |
| 82 | +Explore is intentionally narrower than the main runtime, but it should not be structurally special in the wrong way. |
| 83 | + |
| 84 | +Implementation targets: |
| 85 | + |
| 86 | +- remove or narrow the `ExploreRuntime(agent)` construction shape so explore can be built from the same runtime bootstrap contract as the main runtime |
| 87 | +- keep the read-only registry, read-only permission mode, and capability refresh behavior intact |
| 88 | +- add direct tests for explore bootstrap and state ownership so explore remains a maintained runtime lane rather than a side path |
| 89 | + |
| 90 | +The goal is not to make explore bigger. The goal is to make it less magical and more aligned with the primary runtime contract. |
| 91 | + |
| 92 | +### 4. Shrink `agent/loop.py` toward entrypoint orchestration |
| 93 | + |
| 94 | +Sprint 14 made the runtime path smaller and more explicit. Sprint 15 should let `agent/loop.py` benefit from that work. |
| 95 | + |
| 96 | +Implementation targets: |
| 97 | + |
| 98 | +- move more bootstrap/session/prompt/runtime wiring out of `agent/loop.py` where it has become runtime ownership in practice |
| 99 | +- keep `agent/loop.py` focused on: |
| 100 | + - public entrypoints |
| 101 | + - session-facing orchestration |
| 102 | + - UI/event integration |
| 103 | + - compatibility wrappers that still truly need to exist |
| 104 | +- avoid letting new runtime helpers bounce back into `agent/loop.py` just to preserve old ownership lines |
| 105 | + |
| 106 | +This is the step that turns Sprint 14's seam cleanup into a visibly smaller agent shell. |
| 107 | + |
| 108 | +### 5. Keep the audit line active as a regression check, not a second roadmap |
| 109 | + |
| 110 | +`audit.txt` is old on specifics but still sharp on the pattern to avoid: additive cleanup that never deletes ownership. |
| 111 | + |
| 112 | +Implementation targets: |
| 113 | + |
| 114 | +- use the audit's core complaint as a check against Sprint 15 implementation: |
| 115 | + - do not add a new wrapper if we can adopt or delete |
| 116 | + - do not leave bootstrap ownership ambiguous |
| 117 | + - do not grow a “temporary” compatibility seam without direct tests and an exit story |
| 118 | +- update `PARITY.md` and the sprint audit only after the bootstrap/service changes are actually covered |
| 119 | + |
| 120 | +## Testing strategy |
| 121 | + |
| 122 | +- unit coverage for: |
| 123 | + - runtime bootstrap/factory behavior |
| 124 | + - explore bootstrap behavior |
| 125 | + - runtime-owned reasoning/safeguard services after migration |
| 126 | + - prompt/capability synchronization at the new bootstrap seam |
| 127 | +- runtime coverage for: |
| 128 | + - main turn execution through the new bootstrap path |
| 129 | + - explore mode through the shared bootstrap/runtime contract |
| 130 | + - Sprint 00-14 parity scenarios staying green after the bootstrap/service migration |
| 131 | +- regression coverage for: |
| 132 | + - no reintroduction of hidden raw-text extractors |
| 133 | + - no reintroduction of legacy callback shims equivalent to `RuntimeLegacyServices` |
| 134 | + - no ownership drift where runtime modules silently depend on agent-only helpers again |
| 135 | + |
| 136 | +## Definition of done |
| 137 | + |
| 138 | +- runtime bootstrapping is a first-class runtime seam, not primarily an agent helper |
| 139 | +- explore mode shares the same bootstrap discipline as the main runtime |
| 140 | +- more runtime-relevant behavior is moved or retired out of `agent/reasoning.py` and `agent/safeguards.py` |
| 141 | +- `agent/loop.py` shrinks further toward entrypoint/session orchestration |
| 142 | +- the parity baseline remains green after the bootstrap/service migration |
| 143 | + |
| 144 | +## Explicitly out of scope |
| 145 | + |
| 146 | +- full claw-code policy-engine parity |
| 147 | +- AST-aware or LSP-aware semantic artifact diffs |
| 148 | +- a richer permission rule editor |
| 149 | +- visual workflow tooling |
| 150 | +- multi-agent or team orchestration |