tenseleyflow/sway / 043f736

Browse files

README + CHANGELOG: multi-turn coherence section + Sprint 30 block

Authored by mfwolffe <wolffemf@dukes.jmu.edu>
SHA
043f736d39af01fa02579fc1a5eeef24c96087c3
Parents
332e32f
Tree
b193e13

2 changed files

StatusFile+-
M CHANGELOG.md 69 0
M README.md 36 3
CHANGELOG.mdmodified
@@ -2,6 +2,75 @@
2
 
2
 
3
 ## Unreleased
3
 ## Unreleased
4
 
4
 
5
+### Sprint 30 — `multi_turn_coherence_decay` probe
6
+
7
+Closes the P2 "multi_turn_coherence_decay probe" backlog item. Sway
8
+had zero coverage of multi-turn behavior before this — every
9
+shipped adherence probe was single-turn. Adapters that pass
10
+`delta_kl` cleanly frequently degrade by turn 2 or 3 of real
11
+dialogue, where the model's own previous responses enter the
12
+context window and create compounding drift. The new probe is the
13
+first that catches that failure mode.
14
+
15
+**New probe (`kind: multi_turn_coherence_decay`, category: adherence).**
16
+
17
+For each prompt the probe greedy-generates ft's turn-1 response,
18
+then rolls a multi-turn synthetic dialogue with cycled generic
19
+follow-ups (`Continue.`, `Tell me more.`, …). At each turn 2..N
20
+both views see the same ft-grounded chat history; the probe
21
+computes `KL(base || ft)` at the next-token position. The
22
+per-turn KL series is fit to `kl = a · exp(-b · turn)` and
23
+the probe reports `half_life_turns = ln(2) / b`.
24
+
25
+Verdict ladder:
26
+- `ok` — clean exponential decay; PASS when `half_life_turns ≥
27
+  assert_half_life_turns` (default 2.0).
28
+- `stable` — KL stayed within 0.1% relative spread across turns;
29
+  half-life is formally infinite; clipped to `max_turns × 10`
30
+  and rendered with a "held coherence" message; always PASS.
31
+- `non_monotonic` — KL grew turn-over-turn (atypical but
32
+  possible); WARN with the curve in evidence.
33
+- `degenerate` — KL ≈ 0 at every turn; FAIL with "probable no-op
34
+  adapter" diagnosis.
35
+
36
+**No null calibration.** Mirrors `prompt_collapse`'s rationale: a
37
+null adapter has no signal to decay, so the null distribution of
38
+half-lives is meaningless. Fixed-threshold verdicts are the
39
+published path.
40
+
41
+**Chat-template requirement.** Multi-turn dialogue requires the
42
+base's tokenizer to carry a `chat_template`. Bases without one
43
+SKIP gracefully with a clear message. The probe consults
44
+`ctx.backend._tokenizer.chat_template` — same backdoor
45
+`prompt_collapse` uses for the same reason (avoids broadening the
46
+public scoring contract for one probe's needs).
47
+
48
+Reports surface a tiny unicode sparkline of per-turn KL in the
49
+verdict message so terminal-only readers see curve shape without
50
+opening the JSON.
51
+
52
+**Implementation:**
53
+- `probes/multi_turn_coherence.py` — spec, probe, curve fit,
54
+  verdict mapping, sparkline.
55
+- `probes/__init__.py` — registers the new probe.
56
+
57
+**Test surface:**
58
+- `tests/unit/test_probe_multi_turn_coherence.py` — 22 unit tests
59
+  covering skip paths, end-to-end with planted-distribution
60
+  sequences (decreasing / flat / growing curves), fit math (clean
61
+  exp / stable / growing / zero / partial-zero), verdict mapping,
62
+  sparkline rendering, and chat-template detection.
63
+- `tests/integration/test_probe_multi_turn_coherence.py` —
64
+  slow+online HF smoke on a tiny SmolLM2-135M LoRA across 3
65
+  dialogue turns. Exercises the full chat-template + turn-loop +
66
+  curve-fit path on a real backend.
67
+
68
+**README** gains a "Multi-turn coherence" section between
69
+`Tool-use fidelity` and `Reproducing a sway run` with a worked
70
+YAML example. The probe table at "Why it exists" picks up the
71
+new entry under Adherence (plus `tool_use_fidelity` under
72
+Attribution — the table missed S27's addition).
73
+
5
 ### Sprint 28 — `tenseleyflow/sway-action` GitHub Action
74
 ### Sprint 28 — `tenseleyflow/sway-action` GitHub Action
6
 
75
 
7
 Closes the P2 "GitHub Action" discoverability item from Audit 03
76
 Closes the P2 "GitHub Action" discoverability item from Audit 03
README.mdmodified
@@ -169,13 +169,13 @@ user-authored document. The right question is *"did the adapter actually
169
 move the model toward what I wrote?"* — and existing tools answer this
169
 move the model toward what I wrote?"* — and existing tools answer this
170
 poorly.
170
 poorly.
171
 
171
 
172
-`sway` answers it directly via thirteen primitives across four
172
+`sway` answers it directly via fourteen primitives across four
173
 categories, plus a baseline-calibration primitive:
173
 categories, plus a baseline-calibration primitive:
174
 
174
 
175
 | Category      | Primitives                                            |
175
 | Category      | Primitives                                            |
176
 |---------------|-------------------------------------------------------|
176
 |---------------|-------------------------------------------------------|
177
-| Adherence     | `delta_kl`, `adapter_revert`, `prompt_collapse`, `cluster_kl` |
177
+| Adherence     | `delta_kl`, `adapter_revert`, `prompt_collapse`, `cluster_kl`, `multi_turn_coherence_decay` |
178
-| Attribution   | `section_internalization`, `paraphrase_invariance`, `preference_flip` |
178
+| Attribution   | `section_internalization`, `paraphrase_invariance`, `preference_flip`, `tool_use_fidelity` |
179
 | Calibration   | `style_fingerprint`, `calibration_drift`, `leakage`, `external_perplexity` |
179
 | Calibration   | `style_fingerprint`, `calibration_drift`, `leakage`, `external_perplexity` |
180
 | Ablation      | `adapter_ablation` ← the signature primitive          |
180
 | Ablation      | `adapter_ablation` ← the signature primitive          |
181
 | Baseline      | `null_adapter` (powers every z-score in the report)   |
181
 | Baseline      | `null_adapter` (powers every z-score in the report)   |
@@ -423,6 +423,39 @@ the largest user surface.
423
 when `null_adapter` is in the suite — the principled "the adapter
423
 when `null_adapter` is in the suite — the principled "the adapter
424
 preserved the base's tool-call structure beyond noise" signal.
424
 preserved the base's tool-call structure beyond noise" signal.
425
 
425
 
426
+## Multi-turn coherence
427
+
428
+Every other adherence probe is single-turn: one user message, one
429
+ft response, one score. Adapters that pass `delta_kl` cleanly
430
+frequently *forget their training* by turn 2 or 3 of a real
431
+dialogue — the model's own previous responses fill the context
432
+window and create compounding drift. The
433
+`multi_turn_coherence_decay` probe rolls a multi-turn synthetic
434
+dialogue per prompt and fits an exponential-decay curve to the
435
+per-turn KL:
436
+
437
+```yaml
438
+suite:
439
+  - name: holds_a_conversation
440
+    kind: multi_turn_coherence_decay
441
+    prompts:
442
+      - "Explain how a neural network learns."
443
+      - "What's the difference between TCP and UDP?"
444
+    max_turns: 4
445
+    assert_half_life_turns: 2.0
446
+```
447
+
448
+The probe reports `half_life_turns` (the turn at which adapter
449
+influence is halved), a per-turn KL list, and a tiny ASCII
450
+sparkline in the report message so you can see the curve shape
451
+without opening the JSON. Bases without a `chat_template` SKIP
452
+gracefully — multi-turn requires one to format dialogue history.
453
+
454
+The probe deliberately **doesn't** z-score against the null-adapter
455
+baseline (a null adapter has no coherence to decay; the null
456
+distribution is meaningless). Fixed-threshold verdicts are the
457
+published path. Mirrors `prompt_collapse`.
458
+
426
 ## Reproducing a sway run
459
 ## Reproducing a sway run
427
 
460
 
428
 Sometimes you want a coworker (or a future-you, or a bug report) to
461
 Sometimes you want a coworker (or a future-you, or a bug report) to