markdown · 3491 bytes Raw Blame History

dlm-sway

Differential testing for fine-tuned causal language models.

One question: did LoRA/QLoRA training actually change model behavior in a meaningful way, or is the model just defaulting to the pretrained base?

dlm-sway gives you a trustworthy, reproducible answer with eleven purpose-built primitives, each z-scored against a null-adapter baseline. No LLM judges. No external APIs. Deterministic on CPU where possible.

Install

pip install "dlm-sway[hf]"                # HuggingFace + PEFT backend
pip install "dlm-sway[hf,style,semsim]"   # full primitive battery
pip install "dlm-sway[all]"               # everything including optional viz
pip install "dlm-sway[dlm]"               # auto-generate tests from a .dlm file

90-second smoke test

dlm-sway check path/to/adapter --base HuggingFaceTB/SmolLM2-135M-Instruct

Outputs a verdict in under a minute on CPU for small models: your adapter is 4.2σ above noise ✅ or indistinguishable from a null adapter ❌.

Full suite

# sway.yaml
version: 1
models:
  base: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct"}
  ft:   {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct",
         adapter: "./runs/adapter/v0003"}
suite:
  - {name: knows_concept, kind: dir,
     prompt: "The Dunning-Kruger effect describes",
     target: " a cognitive bias where",
     distractor: " a programming language"}
  - {name: no_reversion, kind: adapter_revert, paraphrases: 4}
  - {name: section_attribution, kind: section_internalization}
dlm-sway run sway.yaml              # full report to terminal + JSON
dlm-sway gate sway.yaml --junit     # CI-friendly; non-zero on fail

Why it exists

Standard benchmarks (MMLU, HellaSwag) ask "how good is this model?" That's the wrong question after a targeted LoRA fine-tune on a small user-authored document. The right question is "did the adapter actually move the model toward what I wrote?" — and existing tools answer this poorly.

dlm-sway answers it directly via eleven primitives across four categories:

Category Primitives
Adherence delta_kl, adapter_revert, prompt_collapse
Attribution section_internalization, paraphrase_invariance, preference_flip
Calibration style_fingerprint, calibration_drift, leakage
Ablation adapter_ablation ← the signature primitive

The signature primitive. adapter_ablation scales the LoRA additive term by λ ∈ {0, 0.25, 0.5, 0.75, 1.0, 1.25} and measures the divergence curve. A healthy fine-tune shows a smooth, monotonic, non-saturated response. A degenerate one shows a step function or an overshoot-then- crash. Nobody else does this because nobody else gets this close to the adapter math.

The .dlm integration

If you trained your adapter via the DocumentLanguageModel project, sway can auto-generate a test suite from your document's sections:

pip install "dlm-sway[hf,dlm]"
dlm-sway autogen path/to/doc.dlm -o sway.yaml
dlm-sway run sway.yaml

Per-section attribution tells you which parts of your document actually moved the model — a kind of signal no other tool provides.

Status

Pre-alpha. API will break. Version 0.1.0 is the first tag.

License

MIT

View source
1 # dlm-sway
2
3 Differential testing for fine-tuned causal language models.
4
5 **One question:** *did LoRA/QLoRA training actually change model behavior
6 in a meaningful way, or is the model just defaulting to the pretrained
7 base?*
8
9 `dlm-sway` gives you a trustworthy, reproducible answer with eleven
10 purpose-built primitives, each z-scored against a null-adapter baseline.
11 No LLM judges. No external APIs. Deterministic on CPU where possible.
12
13 ## Install
14
15 ```bash
16 pip install "dlm-sway[hf]" # HuggingFace + PEFT backend
17 pip install "dlm-sway[hf,style,semsim]" # full primitive battery
18 pip install "dlm-sway[all]" # everything including optional viz
19 pip install "dlm-sway[dlm]" # auto-generate tests from a .dlm file
20 ```
21
22 ## 90-second smoke test
23
24 ```bash
25 dlm-sway check path/to/adapter --base HuggingFaceTB/SmolLM2-135M-Instruct
26 ```
27
28 Outputs a verdict in under a minute on CPU for small models: *your
29 adapter is 4.2σ above noise* ✅ or *indistinguishable from a null
30 adapter* ❌.
31
32 ## Full suite
33
34 ```yaml
35 # sway.yaml
36 version: 1
37 models:
38 base: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct"}
39 ft: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct",
40 adapter: "./runs/adapter/v0003"}
41 suite:
42 - {name: knows_concept, kind: dir,
43 prompt: "The Dunning-Kruger effect describes",
44 target: " a cognitive bias where",
45 distractor: " a programming language"}
46 - {name: no_reversion, kind: adapter_revert, paraphrases: 4}
47 - {name: section_attribution, kind: section_internalization}
48 ```
49
50 ```bash
51 dlm-sway run sway.yaml # full report to terminal + JSON
52 dlm-sway gate sway.yaml --junit # CI-friendly; non-zero on fail
53 ```
54
55 ## Why it exists
56
57 Standard benchmarks (MMLU, HellaSwag) ask *"how good is this model?"*
58 That's the wrong question after a targeted LoRA fine-tune on a small
59 user-authored document. The right question is *"did the adapter actually
60 move the model toward what I wrote?"* — and existing tools answer this
61 poorly.
62
63 `dlm-sway` answers it directly via eleven primitives across four
64 categories:
65
66 | Category | Primitives |
67 |---------------|-------------------------------------------------------|
68 | Adherence | `delta_kl`, `adapter_revert`, `prompt_collapse` |
69 | Attribution | `section_internalization`, `paraphrase_invariance`, `preference_flip` |
70 | Calibration | `style_fingerprint`, `calibration_drift`, `leakage` |
71 | Ablation | `adapter_ablation` ← the signature primitive |
72
73 **The signature primitive.** `adapter_ablation` scales the LoRA additive
74 term by λ ∈ {0, 0.25, 0.5, 0.75, 1.0, 1.25} and measures the divergence
75 curve. A healthy fine-tune shows a smooth, monotonic, non-saturated
76 response. A degenerate one shows a step function or an overshoot-then-
77 crash. Nobody else does this because nobody else gets this close to the
78 adapter math.
79
80 ## The `.dlm` integration
81
82 If you trained your adapter via the [DocumentLanguageModel
83 project](https://github.com/tenseleyFlow/DocumentLanguageModel), sway
84 can auto-generate a test suite from your document's sections:
85
86 ```bash
87 pip install "dlm-sway[hf,dlm]"
88 dlm-sway autogen path/to/doc.dlm -o sway.yaml
89 dlm-sway run sway.yaml
90 ```
91
92 Per-section attribution tells you *which* parts of your document
93 actually moved the model — a kind of signal no other tool provides.
94
95 ## Status
96
97 Pre-alpha. API will break. Version `0.1.0` is the first tag.
98
99 ## License
100
101 MIT