dlm-sway
Differential testing for fine-tuned causal language models.
One question: did LoRA/QLoRA training actually change model behavior in a meaningful way, or is the model just defaulting to the pretrained base?
dlm-sway gives you a trustworthy, reproducible answer with eleven
purpose-built primitives, each z-scored against a null-adapter baseline.
No LLM judges. No external APIs. Deterministic on CPU where possible.
Install
pip install "dlm-sway[hf]" # HuggingFace + PEFT backend
pip install "dlm-sway[hf,style,semsim]" # full primitive battery
pip install "dlm-sway[all]" # everything including optional viz
pip install "dlm-sway[dlm]" # auto-generate tests from a .dlm file
90-second smoke test
dlm-sway check path/to/adapter --base HuggingFaceTB/SmolLM2-135M-Instruct
Outputs a verdict in under a minute on CPU for small models: your adapter is 4.2σ above noise ✅ or indistinguishable from a null adapter ❌.
Full suite
# sway.yaml
version: 1
models:
base: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct"}
ft: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct",
adapter: "./runs/adapter/v0003"}
suite:
- {name: knows_concept, kind: dir,
prompt: "The Dunning-Kruger effect describes",
target: " a cognitive bias where",
distractor: " a programming language"}
- {name: no_reversion, kind: adapter_revert, paraphrases: 4}
- {name: section_attribution, kind: section_internalization}
dlm-sway run sway.yaml # full report to terminal + JSON
dlm-sway gate sway.yaml --junit # CI-friendly; non-zero on fail
Why it exists
Standard benchmarks (MMLU, HellaSwag) ask "how good is this model?" That's the wrong question after a targeted LoRA fine-tune on a small user-authored document. The right question is "did the adapter actually move the model toward what I wrote?" — and existing tools answer this poorly.
dlm-sway answers it directly via eleven primitives across four
categories:
| Category | Primitives |
|---|---|
| Adherence | delta_kl, adapter_revert, prompt_collapse |
| Attribution | section_internalization, paraphrase_invariance, preference_flip |
| Calibration | style_fingerprint, calibration_drift, leakage |
| Ablation | adapter_ablation ← the signature primitive |
The signature primitive. adapter_ablation scales the LoRA additive
term by λ ∈ {0, 0.25, 0.5, 0.75, 1.0, 1.25} and measures the divergence
curve. A healthy fine-tune shows a smooth, monotonic, non-saturated
response. A degenerate one shows a step function or an overshoot-then-
crash. Nobody else does this because nobody else gets this close to the
adapter math.
The .dlm integration
If you trained your adapter via the DocumentLanguageModel project, sway can auto-generate a test suite from your document's sections:
pip install "dlm-sway[hf,dlm]"
dlm-sway autogen path/to/doc.dlm -o sway.yaml
dlm-sway run sway.yaml
Per-section attribution tells you which parts of your document actually moved the model — a kind of signal no other tool provides.
Status
Pre-alpha. API will break. Version 0.1.0 is the first tag.
License
MIT
View source
| 1 | # dlm-sway |
| 2 | |
| 3 | Differential testing for fine-tuned causal language models. |
| 4 | |
| 5 | **One question:** *did LoRA/QLoRA training actually change model behavior |
| 6 | in a meaningful way, or is the model just defaulting to the pretrained |
| 7 | base?* |
| 8 | |
| 9 | `dlm-sway` gives you a trustworthy, reproducible answer with eleven |
| 10 | purpose-built primitives, each z-scored against a null-adapter baseline. |
| 11 | No LLM judges. No external APIs. Deterministic on CPU where possible. |
| 12 | |
| 13 | ## Install |
| 14 | |
| 15 | ```bash |
| 16 | pip install "dlm-sway[hf]" # HuggingFace + PEFT backend |
| 17 | pip install "dlm-sway[hf,style,semsim]" # full primitive battery |
| 18 | pip install "dlm-sway[all]" # everything including optional viz |
| 19 | pip install "dlm-sway[dlm]" # auto-generate tests from a .dlm file |
| 20 | ``` |
| 21 | |
| 22 | ## 90-second smoke test |
| 23 | |
| 24 | ```bash |
| 25 | dlm-sway check path/to/adapter --base HuggingFaceTB/SmolLM2-135M-Instruct |
| 26 | ``` |
| 27 | |
| 28 | Outputs a verdict in under a minute on CPU for small models: *your |
| 29 | adapter is 4.2σ above noise* ✅ or *indistinguishable from a null |
| 30 | adapter* ❌. |
| 31 | |
| 32 | ## Full suite |
| 33 | |
| 34 | ```yaml |
| 35 | # sway.yaml |
| 36 | version: 1 |
| 37 | models: |
| 38 | base: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct"} |
| 39 | ft: {kind: hf, base: "HuggingFaceTB/SmolLM2-135M-Instruct", |
| 40 | adapter: "./runs/adapter/v0003"} |
| 41 | suite: |
| 42 | - {name: knows_concept, kind: dir, |
| 43 | prompt: "The Dunning-Kruger effect describes", |
| 44 | target: " a cognitive bias where", |
| 45 | distractor: " a programming language"} |
| 46 | - {name: no_reversion, kind: adapter_revert, paraphrases: 4} |
| 47 | - {name: section_attribution, kind: section_internalization} |
| 48 | ``` |
| 49 | |
| 50 | ```bash |
| 51 | dlm-sway run sway.yaml # full report to terminal + JSON |
| 52 | dlm-sway gate sway.yaml --junit # CI-friendly; non-zero on fail |
| 53 | ``` |
| 54 | |
| 55 | ## Why it exists |
| 56 | |
| 57 | Standard benchmarks (MMLU, HellaSwag) ask *"how good is this model?"* |
| 58 | That's the wrong question after a targeted LoRA fine-tune on a small |
| 59 | user-authored document. The right question is *"did the adapter actually |
| 60 | move the model toward what I wrote?"* — and existing tools answer this |
| 61 | poorly. |
| 62 | |
| 63 | `dlm-sway` answers it directly via eleven primitives across four |
| 64 | categories: |
| 65 | |
| 66 | | Category | Primitives | |
| 67 | |---------------|-------------------------------------------------------| |
| 68 | | Adherence | `delta_kl`, `adapter_revert`, `prompt_collapse` | |
| 69 | | Attribution | `section_internalization`, `paraphrase_invariance`, `preference_flip` | |
| 70 | | Calibration | `style_fingerprint`, `calibration_drift`, `leakage` | |
| 71 | | Ablation | `adapter_ablation` ← the signature primitive | |
| 72 | |
| 73 | **The signature primitive.** `adapter_ablation` scales the LoRA additive |
| 74 | term by λ ∈ {0, 0.25, 0.5, 0.75, 1.0, 1.25} and measures the divergence |
| 75 | curve. A healthy fine-tune shows a smooth, monotonic, non-saturated |
| 76 | response. A degenerate one shows a step function or an overshoot-then- |
| 77 | crash. Nobody else does this because nobody else gets this close to the |
| 78 | adapter math. |
| 79 | |
| 80 | ## The `.dlm` integration |
| 81 | |
| 82 | If you trained your adapter via the [DocumentLanguageModel |
| 83 | project](https://github.com/tenseleyFlow/DocumentLanguageModel), sway |
| 84 | can auto-generate a test suite from your document's sections: |
| 85 | |
| 86 | ```bash |
| 87 | pip install "dlm-sway[hf,dlm]" |
| 88 | dlm-sway autogen path/to/doc.dlm -o sway.yaml |
| 89 | dlm-sway run sway.yaml |
| 90 | ``` |
| 91 | |
| 92 | Per-section attribution tells you *which* parts of your document |
| 93 | actually moved the model — a kind of signal no other tool provides. |
| 94 | |
| 95 | ## Status |
| 96 | |
| 97 | Pre-alpha. API will break. Version `0.1.0` is the first tag. |
| 98 | |
| 99 | ## License |
| 100 | |
| 101 | MIT |