documentlanguagemodel Public
Multi-adapter composition
Train a single .dlm with more than one named adapter — keep knowledge
and tone orthogonal, mix them at export time, or prompt against one at
a time. Reach for this when you want separate "what the model knows"
and "how it says things" training signals without spinning up two
documents.
When to use it
- A handbook where the knowledge is stable but the tone evolves (you might rewrite just the style examples next month).
- A single base model that needs to serve two personas — one customer- facing, one internal-engineering — where the instruction sets diverge.
- Experiments where you want to A/B two training recipes against the same prose corpus without forking the document.
If the answer is "one adapter is fine," skip this. Multi-adapter trades simplicity for composition flexibility — pay that cost when you need it.
Document shape
---
dlm_id: 01KPM618S7NXSPAY10BHKVECYX
base_model: qwen2.5-1.5b
training:
sequence_len: 2048
num_epochs: 2
adapters:
knowledge:
adapter: lora
lora_r: 8
tone:
adapter: lora
lora_r: 4
target_modules: [q_proj, v_proj]
learning_rate: 1e-4
export:
default_quant: Q4_K_M
---
# Domain prose
This prose trains BOTH adapters by default — prose without a `#name`
suffix fans out to every declared adapter. Most documents keep prose
shared so both adapters pick up the same domain vocabulary.
::instruction#knowledge::
### Q
What is the capital of France?
### A
Paris.
::instruction#tone::
### Q
How should I phrase things?
### A
Crisply. One sentence.
Routing rules
| Section | Fence | Trains |
|---|---|---|
| Prose (no suffix) | # heading / plain prose |
all adapters |
| Prose (pinned) | ::prose#knowledge:: |
only knowledge |
| Instruction (no suffix) | ::instruction:: |
first-declared adapter |
| Instruction (pinned) | ::instruction#tone:: |
only tone |
| Preference | same — ::preference#name:: |
only name |
The first-declared adapter acts as the implicit "default" for untagged non-prose sections. Declaration order is the order you write them in the YAML block.
Training
One dlm train invocation trains all declared adapters:
$ uv run dlm train mydoc.dlm
Each adapter gets its own version history under
~/.dlm/store/<dlm_id>/adapter/<name>/versions/vNNNN/ with an
independent current.txt pointer. The manifest grows one
TrainingRunSummary per adapter per invocation — running dlm train
again commits fresh v0002 directories for each, never mixing lanes.
Each adapter is trained as a fresh LoRA from the base on its routed
rows; the base model loads once per adapter. Shared hyperparameters
(sequence_len, num_epochs, seed, optimizer, scheduler, warmup)
live at the training top level — per-adapter overrides are
intentionally limited to the LoRA-specific knobs.
Prompting a specific adapter
$ uv run dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
$ uv run dlm prompt mydoc.dlm "Explain the runbook" --adapter tone
--adapter is required on multi-adapter documents and rejected on
single-adapter ones. Unknown names get a clear error listing the
declared adapters.
Exporting a specific adapter
$ uv run dlm export mydoc.dlm --adapter knowledge
One adapter → one Ollama model. The GGUF bundle + Modelfile embeds
that adapter only; manifest.exports[-1].adapter_name records which
one.
Weighted composition at export
To ship a single Ollama model that combines both adapters:
$ uv run dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5
This uses PEFT's add_weighted_adapter with linear combination to
produce a composite adapter, which is then converted to GGUF and
registered with Ollama as one unit.
Caveats (from the PEFT reference):
- LoRA-only.
add_weighted_adapterdoesn't support prefix / prompt tuning, and it can't merge across different LoRA ranks robustly. Keep all adapters in the mix on the sameadapter: lorashape. - QLoRA requires dequantize. Combining 4-bit quantized adapters
into a composite is precision-unsafe;
dlmrefuses unless you pass--dequantizeand--mergedexplicitly. - Mix is frozen in the export. Once the Ollama model is built, the
weights are baked. To change the mix, re-run
dlm exportwith a new--adapter-mix. Ollama doesn't support hot-swapping adapter weights at runtime — keep the separate per-adapter exports around if you need dynamic composition at inference time.
Hardware notes
dlm doctor refuses multi-adapter + QLoRA plans whose estimated VRAM
exceeds the device's 85% headroom (roughly: `base_4bit + 1 GB/adapter
- 25% activations
). The failure points at two fixes: drop toadapter: lora` across the board, or reduce the adapter count. LoRA multi-adapter plans are always accepted — each adapter's extra state is negligible next to the base weights.
When to fold back to a single adapter
Multi-adapter adds cognitive load and per-adapter training cost. Fold back when:
- The adapters converge on similar behavior despite separate routing — the extra structure isn't doing work.
- One adapter's training set is so small (<10 rows) that it's adding noise instead of signal.
- Your export pipeline is always
--adapter-mix name:1.0,other:1.0— a single adapter trained on the union is equivalent and cheaper.
See also
- Preference tuning (DPO vs ORPO) —
applies per-adapter on multi-adapter docs via
::preference#name::routing. - Domain knowledge base — the single-adapter story.
View source
| 1 | # Multi-adapter composition |
| 2 | |
| 3 | Train a single `.dlm` with more than one named adapter — keep knowledge |
| 4 | and tone orthogonal, mix them at export time, or prompt against one at |
| 5 | a time. Reach for this when you want separate "what the model knows" |
| 6 | and "how it says things" training signals without spinning up two |
| 7 | documents. |
| 8 | |
| 9 | ## When to use it |
| 10 | |
| 11 | - A handbook where the **knowledge** is stable but the **tone** evolves |
| 12 | (you might rewrite just the style examples next month). |
| 13 | - A single base model that needs to serve two personas — one customer- |
| 14 | facing, one internal-engineering — where the instruction sets |
| 15 | diverge. |
| 16 | - Experiments where you want to A/B two training recipes against the |
| 17 | same prose corpus without forking the document. |
| 18 | |
| 19 | If the answer is "one adapter is fine," skip this. Multi-adapter trades |
| 20 | simplicity for composition flexibility — pay that cost when you need |
| 21 | it. |
| 22 | |
| 23 | ## Document shape |
| 24 | |
| 25 | ```dlm |
| 26 | --- |
| 27 | dlm_id: 01KPM618S7NXSPAY10BHKVECYX |
| 28 | base_model: qwen2.5-1.5b |
| 29 | training: |
| 30 | sequence_len: 2048 |
| 31 | num_epochs: 2 |
| 32 | adapters: |
| 33 | knowledge: |
| 34 | adapter: lora |
| 35 | lora_r: 8 |
| 36 | tone: |
| 37 | adapter: lora |
| 38 | lora_r: 4 |
| 39 | target_modules: [q_proj, v_proj] |
| 40 | learning_rate: 1e-4 |
| 41 | export: |
| 42 | default_quant: Q4_K_M |
| 43 | --- |
| 44 | |
| 45 | # Domain prose |
| 46 | |
| 47 | This prose trains BOTH adapters by default — prose without a `#name` |
| 48 | suffix fans out to every declared adapter. Most documents keep prose |
| 49 | shared so both adapters pick up the same domain vocabulary. |
| 50 | |
| 51 | ::instruction#knowledge:: |
| 52 | ### Q |
| 53 | What is the capital of France? |
| 54 | ### A |
| 55 | Paris. |
| 56 | |
| 57 | ::instruction#tone:: |
| 58 | ### Q |
| 59 | How should I phrase things? |
| 60 | ### A |
| 61 | Crisply. One sentence. |
| 62 | ``` |
| 63 | |
| 64 | ### Routing rules |
| 65 | |
| 66 | | Section | Fence | Trains | |
| 67 | |---|---|---| |
| 68 | | Prose (no suffix) | `# heading` / plain prose | all adapters | |
| 69 | | Prose (pinned) | `::prose#knowledge::` | only `knowledge` | |
| 70 | | Instruction (no suffix) | `::instruction::` | first-declared adapter | |
| 71 | | Instruction (pinned) | `::instruction#tone::` | only `tone` | |
| 72 | | Preference | same — `::preference#name::` | only `name` | |
| 73 | |
| 74 | The first-declared adapter acts as the implicit "default" for untagged |
| 75 | non-prose sections. Declaration order is the order you write them in |
| 76 | the YAML block. |
| 77 | |
| 78 | ## Training |
| 79 | |
| 80 | One `dlm train` invocation trains all declared adapters: |
| 81 | |
| 82 | ```sh |
| 83 | $ uv run dlm train mydoc.dlm |
| 84 | ``` |
| 85 | |
| 86 | Each adapter gets its own version history under |
| 87 | `~/.dlm/store/<dlm_id>/adapter/<name>/versions/vNNNN/` with an |
| 88 | independent `current.txt` pointer. The manifest grows one |
| 89 | `TrainingRunSummary` per adapter per invocation — running `dlm train` |
| 90 | again commits fresh `v0002` directories for each, never mixing lanes. |
| 91 | |
| 92 | Each adapter is trained as a fresh LoRA from the base on its routed |
| 93 | rows; the base model loads once per adapter. Shared hyperparameters |
| 94 | (`sequence_len`, `num_epochs`, `seed`, optimizer, scheduler, warmup) |
| 95 | live at the `training` top level — per-adapter overrides are |
| 96 | intentionally limited to the LoRA-specific knobs. |
| 97 | |
| 98 | ## Prompting a specific adapter |
| 99 | |
| 100 | ```sh |
| 101 | $ uv run dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge |
| 102 | $ uv run dlm prompt mydoc.dlm "Explain the runbook" --adapter tone |
| 103 | ``` |
| 104 | |
| 105 | `--adapter` is required on multi-adapter documents and rejected on |
| 106 | single-adapter ones. Unknown names get a clear error listing the |
| 107 | declared adapters. |
| 108 | |
| 109 | ## Exporting a specific adapter |
| 110 | |
| 111 | ```sh |
| 112 | $ uv run dlm export mydoc.dlm --adapter knowledge |
| 113 | ``` |
| 114 | |
| 115 | One adapter → one Ollama model. The GGUF bundle + Modelfile embeds |
| 116 | that adapter only; `manifest.exports[-1].adapter_name` records which |
| 117 | one. |
| 118 | |
| 119 | ## Weighted composition at export |
| 120 | |
| 121 | To ship a single Ollama model that combines both adapters: |
| 122 | |
| 123 | ```sh |
| 124 | $ uv run dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5 |
| 125 | ``` |
| 126 | |
| 127 | This uses PEFT's `add_weighted_adapter` with linear combination to |
| 128 | produce a composite adapter, which is then converted to GGUF and |
| 129 | registered with Ollama as one unit. |
| 130 | |
| 131 | Caveats (from the PEFT reference): |
| 132 | |
| 133 | - **LoRA-only.** `add_weighted_adapter` doesn't support prefix / prompt |
| 134 | tuning, and it can't merge across different LoRA ranks robustly. Keep |
| 135 | all adapters in the mix on the same `adapter: lora` shape. |
| 136 | - **QLoRA requires dequantize.** Combining 4-bit quantized adapters |
| 137 | into a composite is precision-unsafe; `dlm` refuses unless you pass |
| 138 | `--dequantize` and `--merged` explicitly. |
| 139 | - **Mix is frozen in the export.** Once the Ollama model is built, the |
| 140 | weights are baked. To change the mix, re-run `dlm export` with a new |
| 141 | `--adapter-mix`. Ollama doesn't support hot-swapping adapter weights |
| 142 | at runtime — keep the separate per-adapter exports around if you |
| 143 | need dynamic composition at inference time. |
| 144 | |
| 145 | ## Hardware notes |
| 146 | |
| 147 | `dlm doctor` refuses multi-adapter + QLoRA plans whose estimated VRAM |
| 148 | exceeds the device's 85% headroom (roughly: `base_4bit + 1 GB/adapter |
| 149 | + 25% activations`). The failure points at two fixes: drop to |
| 150 | `adapter: lora` across the board, or reduce the adapter count. LoRA |
| 151 | multi-adapter plans are always accepted — each adapter's extra state |
| 152 | is negligible next to the base weights. |
| 153 | |
| 154 | ## When to fold back to a single adapter |
| 155 | |
| 156 | Multi-adapter adds cognitive load and per-adapter training cost. Fold |
| 157 | back when: |
| 158 | |
| 159 | - The adapters converge on similar behavior despite separate routing — |
| 160 | the extra structure isn't doing work. |
| 161 | - One adapter's training set is so small (<10 rows) that it's adding |
| 162 | noise instead of signal. |
| 163 | - Your export pipeline is always `--adapter-mix name:1.0,other:1.0` — |
| 164 | a single adapter trained on the union is equivalent and cheaper. |
| 165 | |
| 166 | ## See also |
| 167 | |
| 168 | - [Preference tuning (DPO vs ORPO)](preference-dpo-vs-orpo.md) — |
| 169 | applies per-adapter on multi-adapter docs via `::preference#name::` |
| 170 | routing. |
| 171 | - [Domain knowledge base](domain-kb.md) — the single-adapter story. |