markdown · 8673 bytes Raw Blame History

DocumentLanguageModel

A .dlm file becomes a local, reproducible, trainable LLM. Edit the document, retrain, share.

DocumentLanguageModel (DLM) is a local-first training, inference, and export toolchain built around authored documents instead of hosted dashboards.

A .dlm can be:

  • a hand-written training document with prose, instruction, and preference data
  • a directive-driven entrypoint into a codebase or notes tree
  • a multi-adapter project with learned routing
  • a multimodal or audio-language document

DLM trains LoRA / QLoRA / DoRA adapters on real pretrained bases, keeps a replay history so retrains do not silently forget, and exports to Ollama, llama-server, vllm, and mlx-serve.

Install

pip install document-language-model

That gives you the dlm command. Verify:

dlm --version
dlm doctor

Extras

# CUDA QLoRA support (NVIDIA SM >= 8.0):
pip install 'document-language-model[cuda]'

# Apple Silicon MLX inference:
pip install 'document-language-model[mlx]'

# OpenAI teacher for synthetic data generation:
pip install 'document-language-model[openai]'

# Anthropic teacher:
pip install 'document-language-model[anthropic]'

# Observability (TensorBoard + W&B):
pip install 'document-language-model[observability]'

From source

git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
cd DocumentLanguageModel
uv sync --all-extras --dev
uv run dlm --help

# Build GGUF export tooling:
scripts/bump-llama-cpp.sh build

# Optional: llama-server HTTP target:
scripts/bump-llama-cpp.sh build --with-server

30-Second Start

dlm init tutor.dlm --base smollm2-135m
# Edit tutor.dlm — add your Q&A pairs
dlm train tutor.dlm
dlm prompt tutor.dlm "What is a Python decorator?"
dlm export tutor.dlm --target ollama --name my-tutor

What a .dlm Looks Like

A minimal document:

---
dlm_id: 01KPM5CXB51GRX86Q25AKERN6E
dlm_version: 15
base_model: smollm2-135m
---

# My tutor

Some background prose. This trains via continued pretraining.

::instruction::
### Q
What is a decorator?

### A
A function that takes a function and returns a wrapped function.

A more representative one with directives, named adapters, and export config:

---
dlm_id: 01KTESTEXAMPLE000000000000
dlm_version: 15
base_model: qwen3-1.7b
system_prompt: |
  You are a concise engineering assistant.
training:
  adapter: lora
  sequence_len: 4096
  sources:
    - path: ./src
      include: ["**/*.py", "**/*.md"]
      exclude: ["tests/**"]
  adapters:
    knowledge:
      adapter: lora
      lora_r: 8
    tone:
      adapter: lora
      lora_r: 4
  gate:
    enabled: true
export:
  default_quant: Q4_K_M
---

# Project notes

Shared prose trains all declared adapters by default.

::instruction#knowledge::
### Q
What does the cache layer do?

### A
It avoids re-tokenizing unchanged directive-sourced files.

::preference#tone::
### Prompt
Explain a failure mode.

### Chosen
Explain it directly, then give the fix.

### Rejected
Over-explain the background before naming the problem.

Common Workflows

Train a hand-authored document

dlm init tutor.dlm --base smollm2-135m
dlm train tutor.dlm
dlm prompt tutor.dlm "Explain decorators"

Train across a codebase

dlm train ./my-repo --base qwen3-1.7b

Auto-scaffolds a .dlm under ./my-repo/.dlm/ and trains on the repo's source files.

Multi-adapter composition

dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5

Export to local runtimes

dlm export mydoc.dlm --target ollama --name mydoc
dlm export mydoc.dlm --target llama-server
dlm export mydoc.dlm --target vllm
dlm export mydoc.dlm --target mlx-serve

# Also emit a ready-to-run sway.yaml next to the GGUF for downstream
# evaluation via `sway run` (requires the [sway] extra).
dlm export mydoc.dlm --target ollama --emit-sway-json
sway run <export-dir>/sway.yaml

Mine preference pairs and retrain

dlm preference mine mydoc.dlm --samples 4 --max-pairs 8
dlm preference apply mydoc.dlm
dlm train mydoc.dlm --phase preference

Generate synthetic training data

dlm synth instructions mydoc.dlm --teacher self --apply
dlm synth instructions mydoc.dlm --teacher openai:gpt-4o-mini --apply

Multimodal and audio documents

dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct
dlm train diagrams.dlm
dlm prompt diagrams.dlm --image figures/arch.png "What is this?"

dlm init calls.dlm --audio
dlm train calls.dlm
dlm prompt calls.dlm --audio clips/call.wav "Summarize the clip"

Pull eval failures back into training

dlm harvest mydoc.dlm --sway-json sway-report.json --apply

Pack and share

dlm pack mydoc.dlm --include-exports
dlm verify mydoc.dlm.pack
dlm push mydoc.dlm --to hf:org/name

Inspect state

dlm doctor
dlm show mydoc.dlm --json
dlm metrics mydoc.dlm

Supported Platforms

Tier Training Inference / Export
NVIDIA CUDA (SM >= 8.0) bf16 + QLoRA 4-bit + FlashAttention Ollama, GGUF, llama-server, vLLM
NVIDIA CUDA (SM < 8.0) fp16 LoRA Ollama, GGUF, llama-server, vLLM
Apple Silicon (MPS) fp16 LoRA Ollama, GGUF, MLX inference, mlx-serve
CPU inference only (training refused above small bases) GGUF, Ollama, llama-server
AMD ROCm experimental ROCm llama.cpp

Base Model Registry

DLM ships with ~27 pinned base models across text, vision-language, and audio-language families:

  • Text: Qwen 2.5 (0.5B–3B), Qwen 3 (1.7B–8B), Llama 3.2/3.3, SmolLM 2/3, Phi-3.5/4, Gemma 2, OLMo 2, Mixtral 8x7B
  • Vision-language: Qwen2-VL, InternVL2/3, PaliGemma, Mistral-Small-3.1
  • Audio-language: Qwen2-Audio

Any HuggingFace model via --base hf:org/name with compatibility probes.

Command Surface

Area Commands
Author init, templates, show, migrate, cache
Train train, doctor, metrics, harvest
Align preference mine/apply/revert/list
Synth synth instructions/preferences/revert/list
Infer prompt, repl
Ship export, pack, unpack, verify, push, pull, serve

See the CLI reference for the full flag surface.

Editor support

VSCode

Install DLM — Document Language Model from the VSCode Marketplace. The extension provides syntax highlighting, completions, diagnostics, and a side panel for .dlm authoring. Source: dlm-vsc.

It uses the dlm-lsp language server, which you also need to install:

pip install dlm-lsp

Other editors

The language server is editor-agnostic — Zed, Helix, and Neovim get diagnostics, hover, and completions through their LSP clients. See:

Documentation

Principles

  1. The document is the interface. Frontmatter, typed sections, directives, and store contracts — not a dashboard.
  2. Training is real. LoRA / QLoRA / DoRA on pretrained bases.
  3. Retraining should not silently forget. Replay-backed accumulation.
  4. Local-first is load-bearing. Your data stays on your machine.
  5. Determinism is a contract. Locks, pinned versions, golden checks.

Tech Stack

Python 3.11+ · PyTorch · HuggingFace transformers / peft / trl / accelerate · vendored llama.cpp for GGUF · Ollama · Typer · Pydantic · uv

Contributing

See CONTRIBUTING.md.

License

MIT. Base-model licenses are separate and enforced at dlm init, dlm train, dlm export, and dlm pack.

View source
1 # DocumentLanguageModel
2
3 > A `.dlm` file becomes a local, reproducible, trainable LLM.
4 > Edit the document, retrain, share.
5
6 DocumentLanguageModel (DLM) is a local-first training, inference, and export
7 toolchain built around authored documents instead of hosted dashboards.
8
9 A `.dlm` can be:
10
11 - a hand-written training document with prose, instruction, and preference data
12 - a directive-driven entrypoint into a codebase or notes tree
13 - a multi-adapter project with learned routing
14 - a multimodal or audio-language document
15
16 DLM trains LoRA / QLoRA / DoRA adapters on real pretrained bases, keeps a replay
17 history so retrains do not silently forget, and exports to Ollama,
18 `llama-server`, `vllm`, and `mlx-serve`.
19
20 ## Install
21
22 ```sh
23 pip install document-language-model
24 ```
25
26 That gives you the `dlm` command. Verify:
27
28 ```sh
29 dlm --version
30 dlm doctor
31 ```
32
33 ### Extras
34
35 ```sh
36 # CUDA QLoRA support (NVIDIA SM >= 8.0):
37 pip install 'document-language-model[cuda]'
38
39 # Apple Silicon MLX inference:
40 pip install 'document-language-model[mlx]'
41
42 # OpenAI teacher for synthetic data generation:
43 pip install 'document-language-model[openai]'
44
45 # Anthropic teacher:
46 pip install 'document-language-model[anthropic]'
47
48 # Observability (TensorBoard + W&B):
49 pip install 'document-language-model[observability]'
50 ```
51
52 ### From source
53
54 ```sh
55 git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
56 cd DocumentLanguageModel
57 uv sync --all-extras --dev
58 uv run dlm --help
59
60 # Build GGUF export tooling:
61 scripts/bump-llama-cpp.sh build
62
63 # Optional: llama-server HTTP target:
64 scripts/bump-llama-cpp.sh build --with-server
65 ```
66
67 ## 30-Second Start
68
69 ```sh
70 dlm init tutor.dlm --base smollm2-135m
71 # Edit tutor.dlm — add your Q&A pairs
72 dlm train tutor.dlm
73 dlm prompt tutor.dlm "What is a Python decorator?"
74 dlm export tutor.dlm --target ollama --name my-tutor
75 ```
76
77 ## What a `.dlm` Looks Like
78
79 A minimal document:
80
81 ```yaml
82 ---
83 dlm_id: 01KPM5CXB51GRX86Q25AKERN6E
84 dlm_version: 15
85 base_model: smollm2-135m
86 ---
87
88 # My tutor
89
90 Some background prose. This trains via continued pretraining.
91
92 ::instruction::
93 ### Q
94 What is a decorator?
95
96 ### A
97 A function that takes a function and returns a wrapped function.
98 ```
99
100 A more representative one with directives, named adapters, and export config:
101
102 ```yaml
103 ---
104 dlm_id: 01KTESTEXAMPLE000000000000
105 dlm_version: 15
106 base_model: qwen3-1.7b
107 system_prompt: |
108 You are a concise engineering assistant.
109 training:
110 adapter: lora
111 sequence_len: 4096
112 sources:
113 - path: ./src
114 include: ["**/*.py", "**/*.md"]
115 exclude: ["tests/**"]
116 adapters:
117 knowledge:
118 adapter: lora
119 lora_r: 8
120 tone:
121 adapter: lora
122 lora_r: 4
123 gate:
124 enabled: true
125 export:
126 default_quant: Q4_K_M
127 ---
128
129 # Project notes
130
131 Shared prose trains all declared adapters by default.
132
133 ::instruction#knowledge::
134 ### Q
135 What does the cache layer do?
136
137 ### A
138 It avoids re-tokenizing unchanged directive-sourced files.
139
140 ::preference#tone::
141 ### Prompt
142 Explain a failure mode.
143
144 ### Chosen
145 Explain it directly, then give the fix.
146
147 ### Rejected
148 Over-explain the background before naming the problem.
149 ```
150
151 ## Common Workflows
152
153 ### Train a hand-authored document
154
155 ```sh
156 dlm init tutor.dlm --base smollm2-135m
157 dlm train tutor.dlm
158 dlm prompt tutor.dlm "Explain decorators"
159 ```
160
161 ### Train across a codebase
162
163 ```sh
164 dlm train ./my-repo --base qwen3-1.7b
165 ```
166
167 Auto-scaffolds a `.dlm` under `./my-repo/.dlm/` and trains on the repo's
168 source files.
169
170 ### Multi-adapter composition
171
172 ```sh
173 dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
174 dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5
175 ```
176
177 ### Export to local runtimes
178
179 ```sh
180 dlm export mydoc.dlm --target ollama --name mydoc
181 dlm export mydoc.dlm --target llama-server
182 dlm export mydoc.dlm --target vllm
183 dlm export mydoc.dlm --target mlx-serve
184
185 # Also emit a ready-to-run sway.yaml next to the GGUF for downstream
186 # evaluation via `sway run` (requires the [sway] extra).
187 dlm export mydoc.dlm --target ollama --emit-sway-json
188 sway run <export-dir>/sway.yaml
189 ```
190
191 ### Mine preference pairs and retrain
192
193 ```sh
194 dlm preference mine mydoc.dlm --samples 4 --max-pairs 8
195 dlm preference apply mydoc.dlm
196 dlm train mydoc.dlm --phase preference
197 ```
198
199 ### Generate synthetic training data
200
201 ```sh
202 dlm synth instructions mydoc.dlm --teacher self --apply
203 dlm synth instructions mydoc.dlm --teacher openai:gpt-4o-mini --apply
204 ```
205
206 ### Multimodal and audio documents
207
208 ```sh
209 dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct
210 dlm train diagrams.dlm
211 dlm prompt diagrams.dlm --image figures/arch.png "What is this?"
212
213 dlm init calls.dlm --audio
214 dlm train calls.dlm
215 dlm prompt calls.dlm --audio clips/call.wav "Summarize the clip"
216 ```
217
218 ### Pull eval failures back into training
219
220 ```sh
221 dlm harvest mydoc.dlm --sway-json sway-report.json --apply
222 ```
223
224 ### Pack and share
225
226 ```sh
227 dlm pack mydoc.dlm --include-exports
228 dlm verify mydoc.dlm.pack
229 dlm push mydoc.dlm --to hf:org/name
230 ```
231
232 ### Inspect state
233
234 ```sh
235 dlm doctor
236 dlm show mydoc.dlm --json
237 dlm metrics mydoc.dlm
238 ```
239
240 ## Supported Platforms
241
242 | Tier | Training | Inference / Export |
243 |---|---|---|
244 | NVIDIA CUDA (SM >= 8.0) | bf16 + QLoRA 4-bit + FlashAttention | Ollama, GGUF, llama-server, vLLM |
245 | NVIDIA CUDA (SM < 8.0) | fp16 LoRA | Ollama, GGUF, llama-server, vLLM |
246 | Apple Silicon (MPS) | fp16 LoRA | Ollama, GGUF, MLX inference, mlx-serve |
247 | CPU | inference only (training refused above small bases) | GGUF, Ollama, llama-server |
248 | AMD ROCm | experimental | ROCm llama.cpp |
249
250 ## Base Model Registry
251
252 DLM ships with ~27 pinned base models across text, vision-language, and
253 audio-language families:
254
255 - **Text:** Qwen 2.5 (0.5B–3B), Qwen 3 (1.7B–8B), Llama 3.2/3.3,
256 SmolLM 2/3, Phi-3.5/4, Gemma 2, OLMo 2, Mixtral 8x7B
257 - **Vision-language:** Qwen2-VL, InternVL2/3, PaliGemma, Mistral-Small-3.1
258 - **Audio-language:** Qwen2-Audio
259
260 Any HuggingFace model via `--base hf:org/name` with compatibility probes.
261
262 ## Command Surface
263
264 | Area | Commands |
265 |---|---|
266 | Author | `init`, `templates`, `show`, `migrate`, `cache` |
267 | Train | `train`, `doctor`, `metrics`, `harvest` |
268 | Align | `preference mine/apply/revert/list` |
269 | Synth | `synth instructions/preferences/revert/list` |
270 | Infer | `prompt`, `repl` |
271 | Ship | `export`, `pack`, `unpack`, `verify`, `push`, `pull`, `serve` |
272
273 See the [CLI reference](./docs/cli/reference.md) for the full flag surface.
274
275 ## Editor support
276
277 ### VSCode
278
279 Install **DLM — Document Language Model** from the
280 [VSCode Marketplace](https://marketplace.visualstudio.com/items?itemName=tenseleyFlow.dlm-vsc).
281 The extension provides syntax highlighting, completions, diagnostics, and a
282 side panel for `.dlm` authoring. Source:
283 [dlm-vsc](https://github.com/tenseleyFlow/dlm-vsc).
284
285 It uses the [dlm-lsp](https://github.com/tenseleyFlow/dlm-lsp) language
286 server, which you also need to install:
287
288 ```sh
289 pip install dlm-lsp
290 ```
291
292 ### Other editors
293
294 The language server is editor-agnostic — Zed, Helix, and Neovim get
295 diagnostics, hover, and completions through their LSP clients. See:
296
297 - [Zed setup](./docs/cookbook/lsp-zed.md)
298 - [Helix setup](./docs/cookbook/lsp-helix.md)
299 - [Neovim setup](./docs/cookbook/lsp-neovim.md)
300
301 ## Documentation
302
303 - [Getting started](./docs/getting-started/install.md)
304 - [Frontmatter reference](./docs/format/frontmatter.md)
305 - [Section grammar](./docs/format/sections.md)
306 - [CLI reference](./docs/cli/reference.md)
307 - [Training across codebases](./docs/cookbook/training-across-codebases.md)
308 - [Multi-adapter composition](./docs/cookbook/multi-adapter.md)
309 - [Multi-target export](./docs/cookbook/multi-target-export.md)
310 - [Self-improving loop](./docs/cookbook/self-improving-loop.md)
311 - [Synthesize training data](./docs/cookbook/synthesize-training-data.md)
312 - [Multimodal training](./docs/cookbook/multimodal-training.md)
313 - [Audio training](./docs/cookbook/audio-training.md)
314 - [Architecture](./docs/architecture.md)
315 - [Determinism](./docs/determinism.md)
316
317 ## Principles
318
319 1. **The document is the interface.** Frontmatter, typed sections, directives,
320 and store contracts — not a dashboard.
321 2. **Training is real.** LoRA / QLoRA / DoRA on pretrained bases.
322 3. **Retraining should not silently forget.** Replay-backed accumulation.
323 4. **Local-first is load-bearing.** Your data stays on your machine.
324 5. **Determinism is a contract.** Locks, pinned versions, golden checks.
325
326 ## Tech Stack
327
328 Python 3.11+ · PyTorch · HuggingFace transformers / peft / trl / accelerate ·
329 vendored llama.cpp for GGUF · Ollama · Typer · Pydantic · uv
330
331 ## Contributing
332
333 See [CONTRIBUTING.md](./CONTRIBUTING.md).
334
335 ## License
336
337 MIT. Base-model licenses are separate and enforced at `dlm init`, `dlm train`,
338 `dlm export`, and `dlm pack`.