documentlanguagemodel Public
DocumentLanguageModel
A
.dlmfile becomes a local, reproducible, trainable LLM. Edit the document, retrain, share.
DocumentLanguageModel (DLM) is a local-first training, inference, and export toolchain built around authored documents instead of hosted dashboards.
A .dlm can be:
- a hand-written training document with prose, instruction, and preference data
- a directive-driven entrypoint into a codebase or notes tree
- a multi-adapter project with learned routing
- a multimodal or audio-language document
DLM trains LoRA / QLoRA / DoRA adapters on real pretrained bases, keeps a replay
history so retrains do not silently forget, and exports to Ollama,
llama-server, vllm, and mlx-serve.
Install
pip install document-language-model
That gives you the dlm command. Verify:
dlm --version
dlm doctor
Extras
# CUDA QLoRA support (NVIDIA SM >= 8.0):
pip install 'document-language-model[cuda]'
# Apple Silicon MLX inference:
pip install 'document-language-model[mlx]'
# OpenAI teacher for synthetic data generation:
pip install 'document-language-model[openai]'
# Anthropic teacher:
pip install 'document-language-model[anthropic]'
# Observability (TensorBoard + W&B):
pip install 'document-language-model[observability]'
From source
git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
cd DocumentLanguageModel
uv sync --all-extras --dev
uv run dlm --help
# Build GGUF export tooling:
scripts/bump-llama-cpp.sh build
# Optional: llama-server HTTP target:
scripts/bump-llama-cpp.sh build --with-server
30-Second Start
dlm init tutor.dlm --base smollm2-135m
# Edit tutor.dlm — add your Q&A pairs
dlm train tutor.dlm
dlm prompt tutor.dlm "What is a Python decorator?"
dlm export tutor.dlm --target ollama --name my-tutor
What a .dlm Looks Like
A minimal document:
---
dlm_id: 01KPM5CXB51GRX86Q25AKERN6E
dlm_version: 15
base_model: smollm2-135m
---
# My tutor
Some background prose. This trains via continued pretraining.
::instruction::
### Q
What is a decorator?
### A
A function that takes a function and returns a wrapped function.
A more representative one with directives, named adapters, and export config:
---
dlm_id: 01KTESTEXAMPLE000000000000
dlm_version: 15
base_model: qwen3-1.7b
system_prompt: |
You are a concise engineering assistant.
training:
adapter: lora
sequence_len: 4096
sources:
- path: ./src
include: ["**/*.py", "**/*.md"]
exclude: ["tests/**"]
adapters:
knowledge:
adapter: lora
lora_r: 8
tone:
adapter: lora
lora_r: 4
gate:
enabled: true
export:
default_quant: Q4_K_M
---
# Project notes
Shared prose trains all declared adapters by default.
::instruction#knowledge::
### Q
What does the cache layer do?
### A
It avoids re-tokenizing unchanged directive-sourced files.
::preference#tone::
### Prompt
Explain a failure mode.
### Chosen
Explain it directly, then give the fix.
### Rejected
Over-explain the background before naming the problem.
Common Workflows
Train a hand-authored document
dlm init tutor.dlm --base smollm2-135m
dlm train tutor.dlm
dlm prompt tutor.dlm "Explain decorators"
Train across a codebase
dlm train ./my-repo --base qwen3-1.7b
Auto-scaffolds a .dlm under ./my-repo/.dlm/ and trains on the repo's
source files.
Multi-adapter composition
dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5
Export to local runtimes
dlm export mydoc.dlm --target ollama --name mydoc
dlm export mydoc.dlm --target llama-server
dlm export mydoc.dlm --target vllm
dlm export mydoc.dlm --target mlx-serve
# Also emit a ready-to-run sway.yaml next to the GGUF for downstream
# evaluation via `sway run` (requires the [sway] extra).
dlm export mydoc.dlm --target ollama --emit-sway-json
sway run <export-dir>/sway.yaml
Mine preference pairs and retrain
dlm preference mine mydoc.dlm --samples 4 --max-pairs 8
dlm preference apply mydoc.dlm
dlm train mydoc.dlm --phase preference
Generate synthetic training data
dlm synth instructions mydoc.dlm --teacher self --apply
dlm synth instructions mydoc.dlm --teacher openai:gpt-4o-mini --apply
Multimodal and audio documents
dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct
dlm train diagrams.dlm
dlm prompt diagrams.dlm --image figures/arch.png "What is this?"
dlm init calls.dlm --audio
dlm train calls.dlm
dlm prompt calls.dlm --audio clips/call.wav "Summarize the clip"
Pull eval failures back into training
dlm harvest mydoc.dlm --sway-json sway-report.json --apply
Pack and share
dlm pack mydoc.dlm --include-exports
dlm verify mydoc.dlm.pack
dlm push mydoc.dlm --to hf:org/name
Inspect state
dlm doctor
dlm show mydoc.dlm --json
dlm metrics mydoc.dlm
Supported Platforms
| Tier | Training | Inference / Export |
|---|---|---|
| NVIDIA CUDA (SM >= 8.0) | bf16 + QLoRA 4-bit + FlashAttention | Ollama, GGUF, llama-server, vLLM |
| NVIDIA CUDA (SM < 8.0) | fp16 LoRA | Ollama, GGUF, llama-server, vLLM |
| Apple Silicon (MPS) | fp16 LoRA | Ollama, GGUF, MLX inference, mlx-serve |
| CPU | inference only (training refused above small bases) | GGUF, Ollama, llama-server |
| AMD ROCm | experimental | ROCm llama.cpp |
Base Model Registry
DLM ships with ~27 pinned base models across text, vision-language, and audio-language families:
- Text: Qwen 2.5 (0.5B–3B), Qwen 3 (1.7B–8B), Llama 3.2/3.3, SmolLM 2/3, Phi-3.5/4, Gemma 2, OLMo 2, Mixtral 8x7B
- Vision-language: Qwen2-VL, InternVL2/3, PaliGemma, Mistral-Small-3.1
- Audio-language: Qwen2-Audio
Any HuggingFace model via --base hf:org/name with compatibility probes.
Command Surface
| Area | Commands |
|---|---|
| Author | init, templates, show, migrate, cache |
| Train | train, doctor, metrics, harvest |
| Align | preference mine/apply/revert/list |
| Synth | synth instructions/preferences/revert/list |
| Infer | prompt, repl |
| Ship | export, pack, unpack, verify, push, pull, serve |
See the CLI reference for the full flag surface.
Editor support
VSCode
Install DLM — Document Language Model from the
VSCode Marketplace.
The extension provides syntax highlighting, completions, diagnostics, and a
side panel for .dlm authoring. Source:
dlm-vsc.
It uses the dlm-lsp language server, which you also need to install:
pip install dlm-lsp
Other editors
The language server is editor-agnostic — Zed, Helix, and Neovim get diagnostics, hover, and completions through their LSP clients. See:
Documentation
- Getting started
- Frontmatter reference
- Section grammar
- CLI reference
- Training across codebases
- Multi-adapter composition
- Multi-target export
- Self-improving loop
- Synthesize training data
- Multimodal training
- Audio training
- Architecture
- Determinism
Principles
- The document is the interface. Frontmatter, typed sections, directives, and store contracts — not a dashboard.
- Training is real. LoRA / QLoRA / DoRA on pretrained bases.
- Retraining should not silently forget. Replay-backed accumulation.
- Local-first is load-bearing. Your data stays on your machine.
- Determinism is a contract. Locks, pinned versions, golden checks.
Tech Stack
Python 3.11+ · PyTorch · HuggingFace transformers / peft / trl / accelerate · vendored llama.cpp for GGUF · Ollama · Typer · Pydantic · uv
Contributing
See CONTRIBUTING.md.
License
MIT. Base-model licenses are separate and enforced at dlm init, dlm train,
dlm export, and dlm pack.
View source
| 1 | # DocumentLanguageModel |
| 2 | |
| 3 | > A `.dlm` file becomes a local, reproducible, trainable LLM. |
| 4 | > Edit the document, retrain, share. |
| 5 | |
| 6 | DocumentLanguageModel (DLM) is a local-first training, inference, and export |
| 7 | toolchain built around authored documents instead of hosted dashboards. |
| 8 | |
| 9 | A `.dlm` can be: |
| 10 | |
| 11 | - a hand-written training document with prose, instruction, and preference data |
| 12 | - a directive-driven entrypoint into a codebase or notes tree |
| 13 | - a multi-adapter project with learned routing |
| 14 | - a multimodal or audio-language document |
| 15 | |
| 16 | DLM trains LoRA / QLoRA / DoRA adapters on real pretrained bases, keeps a replay |
| 17 | history so retrains do not silently forget, and exports to Ollama, |
| 18 | `llama-server`, `vllm`, and `mlx-serve`. |
| 19 | |
| 20 | ## Install |
| 21 | |
| 22 | ```sh |
| 23 | pip install document-language-model |
| 24 | ``` |
| 25 | |
| 26 | That gives you the `dlm` command. Verify: |
| 27 | |
| 28 | ```sh |
| 29 | dlm --version |
| 30 | dlm doctor |
| 31 | ``` |
| 32 | |
| 33 | ### Extras |
| 34 | |
| 35 | ```sh |
| 36 | # CUDA QLoRA support (NVIDIA SM >= 8.0): |
| 37 | pip install 'document-language-model[cuda]' |
| 38 | |
| 39 | # Apple Silicon MLX inference: |
| 40 | pip install 'document-language-model[mlx]' |
| 41 | |
| 42 | # OpenAI teacher for synthetic data generation: |
| 43 | pip install 'document-language-model[openai]' |
| 44 | |
| 45 | # Anthropic teacher: |
| 46 | pip install 'document-language-model[anthropic]' |
| 47 | |
| 48 | # Observability (TensorBoard + W&B): |
| 49 | pip install 'document-language-model[observability]' |
| 50 | ``` |
| 51 | |
| 52 | ### From source |
| 53 | |
| 54 | ```sh |
| 55 | git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git |
| 56 | cd DocumentLanguageModel |
| 57 | uv sync --all-extras --dev |
| 58 | uv run dlm --help |
| 59 | |
| 60 | # Build GGUF export tooling: |
| 61 | scripts/bump-llama-cpp.sh build |
| 62 | |
| 63 | # Optional: llama-server HTTP target: |
| 64 | scripts/bump-llama-cpp.sh build --with-server |
| 65 | ``` |
| 66 | |
| 67 | ## 30-Second Start |
| 68 | |
| 69 | ```sh |
| 70 | dlm init tutor.dlm --base smollm2-135m |
| 71 | # Edit tutor.dlm — add your Q&A pairs |
| 72 | dlm train tutor.dlm |
| 73 | dlm prompt tutor.dlm "What is a Python decorator?" |
| 74 | dlm export tutor.dlm --target ollama --name my-tutor |
| 75 | ``` |
| 76 | |
| 77 | ## What a `.dlm` Looks Like |
| 78 | |
| 79 | A minimal document: |
| 80 | |
| 81 | ```yaml |
| 82 | --- |
| 83 | dlm_id: 01KPM5CXB51GRX86Q25AKERN6E |
| 84 | dlm_version: 15 |
| 85 | base_model: smollm2-135m |
| 86 | --- |
| 87 | |
| 88 | # My tutor |
| 89 | |
| 90 | Some background prose. This trains via continued pretraining. |
| 91 | |
| 92 | ::instruction:: |
| 93 | ### Q |
| 94 | What is a decorator? |
| 95 | |
| 96 | ### A |
| 97 | A function that takes a function and returns a wrapped function. |
| 98 | ``` |
| 99 | |
| 100 | A more representative one with directives, named adapters, and export config: |
| 101 | |
| 102 | ```yaml |
| 103 | --- |
| 104 | dlm_id: 01KTESTEXAMPLE000000000000 |
| 105 | dlm_version: 15 |
| 106 | base_model: qwen3-1.7b |
| 107 | system_prompt: | |
| 108 | You are a concise engineering assistant. |
| 109 | training: |
| 110 | adapter: lora |
| 111 | sequence_len: 4096 |
| 112 | sources: |
| 113 | - path: ./src |
| 114 | include: ["**/*.py", "**/*.md"] |
| 115 | exclude: ["tests/**"] |
| 116 | adapters: |
| 117 | knowledge: |
| 118 | adapter: lora |
| 119 | lora_r: 8 |
| 120 | tone: |
| 121 | adapter: lora |
| 122 | lora_r: 4 |
| 123 | gate: |
| 124 | enabled: true |
| 125 | export: |
| 126 | default_quant: Q4_K_M |
| 127 | --- |
| 128 | |
| 129 | # Project notes |
| 130 | |
| 131 | Shared prose trains all declared adapters by default. |
| 132 | |
| 133 | ::instruction#knowledge:: |
| 134 | ### Q |
| 135 | What does the cache layer do? |
| 136 | |
| 137 | ### A |
| 138 | It avoids re-tokenizing unchanged directive-sourced files. |
| 139 | |
| 140 | ::preference#tone:: |
| 141 | ### Prompt |
| 142 | Explain a failure mode. |
| 143 | |
| 144 | ### Chosen |
| 145 | Explain it directly, then give the fix. |
| 146 | |
| 147 | ### Rejected |
| 148 | Over-explain the background before naming the problem. |
| 149 | ``` |
| 150 | |
| 151 | ## Common Workflows |
| 152 | |
| 153 | ### Train a hand-authored document |
| 154 | |
| 155 | ```sh |
| 156 | dlm init tutor.dlm --base smollm2-135m |
| 157 | dlm train tutor.dlm |
| 158 | dlm prompt tutor.dlm "Explain decorators" |
| 159 | ``` |
| 160 | |
| 161 | ### Train across a codebase |
| 162 | |
| 163 | ```sh |
| 164 | dlm train ./my-repo --base qwen3-1.7b |
| 165 | ``` |
| 166 | |
| 167 | Auto-scaffolds a `.dlm` under `./my-repo/.dlm/` and trains on the repo's |
| 168 | source files. |
| 169 | |
| 170 | ### Multi-adapter composition |
| 171 | |
| 172 | ```sh |
| 173 | dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge |
| 174 | dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5 |
| 175 | ``` |
| 176 | |
| 177 | ### Export to local runtimes |
| 178 | |
| 179 | ```sh |
| 180 | dlm export mydoc.dlm --target ollama --name mydoc |
| 181 | dlm export mydoc.dlm --target llama-server |
| 182 | dlm export mydoc.dlm --target vllm |
| 183 | dlm export mydoc.dlm --target mlx-serve |
| 184 | |
| 185 | # Also emit a ready-to-run sway.yaml next to the GGUF for downstream |
| 186 | # evaluation via `sway run` (requires the [sway] extra). |
| 187 | dlm export mydoc.dlm --target ollama --emit-sway-json |
| 188 | sway run <export-dir>/sway.yaml |
| 189 | ``` |
| 190 | |
| 191 | ### Mine preference pairs and retrain |
| 192 | |
| 193 | ```sh |
| 194 | dlm preference mine mydoc.dlm --samples 4 --max-pairs 8 |
| 195 | dlm preference apply mydoc.dlm |
| 196 | dlm train mydoc.dlm --phase preference |
| 197 | ``` |
| 198 | |
| 199 | ### Generate synthetic training data |
| 200 | |
| 201 | ```sh |
| 202 | dlm synth instructions mydoc.dlm --teacher self --apply |
| 203 | dlm synth instructions mydoc.dlm --teacher openai:gpt-4o-mini --apply |
| 204 | ``` |
| 205 | |
| 206 | ### Multimodal and audio documents |
| 207 | |
| 208 | ```sh |
| 209 | dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct |
| 210 | dlm train diagrams.dlm |
| 211 | dlm prompt diagrams.dlm --image figures/arch.png "What is this?" |
| 212 | |
| 213 | dlm init calls.dlm --audio |
| 214 | dlm train calls.dlm |
| 215 | dlm prompt calls.dlm --audio clips/call.wav "Summarize the clip" |
| 216 | ``` |
| 217 | |
| 218 | ### Pull eval failures back into training |
| 219 | |
| 220 | ```sh |
| 221 | dlm harvest mydoc.dlm --sway-json sway-report.json --apply |
| 222 | ``` |
| 223 | |
| 224 | ### Pack and share |
| 225 | |
| 226 | ```sh |
| 227 | dlm pack mydoc.dlm --include-exports |
| 228 | dlm verify mydoc.dlm.pack |
| 229 | dlm push mydoc.dlm --to hf:org/name |
| 230 | ``` |
| 231 | |
| 232 | ### Inspect state |
| 233 | |
| 234 | ```sh |
| 235 | dlm doctor |
| 236 | dlm show mydoc.dlm --json |
| 237 | dlm metrics mydoc.dlm |
| 238 | ``` |
| 239 | |
| 240 | ## Supported Platforms |
| 241 | |
| 242 | | Tier | Training | Inference / Export | |
| 243 | |---|---|---| |
| 244 | | NVIDIA CUDA (SM >= 8.0) | bf16 + QLoRA 4-bit + FlashAttention | Ollama, GGUF, llama-server, vLLM | |
| 245 | | NVIDIA CUDA (SM < 8.0) | fp16 LoRA | Ollama, GGUF, llama-server, vLLM | |
| 246 | | Apple Silicon (MPS) | fp16 LoRA | Ollama, GGUF, MLX inference, mlx-serve | |
| 247 | | CPU | inference only (training refused above small bases) | GGUF, Ollama, llama-server | |
| 248 | | AMD ROCm | experimental | ROCm llama.cpp | |
| 249 | |
| 250 | ## Base Model Registry |
| 251 | |
| 252 | DLM ships with ~27 pinned base models across text, vision-language, and |
| 253 | audio-language families: |
| 254 | |
| 255 | - **Text:** Qwen 2.5 (0.5B–3B), Qwen 3 (1.7B–8B), Llama 3.2/3.3, |
| 256 | SmolLM 2/3, Phi-3.5/4, Gemma 2, OLMo 2, Mixtral 8x7B |
| 257 | - **Vision-language:** Qwen2-VL, InternVL2/3, PaliGemma, Mistral-Small-3.1 |
| 258 | - **Audio-language:** Qwen2-Audio |
| 259 | |
| 260 | Any HuggingFace model via `--base hf:org/name` with compatibility probes. |
| 261 | |
| 262 | ## Command Surface |
| 263 | |
| 264 | | Area | Commands | |
| 265 | |---|---| |
| 266 | | Author | `init`, `templates`, `show`, `migrate`, `cache` | |
| 267 | | Train | `train`, `doctor`, `metrics`, `harvest` | |
| 268 | | Align | `preference mine/apply/revert/list` | |
| 269 | | Synth | `synth instructions/preferences/revert/list` | |
| 270 | | Infer | `prompt`, `repl` | |
| 271 | | Ship | `export`, `pack`, `unpack`, `verify`, `push`, `pull`, `serve` | |
| 272 | |
| 273 | See the [CLI reference](./docs/cli/reference.md) for the full flag surface. |
| 274 | |
| 275 | ## Editor support |
| 276 | |
| 277 | ### VSCode |
| 278 | |
| 279 | Install **DLM — Document Language Model** from the |
| 280 | [VSCode Marketplace](https://marketplace.visualstudio.com/items?itemName=tenseleyFlow.dlm-vsc). |
| 281 | The extension provides syntax highlighting, completions, diagnostics, and a |
| 282 | side panel for `.dlm` authoring. Source: |
| 283 | [dlm-vsc](https://github.com/tenseleyFlow/dlm-vsc). |
| 284 | |
| 285 | It uses the [dlm-lsp](https://github.com/tenseleyFlow/dlm-lsp) language |
| 286 | server, which you also need to install: |
| 287 | |
| 288 | ```sh |
| 289 | pip install dlm-lsp |
| 290 | ``` |
| 291 | |
| 292 | ### Other editors |
| 293 | |
| 294 | The language server is editor-agnostic — Zed, Helix, and Neovim get |
| 295 | diagnostics, hover, and completions through their LSP clients. See: |
| 296 | |
| 297 | - [Zed setup](./docs/cookbook/lsp-zed.md) |
| 298 | - [Helix setup](./docs/cookbook/lsp-helix.md) |
| 299 | - [Neovim setup](./docs/cookbook/lsp-neovim.md) |
| 300 | |
| 301 | ## Documentation |
| 302 | |
| 303 | - [Getting started](./docs/getting-started/install.md) |
| 304 | - [Frontmatter reference](./docs/format/frontmatter.md) |
| 305 | - [Section grammar](./docs/format/sections.md) |
| 306 | - [CLI reference](./docs/cli/reference.md) |
| 307 | - [Training across codebases](./docs/cookbook/training-across-codebases.md) |
| 308 | - [Multi-adapter composition](./docs/cookbook/multi-adapter.md) |
| 309 | - [Multi-target export](./docs/cookbook/multi-target-export.md) |
| 310 | - [Self-improving loop](./docs/cookbook/self-improving-loop.md) |
| 311 | - [Synthesize training data](./docs/cookbook/synthesize-training-data.md) |
| 312 | - [Multimodal training](./docs/cookbook/multimodal-training.md) |
| 313 | - [Audio training](./docs/cookbook/audio-training.md) |
| 314 | - [Architecture](./docs/architecture.md) |
| 315 | - [Determinism](./docs/determinism.md) |
| 316 | |
| 317 | ## Principles |
| 318 | |
| 319 | 1. **The document is the interface.** Frontmatter, typed sections, directives, |
| 320 | and store contracts — not a dashboard. |
| 321 | 2. **Training is real.** LoRA / QLoRA / DoRA on pretrained bases. |
| 322 | 3. **Retraining should not silently forget.** Replay-backed accumulation. |
| 323 | 4. **Local-first is load-bearing.** Your data stays on your machine. |
| 324 | 5. **Determinism is a contract.** Locks, pinned versions, golden checks. |
| 325 | |
| 326 | ## Tech Stack |
| 327 | |
| 328 | Python 3.11+ · PyTorch · HuggingFace transformers / peft / trl / accelerate · |
| 329 | vendored llama.cpp for GGUF · Ollama · Typer · Pydantic · uv |
| 330 | |
| 331 | ## Contributing |
| 332 | |
| 333 | See [CONTRIBUTING.md](./CONTRIBUTING.md). |
| 334 | |
| 335 | ## License |
| 336 | |
| 337 | MIT. Base-model licenses are separate and enforced at `dlm init`, `dlm train`, |
| 338 | `dlm export`, and `dlm pack`. |