documentlanguagemodel Public
DocumentLanguageModel
.dlmis a trainable local AI document format: typed sections, directives, replay-backed retraining, and export.
DocumentLanguageModel (DLM) is a local-first training and inference toolkit built around authored documents instead of hosted dashboards.
A .dlm can be a hand-authored training doc, a directive-driven entrypoint
into a codebase, a multi-adapter project with learned routing, or a selected
multimodal / audio-language document. DLM trains LoRA / QLoRA / DoRA adapters
on real pretrained bases, keeps replay history, and exports local runtimes such
as Ollama, llama-server, vllm, and mlx-serve.
What DLM Ships Today
- Structured
.dlmauthoring with frontmatter plus typed body sections like prose,::instruction::,::preference::,::image::, and::audio:: - Directive-driven corpus building via
training.sources, plus nested.dlm/training.yaml/.dlm/ignorefor repo-local curation - Modern base-model registry across text, reasoning, sparse-MoE, vision-language, and audio-language rows
- Replay-backed retraining so edits accumulate instead of silently wiping prior state
- Synthetic data loops through
dlm synth instructionsanddlm synth preferences - Multi-adapter docs + learned gating for separating knowledge, tone, or persona lanes inside one project
- Local iteration UX with
prompt,repl,train --watch,metrics, anddoctor - Runtime export to
ollama,llama-server,vllm, andmlx-serve - Probe-driven improvement through
sway-style harvest flows
30-Second Demo
$ uv run dlm init tutor.dlm --base smollm2-135m
$ $EDITOR tutor.dlm
$ uv run dlm train tutor.dlm
$ uv run dlm prompt tutor.dlm "Explain Python decorators"
$ uv run dlm export tutor.dlm --target ollama --name my-tutor
Where To Start
| If you want to… | Start here |
|---|---|
| Install DLM and run the first cycle | Getting started → Install |
Understand the .dlm file format |
Frontmatter and Section grammar |
| Train across a real repo | Training across codebases |
| Use named adapters and routing | Multi-adapter and Learned adapter gate |
| Work with images or audio | Multimodal training and Audio training |
| Turn prose into instruction data | Synthesize training data and Bootstrap self-improving |
| Mine preference pairs from a live adapter | Self-improving loop and Reward-model integration |
| Export or ship a model | Multi-target export, CLI reference, and Determinism |
| Pull eval failures back into training | Probe-driven training |
Status
DLM is pre-v1.0 but substantially broader than the original MVP framing. Core author/train/prompt/export/pack/share flows are in place, and current runtime-target work is extending export beyond the original Ollama-only path.
View source
| 1 | # DocumentLanguageModel |
| 2 | |
| 3 | > `.dlm` is a trainable local AI document format: typed sections, directives, |
| 4 | > replay-backed retraining, and export. |
| 5 | |
| 6 | DocumentLanguageModel (DLM) is a local-first training and inference toolkit |
| 7 | built around authored documents instead of hosted dashboards. |
| 8 | |
| 9 | A `.dlm` can be a hand-authored training doc, a directive-driven entrypoint |
| 10 | into a codebase, a multi-adapter project with learned routing, or a selected |
| 11 | multimodal / audio-language document. DLM trains LoRA / QLoRA / DoRA adapters |
| 12 | on real pretrained bases, keeps replay history, and exports local runtimes such |
| 13 | as Ollama, `llama-server`, `vllm`, and `mlx-serve`. |
| 14 | |
| 15 | ## What DLM Ships Today |
| 16 | |
| 17 | - **Structured `.dlm` authoring** with frontmatter plus typed body sections |
| 18 | like prose, `::instruction::`, `::preference::`, `::image::`, and |
| 19 | `::audio::` |
| 20 | - **Directive-driven corpus building** via `training.sources`, plus nested |
| 21 | `.dlm/training.yaml` / `.dlm/ignore` for repo-local curation |
| 22 | - **Modern base-model registry** across text, reasoning, sparse-MoE, |
| 23 | vision-language, and audio-language rows |
| 24 | - **Replay-backed retraining** so edits accumulate instead of silently wiping |
| 25 | prior state |
| 26 | - **Synthetic data loops** through `dlm synth instructions` and |
| 27 | `dlm synth preferences` |
| 28 | - **Multi-adapter docs + learned gating** for separating knowledge, tone, or |
| 29 | persona lanes inside one project |
| 30 | - **Local iteration UX** with `prompt`, `repl`, `train --watch`, `metrics`, |
| 31 | and `doctor` |
| 32 | - **Runtime export** to `ollama`, `llama-server`, `vllm`, and `mlx-serve` |
| 33 | - **Probe-driven improvement** through `sway`-style harvest flows |
| 34 | |
| 35 | ## 30-Second Demo |
| 36 | |
| 37 | ```sh |
| 38 | $ uv run dlm init tutor.dlm --base smollm2-135m |
| 39 | $ $EDITOR tutor.dlm |
| 40 | $ uv run dlm train tutor.dlm |
| 41 | $ uv run dlm prompt tutor.dlm "Explain Python decorators" |
| 42 | $ uv run dlm export tutor.dlm --target ollama --name my-tutor |
| 43 | ``` |
| 44 | |
| 45 | ## Where To Start |
| 46 | |
| 47 | | If you want to… | Start here | |
| 48 | |---|---| |
| 49 | | Install DLM and run the first cycle | [Getting started → Install](getting-started/install.md) | |
| 50 | | Understand the `.dlm` file format | [Frontmatter](format/frontmatter.md) and [Section grammar](format/sections.md) | |
| 51 | | Train across a real repo | [Training across codebases](cookbook/training-across-codebases.md) | |
| 52 | | Use named adapters and routing | [Multi-adapter](cookbook/multi-adapter.md) and [Learned adapter gate](cookbook/learned-adapter-gate.md) | |
| 53 | | Work with images or audio | [Multimodal training](cookbook/multimodal-training.md) and [Audio training](cookbook/audio-training.md) | |
| 54 | | Turn prose into instruction data | [Synthesize training data](cookbook/synthesize-training-data.md) and [Bootstrap self-improving](cookbook/bootstrap-self-improving.md) | |
| 55 | | Mine preference pairs from a live adapter | [Self-improving loop](cookbook/self-improving-loop.md) and [Reward-model integration](cookbook/reward-model-integration.md) | |
| 56 | | Export or ship a model | [Multi-target export](cookbook/multi-target-export.md), [CLI reference](cli/reference.md), and [Determinism](determinism.md) | |
| 57 | | Pull eval failures back into training | [Probe-driven training](cookbook/probe-driven-training.md) | |
| 58 | |
| 59 | ## Status |
| 60 | |
| 61 | DLM is pre-v1.0 but substantially broader than the original MVP framing. |
| 62 | Core author/train/prompt/export/pack/share flows are in place, and current |
| 63 | runtime-target work is extending export beyond the original Ollama-only path. |