documentlanguagemodel Public

a3db86f

Branches trunk

1 Branches 0 Tags

Go to file T

espadonne feat(export): chat-template golden matrix + refresh script (sprint 12.6)

a3db86f 3 weeks ago 253 Commits

.github	ci: coverage gate — src/dlm/pack ≥ 95% (Sprint 14)	3 weeks ago
docs-internal	docs: add contributor testing guide	3 weeks ago
scripts	feat(export): chat-template golden matrix + refresh script (sprint 12.6)	3 weeks ago
src	fix(verify): apply_chat_template return_dict=False — len counts keys otherwise (sprint 12.6)	3 weeks ago
tests	feat(export): chat-template golden matrix + refresh script (sprint 12.6)	3 weeks ago
vendor	feat(vendor): add llama.cpp submodule pinned at b8816 + refresh pre-tokenizer labels	3 weeks ago
.editorconfig	chore: add editorconfig	3 weeks ago
.gitignore	chore: dedupe .gitignore and drop dead conftest helper (audit-03)	3 weeks ago
.gitmodules	ci(slow-tests): init llama.cpp submodule + build llama-quantize (options 1+3)	3 weeks ago
.python-version	chore: scaffold pyproject, pin python 3.11, lock deps	3 weeks ago
CONTRIBUTING.md	cleanup: signoff	3 weeks ago
LICENSE	chore: add MIT license	3 weeks ago
README.md	docs(README): update status + quickstart — full v1.0 CLI surface is wired	3 weeks ago
pyproject.toml	test: add hypothesis property tests — ULID / rollup / merge-safety (audit-04 T4)	3 weeks ago
uv.lock	test: add hypothesis property tests — ULID / rollup / merge-safety (audit-04 T4)	3 weeks ago

DocumentLanguageModel

A text file becomes your personal, locally-trained LLM.

Edit a .dlm file, train a LoRA on it, export to Ollama — all on your machine. No telemetry, no uploads, no cloud. Built on PyTorch + HuggingFace with a hardware-aware planner that picks precision, attention, and batching for your box.

Status: pre-release. The full v1.0 command surface is wired — init, train, prompt, export, pack, unpack, doctor, show, migrate. Reproducibility-lock and docs polish are the remaining Phase 3 work before a tagged release.

What it does

Edit a document, get a model. A .dlm is plain UTF-8 text with a YAML frontmatter and section fences (::instruction::, ::preference::, default-prose). Prose trains via continued pretraining; instruction blocks train via SFT; preference blocks via DPO/ORPO (coming).
LoRA / QLoRA on a real base. Curated registry of small pretrained bases (Qwen 2.5 0.5B–3B, Llama-3.2 1B/3B, SmolLM2 135M–1.7B, Phi-3.5-mini). Any HuggingFace model via an hf:org/name escape hatch.
Retrain, don't forget. Prior document versions are stored in a zstd-compressed replay corpus and sampled back into each training run; retrains are additive, not destructive.
Deterministic by default. Same document + same hardware tier + pinned versions → bit-identical adapter.
Export to Ollama. dlm export produces a base GGUF + adapter GGUF + Modelfile with an explicit Go text/template (no fuzzy matching), then registers it locally with ollama create.
Hardware-aware. dlm doctor picks precision (bf16 on Ampere+, fp16 on MPS), attention (FlashAttention when available, SDPA otherwise), batching, and gradient checkpointing.

Supported platforms

Tier	Training	Inference
NVIDIA CUDA (SM ≥ 8.0)	bf16 + QLoRA 4-bit + FlashAttention	Ollama (GGUF CUDA)
NVIDIA CUDA (SM < 8.0)	fp16 LoRA	Ollama (GGUF CUDA)
Apple Silicon (MPS)	fp16 LoRA	Ollama (GGUF Metal)
CPU	inference-only by default (training refused above 200M params)	Ollama (GGUF CPU)
AMD ROCm	experimental (later)	llama.cpp ROCm

Installation

# Requires Python 3.11+ and uv (https://github.com/astral-sh/uv)
git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
cd DocumentLanguageModel
uv sync
uv run dlm --help

For export: install Ollama separately (minimum version is pinned in the CLI; dlm doctor reports it).

Quickstart

uv run dlm init mydoc.dlm                 # scaffold a new .dlm
# edit mydoc.dlm — write prose, add ### Q / ### A pairs, etc.
uv run dlm train mydoc.dlm                # train a LoRA
uv run dlm prompt mydoc.dlm "question?"   # query the trained adapter
uv run dlm export mydoc.dlm --name mydoc  # register with Ollama
ollama run mydoc                          # use it

dlm pack mydoc.dlm produces a portable .dlm.pack bundle you can hand off to another machine; dlm unpack installs it on the other end. dlm show mydoc.dlm prints training history, exports, and adapter state; dlm doctor reports the resolved hardware plan.

Principles

The document is the interface. Not a config file. Not a framework. Plain text with a special extension.
Training is real. LoRA/QLoRA on a pretrained base, not a toy from-scratch transformer.
Retrain is additive. Replay prior versions; never forget silently.
Local-first, always. Training, inference, and store all live on your disk. No network calls outside of model download.
Deterministic by default. Reproducibility is a contract, not a wish.

Tech stack

Python 3.11+ · PyTorch · HuggingFace transformers/peft/trl/accelerate · bitsandbytes (CUDA-gated) · llama.cpp (vendored, for GGUF export) · Typer · Pydantic · uv.

Contributing

See CONTRIBUTING.md. Testing conventions live at docs-internal/README-testing.md.

License

MIT. Base-model licenses are separate and enforced at dlm init / dlm pack time; Llama family bases require explicit acceptance.

About

Document-first local LLM training, preference mining, retraining, and multi-target export from .dlm docs, codebases, and multimodal sources.

Report repository

Releases

No releases published

Packages

No packages published

Contributors 3

mfwolffe

espadonne Matthew Forrester Wolffe