467e4c5
Branches trunk
1 Branches 0 Tags
Go to file T
Code
.github test(lock): integration determinism-golden + strict-lock suite + CI 95% gate (sprint 15)
docs docs: getting-started (install, first-train, first-prompt, first-export) (sprint 16)
docs-internal docs: add contributor testing guide
scripts feat(scripts): regen-determinism-golden — two-run-and-diff with --approve gate (sprint 15)
src feat(cli): dlm train --strict-lock / --update-lock / --ignore-lock (sprint 15)
tests test(lock): integration determinism-golden + strict-lock suite + CI 95% gate (sprint 15)
vendor feat(vendor): add llama.cpp submodule pinned at b8816 + refresh pre-tokenizer labels
.editorconfig chore: add editorconfig
.gitignore docs: MkDocs Material scaffold + landing page (sprint 16)
.gitmodules ci(slow-tests): init llama.cpp submodule + build llama-quantize (options 1+3)
.pre-commit-config.yaml chore: pre-commit config — ruff + mypy + non-slow pytest (sprint 15)
.python-version chore: scaffold pyproject, pin python 3.11, lock deps
CONTRIBUTING.md cleanup: signoff
LICENSE chore: add MIT license
README.md docs(README): update status + quickstart — full v1.0 CLI surface is wired
mkdocs.yml docs: MkDocs Material scaffold + landing page (sprint 16)
pyproject.toml docs: MkDocs Material scaffold + landing page (sprint 16)
uv.lock docs: MkDocs Material scaffold + landing page (sprint 16)

DocumentLanguageModel

A text file becomes your personal, locally-trained LLM.

Edit a .dlm file, train a LoRA on it, export to Ollama — all on your machine. No telemetry, no uploads, no cloud. Built on PyTorch + HuggingFace with a hardware-aware planner that picks precision, attention, and batching for your box.

Status: pre-release. The full v1.0 command surface is wired — init, train, prompt, export, pack, unpack, doctor, show, migrate. Reproducibility-lock and docs polish are the remaining Phase 3 work before a tagged release.

What it does

  • Edit a document, get a model. A .dlm is plain UTF-8 text with a YAML frontmatter and section fences (::instruction::, ::preference::, default-prose). Prose trains via continued pretraining; instruction blocks train via SFT; preference blocks via DPO/ORPO (coming).
  • LoRA / QLoRA on a real base. Curated registry of small pretrained bases (Qwen 2.5 0.5B–3B, Llama-3.2 1B/3B, SmolLM2 135M–1.7B, Phi-3.5-mini). Any HuggingFace model via an hf:org/name escape hatch.
  • Retrain, don't forget. Prior document versions are stored in a zstd-compressed replay corpus and sampled back into each training run; retrains are additive, not destructive.
  • Deterministic by default. Same document + same hardware tier + pinned versions → bit-identical adapter.
  • Export to Ollama. dlm export produces a base GGUF + adapter GGUF + Modelfile with an explicit Go text/template (no fuzzy matching), then registers it locally with ollama create.
  • Hardware-aware. dlm doctor picks precision (bf16 on Ampere+, fp16 on MPS), attention (FlashAttention when available, SDPA otherwise), batching, and gradient checkpointing.

Supported platforms

Tier Training Inference
NVIDIA CUDA (SM ≥ 8.0) bf16 + QLoRA 4-bit + FlashAttention Ollama (GGUF CUDA)
NVIDIA CUDA (SM < 8.0) fp16 LoRA Ollama (GGUF CUDA)
Apple Silicon (MPS) fp16 LoRA Ollama (GGUF Metal)
CPU inference-only by default (training refused above 200M params) Ollama (GGUF CPU)
AMD ROCm experimental (later) llama.cpp ROCm

Installation

# Requires Python 3.11+ and uv (https://github.com/astral-sh/uv)
git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
cd DocumentLanguageModel
uv sync
uv run dlm --help

For export: install Ollama separately (minimum version is pinned in the CLI; dlm doctor reports it).

Quickstart

uv run dlm init mydoc.dlm                 # scaffold a new .dlm
# edit mydoc.dlm — write prose, add ### Q / ### A pairs, etc.
uv run dlm train mydoc.dlm                # train a LoRA
uv run dlm prompt mydoc.dlm "question?"   # query the trained adapter
uv run dlm export mydoc.dlm --name mydoc  # register with Ollama
ollama run mydoc                          # use it

dlm pack mydoc.dlm produces a portable .dlm.pack bundle you can hand off to another machine; dlm unpack installs it on the other end. dlm show mydoc.dlm prints training history, exports, and adapter state; dlm doctor reports the resolved hardware plan.

Principles

  1. The document is the interface. Not a config file. Not a framework. Plain text with a special extension.
  2. Training is real. LoRA/QLoRA on a pretrained base, not a toy from-scratch transformer.
  3. Retrain is additive. Replay prior versions; never forget silently.
  4. Local-first, always. Training, inference, and store all live on your disk. No network calls outside of model download.
  5. Deterministic by default. Reproducibility is a contract, not a wish.

Tech stack

Python 3.11+ · PyTorch · HuggingFace transformers/peft/trl/accelerate · bitsandbytes (CUDA-gated) · llama.cpp (vendored, for GGUF export) · Typer · Pydantic · uv.

Contributing

See CONTRIBUTING.md. Testing conventions live at docs-internal/README-testing.md.

License

MIT. Base-model licenses are separate and enforced at dlm init / dlm pack time; Llama family bases require explicit acceptance.