`4dba4c8`

Rename package to document-language-model for PyPI, rewrite README with install instructions

Authored by mfwolffe <wolffemf@dukes.jmu.edu> 2 weeks ago

SHA: 4dba4c80f432181c093ed1a4bcf1018c1e9674c2
Parents: d8f87e3
Tree: 038d270

2 changed files

Status	File	+	-
M	`README.md`	144	245
M	`pyproject.toml`	1	1

README.mdmodified

  # DocumentLanguageModel
--> `.dlm` is a trainable local AI document format: typed sections, directives,
++> A `.dlm` file becomes a local, reproducible, trainable LLM.
--> replay-backed retraining, and export.
++> Edit the document, retrain, share.
  DocumentLanguageModel (DLM) is a local-first training, inference, and export
  toolchain built around authored documents instead of hosted dashboards.
  - a hand-written training document with prose, instruction, and preference data
  - a directive-driven entrypoint into a codebase or notes tree
  - a multi-adapter project with learned routing
--- a selected multimodal or audio-language document
++- a multimodal or audio-language document
  DLM trains LoRA / QLoRA / DoRA adapters on real pretrained bases, keeps a replay
--history so retrains do not silently forget, and exports local runtimes such as
++history so retrains do not silently forget, and exports to Ollama,
--Ollama, `llama-server`, `vllm`, and `mlx-serve`.
--
--**Status:** pre-v1.0, but far beyond the original MVP framing. The core
--author/train/prompt/export/pack/share loop is real, and newer runtime-target
--work is landing incrementally. Current export targets are `ollama`,
  `llama-server`, `vllm`, and `mlx-serve`.
--## What A `.dlm` Actually Is
++## Install
--
--A `.dlm` is not just “a text file with a special extension.”
--
--It is a trainable project surface with:
--
--- **frontmatter** for base-model choice, training config, export defaults,
--  sources, cache policy, and multi-adapter gate settings
--- **typed body sections** such as prose, `::instruction::`,
--  `::preference::`, `::image::`, and `::audio::`
--- **adapter routing** via fences like `::instruction#knowledge::`
--- **directive-driven ingestion** from files and directories through
--  `training.sources`
--- **repo-local subtree control** through `.dlm/training.yaml` and `.dlm/ignore`
--- a stable **`dlm_id`** that binds the document to a local store under
--  `~/.dlm/store/<dlm_id>/`
--
--That combination is what makes DLM more like a local AI authoring format than a
--single prompt file.
--
--## Why DLM
--
--Most “personal AI” tooling still pushes you toward one of two bad choices:
--
--- upload your data to someone else’s cloud
--- run an oversized model with weak authoring and retraining ergonomics
--
--DLM sits in the gap:
--
--- **The document is the interface.** You author the thing you care about instead
--  of wiring together a hidden dataset pipeline.
--- **Training is real.** LoRA / QLoRA / DoRA on pretrained bases, not a toy
--  from-scratch transformer.
--- **Retraining is additive.** Previous document versions flow into a replay
--  corpus so the model does not forget last week’s state by default.
--- **Everything stays local.** Training, inference, store state, exports, and
--  packs all live on your machine unless you explicitly push them somewhere.
--- **Determinism is a contract.** Locks, pinned versions, and golden checks are
--  first-class design constraints, not “best effort.”
--
--## Core Capabilities
--
--- **Author structured training data in one place.** Mix prose, SFT examples,
--  preferences, image sections, and audio sections in one document.
--- **Ingest whole trees, not just one file.** `training.sources` can walk a
--  repo, and subtree-local `.dlm/training.yaml` / `.dlm/ignore` let the corpus
--  carry its own curation rules.
--- **Train on modern base families.** Text, reasoning-tuned, sparse-MoE,
--  vision-language, and audio-language registry rows ship today, plus `hf:org/name`
--  escape hatches.
--- **Compose multiple adapters in one document.** Named adapters, weighted export
--  mixes, and learned adapter gates let one `.dlm` separate knowledge, tone, or
--  persona lanes.
--- **Mine preference pairs from a live adapter.** `dlm preference mine` can use
--  `sway`, HF reward models, or external CLI judges to write auto-mined
--  `::preference::` sections back into the document.
--- **Stay in a local iteration loop.** `dlm prompt`, `dlm repl`,
--  `dlm train --watch`, `dlm metrics`, and `dlm doctor` are all part of the
--  normal workflow now.
--- **Export beyond the original Ollama-only story.** DLM still does explicit
--  Ollama exports with pinned templates, and now also emits `llama-server`,
--  `vllm`, and `mlx-serve` launch artifacts for local runtime targets.
--- **Close the eval loop.** `dlm harvest` can pull failing `sway`-style probe
--  reports back into the document as new training examples.
--- **Pack and share reproducibly.** `.dlm.pack`, verification, push/pull, and
--  local serve flows are all built around the same store contracts.
--
--## Supported Platforms
--| Tier | Training | Inference / export |
++```sh
--|---|---|---|
++pip install document-language-model
--| NVIDIA CUDA (SM ≥ 8.0) | bf16 + QLoRA 4-bit + FlashAttention | Ollama, GGUF export, `llama-server`, `vllm` |
++```
--| NVIDIA CUDA (SM < 8.0) | fp16 LoRA | Ollama, GGUF export, `llama-server`, `vllm` |
--| Apple Silicon (MPS) | fp16 or fp32 LoRA depending on doctor plan | Ollama, selected MLX inference paths, GGUF export, `vllm` (conservative Metal defaults), `mlx-serve` |
--| CPU | inference-first; training refused above small bases unless forced | GGUF export, Ollama, `llama-server` |
--| AMD ROCm | experimental | ROCm-oriented llama.cpp flows |
--See [docs/hardware](./docs/hardware/memory-estimates.md) and
++That gives you the `dlm` command. Verify:
--[docs/hardware/vl-memory.md](./docs/hardware/vl-memory.md) for the real support
--matrix and current caveats.
--## Install
++```sh
++dlm --version
++dlm doctor
++```
--### From the Homebrew tap
++### Extras
  ```sh
--brew tap tenseleyFlow/tap
++# CUDA QLoRA support (NVIDIA SM >= 8.0):
--brew install dlm
++pip install 'document-language-model[cuda]'
--# Optional, only if you want `--target ollama` registration/smoke:
++# Apple Silicon MLX inference:
--brew install ollama
++pip install 'document-language-model[mlx]'
--```
--`brew install dlm` pulls in the Python environment and the vendored
++# OpenAI teacher for synthetic data generation:
--`llama.cpp` source tree DLM uses for GGUF conversion. CUDA users unlock QLoRA
++pip install 'document-language-model[openai]'
--after install:
--```sh
++# Anthropic teacher:
--$(brew --prefix dlm)/libexec/venv/bin/pip install 'dlm[cuda]'
++pip install 'document-language-model[anthropic]'
++
++# Observability (TensorBoard + W&B):
++pip install 'document-language-model[observability]'
  ```
  ### From source
  ```sh
  git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
  cd DocumentLanguageModel
--uv sync
++uv sync --all-extras --dev
++uv run dlm --help
--# Build GGUF tooling:
++# Build GGUF export tooling:
  scripts/bump-llama-cpp.sh build
--# If you want the llama.cpp HTTP target too:
++# Optional: llama-server HTTP target:
  scripts/bump-llama-cpp.sh build --with-server
--
--# If you want the Apple Silicon MLX HTTP target:
--uv sync --extra mlx
--
--# If you want the vLLM HTTP target:
--# install a compatible vllm runtime separately; DLM writes launch artifacts
--# but does not bundle the server runtime itself.
--
--uv run dlm --help
  ```
--We deliberately do not publish to PyPI yet. See
--[CONTRIBUTING.md](./CONTRIBUTING.md) for the release flow.
--
  ## 30-Second Start
  ```sh
--uv run dlm init tutor.dlm --base smollm2-135m
++dlm init tutor.dlm --base smollm2-135m
--$EDITOR tutor.dlm
++# Edit tutor.dlm — add your Q&A pairs
--uv run dlm train tutor.dlm
++dlm train tutor.dlm
--uv run dlm prompt tutor.dlm "What is a Python decorator?"
++dlm prompt tutor.dlm "What is a Python decorator?"
--uv run dlm export tutor.dlm --target ollama --name my-tutor
++dlm export tutor.dlm --target ollama --name my-tutor
  ```
--A minimal `.dlm` still works:
++## What a `.dlm` Looks Like
++
++A minimal document:
--```dlm
++```yaml
  ---
  dlm_id: 01KPM5CXB51GRX86Q25AKERN6E
--dlm_version: 1
++dlm_version: 15
  base_model: smollm2-135m
  ---
--# Your document title
++# My tutor
--Write prose here.
++Some background prose. This trains via continued pretraining.
  ::instruction::
  ### Q
  A function that takes a function and returns a wrapped function.
  ```
--That path is still important. It is just no longer the whole story.
++A more representative one with directives, named adapters, and export config:
--
--## Authoring Beyond The Toy Example
--A more representative `.dlm` can mix directives, named adapters, and export
++```yaml
--defaults in one place:
--
--```dlm
  ---
  dlm_id: 01KTESTEXAMPLE000000000000
--dlm_version: 1
++dlm_version: 15
  base_model: qwen3-1.7b
  system_prompt: |
    You are a concise engineering assistant.
  training:
    adapter: lora
    sequence_len: 4096
--  sources_policy: strict
    sources:
      - path: ./src
        include: ["**/*.py", "**/*.md"]
--      exclude: ["tests/**", "**/__pycache__/**"]
++      exclude: ["tests/**"]
    adapters:
      knowledge:
        adapter: lora
  Over-explain the background before naming the problem.
  ```
--Two important upgrades over the older README story:
--
--- `training.sources` can turn a repo or notes tree into synthetic training
--  sections.
--- `training.adapters` + `training.gate` let one document route prompts across
--  multiple adapters instead of pretending one flat adapter is the only mode.
--
--If you need deeper subtree-specific curation, drop `.dlm/training.yaml` and
--`.dlm/ignore` into nested directories and let the corpus carry its own rules.
--
  ## Common Workflows
--### 1. Hand-authored document
++### Train a hand-authored document
  ```sh
--uv run dlm init tutor.dlm --base smollm2-135m
++dlm init tutor.dlm --base smollm2-135m
--uv run dlm train tutor.dlm
++dlm train tutor.dlm
--uv run dlm prompt tutor.dlm "Explain decorators"
++dlm prompt tutor.dlm "Explain decorators"
  ```
--### 2. Train across a codebase
++### Train across a codebase
  ```sh
--uv run dlm train ./my-repo --base qwen3-1.7b --include '**/*.py' --name corpus
++dlm train ./my-repo --base qwen3-1.7b
  ```
--That auto-scaffolds a `.dlm` under `./my-repo/.dlm/` and lets the repo become
++Auto-scaffolds a `.dlm` under `./my-repo/.dlm/` and trains on the repo's
--its own training surface.
++source files.
--### 3. Multi-adapter composition
++### Multi-adapter composition
  ```sh
--uv run dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
++dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
--uv run dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5
++dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5
  ```
--### 4. Local iteration loop
++### Export to local runtimes
  ```sh
--uv run dlm train mydoc.dlm --watch
++dlm export mydoc.dlm --target ollama --name mydoc
--uv run dlm repl mydoc.dlm
++dlm export mydoc.dlm --target llama-server
--uv run dlm metrics mydoc.dlm
++dlm export mydoc.dlm --target vllm
++dlm export mydoc.dlm --target mlx-serve
  ```
--### 5. Export and ship
++### Mine preference pairs and retrain
  ```sh
--uv run dlm export mydoc.dlm --target ollama --name mydoc
++dlm preference mine mydoc.dlm --samples 4 --max-pairs 8
--uv run dlm export mydoc.dlm --target llama-server
++dlm preference apply mydoc.dlm
--uv run dlm export mydoc.dlm --target vllm
++dlm train mydoc.dlm --phase preference
--uv run dlm export mydoc.dlm --target mlx-serve
--uv run dlm pack mydoc.dlm --include-exports
--uv run dlm verify mydoc.dlm.pack
  ```
--On Apple Silicon, `--target vllm` now emits conservative `vllm-metal`
++### Generate synthetic training data
--defaults in the launch script: it pins the server to the MLX KV path
--(`VLLM_METAL_USE_PAGED_ATTENTION=0`, `VLLM_METAL_MEMORY_FRACTION=auto`)
--and caps `--max-model-len` to the document's `training.sequence_len`
--instead of blindly asking `vllm` for the base model's full context.
--
--### 6. Mine preference pairs and retrain
  ```sh
--uv run dlm preference mine mydoc.dlm --samples 4 --max-pairs 8
++dlm synth instructions mydoc.dlm --teacher self --apply
--uv run dlm preference list mydoc.dlm
++dlm synth instructions mydoc.dlm --teacher openai:gpt-4o-mini --apply
--uv run dlm preference apply mydoc.dlm
--uv run dlm train mydoc.dlm --phase preference
--
--# A/B check against hand-authored pairs only:
--uv run dlm train mydoc.dlm --phase preference --no-mined
--
--# Use a different judge when bootstrap self-judging is not enough:
--uv run dlm preference mine mydoc.dlm --judge hf:YourOrg/reward-model --apply
  ```
--### 7. Scaffold multimodal or audio docs
++### Multimodal and audio documents
  ```sh
--uv run dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct
++dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct
--uv run dlm train diagrams.dlm
++dlm train diagrams.dlm
--uv run dlm prompt diagrams.dlm --image figures/system.png "What is happening here?"
++dlm prompt diagrams.dlm --image figures/arch.png "What is this?"
--uv run dlm init calls.dlm --audio
++dlm init calls.dlm --audio
--uv run dlm train calls.dlm
++dlm train calls.dlm
--uv run dlm prompt calls.dlm --audio clips/example.wav "Summarize the clip"
++dlm prompt calls.dlm --audio clips/call.wav "Summarize the clip"
  ```
--### 8. Pull eval failures back into training
++### Pull eval failures back into training
  ```sh
--uv run dlm harvest mydoc.dlm --sway-json sway-report.json --apply
++dlm harvest mydoc.dlm --sway-json sway-report.json --apply
  ```
--That is the probe-driven loop: evaluation finds a miss, DLM turns it into
++### Pack and share
--document-level training data, and the next train closes the gap.
--
--### 9. Inspect store state and reproducibility
  ```sh
--uv run dlm doctor
++dlm pack mydoc.dlm --include-exports
--uv run dlm show mydoc.dlm --json
++dlm verify mydoc.dlm.pack
--uv run dlm metrics mydoc.dlm --run-id 7 --json
++dlm push mydoc.dlm --to hf:org/name
--uv run dlm pack mydoc.dlm --include-exports
--uv run dlm verify mydoc.dlm.pack
  ```
--## Command Surface
++### Inspect state
--The CLI is broader than the original MVP now. A useful mental map:
++```sh
++dlm doctor
++dlm show mydoc.dlm --json
++dlm metrics mydoc.dlm
++```
++
++## Supported Platforms
--| Area | Commands | What they cover |
++| Tier | Training | Inference / Export |
  |---|---|---|
--| Author | `init`, `templates`, `show`, `migrate`, `cache` | Create docs, inspect them, migrate schema, manage cache state |
++| NVIDIA CUDA (SM >= 8.0) | bf16 + QLoRA 4-bit + FlashAttention | Ollama, GGUF, llama-server, vLLM |
--| Train | `train`, `doctor`, `metrics`, `harvest` | Run training, inspect plans, observe runs, pull eval misses back in |
++| NVIDIA CUDA (SM < 8.0) | fp16 LoRA | Ollama, GGUF, llama-server, vLLM |
--| Align | `preference` | Mine, stage, apply, revert, and inspect auto-mined preference sections |
++| Apple Silicon (MPS) | fp16 LoRA | Ollama, GGUF, MLX inference, mlx-serve |
--| Infer | `prompt`, `repl` | Local interactive and one-shot inference |
++| CPU | inference only (training refused above small bases) | GGUF, Ollama, llama-server |
--| Ship | `export`, `pack`, `unpack`, `verify`, `push`, `pull`, `serve` | Export to runtimes, bundle, verify, and move artifacts |
++| AMD ROCm | experimental | ROCm llama.cpp |
++
++## Base Model Registry
++
++DLM ships with ~27 pinned base models across text, vision-language, and
++audio-language families:
++
++- **Text:** Qwen 2.5 (0.5B–3B), Qwen 3 (1.7B–8B), Llama 3.2/3.3,
++  SmolLM 2/3, Phi-3.5/4, Gemma 2, OLMo 2, Mixtral 8x7B
++- **Vision-language:** Qwen2-VL, InternVL2/3, PaliGemma, Mistral-Small-3.1
++- **Audio-language:** Qwen2-Audio
++
++Any HuggingFace model via `--base hf:org/name` with compatibility probes.
++
++## Command Surface
++
++| Area | Commands |
++|---|---|
++| Author | `init`, `templates`, `show`, `migrate`, `cache` |
++| Train | `train`, `doctor`, `metrics`, `harvest` |
++| Align | `preference mine/apply/revert/list` |
++| Synth | `synth instructions/preferences/revert/list` |
++| Infer | `prompt`, `repl` |
++| Ship | `export`, `pack`, `unpack`, `verify`, `push`, `pull`, `serve` |
  See the [CLI reference](./docs/cli/reference.md) for the full flag surface.
++## VSCode Extension
++
++The [dlm-vsc](https://github.com/tenseleyFlow/dlm-vsc) extension provides
++syntax highlighting, completions, diagnostics, and a side panel for `.dlm`
++authoring. Requires the
++[dlm-lsp](https://github.com/tenseleyFlow/dlm-lsp) language server:
++
++```sh
++pip install dlm-lsp
++```
++
  ## Documentation
  - [Getting started](./docs/getting-started/install.md)
  - [Frontmatter reference](./docs/format/frontmatter.md)
  - [Section grammar](./docs/format/sections.md)
--- [Preference section reference](./docs/format/preference-section.md)
++- [CLI reference](./docs/cli/reference.md)
  - [Training across codebases](./docs/cookbook/training-across-codebases.md)
--- [Train from a folder](./docs/cookbook/train-from-folder.md)
--- [Multi-source training](./docs/cookbook/multi-source-training.md)
--- [Tokenized-section cache](./docs/cookbook/directive-cache.md)
  - [Multi-adapter composition](./docs/cookbook/multi-adapter.md)
--- [Learned adapter gate](./docs/cookbook/learned-adapter-gate.md)
++- [Multi-target export](./docs/cookbook/multi-target-export.md)
--- [Self-improving loop / preference mining](./docs/cookbook/self-improving-loop.md)
++- [Self-improving loop](./docs/cookbook/self-improving-loop.md)
--- [Reward-model integration](./docs/cookbook/reward-model-integration.md)
++- [Synthesize training data](./docs/cookbook/synthesize-training-data.md)
  - [Multimodal training](./docs/cookbook/multimodal-training.md)
  - [Audio training](./docs/cookbook/audio-training.md)
--- [Probe-driven training / sway harvest](./docs/cookbook/probe-driven-training.md)
--- [Multi-target export](./docs/cookbook/multi-target-export.md)
--- [Sharing adapters and packs](./docs/cookbook/sharing.md)
--- [CLI reference](./docs/cli/reference.md)
  - [Architecture](./docs/architecture.md)
  - [Determinism](./docs/determinism.md)
  ## Principles
--1. **The document is the interface.**
++1. **The document is the interface.** Frontmatter, typed sections, directives,
--   But the document is structured: frontmatter, typed sections, directives, and
++   and store contracts — not a dashboard.
--   store contracts all matter.
++2. **Training is real.** LoRA / QLoRA / DoRA on pretrained bases.
--2. **Training is real.**
++3. **Retraining should not silently forget.** Replay-backed accumulation.
--   LoRA / QLoRA / DoRA on pretrained bases, not a toy transformer.
++4. **Local-first is load-bearing.** Your data stays on your machine.
--3. **Retraining should not silently forget.**
++5. **Determinism is a contract.** Locks, pinned versions, golden checks.
--   Replay-backed accumulation is part of the product.
--4. **Local-first is load-bearing.**
--   Your training data, adapters, exports, and packs stay on your machine unless
--   you explicitly move them.
--5. **Determinism is a contract.**
--   If a change breaks the reproducibility story, that is a product regression.
  ## Tech Stack
--Python 3.11+ · PyTorch · HuggingFace `transformers` / `peft` / `trl` /
++Python 3.11+ · PyTorch · HuggingFace transformers / peft / trl / accelerate ·
--`accelerate` / `datasets` · `watchfiles` · `prompt-toolkit` · `safetensors` ·
++vendored llama.cpp for GGUF · Ollama · Typer · Pydantic · uv
--vendored `llama.cpp` for GGUF export · Ollama (optional runtime target) ·
--Typer · Pydantic · `uv`
  ## Contributing
--See [CONTRIBUTING.md](./CONTRIBUTING.md). Testing conventions live in
++See [CONTRIBUTING.md](./CONTRIBUTING.md).
--[docs-internal/README-testing.md](./docs-internal/README-testing.md).
--
--```sh
--uv run pre-commit install
--```
  ## License
--MIT. Base-model licenses are separate and enforced where DLM needs them:
++MIT. Base-model licenses are separate and enforced at `dlm init`, `dlm train`,
--`dlm init`, `dlm train`, `dlm export`, and `dlm pack` all keep the gated-base
++`dlm export`, and `dlm pack`.
--acceptance path explicit.

pyproject.tomlmodified


 [project]
+name = "document-language-model"
 version = "0.10.0"
 description = "Directive-driven local LLM training, retraining, and export from .dlm documents, codebases, and multimodal sources."
 readme = "README.md"

`@@ -1,5 +1,5 @@`
1	[project]	1	[project]
2	-name = "dlm"	2	+name = "document-language-model"
3	version = "0.10.0"	3	version = "0.10.0"
4	description = "Directive-driven local LLM training, retraining, and export from .dlm documents, codebases, and multimodal sources."	4	description = "Directive-driven local LLM training, retraining, and export from .dlm documents, codebases, and multimodal sources."
5	readme = "README.md"	5	readme = "README.md"