markdown · 21278 bytes Raw Blame History

CLI reference

Generated from the running dlm --help output. Auto-regeneration via typer-cli is planned for a follow-up sprint; until then this file is hand-maintained and gated by the test suite.

Global options

Applied to every subcommand:

Option Env var Default Description
--home PATH DLM_HOME ~/.dlm Override the store root.
-v, --verbose off Emit plan / resolver diagnostics on stderr.
-q, --quiet off Suppress informational output.
--version Print version and exit.
--install-completion Install shell completion.
--show-completion Print shell completion script.
-h, --help Show command help.

Commands

dlm init

Bootstrap a new .dlm file with a fresh ULID, create the per-store directory, and persist the license-acceptance record (audit-05 B2).

dlm init <path> [--base <key>] [--template <name>]
                [--multimodal | --audio]
                [--skip-export-probes]
                [--i-accept-license] [--force]
Option Default Notes
--base <key> qwen2.5-1.5b Registry key or hf:org/name. Ignored when --template is used (the template's recommended_base wins). With --multimodal, defaults to paligemma-3b-mix-224.
--template <name> None Bootstrap from a named gallery template. See dlm templates list. Mutually exclusive with --multimodal.
--skip-export-probes false Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be used for training + HF inference. Forfeits dlm export to Ollama until the vendored exporter catches up.
--multimodal false Scaffold a vision-language .dlm with an ::image:: section (schema v10). Flips --base to paligemma-3b-mix-224 unless explicitly overridden; a non-VL --base is refused. See multimodal-training cookbook.
--audio false Scaffold an audio-language .dlm with an ::audio:: section. Flips --base to qwen2-audio-7b-instruct, skips export probes, and refuses text / vision-language bases. See audio-training cookbook.
--i-accept-license false Required for gated bases (Llama-3.2, PaliGemma).
--force false Overwrite an existing .dlm at path.

Writes <path> with minimum frontmatter, provisions ~/.dlm/store/<dlm_id>/ with an initial manifest.json, and (for gated bases) stores the LicenseAcceptance record so dlm train / dlm export don't re-prompt. Refuses if the .dlm file already exists and --force wasn't passed.

dlm train

Train / retrain the adapter.

dlm train <path> [--resume|--fresh] [--seed N] [--max-steps N]
                 [--phase {sft,preference,all}]
                 [--i-accept-license]
                 [--strict-lock|--update-lock|--ignore-lock]
                 [--strict-metrics]
                 [--base <key>] [--include GLOB]... [--exclude GLOB]...
                 [--recursive|--no-recursive] [--name NAME]
                 [--policy {strict,permissive}] [--rescaffold]
                 [--skip-export-probes]
Option Default Notes
--resume false Continue from training_state.pt. Mutex with --fresh.
--fresh false Discard prior optimizer state; train from scratch. Mutex with --resume. Default when neither flag is set.
--seed N frontmatter.training.seed Override training seed.
--max-steps N unlimited Cap step count.
--phase {sft,preference,all} all Choose which training phases run: SFT only, preference only, or both in sequence. Preference-only requires a prior SFT adapter.
--i-accept-license false Required for gated bases (usually captured once at dlm init and persisted).
--strict-lock false Fail on any dlm.lock drift (even WARN).
--update-lock false Bypass validation; always write a fresh dlm.lock.
--ignore-lock false Bypass validation; don't write dlm.lock.
--strict-metrics false Promote metrics SQLite write failures to hard errors instead of best-effort degradation. Run-start / run-end are always hard-fail anchors; this flag extends that policy to step, eval, tokenization, and export streams.
--gpus SPEC single-process Multi-GPU training via Accelerate. all uses every visible CUDA device; N uses the first N; 0,1 selects exact device ids. Dispatches to accelerate launch when >1 device is selected. Refused on MPS/CPU/ROCm; heterogeneous CUDA SMs refused.
--watch false Save-to-train mode (Sprint 25). After the initial train, block on filesystem events and re-run bounded-step retrains on each settled save.
--watch-max-steps N 100 Per-cycle step cap for --watch. Keeps cycles responsive.
--watch-debounce-ms N 400 Quiet interval before a burst of saves triggers a retrain.
--repl false With --watch: also run dlm repl in the same process. Scaffolded only — threading integration is a followup; today the flag refuses with exit 2.
--base <key> required on first auto-scaffold Base model for dlm train <dir> auto-scaffold. Ignored when <path> already points at a .dlm.
--include GLOB repeatable Auto-scaffold include glob. Defaults to **/* with --recursive, * with --no-recursive.
--exclude GLOB repeatable Auto-scaffold exclude glob. Directory-descent defaults still apply on top.
--recursive / --no-recursive recursive Auto-scaffold whether default include globs descend into subdirectories.
--name NAME corpus Auto-scaffold target file name under <dir>/.dlm/<name>.dlm. Lets one tree host multiple adapters.
--policy {strict,permissive} strict Auto-scaffold training.sources_policy. strict confines training sources to the target directory; permissive allows absolute paths anywhere.
--rescaffold false Rewrite an existing scaffolded .dlm in place with new auto-scaffold flags while preserving its dlm_id.
--no-cache false Bypass the tokenized-section cache for this run. Default is cache-on when the .dlm declares training.sources. Use when debugging tokenization or cross-checking cached-vs-uncached determinism. Entries from prior runs stay on disk; the next run without the flag picks them back up. See directive-cache.
--skip-export-probes false Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be trained for HF inference. Mirrors dlm init --skip-export-probes.

The three lock flags are mutually exclusive. See Determinism for the mismatch severity table.

--gpus multiplies the effective batch size by world_size; the resulting lock records world_size and warns on drift between runs. Multi-GPU + QLoRA on CUDA is permitted (bitsandbytes supports DDP); multi-GPU + ROCm is out of scope for Sprint 23.

dlm prompt

Run inference against the current adapter.

dlm prompt <path> [query] [--max-tokens N] [--temp F] [--top-p F]
                  [--adapter NAME] [--gate {auto,off}]
                  [--image PATH]... [--audio PATH]...
                  [--verbose]
Option Default Notes
--max-tokens N 256 Max new tokens to generate.
--temp F 0.7 Temperature. 0.0 = greedy decoding (deterministic).
--top-p F None Top-p sampling.
--adapter NAME None Select a named adapter from training.adapters. Required on multi-adapter documents; rejected on single-adapter ones.
--gate {auto,off} auto Learned adapter gate (Sprint 34). auto uses the trained gate when one exists in the store; off forces uniform weights across declared adapters. Silently ignored when --adapter pins a single adapter. See docs/cookbook/learned-adapter-gate.md.
--image PATH none Attach an image to the prompt. Repeat for multiple images; each expands to one <image> placeholder the processor slots pixels into. Required on vision-language bases; rejected on text bases. See multimodal-training cookbook.
--audio PATH none Attach an audio clip to the prompt. Repeat for multiple clips. Required on audio-language bases; rejected on text and vision-language bases. See audio-training cookbook.
--backend {auto,pytorch,mlx} auto Inference backend. auto picks MLX on Apple Silicon (when uv sync --extra mlx is installed), else PyTorch. Ignored on VL bases (the VL path always uses PyTorch + AutoModelForImageTextToText).
--verbose false Print resolved InferencePlan on stderr.

Query is the CLI positional argument. Omit to read from stdin.

dlm repl

Interactive prompt-and-respond REPL against the trained adapter (Sprint 24).

dlm repl <path> [--adapter NAME] [--backend {auto,pytorch,mlx}]
Option Default Notes
--adapter NAME None Named adapter; required on multi-adapter docs.
--backend {auto,pytorch,mlx} auto Same contract as dlm prompt --backend.

Slash commands inside the REPL: /help, /exit, /clear, /save, /adapter, /params, /model, /history. Ctrl-D exits; Ctrl-C cancels generation or input. Session history persists at ~/.dlm/history. See the interactive-session cookbook.

dlm metrics

Query the per-store SQLite metrics DB (Sprint 26).

dlm metrics <path> [--json|--csv] [--run-id N] [--phase PHASE] [--since WINDOW] [--limit N]
dlm metrics <path> watch [--poll-seconds N]
Option Default Notes
--json false Emit JSON object ({runs: [...], steps: [...], evals: [...]} when combined with --run-id).
--csv false Emit CSV of runs or (with --run-id) steps + evals.
--run-id N None Drill into one run; prints its step/eval counts.
--phase None Filter runs by phase (sft/dpo/orpo/cpt).
--since None Time window (24h, 7d, 30m, 10s).
--limit N 20 Cap the number of runs returned.

dlm metrics <path> watch polls the DB and tails new step/eval rows as they arrive. See the metrics cookbook for the full flow + optional TensorBoard / W&B sinks (uv sync --extra observability).

dlm templates

Browse the starter template gallery (Sprint 27).

dlm templates list [--json] [--refresh] [--accept-unsigned]
Option Default Notes
--json false Emit the full TemplateMeta for each entry as JSON.
--refresh false Refresh from the upstream gallery. Currently a no-op — upstream repo and signing key are pending (Sprint 27 deferred polish); the command warns and falls back to the bundled gallery.
--accept-unsigned false Reserved. Will bypass signed-tag verification once the live fetcher is wired.

Pair with dlm init --template <name> to create a new .dlm:

dlm init mydoc.dlm --template coding-tutor

See the template-gallery cookbook for the full walkthrough and the TemplateMeta schema.

dlm export

Produce GGUF files + Modelfile + register with Ollama.

dlm export <path> [--quant Q] [--merged [--dequantize]]
                  [--name N] [--no-template] [--skip-ollama]
                  [--no-smoke] [--no-imatrix] [--verbose]
                  [--draft TAG | --no-draft]
                  [--adapter NAME | --adapter-mix SPEC]
Option Default Notes
--quant Q frontmatter.export.default_quant Q4_K_M / Q5_K_M / Q6_K / Q8_0 / F16.
--merged false Merge LoRA into base before quantizing.
--dequantize false Required with --merged on a QLoRA adapter (pitfall #3).
--name N derived Ollama model name.
--no-template false Skip writing TEMPLATE into the Modelfile (power users only — Ollama will fuzzy-match, which Sprint 12 deliberately works around).
--skip-ollama false Emit GGUFs but don't register.
--no-smoke false Register but skip the smoke prompt.
--no-imatrix false Opt out of imatrix-calibrated quantization.
--verbose false Surface preflight + conversion diagnostics.
--draft TAG auto Override the speculative-decoding draft model.
--no-draft false Disable speculative decoding. Mutex with --draft.
--adapter NAME None Export a single named adapter from training.adapters. Rejected on single-adapter documents. Mutex with --adapter-mix.
--adapter-mix SPEC None Weighted composition like knowledge:1.0,tone:0.5. Produces one Ollama model by merging the named adapters at export time. LoRA-only; QLoRA sources require --dequantize --merged. Mutex with --adapter.
--adapter-mix-method linear PEFT merge strategy: linear (default; fast weighted sum) or svd (higher fidelity, heavier compute). Only meaningful with --adapter-mix.

dlm pack

Produce a portable .dlm.pack bundle.

dlm pack <path> [--out PATH] [--include-exports] [--include-base]
                [--include-logs] [--i-am-the-licensee URL]
Option Default Notes
--out PATH <name>.dlm.pack Pack output.
--include-exports false Bundle all GGUF exports.
--include-base false Bundle the base model weights. Requires license acknowledgement for gated bases.
--include-logs false Bundle per-run JSONL logs.
--i-am-the-licensee URL none URL acknowledging separate base-license acceptance.

dlm unpack

Install a .dlm.pack into the local store.

dlm unpack <pack> [--force] [--out DIR]
Option Default Notes
--force false Overwrite an existing store with the same dlm_id.
--out DIR pack parent Where to place the restored .dlm.

dlm verify

Verify a .dlm.pack provenance chain before trusting or installing it.

dlm verify <pack> [--trust-on-first-use]
Option Default Notes
--trust-on-first-use false Record an unknown signer's public key into ~/.dlm/trusted-keys/ on first verify. Without it, unknown signers are refused with exit code 2.

Exit codes: 0 verified, 1 broken chain or missing provenance, 2 untrusted signer, 3 signature rejected.

dlm push

Upload a .dlm (auto-packs) or .dlm.pack to a sharing destination (Sprint 28).

dlm push <path> --to <destination> [--sign] [pack flags]
Option Default Notes
--to <destination> required hf:<org>/<repo>, https://... URL endpoint, or a local path.
--sign false Sign the pack with minisign before upload (requires minisign on PATH + key at ~/.dlm/minisign.key).
--include-exports false Forwarded to dlm pack when auto-packing a .dlm.
--include-base false Same.
--include-logs false Same.
--i-am-the-licensee URL none Required with --include-base on a non-redistributable base.

Destinations:

  • hf:<org>/<repo> — HuggingFace Hub. Uses $HF_TOKEN if set. Autogenerates a README.md with library_name: dlm tag. Creates the repo if missing (your personal namespace needs no approval).
  • https://… — any HTTPS endpoint that accepts a POST with an application/octet-stream body. Sets Authorization: from $DLM_SHARE_AUTH when present (e.g. Bearer <token>).
  • <local/path> — copy the pack to a filesystem path.

dlm pull

Download + verify + unpack a .dlm.pack from a remote source.

dlm pull <source> [--out DIR] [--force]
Option Default Notes
<source> required hf:<org>/<repo>, https://…, peer://host:port/<id>?token=…, or a local path.
--out DIR CWD Directory for the restored .dlm.
--force false Overwrite an existing store with the same dlm_id.

Pulls always verify sha256 checksums during unpack. If a .minisig sidecar is served alongside the pack, dlm pull tries every key in ~/.dlm/trusted-keys/*.pub — match → verified, no match → unverified warning (still installs, checksums are fine). No sidecar → unsigned (still installs).

dlm serve

Serve a .dlm's pack over LAN for peers to pull.

dlm serve <path> [--port N] [--public --i-know-this-is-public]
                 [--max-concurrency N] [--rate-limit N]
                 [--token-ttl-minutes N]
Option Default Notes
--port N 7337 Bind port.
--public false Bind 0.0.0.0 only when paired with --i-know-this-is-public. Without the confirmation flag, --public logs a refusal and binds 127.0.0.1.
--i-know-this-is-public false Acknowledges the public bind. Meaningless without --public.
--max-concurrency N 4 Max concurrent connections per token. Excess returns HTTP 429.
--rate-limit N 30 Max requests per minute per token.
--token-ttl-minutes N 15 Issued token lifetime. Ctrl-C invalidates every outstanding token instantly — the session secret lives only in the serving process.

On start, prints the peer:// URL (with embedded token) that the other side pastes into dlm pull. Ctrl-C cleanly stops the server and deletes the temp pack.

dlm doctor

Inspect hardware + print the resolved training plan.

dlm doctor [--json]

No-network. Probes torch + psutil only; refuses to go online.

dlm show

Show training history, exports, and adapter state for a document.

dlm show <path> [--json]

Pretty-prints manifest + lock state. --json emits machine-readable output.

dlm migrate

Migrate a .dlm frontmatter to the current schema version.

dlm migrate <path> [--dry-run] [--no-backup]
Option Default Notes
--dry-run false Print the migrated frontmatter without writing.
--no-backup false Skip the .dlm.bak backup.

dlm cache

Inspect and manage the per-store tokenized-section cache (Sprint 31). The cache speeds up re-training on directive-sourced codebases by keying tokenized output on (section_id, tokenizer_sha, sequence_len).

dlm cache show <path> [--json]
dlm cache prune <path> [--older-than DURATION]
dlm cache clear <path> [--force]
Subcommand Notes
show Print entry count, size on disk, last-run hit rate. --json for machine-readable output.
prune Delete entries not accessed within --older-than (e.g. 30d, 12h, 45m). Default 90d.
clear Wipe the entire cache. Prompts for confirmation unless --force is passed.

See docs/cookbook/directive-cache.md for tuning, invalidation triggers, and maintenance patterns.

dlm harvest

Pull failing-probe results from a sway-style eval report back into the document as !probe-tagged ::instruction:: sections for the next retrain. See docs/cookbook/probe-driven-training.md.

dlm harvest <path> --sway-json <report> [--apply] [--dry-run]
                   [--tag NAME] [--min-confidence F]
                   [--strict | --lax]
dlm harvest <path> --revert
Option Default Notes
--sway-json PATH required Path to the sway probe report JSON.
--apply false Write changes to disk. Without it, dry-run.
--dry-run true Print the diff; no writes.
--revert Strip all auto_harvest=True sections; mutually exclusive with --sway-json.
--tag NAME auto-harvest Provenance tag written into harvest_source.
--min-confidence F 0.0 Skip candidates below this confidence.
--strict / --lax lax Strict: fail if any failing probe lacks a reference. Lax: skip + log.

Exit codes: 0 success, 1 validation error (malformed JSON, strict miss, mutual-exclusion violation), 2 no candidates to harvest.

dlm train --listen-rpc

During --watch, open a JSON-RPC endpoint that accepts inject_probe pushes from external eval harnesses. Requires DLM_PROBE_TOKEN in the environment. See docs/cookbook/probe-driven-training.md for the wire protocol and security notes.

Option Default Notes
--listen-rpc HOST:PORT off Bind the probe-RPC endpoint. Requires --watch or --max-cycles.
--max-cycles N 0 Bounded-loop alternative to --watch for convergence runs (scaffolded — currently refuses execution without --watch).

Exit codes

Code Meaning
0 Success.
1 Runtime failure (license refused, disk full, OOM, template drift, lock validation).
2 CLI misuse (mutex violation, missing argument).

Domain errors are formatted consistently via bare console.print calls in each subcommand (prefix convention: <subject>: <message>, e.g. lock: base_model_revision changed). Uncaught exceptions escape into dlm.cli.reporter which picks a matching prefix from the module the exception came from and renders a tier-3 generic message.

View source
1 # CLI reference
2
3 Generated from the running `dlm --help` output. Auto-regeneration via
4 `typer-cli` is planned for a follow-up sprint; until then this file is
5 hand-maintained and gated by the test suite.
6
7 ## Global options
8
9 Applied to every subcommand:
10
11 | Option | Env var | Default | Description |
12 |---|---|---|---|
13 | `--home PATH` | `DLM_HOME` | `~/.dlm` | Override the store root. |
14 | `-v, --verbose` | — | off | Emit plan / resolver diagnostics on stderr. |
15 | `-q, --quiet` | — | off | Suppress informational output. |
16 | `--version` | — | — | Print version and exit. |
17 | `--install-completion` | — | — | Install shell completion. |
18 | `--show-completion` | — | — | Print shell completion script. |
19 | `-h, --help` | — | — | Show command help. |
20
21 ## Commands
22
23 ### `dlm init`
24
25 Bootstrap a new `.dlm` file with a fresh ULID, create the per-store
26 directory, and persist the license-acceptance record (audit-05 B2).
27
28 ```
29 dlm init <path> [--base <key>] [--template <name>]
30 [--multimodal | --audio]
31 [--skip-export-probes]
32 [--i-accept-license] [--force]
33 ```
34
35 | Option | Default | Notes |
36 |---|---|---|
37 | `--base <key>` | `qwen2.5-1.5b` | Registry key or `hf:org/name`. Ignored when `--template` is used (the template's `recommended_base` wins). With `--multimodal`, defaults to `paligemma-3b-mix-224`. |
38 | `--template <name>` | None | Bootstrap from a named gallery template. See `dlm templates list`. Mutually exclusive with `--multimodal`. |
39 | `--skip-export-probes` | false | Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be used for training + HF inference. Forfeits `dlm export` to Ollama until the vendored exporter catches up. |
40 | `--multimodal` | false | Scaffold a vision-language `.dlm` with an `::image::` section (schema v10). Flips `--base` to `paligemma-3b-mix-224` unless explicitly overridden; a non-VL `--base` is refused. See [multimodal-training cookbook](../cookbook/multimodal-training.md). |
41 | `--audio` | false | Scaffold an audio-language `.dlm` with an `::audio::` section. Flips `--base` to `qwen2-audio-7b-instruct`, skips export probes, and refuses text / vision-language bases. See [audio-training cookbook](../cookbook/audio-training.md). |
42 | `--i-accept-license` | false | Required for gated bases (Llama-3.2, PaliGemma). |
43 | `--force` | false | Overwrite an existing `.dlm` at path. |
44
45 Writes `<path>` with minimum frontmatter, provisions
46 `~/.dlm/store/<dlm_id>/` with an initial `manifest.json`, and (for
47 gated bases) stores the `LicenseAcceptance` record so `dlm train` /
48 `dlm export` don't re-prompt. Refuses if the `.dlm` file already
49 exists and `--force` wasn't passed.
50
51 ### `dlm train`
52
53 Train / retrain the adapter.
54
55 ```
56 dlm train <path> [--resume|--fresh] [--seed N] [--max-steps N]
57 [--phase {sft,preference,all}]
58 [--i-accept-license]
59 [--strict-lock|--update-lock|--ignore-lock]
60 [--strict-metrics]
61 [--base <key>] [--include GLOB]... [--exclude GLOB]...
62 [--recursive|--no-recursive] [--name NAME]
63 [--policy {strict,permissive}] [--rescaffold]
64 [--skip-export-probes]
65 ```
66
67 | Option | Default | Notes |
68 |---|---|---|
69 | `--resume` | false | Continue from `training_state.pt`. Mutex with `--fresh`. |
70 | `--fresh` | false | Discard prior optimizer state; train from scratch. Mutex with `--resume`. Default when neither flag is set. |
71 | `--seed N` | frontmatter.training.seed | Override training seed. |
72 | `--max-steps N` | unlimited | Cap step count. |
73 | `--phase {sft,preference,all}` | `all` | Choose which training phases run: SFT only, preference only, or both in sequence. Preference-only requires a prior SFT adapter. |
74 | `--i-accept-license` | false | Required for gated bases (usually captured once at `dlm init` and persisted). |
75 | `--strict-lock` | false | Fail on any `dlm.lock` drift (even WARN). |
76 | `--update-lock` | false | Bypass validation; always write a fresh `dlm.lock`. |
77 | `--ignore-lock` | false | Bypass validation; don't write `dlm.lock`. |
78 | `--strict-metrics` | false | Promote metrics SQLite write failures to hard errors instead of best-effort degradation. Run-start / run-end are always hard-fail anchors; this flag extends that policy to step, eval, tokenization, and export streams. |
79 | `--gpus SPEC` | single-process | Multi-GPU training via Accelerate. `all` uses every visible CUDA device; `N` uses the first N; `0,1` selects exact device ids. Dispatches to `accelerate launch` when >1 device is selected. Refused on MPS/CPU/ROCm; heterogeneous CUDA SMs refused. |
80 | `--watch` | false | Save-to-train mode (Sprint 25). After the initial train, block on filesystem events and re-run bounded-step retrains on each settled save. |
81 | `--watch-max-steps N` | 100 | Per-cycle step cap for `--watch`. Keeps cycles responsive. |
82 | `--watch-debounce-ms N` | 400 | Quiet interval before a burst of saves triggers a retrain. |
83 | `--repl` | false | With `--watch`: also run `dlm repl` in the same process. **Scaffolded only** — threading integration is a followup; today the flag refuses with exit 2. |
84 | `--base <key>` | required on first auto-scaffold | Base model for `dlm train <dir>` auto-scaffold. Ignored when `<path>` already points at a `.dlm`. |
85 | `--include GLOB` | repeatable | Auto-scaffold include glob. Defaults to `**/*` with `--recursive`, `*` with `--no-recursive`. |
86 | `--exclude GLOB` | repeatable | Auto-scaffold exclude glob. Directory-descent defaults still apply on top. |
87 | `--recursive` / `--no-recursive` | recursive | Auto-scaffold whether default include globs descend into subdirectories. |
88 | `--name NAME` | `corpus` | Auto-scaffold target file name under `<dir>/.dlm/<name>.dlm`. Lets one tree host multiple adapters. |
89 | `--policy {strict,permissive}` | `strict` | Auto-scaffold `training.sources_policy`. `strict` confines training sources to the target directory; `permissive` allows absolute paths anywhere. |
90 | `--rescaffold` | false | Rewrite an existing scaffolded `.dlm` in place with new auto-scaffold flags while preserving its `dlm_id`. |
91 | `--no-cache` | false | Bypass the tokenized-section cache for this run. Default is cache-on when the `.dlm` declares `training.sources`. Use when debugging tokenization or cross-checking cached-vs-uncached determinism. Entries from prior runs stay on disk; the next run without the flag picks them back up. See [directive-cache](../cookbook/directive-cache.md). |
92 | `--skip-export-probes` | false | Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be trained for HF inference. Mirrors `dlm init --skip-export-probes`. |
93
94 The three lock flags are mutually exclusive. See [Determinism](../determinism.md)
95 for the mismatch severity table.
96
97 `--gpus` multiplies the effective batch size by `world_size`; the
98 resulting lock records `world_size` and warns on drift between runs.
99 Multi-GPU + QLoRA on CUDA is permitted (bitsandbytes supports DDP);
100 multi-GPU + ROCm is out of scope for Sprint 23.
101
102 ### `dlm prompt`
103
104 Run inference against the current adapter.
105
106 ```
107 dlm prompt <path> [query] [--max-tokens N] [--temp F] [--top-p F]
108 [--adapter NAME] [--gate {auto,off}]
109 [--image PATH]... [--audio PATH]...
110 [--verbose]
111 ```
112
113 | Option | Default | Notes |
114 |---|---|---|
115 | `--max-tokens N` | 256 | Max new tokens to generate. |
116 | `--temp F` | 0.7 | Temperature. `0.0` = greedy decoding (deterministic). |
117 | `--top-p F` | None | Top-p sampling. |
118 | `--adapter NAME` | None | Select a named adapter from `training.adapters`. Required on multi-adapter documents; rejected on single-adapter ones. |
119 | `--gate {auto,off}` | `auto` | Learned adapter gate (Sprint 34). `auto` uses the trained gate when one exists in the store; `off` forces uniform weights across declared adapters. Silently ignored when `--adapter` pins a single adapter. See `docs/cookbook/learned-adapter-gate.md`. |
120 | `--image PATH` | none | Attach an image to the prompt. Repeat for multiple images; each expands to one `<image>` placeholder the processor slots pixels into. Required on vision-language bases; rejected on text bases. See [multimodal-training cookbook](../cookbook/multimodal-training.md). |
121 | `--audio PATH` | none | Attach an audio clip to the prompt. Repeat for multiple clips. Required on audio-language bases; rejected on text and vision-language bases. See [audio-training cookbook](../cookbook/audio-training.md). |
122 | `--backend {auto,pytorch,mlx}` | `auto` | Inference backend. `auto` picks MLX on Apple Silicon (when `uv sync --extra mlx` is installed), else PyTorch. Ignored on VL bases (the VL path always uses PyTorch + AutoModelForImageTextToText). |
123 | `--verbose` | false | Print resolved `InferencePlan` on stderr. |
124
125 Query is the CLI positional argument. Omit to read from stdin.
126
127 ### `dlm repl`
128
129 Interactive prompt-and-respond REPL against the trained adapter
130 (Sprint 24).
131
132 ```
133 dlm repl <path> [--adapter NAME] [--backend {auto,pytorch,mlx}]
134 ```
135
136 | Option | Default | Notes |
137 |---|---|---|
138 | `--adapter NAME` | None | Named adapter; required on multi-adapter docs. |
139 | `--backend {auto,pytorch,mlx}` | `auto` | Same contract as `dlm prompt --backend`. |
140
141 Slash commands inside the REPL: `/help`, `/exit`, `/clear`, `/save`,
142 `/adapter`, `/params`, `/model`, `/history`. Ctrl-D exits; Ctrl-C
143 cancels generation or input. Session history persists at
144 `~/.dlm/history`. See the [interactive-session cookbook](../cookbook/interactive-session.md).
145
146 ### `dlm metrics`
147
148 Query the per-store SQLite metrics DB (Sprint 26).
149
150 ```
151 dlm metrics <path> [--json|--csv] [--run-id N] [--phase PHASE] [--since WINDOW] [--limit N]
152 dlm metrics <path> watch [--poll-seconds N]
153 ```
154
155 | Option | Default | Notes |
156 |---|---|---|
157 | `--json` | false | Emit JSON object (`{runs: [...], steps: [...], evals: [...]}` when combined with `--run-id`). |
158 | `--csv` | false | Emit CSV of runs or (with `--run-id`) steps + evals. |
159 | `--run-id N` | None | Drill into one run; prints its step/eval counts. |
160 | `--phase` | None | Filter runs by phase (`sft`/`dpo`/`orpo`/`cpt`). |
161 | `--since` | None | Time window (`24h`, `7d`, `30m`, `10s`). |
162 | `--limit N` | 20 | Cap the number of runs returned. |
163
164 `dlm metrics <path> watch` polls the DB and tails new step/eval rows as
165 they arrive. See the [metrics cookbook](../cookbook/metrics.md) for
166 the full flow + optional TensorBoard / W&B sinks (`uv sync --extra
167 observability`).
168
169 ### `dlm templates`
170
171 Browse the starter template gallery (Sprint 27).
172
173 ```
174 dlm templates list [--json] [--refresh] [--accept-unsigned]
175 ```
176
177 | Option | Default | Notes |
178 |---|---|---|
179 | `--json` | false | Emit the full `TemplateMeta` for each entry as JSON. |
180 | `--refresh` | false | Refresh from the upstream gallery. **Currently a no-op** — upstream repo and signing key are pending (Sprint 27 deferred polish); the command warns and falls back to the bundled gallery. |
181 | `--accept-unsigned` | false | Reserved. Will bypass signed-tag verification once the live fetcher is wired. |
182
183 Pair with `dlm init --template <name>` to create a new `.dlm`:
184
185 ```bash
186 dlm init mydoc.dlm --template coding-tutor
187 ```
188
189 See the [template-gallery cookbook](../cookbook/template-gallery.md)
190 for the full walkthrough and the `TemplateMeta` schema.
191
192 ### `dlm export`
193
194 Produce GGUF files + Modelfile + register with Ollama.
195
196 ```
197 dlm export <path> [--quant Q] [--merged [--dequantize]]
198 [--name N] [--no-template] [--skip-ollama]
199 [--no-smoke] [--no-imatrix] [--verbose]
200 [--draft TAG | --no-draft]
201 [--adapter NAME | --adapter-mix SPEC]
202 ```
203
204 | Option | Default | Notes |
205 |---|---|---|
206 | `--quant Q` | frontmatter.export.default_quant | `Q4_K_M` / `Q5_K_M` / `Q6_K` / `Q8_0` / `F16`. |
207 | `--merged` | false | Merge LoRA into base before quantizing. |
208 | `--dequantize` | false | Required with `--merged` on a QLoRA adapter (pitfall #3). |
209 | `--name N` | derived | Ollama model name. |
210 | `--no-template` | false | Skip writing `TEMPLATE` into the Modelfile (power users only — Ollama will fuzzy-match, which Sprint 12 deliberately works around). |
211 | `--skip-ollama` | false | Emit GGUFs but don't register. |
212 | `--no-smoke` | false | Register but skip the smoke prompt. |
213 | `--no-imatrix` | false | Opt out of imatrix-calibrated quantization. |
214 | `--verbose` | false | Surface preflight + conversion diagnostics. |
215 | `--draft TAG` | auto | Override the speculative-decoding draft model. |
216 | `--no-draft` | false | Disable speculative decoding. Mutex with `--draft`. |
217 | `--adapter NAME` | None | Export a single named adapter from `training.adapters`. Rejected on single-adapter documents. Mutex with `--adapter-mix`. |
218 | `--adapter-mix SPEC` | None | Weighted composition like `knowledge:1.0,tone:0.5`. Produces one Ollama model by merging the named adapters at export time. LoRA-only; QLoRA sources require `--dequantize --merged`. Mutex with `--adapter`. |
219 | `--adapter-mix-method` | `linear` | PEFT merge strategy: `linear` (default; fast weighted sum) or `svd` (higher fidelity, heavier compute). Only meaningful with `--adapter-mix`. |
220
221 ### `dlm pack`
222
223 Produce a portable `.dlm.pack` bundle.
224
225 ```
226 dlm pack <path> [--out PATH] [--include-exports] [--include-base]
227 [--include-logs] [--i-am-the-licensee URL]
228 ```
229
230 | Option | Default | Notes |
231 |---|---|---|
232 | `--out PATH` | `<name>.dlm.pack` | Pack output. |
233 | `--include-exports` | false | Bundle all GGUF exports. |
234 | `--include-base` | false | Bundle the base model weights. Requires license acknowledgement for gated bases. |
235 | `--include-logs` | false | Bundle per-run JSONL logs. |
236 | `--i-am-the-licensee URL` | none | URL acknowledging separate base-license acceptance. |
237
238 ### `dlm unpack`
239
240 Install a `.dlm.pack` into the local store.
241
242 ```
243 dlm unpack <pack> [--force] [--out DIR]
244 ```
245
246 | Option | Default | Notes |
247 |---|---|---|
248 | `--force` | false | Overwrite an existing store with the same `dlm_id`. |
249 | `--out DIR` | pack parent | Where to place the restored `.dlm`. |
250
251 ### `dlm verify`
252
253 Verify a `.dlm.pack` provenance chain before trusting or installing it.
254
255 ```
256 dlm verify <pack> [--trust-on-first-use]
257 ```
258
259 | Option | Default | Notes |
260 |---|---|---|
261 | `--trust-on-first-use` | false | Record an unknown signer's public key into `~/.dlm/trusted-keys/` on first verify. Without it, unknown signers are refused with exit code 2. |
262
263 Exit codes: `0` verified, `1` broken chain or missing provenance,
264 `2` untrusted signer, `3` signature rejected.
265
266 ### `dlm push`
267
268 Upload a `.dlm` (auto-packs) or `.dlm.pack` to a sharing destination
269 (Sprint 28).
270
271 ```
272 dlm push <path> --to <destination> [--sign] [pack flags]
273 ```
274
275 | Option | Default | Notes |
276 |---|---|---|
277 | `--to <destination>` | required | `hf:<org>/<repo>`, `https://...` URL endpoint, or a local path. |
278 | `--sign` | false | Sign the pack with `minisign` before upload (requires `minisign` on PATH + key at `~/.dlm/minisign.key`). |
279 | `--include-exports` | false | Forwarded to `dlm pack` when auto-packing a `.dlm`. |
280 | `--include-base` | false | Same. |
281 | `--include-logs` | false | Same. |
282 | `--i-am-the-licensee URL` | none | Required with `--include-base` on a non-redistributable base. |
283
284 **Destinations:**
285 - `hf:<org>/<repo>` — HuggingFace Hub. Uses `$HF_TOKEN` if set. Autogenerates a `README.md` with `library_name: dlm` tag. Creates the repo if missing (your personal namespace needs no approval).
286 - `https://…` — any HTTPS endpoint that accepts a POST with an `application/octet-stream` body. Sets `Authorization:` from `$DLM_SHARE_AUTH` when present (e.g. `Bearer <token>`).
287 - `<local/path>` — copy the pack to a filesystem path.
288
289 ### `dlm pull`
290
291 Download + verify + unpack a `.dlm.pack` from a remote source.
292
293 ```
294 dlm pull <source> [--out DIR] [--force]
295 ```
296
297 | Option | Default | Notes |
298 |---|---|---|
299 | `<source>` | required | `hf:<org>/<repo>`, `https://…`, `peer://host:port/<id>?token=…`, or a local path. |
300 | `--out DIR` | CWD | Directory for the restored `.dlm`. |
301 | `--force` | false | Overwrite an existing store with the same `dlm_id`. |
302
303 Pulls always verify sha256 checksums during unpack. If a `.minisig`
304 sidecar is served alongside the pack, `dlm pull` tries every key in
305 `~/.dlm/trusted-keys/*.pub` — match → `verified`, no match →
306 `unverified` warning (still installs, checksums are fine). No sidecar
307 `unsigned` (still installs).
308
309 ### `dlm serve`
310
311 Serve a `.dlm`'s pack over LAN for peers to pull.
312
313 ```
314 dlm serve <path> [--port N] [--public --i-know-this-is-public]
315 [--max-concurrency N] [--rate-limit N]
316 [--token-ttl-minutes N]
317 ```
318
319 | Option | Default | Notes |
320 |---|---|---|
321 | `--port N` | 7337 | Bind port. |
322 | `--public` | false | Bind `0.0.0.0` **only when paired with** `--i-know-this-is-public`. Without the confirmation flag, `--public` logs a refusal and binds `127.0.0.1`. |
323 | `--i-know-this-is-public` | false | Acknowledges the public bind. Meaningless without `--public`. |
324 | `--max-concurrency N` | 4 | Max concurrent connections per token. Excess returns HTTP 429. |
325 | `--rate-limit N` | 30 | Max requests per minute per token. |
326 | `--token-ttl-minutes N` | 15 | Issued token lifetime. Ctrl-C invalidates every outstanding token instantly — the session secret lives only in the serving process. |
327
328 On start, prints the `peer://` URL (with embedded token) that the
329 other side pastes into `dlm pull`. Ctrl-C cleanly stops the server
330 and deletes the temp pack.
331
332 ### `dlm doctor`
333
334 Inspect hardware + print the resolved training plan.
335
336 ```
337 dlm doctor [--json]
338 ```
339
340 No-network. Probes torch + psutil only; refuses to go online.
341
342 ### `dlm show`
343
344 Show training history, exports, and adapter state for a document.
345
346 ```
347 dlm show <path> [--json]
348 ```
349
350 Pretty-prints manifest + lock state. `--json` emits machine-readable
351 output.
352
353 ### `dlm migrate`
354
355 Migrate a `.dlm` frontmatter to the current schema version.
356
357 ```
358 dlm migrate <path> [--dry-run] [--no-backup]
359 ```
360
361 | Option | Default | Notes |
362 |---|---|---|
363 | `--dry-run` | false | Print the migrated frontmatter without writing. |
364 | `--no-backup` | false | Skip the `.dlm.bak` backup. |
365
366 ### `dlm cache`
367
368 Inspect and manage the per-store tokenized-section cache (Sprint 31).
369 The cache speeds up re-training on directive-sourced codebases by
370 keying tokenized output on `(section_id, tokenizer_sha, sequence_len)`.
371
372 ```
373 dlm cache show <path> [--json]
374 dlm cache prune <path> [--older-than DURATION]
375 dlm cache clear <path> [--force]
376 ```
377
378 | Subcommand | Notes |
379 |---|---|
380 | `show` | Print entry count, size on disk, last-run hit rate. `--json` for machine-readable output. |
381 | `prune` | Delete entries not accessed within `--older-than` (e.g. `30d`, `12h`, `45m`). Default `90d`. |
382 | `clear` | Wipe the entire cache. Prompts for confirmation unless `--force` is passed. |
383
384 See `docs/cookbook/directive-cache.md` for tuning, invalidation
385 triggers, and maintenance patterns.
386
387 ### `dlm harvest`
388
389 Pull failing-probe results from a sway-style eval report back into the
390 document as `!probe`-tagged `::instruction::` sections for the next
391 retrain. See `docs/cookbook/probe-driven-training.md`.
392
393 ```
394 dlm harvest <path> --sway-json <report> [--apply] [--dry-run]
395 [--tag NAME] [--min-confidence F]
396 [--strict | --lax]
397 dlm harvest <path> --revert
398 ```
399
400 | Option | Default | Notes |
401 |---|---|---|
402 | `--sway-json PATH` | required | Path to the sway probe report JSON. |
403 | `--apply` | false | Write changes to disk. Without it, dry-run. |
404 | `--dry-run` | true | Print the diff; no writes. |
405 | `--revert` | — | Strip all `auto_harvest=True` sections; mutually exclusive with `--sway-json`. |
406 | `--tag NAME` | `auto-harvest` | Provenance tag written into `harvest_source`. |
407 | `--min-confidence F` | `0.0` | Skip candidates below this confidence. |
408 | `--strict` / `--lax` | lax | Strict: fail if any failing probe lacks a reference. Lax: skip + log. |
409
410 Exit codes: `0` success, `1` validation error (malformed JSON, strict
411 miss, mutual-exclusion violation), `2` no candidates to harvest.
412
413 ### `dlm train --listen-rpc`
414
415 During `--watch`, open a JSON-RPC endpoint that accepts `inject_probe`
416 pushes from external eval harnesses. Requires `DLM_PROBE_TOKEN` in the
417 environment. See `docs/cookbook/probe-driven-training.md` for the wire
418 protocol and security notes.
419
420 | Option | Default | Notes |
421 |---|---|---|
422 | `--listen-rpc HOST:PORT` | off | Bind the probe-RPC endpoint. Requires `--watch` or `--max-cycles`. |
423 | `--max-cycles N` | `0` | Bounded-loop alternative to `--watch` for convergence runs (scaffolded — currently refuses execution without `--watch`). |
424
425 ## Exit codes
426
427 | Code | Meaning |
428 |---|---|
429 | 0 | Success. |
430 | 1 | Runtime failure (license refused, disk full, OOM, template drift, lock validation). |
431 | 2 | CLI misuse (mutex violation, missing argument). |
432
433 Domain errors are formatted consistently via bare `console.print`
434 calls in each subcommand (prefix convention: `<subject>: <message>`,
435 e.g. `lock: base_model_revision changed`). Uncaught exceptions escape
436 into `dlm.cli.reporter` which picks a matching prefix from the
437 module the exception came from and renders a tier-3 generic message.