documentlanguagemodel Public
CLI reference
Generated from the running dlm --help output. Auto-regeneration via
typer-cli is planned for a follow-up sprint; until then this file is
hand-maintained and gated by the test suite.
Global options
Applied to every subcommand:
| Option | Env var | Default | Description |
|---|---|---|---|
--home PATH |
DLM_HOME |
~/.dlm |
Override the store root. |
-v, --verbose |
— | off | Emit plan / resolver diagnostics on stderr. |
-q, --quiet |
— | off | Suppress informational output. |
--version |
— | — | Print version and exit. |
--install-completion |
— | — | Install shell completion. |
--show-completion |
— | — | Print shell completion script. |
-h, --help |
— | — | Show command help. |
Commands
dlm init
Bootstrap a new .dlm file with a fresh ULID, create the per-store
directory, and persist the license-acceptance record (audit-05 B2).
dlm init <path> [--base <key>] [--template <name>]
[--multimodal | --audio]
[--skip-export-probes]
[--i-accept-license] [--force]
| Option | Default | Notes |
|---|---|---|
--base <key> |
qwen2.5-1.5b |
Registry key or hf:org/name. Ignored when --template is used (the template's recommended_base wins). With --multimodal, defaults to paligemma-3b-mix-224. |
--template <name> |
None | Bootstrap from a named gallery template. See dlm templates list. Mutually exclusive with --multimodal. |
--skip-export-probes |
false | Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be used for training + HF inference. Forfeits dlm export to Ollama until the vendored exporter catches up. |
--multimodal |
false | Scaffold a vision-language .dlm with an ::image:: section (schema v10). Flips --base to paligemma-3b-mix-224 unless explicitly overridden; a non-VL --base is refused. See multimodal-training cookbook. |
--audio |
false | Scaffold an audio-language .dlm with an ::audio:: section. Flips --base to qwen2-audio-7b-instruct, skips export probes, and refuses text / vision-language bases. See audio-training cookbook. |
--i-accept-license |
false | Required for gated bases (Llama-3.2, PaliGemma). |
--force |
false | Overwrite an existing .dlm at path. |
Writes <path> with minimum frontmatter, provisions
~/.dlm/store/<dlm_id>/ with an initial manifest.json, and (for
gated bases) stores the LicenseAcceptance record so dlm train /
dlm export don't re-prompt. Refuses if the .dlm file already
exists and --force wasn't passed.
dlm train
Train / retrain the adapter.
dlm train <path> [--resume|--fresh] [--seed N] [--max-steps N]
[--phase {sft,preference,all}]
[--i-accept-license]
[--strict-lock|--update-lock|--ignore-lock]
[--strict-metrics]
[--base <key>] [--include GLOB]... [--exclude GLOB]...
[--recursive|--no-recursive] [--name NAME]
[--policy {strict,permissive}] [--rescaffold]
[--skip-export-probes]
| Option | Default | Notes |
|---|---|---|
--resume |
false | Continue from training_state.pt. Mutex with --fresh. |
--fresh |
false | Discard prior optimizer state; train from scratch. Mutex with --resume. Default when neither flag is set. |
--seed N |
frontmatter.training.seed | Override training seed. |
--max-steps N |
unlimited | Cap step count. |
--phase {sft,preference,all} |
all |
Choose which training phases run: SFT only, preference only, or both in sequence. Preference-only requires a prior SFT adapter. |
--i-accept-license |
false | Required for gated bases (usually captured once at dlm init and persisted). |
--strict-lock |
false | Fail on any dlm.lock drift (even WARN). |
--update-lock |
false | Bypass validation; always write a fresh dlm.lock. |
--ignore-lock |
false | Bypass validation; don't write dlm.lock. |
--strict-metrics |
false | Promote metrics SQLite write failures to hard errors instead of best-effort degradation. Run-start / run-end are always hard-fail anchors; this flag extends that policy to step, eval, tokenization, and export streams. |
--gpus SPEC |
single-process | Multi-GPU training via Accelerate. all uses every visible CUDA device; N uses the first N; 0,1 selects exact device ids. Dispatches to accelerate launch when >1 device is selected. Refused on MPS/CPU/ROCm; heterogeneous CUDA SMs refused. |
--watch |
false | Save-to-train mode (Sprint 25). After the initial train, block on filesystem events and re-run bounded-step retrains on each settled save. |
--watch-max-steps N |
100 | Per-cycle step cap for --watch. Keeps cycles responsive. |
--watch-debounce-ms N |
400 | Quiet interval before a burst of saves triggers a retrain. |
--repl |
false | With --watch: also run dlm repl in the same process. Scaffolded only — threading integration is a followup; today the flag refuses with exit 2. |
--base <key> |
required on first auto-scaffold | Base model for dlm train <dir> auto-scaffold. Ignored when <path> already points at a .dlm. |
--include GLOB |
repeatable | Auto-scaffold include glob. Defaults to **/* with --recursive, * with --no-recursive. |
--exclude GLOB |
repeatable | Auto-scaffold exclude glob. Directory-descent defaults still apply on top. |
--recursive / --no-recursive |
recursive | Auto-scaffold whether default include globs descend into subdirectories. |
--name NAME |
corpus |
Auto-scaffold target file name under <dir>/.dlm/<name>.dlm. Lets one tree host multiple adapters. |
--policy {strict,permissive} |
strict |
Auto-scaffold training.sources_policy. strict confines training sources to the target directory; permissive allows absolute paths anywhere. |
--rescaffold |
false | Rewrite an existing scaffolded .dlm in place with new auto-scaffold flags while preserving its dlm_id. |
--no-cache |
false | Bypass the tokenized-section cache for this run. Default is cache-on when the .dlm declares training.sources. Use when debugging tokenization or cross-checking cached-vs-uncached determinism. Entries from prior runs stay on disk; the next run without the flag picks them back up. See directive-cache. |
--skip-export-probes |
false | Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be trained for HF inference. Mirrors dlm init --skip-export-probes. |
The three lock flags are mutually exclusive. See Determinism for the mismatch severity table.
--gpus multiplies the effective batch size by world_size; the
resulting lock records world_size and warns on drift between runs.
Multi-GPU + QLoRA on CUDA is permitted (bitsandbytes supports DDP);
multi-GPU + ROCm is out of scope for Sprint 23.
dlm prompt
Run inference against the current adapter.
dlm prompt <path> [query] [--max-tokens N] [--temp F] [--top-p F]
[--adapter NAME] [--gate {auto,off}]
[--image PATH]... [--audio PATH]...
[--verbose]
| Option | Default | Notes |
|---|---|---|
--max-tokens N |
256 | Max new tokens to generate. |
--temp F |
0.7 | Temperature. 0.0 = greedy decoding (deterministic). |
--top-p F |
None | Top-p sampling. |
--adapter NAME |
None | Select a named adapter from training.adapters. Required on multi-adapter documents; rejected on single-adapter ones. |
--gate {auto,off} |
auto |
Learned adapter gate (Sprint 34). auto uses the trained gate when one exists in the store; off forces uniform weights across declared adapters. Silently ignored when --adapter pins a single adapter. See docs/cookbook/learned-adapter-gate.md. |
--image PATH |
none | Attach an image to the prompt. Repeat for multiple images; each expands to one <image> placeholder the processor slots pixels into. Required on vision-language bases; rejected on text bases. See multimodal-training cookbook. |
--audio PATH |
none | Attach an audio clip to the prompt. Repeat for multiple clips. Required on audio-language bases; rejected on text and vision-language bases. See audio-training cookbook. |
--backend {auto,pytorch,mlx} |
auto |
Inference backend. auto picks MLX on Apple Silicon (when uv sync --extra mlx is installed), else PyTorch. Ignored on VL bases (the VL path always uses PyTorch + AutoModelForImageTextToText). |
--verbose |
false | Print resolved InferencePlan on stderr. |
Query is the CLI positional argument. Omit to read from stdin.
dlm repl
Interactive prompt-and-respond REPL against the trained adapter (Sprint 24).
dlm repl <path> [--adapter NAME] [--backend {auto,pytorch,mlx}]
| Option | Default | Notes |
|---|---|---|
--adapter NAME |
None | Named adapter; required on multi-adapter docs. |
--backend {auto,pytorch,mlx} |
auto |
Same contract as dlm prompt --backend. |
Slash commands inside the REPL: /help, /exit, /clear, /save,
/adapter, /params, /model, /history. Ctrl-D exits; Ctrl-C
cancels generation or input. Session history persists at
~/.dlm/history. See the interactive-session cookbook.
dlm metrics
Query the per-store SQLite metrics DB (Sprint 26).
dlm metrics <path> [--json|--csv] [--run-id N] [--phase PHASE] [--since WINDOW] [--limit N]
dlm metrics <path> watch [--poll-seconds N]
| Option | Default | Notes |
|---|---|---|
--json |
false | Emit JSON object ({runs: [...], steps: [...], evals: [...]} when combined with --run-id). |
--csv |
false | Emit CSV of runs or (with --run-id) steps + evals. |
--run-id N |
None | Drill into one run; prints its step/eval counts. |
--phase |
None | Filter runs by phase (sft/dpo/orpo/cpt). |
--since |
None | Time window (24h, 7d, 30m, 10s). |
--limit N |
20 | Cap the number of runs returned. |
dlm metrics <path> watch polls the DB and tails new step/eval rows as
they arrive. See the metrics cookbook for
the full flow + optional TensorBoard / W&B sinks (uv sync --extra observability).
dlm preference
Mine, stage, apply, revert, and inspect auto-mined preference sections (Sprint 42).
dlm preference mine <path> [--samples N] [--judge J] [--threshold F]
[--max-pairs N] [--temp F] [--top-p F]
[--backend {auto,pytorch,mlx}] [--adapter NAME]
[--apply]
dlm preference apply <path>
dlm preference revert <path>
dlm preference list <path>
| Option | Default | Notes |
|---|---|---|
--samples N |
4 |
Candidate responses to sample per instruction prompt. Minimum 2. |
--judge J |
sway |
Judge selector: sway, hf:<model>, or cli:<cmd>. |
--threshold F |
judge default | Minimum chosen-vs-rejected score margin. Defaults to the selected judge's native threshold (0.1 for sway, 1.0 for HF reward models). |
--max-pairs N |
unlimited | Cap the number of mined pairs kept from one run. |
--temp F |
0.7 |
Sampling temperature for candidate generation. |
--top-p F |
None | Optional top-p cutoff for candidate generation. |
--backend {auto,pytorch,mlx} |
auto |
Generation backend. Follows the same selection contract as dlm prompt. |
--adapter NAME |
None | Required on multi-adapter documents so mining knows which adapter to sample. --judge sway is currently refused there; use hf: or cli: judges instead. |
--apply |
false | Write mined sections directly to the .dlm. Without it, mine stages the plan under the store root for dlm preference apply. |
dlm preference mine requires a prior training run because the mined
sections carry mined_run_id provenance. By default it stages the
auto-mined ::preference:: sections under the store and prints the
plan; dlm preference apply writes the staged plan into the .dlm,
dlm preference revert strips every auto_mined: true preference
section, and dlm preference list shows both applied and staged
sections.
dlm synth
Synthesize instruction or preference training data (Sprint 43).
dlm synth instructions <path> [--teacher T] [--per-section N]
[--strategy {extraction,expansion,both}]
[--filter {sway,none,dedup-only}]
[--threshold F] [--max-pairs N]
[--max-new-tokens N] [--temp F] [--top-p F]
[--seed N] [--apply | --dry-run]
dlm synth preferences <path> [--samples N] [--judge J] [--threshold F]
[--max-pairs N] [--temp F] [--top-p F]
[--backend {auto,pytorch,mlx}] [--adapter NAME]
[--apply]
dlm synth revert <path>
dlm synth list <path>
| Option | Default | Notes |
|---|---|---|
--teacher T |
self |
Teacher selector: self, hf:<model>, openai:<model>, anthropic:<model>, or vllm-server:<url>. |
--per-section N |
3 |
Accepted instruction pairs to request per prose section before filtering. |
--strategy {extraction,expansion,both} |
extraction |
extraction asks for questions answered directly by the prose, expansion extrapolates beyond it, and both splits the per-section budget across both prompts. |
--filter {sway,none,dedup-only} |
sway |
Filter pipeline after generation. sway reuses Sprint 42's judge, dedup-only keeps near-duplicate suppression but skips judging, none accepts every deduped pair. |
--threshold F |
judge default | Minimum sway-judge margin. Only valid with --filter sway. |
--max-pairs N |
unlimited | Cap the number of accepted synth pairs from one invocation. |
--max-new-tokens N |
512 |
Teacher-side completion cap per prompt. |
--temp F |
0.0 |
Teacher sampling temperature. |
--top-p F |
None | Optional top-p cutoff for teacher sampling. |
--seed N |
None | Optional teacher sampling seed. |
--apply |
false | Write accepted auto-synth ::instruction:: sections directly to the .dlm. |
--dry-run |
false | Preview the synth plan without staging or writing anything. Default behavior stages the accepted plan under the store for inspection via dlm synth list. |
dlm synth instructions prints the raw synth plan, then the filter
summary (generated, dedup, judge passed, threshold/accepted).
Without --apply or --dry-run, the accepted auto-synth
::instruction:: sections are staged under the store root so dlm synth list can show them before a later rerun. dlm synth revert
strips every auto_synth: true instruction section from the document.
dlm synth preferences is an alias over dlm preference mine for the
same Sprint 42 preference-mining loop. Use it when you want the
umbrella synth surface but the output should be ::preference::
sections instead of ::instruction:: sections.
dlm templates
Browse the starter template gallery (Sprint 27).
dlm templates list [--json] [--refresh] [--accept-unsigned]
| Option | Default | Notes |
|---|---|---|
--json |
false | Emit the full TemplateMeta for each entry as JSON. |
--refresh |
false | Refresh from the upstream gallery. Currently a no-op — upstream repo and signing key are pending (Sprint 27 deferred polish); the command warns and falls back to the bundled gallery. |
--accept-unsigned |
false | Reserved. Will bypass signed-tag verification once the live fetcher is wired. |
Pair with dlm init --template <name> to create a new .dlm:
dlm init mydoc.dlm --template coding-tutor
See the template-gallery cookbook
for the full walkthrough and the TemplateMeta schema.
dlm export
Produce GGUF files + runtime-target metadata.
dlm export <path> [--target NAME] [--quant Q] [--merged [--dequantize]]
[--name N] [--no-template] [--skip-ollama]
[--no-smoke] [--no-imatrix] [--verbose]
[--draft TAG | --no-draft]
[--adapter NAME | --adapter-mix SPEC]
| Option | Default | Notes |
|---|---|---|
--target NAME |
ollama |
Export destination. Sprint 41 currently supports ollama, llama-server, vllm, and mlx-serve. The llama-server path writes launch artifacts against the existing GGUF export and uses the shared OpenAI-compatible HTTP smoke harness. The vllm path writes vllm_launch.sh + vllm_config.json against the local adapter layout and ignores GGUF-only flags. On Apple Silicon, the generated vllm launch path forces the documented low-risk vllm-metal settings (VLLM_METAL_USE_PAGED_ATTENTION=0, VLLM_METAL_MEMORY_FRACTION=auto) and caps --max-model-len to the document's training.sequence_len. The mlx-serve path is Apple Silicon only, writes mlx_serve_launch.sh plus a staged MLX adapter directory, and currently supports text bases only. |
--quant Q |
frontmatter.export.default_quant | Q4_K_M / Q5_K_M / Q6_K / Q8_0 / F16. |
--merged |
false | Merge LoRA into base before quantizing. |
--dequantize |
false | Required with --merged on a QLoRA adapter (pitfall #3). |
--name N |
derived | Ollama model name. |
--no-template |
false | Skip writing TEMPLATE into the Modelfile (power users only — Ollama will fuzzy-match, which Sprint 12 deliberately works around). |
--skip-ollama |
false | Emit GGUFs but don't register. |
--no-smoke |
false | Register but skip the smoke prompt. |
--no-imatrix |
false | Opt out of imatrix-calibrated quantization. |
--verbose |
false | Surface preflight + conversion diagnostics. |
--draft TAG |
auto | Override the speculative-decoding draft model. |
--no-draft |
false | Disable speculative decoding. Mutex with --draft. |
--adapter NAME |
None | Export a single named adapter from training.adapters. Rejected on single-adapter documents. Mutex with --adapter-mix. |
--adapter-mix SPEC |
None | Weighted composition like knowledge:1.0,tone:0.5. Produces one Ollama model by merging the named adapters at export time. LoRA-only; QLoRA sources require --dequantize --merged. Mutex with --adapter. |
--adapter-mix-method |
linear |
PEFT merge strategy: linear (default; fast weighted sum) or svd (higher fidelity, heavier compute). Only meaningful with --adapter-mix. |
dlm pack
Produce a portable .dlm.pack bundle.
dlm pack <path> [--out PATH] [--include-exports] [--include-base]
[--include-logs] [--i-am-the-licensee URL]
| Option | Default | Notes |
|---|---|---|
--out PATH |
<name>.dlm.pack |
Pack output. |
--include-exports |
false | Bundle all GGUF exports. |
--include-base |
false | Bundle the base model weights. Requires license acknowledgement for gated bases. |
--include-logs |
false | Bundle per-run JSONL logs. |
--i-am-the-licensee URL |
none | URL acknowledging separate base-license acceptance. |
dlm unpack
Install a .dlm.pack into the local store.
dlm unpack <pack> [--force] [--out DIR]
| Option | Default | Notes |
|---|---|---|
--force |
false | Overwrite an existing store with the same dlm_id. |
--out DIR |
pack parent | Where to place the restored .dlm. |
dlm verify
Verify a .dlm.pack provenance chain before trusting or installing it.
dlm verify <pack> [--trust-on-first-use]
| Option | Default | Notes |
|---|---|---|
--trust-on-first-use |
false | Record an unknown signer's public key into ~/.dlm/trusted-keys/ on first verify. Without it, unknown signers are refused with exit code 2. |
Exit codes: 0 verified, 1 broken chain or missing provenance,
2 untrusted signer, 3 signature rejected.
dlm push
Upload a .dlm (auto-packs) or .dlm.pack to a sharing destination
(Sprint 28).
dlm push <path> --to <destination> [--sign] [pack flags]
| Option | Default | Notes |
|---|---|---|
--to <destination> |
required | hf:<org>/<repo>, https://... URL endpoint, or a local path. |
--sign |
false | Sign the pack with minisign before upload (requires minisign on PATH + key at ~/.dlm/minisign.key). |
--include-exports |
false | Forwarded to dlm pack when auto-packing a .dlm. |
--include-base |
false | Same. |
--include-logs |
false | Same. |
--i-am-the-licensee URL |
none | Required with --include-base on a non-redistributable base. |
Destinations:
hf:<org>/<repo>— HuggingFace Hub. Uses$HF_TOKENif set. Autogenerates aREADME.mdwithlibrary_name: dlmtag. Creates the repo if missing (your personal namespace needs no approval).https://…— any HTTPS endpoint that accepts a POST with anapplication/octet-streambody. SetsAuthorization:from$DLM_SHARE_AUTHwhen present (e.g.Bearer <token>).<local/path>— copy the pack to a filesystem path.
dlm pull
Download + verify + unpack a .dlm.pack from a remote source.
dlm pull <source> [--out DIR] [--force]
| Option | Default | Notes |
|---|---|---|
<source> |
required | hf:<org>/<repo>, https://…, peer://host:port/<id>?token=…, or a local path. |
--out DIR |
CWD | Directory for the restored .dlm. |
--force |
false | Overwrite an existing store with the same dlm_id. |
Pulls always verify sha256 checksums during unpack. If a .minisig
sidecar is served alongside the pack, dlm pull tries every key in
~/.dlm/trusted-keys/*.pub — match → verified, no match →
unverified warning (still installs, checksums are fine). No sidecar
→ unsigned (still installs).
dlm serve
Serve a .dlm's pack over LAN for peers to pull.
dlm serve <path> [--port N] [--public --i-know-this-is-public]
[--max-concurrency N] [--rate-limit N]
[--token-ttl-minutes N]
| Option | Default | Notes |
|---|---|---|
--port N |
7337 | Bind port. |
--public |
false | Bind 0.0.0.0 only when paired with --i-know-this-is-public. Without the confirmation flag, --public logs a refusal and binds 127.0.0.1. |
--i-know-this-is-public |
false | Acknowledges the public bind. Meaningless without --public. |
--max-concurrency N |
4 | Max concurrent connections per token. Excess returns HTTP 429. |
--rate-limit N |
30 | Max requests per minute per token. |
--token-ttl-minutes N |
15 | Issued token lifetime. Ctrl-C invalidates every outstanding token instantly — the session secret lives only in the serving process. |
On start, prints the peer:// URL (with embedded token) that the
other side pastes into dlm pull. Ctrl-C cleanly stops the server
and deletes the temp pack.
dlm doctor
Inspect hardware + print the resolved training plan.
dlm doctor [--json]
No-network. Probes torch + psutil only; refuses to go online.
dlm show
Show training history, exports, and adapter state for a document.
dlm show <path> [--json]
Pretty-prints manifest + lock state. --json emits machine-readable
output.
dlm migrate
Migrate a .dlm frontmatter to the current schema version.
dlm migrate <path> [--dry-run] [--no-backup]
| Option | Default | Notes |
|---|---|---|
--dry-run |
false | Print the migrated frontmatter without writing. |
--no-backup |
false | Skip the .dlm.bak backup. |
dlm cache
Inspect and manage the per-store tokenized-section cache (Sprint 31).
The cache speeds up re-training on directive-sourced codebases by
keying tokenized output on (section_id, tokenizer_sha, sequence_len).
dlm cache show <path> [--json]
dlm cache prune <path> [--older-than DURATION]
dlm cache clear <path> [--force]
| Subcommand | Notes |
|---|---|
show |
Print entry count, size on disk, last-run hit rate. --json for machine-readable output. |
prune |
Delete entries not accessed within --older-than (e.g. 30d, 12h, 45m). Default 90d. |
clear |
Wipe the entire cache. Prompts for confirmation unless --force is passed. |
See docs/cookbook/directive-cache.md for tuning, invalidation
triggers, and maintenance patterns.
dlm harvest
Pull failing-probe results from a sway-style eval report back into the
document as !probe-tagged ::instruction:: sections for the next
retrain. See docs/cookbook/probe-driven-training.md.
dlm harvest <path> --sway-json <report> [--apply] [--dry-run]
[--tag NAME] [--min-confidence F]
[--strict | --lax]
dlm harvest <path> --revert
| Option | Default | Notes |
|---|---|---|
--sway-json PATH |
required | Path to the sway probe report JSON. |
--apply |
false | Write changes to disk. Without it, dry-run. |
--dry-run |
true | Print the diff; no writes. |
--revert |
— | Strip all auto_harvest=True sections; mutually exclusive with --sway-json. |
--tag NAME |
auto-harvest |
Provenance tag written into harvest_source. |
--min-confidence F |
0.0 |
Skip candidates below this confidence. |
--strict / --lax |
lax | Strict: fail if any failing probe lacks a reference. Lax: skip + log. |
Exit codes: 0 success, 1 validation error (malformed JSON, strict
miss, mutual-exclusion violation), 2 no candidates to harvest.
dlm train --listen-rpc
During --watch, open a JSON-RPC endpoint that accepts inject_probe
pushes from external eval harnesses. Requires DLM_PROBE_TOKEN in the
environment. See docs/cookbook/probe-driven-training.md for the wire
protocol and security notes.
| Option | Default | Notes |
|---|---|---|
--listen-rpc HOST:PORT |
off | Bind the probe-RPC endpoint. Requires --watch or --max-cycles. |
--max-cycles N |
0 |
Bounded-loop alternative to --watch for convergence runs (scaffolded — currently refuses execution without --watch). |
Exit codes
| Code | Meaning |
|---|---|
| 0 | Success. |
| 1 | Runtime failure (license refused, disk full, OOM, template drift, lock validation). |
| 2 | CLI misuse (mutex violation, missing argument). |
Domain errors are formatted consistently via bare console.print
calls in each subcommand (prefix convention: <subject>: <message>,
e.g. lock: base_model_revision changed). Uncaught exceptions escape
into dlm.cli.reporter which picks a matching prefix from the
module the exception came from and renders a tier-3 generic message.
View source
| 1 | # CLI reference |
| 2 | |
| 3 | Generated from the running `dlm --help` output. Auto-regeneration via |
| 4 | `typer-cli` is planned for a follow-up sprint; until then this file is |
| 5 | hand-maintained and gated by the test suite. |
| 6 | |
| 7 | ## Global options |
| 8 | |
| 9 | Applied to every subcommand: |
| 10 | |
| 11 | | Option | Env var | Default | Description | |
| 12 | |---|---|---|---| |
| 13 | | `--home PATH` | `DLM_HOME` | `~/.dlm` | Override the store root. | |
| 14 | | `-v, --verbose` | — | off | Emit plan / resolver diagnostics on stderr. | |
| 15 | | `-q, --quiet` | — | off | Suppress informational output. | |
| 16 | | `--version` | — | — | Print version and exit. | |
| 17 | | `--install-completion` | — | — | Install shell completion. | |
| 18 | | `--show-completion` | — | — | Print shell completion script. | |
| 19 | | `-h, --help` | — | — | Show command help. | |
| 20 | |
| 21 | ## Commands |
| 22 | |
| 23 | ### `dlm init` |
| 24 | |
| 25 | Bootstrap a new `.dlm` file with a fresh ULID, create the per-store |
| 26 | directory, and persist the license-acceptance record (audit-05 B2). |
| 27 | |
| 28 | ``` |
| 29 | dlm init <path> [--base <key>] [--template <name>] |
| 30 | [--multimodal | --audio] |
| 31 | [--skip-export-probes] |
| 32 | [--i-accept-license] [--force] |
| 33 | ``` |
| 34 | |
| 35 | | Option | Default | Notes | |
| 36 | |---|---|---| |
| 37 | | `--base <key>` | `qwen2.5-1.5b` | Registry key or `hf:org/name`. Ignored when `--template` is used (the template's `recommended_base` wins). With `--multimodal`, defaults to `paligemma-3b-mix-224`. | |
| 38 | | `--template <name>` | None | Bootstrap from a named gallery template. See `dlm templates list`. Mutually exclusive with `--multimodal`. | |
| 39 | | `--skip-export-probes` | false | Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be used for training + HF inference. Forfeits `dlm export` to Ollama until the vendored exporter catches up. | |
| 40 | | `--multimodal` | false | Scaffold a vision-language `.dlm` with an `::image::` section (schema v10). Flips `--base` to `paligemma-3b-mix-224` unless explicitly overridden; a non-VL `--base` is refused. See [multimodal-training cookbook](../cookbook/multimodal-training.md). | |
| 41 | | `--audio` | false | Scaffold an audio-language `.dlm` with an `::audio::` section. Flips `--base` to `qwen2-audio-7b-instruct`, skips export probes, and refuses text / vision-language bases. See [audio-training cookbook](../cookbook/audio-training.md). | |
| 42 | | `--i-accept-license` | false | Required for gated bases (Llama-3.2, PaliGemma). | |
| 43 | | `--force` | false | Overwrite an existing `.dlm` at path. | |
| 44 | |
| 45 | Writes `<path>` with minimum frontmatter, provisions |
| 46 | `~/.dlm/store/<dlm_id>/` with an initial `manifest.json`, and (for |
| 47 | gated bases) stores the `LicenseAcceptance` record so `dlm train` / |
| 48 | `dlm export` don't re-prompt. Refuses if the `.dlm` file already |
| 49 | exists and `--force` wasn't passed. |
| 50 | |
| 51 | ### `dlm train` |
| 52 | |
| 53 | Train / retrain the adapter. |
| 54 | |
| 55 | ``` |
| 56 | dlm train <path> [--resume|--fresh] [--seed N] [--max-steps N] |
| 57 | [--phase {sft,preference,all}] |
| 58 | [--i-accept-license] |
| 59 | [--strict-lock|--update-lock|--ignore-lock] |
| 60 | [--strict-metrics] |
| 61 | [--base <key>] [--include GLOB]... [--exclude GLOB]... |
| 62 | [--recursive|--no-recursive] [--name NAME] |
| 63 | [--policy {strict,permissive}] [--rescaffold] |
| 64 | [--skip-export-probes] |
| 65 | ``` |
| 66 | |
| 67 | | Option | Default | Notes | |
| 68 | |---|---|---| |
| 69 | | `--resume` | false | Continue from `training_state.pt`. Mutex with `--fresh`. | |
| 70 | | `--fresh` | false | Discard prior optimizer state; train from scratch. Mutex with `--resume`. Default when neither flag is set. | |
| 71 | | `--seed N` | frontmatter.training.seed | Override training seed. | |
| 72 | | `--max-steps N` | unlimited | Cap step count. | |
| 73 | | `--phase {sft,preference,all}` | `all` | Choose which training phases run: SFT only, preference only, or both in sequence. Preference-only requires a prior SFT adapter. | |
| 74 | | `--i-accept-license` | false | Required for gated bases (usually captured once at `dlm init` and persisted). | |
| 75 | | `--strict-lock` | false | Fail on any `dlm.lock` drift (even WARN). | |
| 76 | | `--update-lock` | false | Bypass validation; always write a fresh `dlm.lock`. | |
| 77 | | `--ignore-lock` | false | Bypass validation; don't write `dlm.lock`. | |
| 78 | | `--strict-metrics` | false | Promote metrics SQLite write failures to hard errors instead of best-effort degradation. Run-start / run-end are always hard-fail anchors; this flag extends that policy to step, eval, tokenization, and export streams. | |
| 79 | | `--gpus SPEC` | single-process | Multi-GPU training via Accelerate. `all` uses every visible CUDA device; `N` uses the first N; `0,1` selects exact device ids. Dispatches to `accelerate launch` when >1 device is selected. Refused on MPS/CPU/ROCm; heterogeneous CUDA SMs refused. | |
| 80 | | `--watch` | false | Save-to-train mode (Sprint 25). After the initial train, block on filesystem events and re-run bounded-step retrains on each settled save. | |
| 81 | | `--watch-max-steps N` | 100 | Per-cycle step cap for `--watch`. Keeps cycles responsive. | |
| 82 | | `--watch-debounce-ms N` | 400 | Quiet interval before a burst of saves triggers a retrain. | |
| 83 | | `--repl` | false | With `--watch`: also run `dlm repl` in the same process. **Scaffolded only** — threading integration is a followup; today the flag refuses with exit 2. | |
| 84 | | `--base <key>` | required on first auto-scaffold | Base model for `dlm train <dir>` auto-scaffold. Ignored when `<path>` already points at a `.dlm`. | |
| 85 | | `--include GLOB` | repeatable | Auto-scaffold include glob. Defaults to `**/*` with `--recursive`, `*` with `--no-recursive`. | |
| 86 | | `--exclude GLOB` | repeatable | Auto-scaffold exclude glob. Directory-descent defaults still apply on top. | |
| 87 | | `--recursive` / `--no-recursive` | recursive | Auto-scaffold whether default include globs descend into subdirectories. | |
| 88 | | `--name NAME` | `corpus` | Auto-scaffold target file name under `<dir>/.dlm/<name>.dlm`. Lets one tree host multiple adapters. | |
| 89 | | `--policy {strict,permissive}` | `strict` | Auto-scaffold `training.sources_policy`. `strict` confines training sources to the target directory; `permissive` allows absolute paths anywhere. | |
| 90 | | `--rescaffold` | false | Rewrite an existing scaffolded `.dlm` in place with new auto-scaffold flags while preserving its `dlm_id`. | |
| 91 | | `--no-cache` | false | Bypass the tokenized-section cache for this run. Default is cache-on when the `.dlm` declares `training.sources`. Use when debugging tokenization or cross-checking cached-vs-uncached determinism. Entries from prior runs stay on disk; the next run without the flag picks them back up. See [directive-cache](../cookbook/directive-cache.md). | |
| 92 | | `--skip-export-probes` | false | Skip the llama.cpp / GGUF compatibility probes so a brand-new architecture can still be trained for HF inference. Mirrors `dlm init --skip-export-probes`. | |
| 93 | |
| 94 | The three lock flags are mutually exclusive. See [Determinism](../determinism.md) |
| 95 | for the mismatch severity table. |
| 96 | |
| 97 | `--gpus` multiplies the effective batch size by `world_size`; the |
| 98 | resulting lock records `world_size` and warns on drift between runs. |
| 99 | Multi-GPU + QLoRA on CUDA is permitted (bitsandbytes supports DDP); |
| 100 | multi-GPU + ROCm is out of scope for Sprint 23. |
| 101 | |
| 102 | ### `dlm prompt` |
| 103 | |
| 104 | Run inference against the current adapter. |
| 105 | |
| 106 | ``` |
| 107 | dlm prompt <path> [query] [--max-tokens N] [--temp F] [--top-p F] |
| 108 | [--adapter NAME] [--gate {auto,off}] |
| 109 | [--image PATH]... [--audio PATH]... |
| 110 | [--verbose] |
| 111 | ``` |
| 112 | |
| 113 | | Option | Default | Notes | |
| 114 | |---|---|---| |
| 115 | | `--max-tokens N` | 256 | Max new tokens to generate. | |
| 116 | | `--temp F` | 0.7 | Temperature. `0.0` = greedy decoding (deterministic). | |
| 117 | | `--top-p F` | None | Top-p sampling. | |
| 118 | | `--adapter NAME` | None | Select a named adapter from `training.adapters`. Required on multi-adapter documents; rejected on single-adapter ones. | |
| 119 | | `--gate {auto,off}` | `auto` | Learned adapter gate (Sprint 34). `auto` uses the trained gate when one exists in the store; `off` forces uniform weights across declared adapters. Silently ignored when `--adapter` pins a single adapter. See `docs/cookbook/learned-adapter-gate.md`. | |
| 120 | | `--image PATH` | none | Attach an image to the prompt. Repeat for multiple images; each expands to one `<image>` placeholder the processor slots pixels into. Required on vision-language bases; rejected on text bases. See [multimodal-training cookbook](../cookbook/multimodal-training.md). | |
| 121 | | `--audio PATH` | none | Attach an audio clip to the prompt. Repeat for multiple clips. Required on audio-language bases; rejected on text and vision-language bases. See [audio-training cookbook](../cookbook/audio-training.md). | |
| 122 | | `--backend {auto,pytorch,mlx}` | `auto` | Inference backend. `auto` picks MLX on Apple Silicon (when `uv sync --extra mlx` is installed), else PyTorch. Ignored on VL bases (the VL path always uses PyTorch + AutoModelForImageTextToText). | |
| 123 | | `--verbose` | false | Print resolved `InferencePlan` on stderr. | |
| 124 | |
| 125 | Query is the CLI positional argument. Omit to read from stdin. |
| 126 | |
| 127 | ### `dlm repl` |
| 128 | |
| 129 | Interactive prompt-and-respond REPL against the trained adapter |
| 130 | (Sprint 24). |
| 131 | |
| 132 | ``` |
| 133 | dlm repl <path> [--adapter NAME] [--backend {auto,pytorch,mlx}] |
| 134 | ``` |
| 135 | |
| 136 | | Option | Default | Notes | |
| 137 | |---|---|---| |
| 138 | | `--adapter NAME` | None | Named adapter; required on multi-adapter docs. | |
| 139 | | `--backend {auto,pytorch,mlx}` | `auto` | Same contract as `dlm prompt --backend`. | |
| 140 | |
| 141 | Slash commands inside the REPL: `/help`, `/exit`, `/clear`, `/save`, |
| 142 | `/adapter`, `/params`, `/model`, `/history`. Ctrl-D exits; Ctrl-C |
| 143 | cancels generation or input. Session history persists at |
| 144 | `~/.dlm/history`. See the [interactive-session cookbook](../cookbook/interactive-session.md). |
| 145 | |
| 146 | ### `dlm metrics` |
| 147 | |
| 148 | Query the per-store SQLite metrics DB (Sprint 26). |
| 149 | |
| 150 | ``` |
| 151 | dlm metrics <path> [--json|--csv] [--run-id N] [--phase PHASE] [--since WINDOW] [--limit N] |
| 152 | dlm metrics <path> watch [--poll-seconds N] |
| 153 | ``` |
| 154 | |
| 155 | | Option | Default | Notes | |
| 156 | |---|---|---| |
| 157 | | `--json` | false | Emit JSON object (`{runs: [...], steps: [...], evals: [...]}` when combined with `--run-id`). | |
| 158 | | `--csv` | false | Emit CSV of runs or (with `--run-id`) steps + evals. | |
| 159 | | `--run-id N` | None | Drill into one run; prints its step/eval counts. | |
| 160 | | `--phase` | None | Filter runs by phase (`sft`/`dpo`/`orpo`/`cpt`). | |
| 161 | | `--since` | None | Time window (`24h`, `7d`, `30m`, `10s`). | |
| 162 | | `--limit N` | 20 | Cap the number of runs returned. | |
| 163 | |
| 164 | `dlm metrics <path> watch` polls the DB and tails new step/eval rows as |
| 165 | they arrive. See the [metrics cookbook](../cookbook/metrics.md) for |
| 166 | the full flow + optional TensorBoard / W&B sinks (`uv sync --extra |
| 167 | observability`). |
| 168 | |
| 169 | ### `dlm preference` |
| 170 | |
| 171 | Mine, stage, apply, revert, and inspect auto-mined preference |
| 172 | sections (Sprint 42). |
| 173 | |
| 174 | ``` |
| 175 | dlm preference mine <path> [--samples N] [--judge J] [--threshold F] |
| 176 | [--max-pairs N] [--temp F] [--top-p F] |
| 177 | [--backend {auto,pytorch,mlx}] [--adapter NAME] |
| 178 | [--apply] |
| 179 | dlm preference apply <path> |
| 180 | dlm preference revert <path> |
| 181 | dlm preference list <path> |
| 182 | ``` |
| 183 | |
| 184 | | Option | Default | Notes | |
| 185 | |---|---|---| |
| 186 | | `--samples N` | `4` | Candidate responses to sample per instruction prompt. Minimum `2`. | |
| 187 | | `--judge J` | `sway` | Judge selector: `sway`, `hf:<model>`, or `cli:<cmd>`. | |
| 188 | | `--threshold F` | judge default | Minimum chosen-vs-rejected score margin. Defaults to the selected judge's native threshold (`0.1` for sway, `1.0` for HF reward models). | |
| 189 | | `--max-pairs N` | unlimited | Cap the number of mined pairs kept from one run. | |
| 190 | | `--temp F` | `0.7` | Sampling temperature for candidate generation. | |
| 191 | | `--top-p F` | None | Optional top-p cutoff for candidate generation. | |
| 192 | | `--backend {auto,pytorch,mlx}` | `auto` | Generation backend. Follows the same selection contract as `dlm prompt`. | |
| 193 | | `--adapter NAME` | None | Required on multi-adapter documents so mining knows which adapter to sample. `--judge sway` is currently refused there; use `hf:` or `cli:` judges instead. | |
| 194 | | `--apply` | false | Write mined sections directly to the `.dlm`. Without it, `mine` stages the plan under the store root for `dlm preference apply`. | |
| 195 | |
| 196 | `dlm preference mine` requires a prior training run because the mined |
| 197 | sections carry `mined_run_id` provenance. By default it stages the |
| 198 | auto-mined `::preference::` sections under the store and prints the |
| 199 | plan; `dlm preference apply` writes the staged plan into the `.dlm`, |
| 200 | `dlm preference revert` strips every `auto_mined: true` preference |
| 201 | section, and `dlm preference list` shows both applied and staged |
| 202 | sections. |
| 203 | |
| 204 | ### `dlm synth` |
| 205 | |
| 206 | Synthesize instruction or preference training data (Sprint 43). |
| 207 | |
| 208 | ``` |
| 209 | dlm synth instructions <path> [--teacher T] [--per-section N] |
| 210 | [--strategy {extraction,expansion,both}] |
| 211 | [--filter {sway,none,dedup-only}] |
| 212 | [--threshold F] [--max-pairs N] |
| 213 | [--max-new-tokens N] [--temp F] [--top-p F] |
| 214 | [--seed N] [--apply | --dry-run] |
| 215 | dlm synth preferences <path> [--samples N] [--judge J] [--threshold F] |
| 216 | [--max-pairs N] [--temp F] [--top-p F] |
| 217 | [--backend {auto,pytorch,mlx}] [--adapter NAME] |
| 218 | [--apply] |
| 219 | dlm synth revert <path> |
| 220 | dlm synth list <path> |
| 221 | ``` |
| 222 | |
| 223 | | Option | Default | Notes | |
| 224 | |---|---|---| |
| 225 | | `--teacher T` | `self` | Teacher selector: `self`, `hf:<model>`, `openai:<model>`, `anthropic:<model>`, or `vllm-server:<url>`. | |
| 226 | | `--per-section N` | `3` | Accepted instruction pairs to request per prose section before filtering. | |
| 227 | | `--strategy {extraction,expansion,both}` | `extraction` | `extraction` asks for questions answered directly by the prose, `expansion` extrapolates beyond it, and `both` splits the per-section budget across both prompts. | |
| 228 | | `--filter {sway,none,dedup-only}` | `sway` | Filter pipeline after generation. `sway` reuses Sprint 42's judge, `dedup-only` keeps near-duplicate suppression but skips judging, `none` accepts every deduped pair. | |
| 229 | | `--threshold F` | judge default | Minimum sway-judge margin. Only valid with `--filter sway`. | |
| 230 | | `--max-pairs N` | unlimited | Cap the number of accepted synth pairs from one invocation. | |
| 231 | | `--max-new-tokens N` | `512` | Teacher-side completion cap per prompt. | |
| 232 | | `--temp F` | `0.0` | Teacher sampling temperature. | |
| 233 | | `--top-p F` | None | Optional top-p cutoff for teacher sampling. | |
| 234 | | `--seed N` | None | Optional teacher sampling seed. | |
| 235 | | `--apply` | false | Write accepted auto-synth `::instruction::` sections directly to the `.dlm`. | |
| 236 | | `--dry-run` | false | Preview the synth plan without staging or writing anything. Default behavior stages the accepted plan under the store for inspection via `dlm synth list`. | |
| 237 | |
| 238 | `dlm synth instructions` prints the raw synth plan, then the filter |
| 239 | summary (`generated`, `dedup`, `judge passed`, `threshold/accepted`). |
| 240 | Without `--apply` or `--dry-run`, the accepted auto-synth |
| 241 | `::instruction::` sections are staged under the store root so `dlm |
| 242 | synth list` can show them before a later rerun. `dlm synth revert` |
| 243 | strips every `auto_synth: true` instruction section from the document. |
| 244 | |
| 245 | `dlm synth preferences` is an alias over `dlm preference mine` for the |
| 246 | same Sprint 42 preference-mining loop. Use it when you want the |
| 247 | umbrella synth surface but the output should be `::preference::` |
| 248 | sections instead of `::instruction::` sections. |
| 249 | |
| 250 | ### `dlm templates` |
| 251 | |
| 252 | Browse the starter template gallery (Sprint 27). |
| 253 | |
| 254 | ``` |
| 255 | dlm templates list [--json] [--refresh] [--accept-unsigned] |
| 256 | ``` |
| 257 | |
| 258 | | Option | Default | Notes | |
| 259 | |---|---|---| |
| 260 | | `--json` | false | Emit the full `TemplateMeta` for each entry as JSON. | |
| 261 | | `--refresh` | false | Refresh from the upstream gallery. **Currently a no-op** — upstream repo and signing key are pending (Sprint 27 deferred polish); the command warns and falls back to the bundled gallery. | |
| 262 | | `--accept-unsigned` | false | Reserved. Will bypass signed-tag verification once the live fetcher is wired. | |
| 263 | |
| 264 | Pair with `dlm init --template <name>` to create a new `.dlm`: |
| 265 | |
| 266 | ```bash |
| 267 | dlm init mydoc.dlm --template coding-tutor |
| 268 | ``` |
| 269 | |
| 270 | See the [template-gallery cookbook](../cookbook/template-gallery.md) |
| 271 | for the full walkthrough and the `TemplateMeta` schema. |
| 272 | |
| 273 | ### `dlm export` |
| 274 | |
| 275 | Produce GGUF files + runtime-target metadata. |
| 276 | |
| 277 | ``` |
| 278 | dlm export <path> [--target NAME] [--quant Q] [--merged [--dequantize]] |
| 279 | [--name N] [--no-template] [--skip-ollama] |
| 280 | [--no-smoke] [--no-imatrix] [--verbose] |
| 281 | [--draft TAG | --no-draft] |
| 282 | [--adapter NAME | --adapter-mix SPEC] |
| 283 | ``` |
| 284 | |
| 285 | | Option | Default | Notes | |
| 286 | |---|---|---| |
| 287 | | `--target NAME` | `ollama` | Export destination. Sprint 41 currently supports `ollama`, `llama-server`, `vllm`, and `mlx-serve`. The `llama-server` path writes launch artifacts against the existing GGUF export and uses the shared OpenAI-compatible HTTP smoke harness. The `vllm` path writes `vllm_launch.sh` + `vllm_config.json` against the local adapter layout and ignores GGUF-only flags. On Apple Silicon, the generated `vllm` launch path forces the documented low-risk `vllm-metal` settings (`VLLM_METAL_USE_PAGED_ATTENTION=0`, `VLLM_METAL_MEMORY_FRACTION=auto`) and caps `--max-model-len` to the document's `training.sequence_len`. The `mlx-serve` path is Apple Silicon only, writes `mlx_serve_launch.sh` plus a staged MLX adapter directory, and currently supports text bases only. | |
| 288 | | `--quant Q` | frontmatter.export.default_quant | `Q4_K_M` / `Q5_K_M` / `Q6_K` / `Q8_0` / `F16`. | |
| 289 | | `--merged` | false | Merge LoRA into base before quantizing. | |
| 290 | | `--dequantize` | false | Required with `--merged` on a QLoRA adapter (pitfall #3). | |
| 291 | | `--name N` | derived | Ollama model name. | |
| 292 | | `--no-template` | false | Skip writing `TEMPLATE` into the Modelfile (power users only — Ollama will fuzzy-match, which Sprint 12 deliberately works around). | |
| 293 | | `--skip-ollama` | false | Emit GGUFs but don't register. | |
| 294 | | `--no-smoke` | false | Register but skip the smoke prompt. | |
| 295 | | `--no-imatrix` | false | Opt out of imatrix-calibrated quantization. | |
| 296 | | `--verbose` | false | Surface preflight + conversion diagnostics. | |
| 297 | | `--draft TAG` | auto | Override the speculative-decoding draft model. | |
| 298 | | `--no-draft` | false | Disable speculative decoding. Mutex with `--draft`. | |
| 299 | | `--adapter NAME` | None | Export a single named adapter from `training.adapters`. Rejected on single-adapter documents. Mutex with `--adapter-mix`. | |
| 300 | | `--adapter-mix SPEC` | None | Weighted composition like `knowledge:1.0,tone:0.5`. Produces one Ollama model by merging the named adapters at export time. LoRA-only; QLoRA sources require `--dequantize --merged`. Mutex with `--adapter`. | |
| 301 | | `--adapter-mix-method` | `linear` | PEFT merge strategy: `linear` (default; fast weighted sum) or `svd` (higher fidelity, heavier compute). Only meaningful with `--adapter-mix`. | |
| 302 | |
| 303 | ### `dlm pack` |
| 304 | |
| 305 | Produce a portable `.dlm.pack` bundle. |
| 306 | |
| 307 | ``` |
| 308 | dlm pack <path> [--out PATH] [--include-exports] [--include-base] |
| 309 | [--include-logs] [--i-am-the-licensee URL] |
| 310 | ``` |
| 311 | |
| 312 | | Option | Default | Notes | |
| 313 | |---|---|---| |
| 314 | | `--out PATH` | `<name>.dlm.pack` | Pack output. | |
| 315 | | `--include-exports` | false | Bundle all GGUF exports. | |
| 316 | | `--include-base` | false | Bundle the base model weights. Requires license acknowledgement for gated bases. | |
| 317 | | `--include-logs` | false | Bundle per-run JSONL logs. | |
| 318 | | `--i-am-the-licensee URL` | none | URL acknowledging separate base-license acceptance. | |
| 319 | |
| 320 | ### `dlm unpack` |
| 321 | |
| 322 | Install a `.dlm.pack` into the local store. |
| 323 | |
| 324 | ``` |
| 325 | dlm unpack <pack> [--force] [--out DIR] |
| 326 | ``` |
| 327 | |
| 328 | | Option | Default | Notes | |
| 329 | |---|---|---| |
| 330 | | `--force` | false | Overwrite an existing store with the same `dlm_id`. | |
| 331 | | `--out DIR` | pack parent | Where to place the restored `.dlm`. | |
| 332 | |
| 333 | ### `dlm verify` |
| 334 | |
| 335 | Verify a `.dlm.pack` provenance chain before trusting or installing it. |
| 336 | |
| 337 | ``` |
| 338 | dlm verify <pack> [--trust-on-first-use] |
| 339 | ``` |
| 340 | |
| 341 | | Option | Default | Notes | |
| 342 | |---|---|---| |
| 343 | | `--trust-on-first-use` | false | Record an unknown signer's public key into `~/.dlm/trusted-keys/` on first verify. Without it, unknown signers are refused with exit code 2. | |
| 344 | |
| 345 | Exit codes: `0` verified, `1` broken chain or missing provenance, |
| 346 | `2` untrusted signer, `3` signature rejected. |
| 347 | |
| 348 | ### `dlm push` |
| 349 | |
| 350 | Upload a `.dlm` (auto-packs) or `.dlm.pack` to a sharing destination |
| 351 | (Sprint 28). |
| 352 | |
| 353 | ``` |
| 354 | dlm push <path> --to <destination> [--sign] [pack flags] |
| 355 | ``` |
| 356 | |
| 357 | | Option | Default | Notes | |
| 358 | |---|---|---| |
| 359 | | `--to <destination>` | required | `hf:<org>/<repo>`, `https://...` URL endpoint, or a local path. | |
| 360 | | `--sign` | false | Sign the pack with `minisign` before upload (requires `minisign` on PATH + key at `~/.dlm/minisign.key`). | |
| 361 | | `--include-exports` | false | Forwarded to `dlm pack` when auto-packing a `.dlm`. | |
| 362 | | `--include-base` | false | Same. | |
| 363 | | `--include-logs` | false | Same. | |
| 364 | | `--i-am-the-licensee URL` | none | Required with `--include-base` on a non-redistributable base. | |
| 365 | |
| 366 | **Destinations:** |
| 367 | - `hf:<org>/<repo>` — HuggingFace Hub. Uses `$HF_TOKEN` if set. Autogenerates a `README.md` with `library_name: dlm` tag. Creates the repo if missing (your personal namespace needs no approval). |
| 368 | - `https://…` — any HTTPS endpoint that accepts a POST with an `application/octet-stream` body. Sets `Authorization:` from `$DLM_SHARE_AUTH` when present (e.g. `Bearer <token>`). |
| 369 | - `<local/path>` — copy the pack to a filesystem path. |
| 370 | |
| 371 | ### `dlm pull` |
| 372 | |
| 373 | Download + verify + unpack a `.dlm.pack` from a remote source. |
| 374 | |
| 375 | ``` |
| 376 | dlm pull <source> [--out DIR] [--force] |
| 377 | ``` |
| 378 | |
| 379 | | Option | Default | Notes | |
| 380 | |---|---|---| |
| 381 | | `<source>` | required | `hf:<org>/<repo>`, `https://…`, `peer://host:port/<id>?token=…`, or a local path. | |
| 382 | | `--out DIR` | CWD | Directory for the restored `.dlm`. | |
| 383 | | `--force` | false | Overwrite an existing store with the same `dlm_id`. | |
| 384 | |
| 385 | Pulls always verify sha256 checksums during unpack. If a `.minisig` |
| 386 | sidecar is served alongside the pack, `dlm pull` tries every key in |
| 387 | `~/.dlm/trusted-keys/*.pub` — match → `verified`, no match → |
| 388 | `unverified` warning (still installs, checksums are fine). No sidecar |
| 389 | → `unsigned` (still installs). |
| 390 | |
| 391 | ### `dlm serve` |
| 392 | |
| 393 | Serve a `.dlm`'s pack over LAN for peers to pull. |
| 394 | |
| 395 | ``` |
| 396 | dlm serve <path> [--port N] [--public --i-know-this-is-public] |
| 397 | [--max-concurrency N] [--rate-limit N] |
| 398 | [--token-ttl-minutes N] |
| 399 | ``` |
| 400 | |
| 401 | | Option | Default | Notes | |
| 402 | |---|---|---| |
| 403 | | `--port N` | 7337 | Bind port. | |
| 404 | | `--public` | false | Bind `0.0.0.0` **only when paired with** `--i-know-this-is-public`. Without the confirmation flag, `--public` logs a refusal and binds `127.0.0.1`. | |
| 405 | | `--i-know-this-is-public` | false | Acknowledges the public bind. Meaningless without `--public`. | |
| 406 | | `--max-concurrency N` | 4 | Max concurrent connections per token. Excess returns HTTP 429. | |
| 407 | | `--rate-limit N` | 30 | Max requests per minute per token. | |
| 408 | | `--token-ttl-minutes N` | 15 | Issued token lifetime. Ctrl-C invalidates every outstanding token instantly — the session secret lives only in the serving process. | |
| 409 | |
| 410 | On start, prints the `peer://` URL (with embedded token) that the |
| 411 | other side pastes into `dlm pull`. Ctrl-C cleanly stops the server |
| 412 | and deletes the temp pack. |
| 413 | |
| 414 | ### `dlm doctor` |
| 415 | |
| 416 | Inspect hardware + print the resolved training plan. |
| 417 | |
| 418 | ``` |
| 419 | dlm doctor [--json] |
| 420 | ``` |
| 421 | |
| 422 | No-network. Probes torch + psutil only; refuses to go online. |
| 423 | |
| 424 | ### `dlm show` |
| 425 | |
| 426 | Show training history, exports, and adapter state for a document. |
| 427 | |
| 428 | ``` |
| 429 | dlm show <path> [--json] |
| 430 | ``` |
| 431 | |
| 432 | Pretty-prints manifest + lock state. `--json` emits machine-readable |
| 433 | output. |
| 434 | |
| 435 | ### `dlm migrate` |
| 436 | |
| 437 | Migrate a `.dlm` frontmatter to the current schema version. |
| 438 | |
| 439 | ``` |
| 440 | dlm migrate <path> [--dry-run] [--no-backup] |
| 441 | ``` |
| 442 | |
| 443 | | Option | Default | Notes | |
| 444 | |---|---|---| |
| 445 | | `--dry-run` | false | Print the migrated frontmatter without writing. | |
| 446 | | `--no-backup` | false | Skip the `.dlm.bak` backup. | |
| 447 | |
| 448 | ### `dlm cache` |
| 449 | |
| 450 | Inspect and manage the per-store tokenized-section cache (Sprint 31). |
| 451 | The cache speeds up re-training on directive-sourced codebases by |
| 452 | keying tokenized output on `(section_id, tokenizer_sha, sequence_len)`. |
| 453 | |
| 454 | ``` |
| 455 | dlm cache show <path> [--json] |
| 456 | dlm cache prune <path> [--older-than DURATION] |
| 457 | dlm cache clear <path> [--force] |
| 458 | ``` |
| 459 | |
| 460 | | Subcommand | Notes | |
| 461 | |---|---| |
| 462 | | `show` | Print entry count, size on disk, last-run hit rate. `--json` for machine-readable output. | |
| 463 | | `prune` | Delete entries not accessed within `--older-than` (e.g. `30d`, `12h`, `45m`). Default `90d`. | |
| 464 | | `clear` | Wipe the entire cache. Prompts for confirmation unless `--force` is passed. | |
| 465 | |
| 466 | See `docs/cookbook/directive-cache.md` for tuning, invalidation |
| 467 | triggers, and maintenance patterns. |
| 468 | |
| 469 | ### `dlm harvest` |
| 470 | |
| 471 | Pull failing-probe results from a sway-style eval report back into the |
| 472 | document as `!probe`-tagged `::instruction::` sections for the next |
| 473 | retrain. See `docs/cookbook/probe-driven-training.md`. |
| 474 | |
| 475 | ``` |
| 476 | dlm harvest <path> --sway-json <report> [--apply] [--dry-run] |
| 477 | [--tag NAME] [--min-confidence F] |
| 478 | [--strict | --lax] |
| 479 | dlm harvest <path> --revert |
| 480 | ``` |
| 481 | |
| 482 | | Option | Default | Notes | |
| 483 | |---|---|---| |
| 484 | | `--sway-json PATH` | required | Path to the sway probe report JSON. | |
| 485 | | `--apply` | false | Write changes to disk. Without it, dry-run. | |
| 486 | | `--dry-run` | true | Print the diff; no writes. | |
| 487 | | `--revert` | — | Strip all `auto_harvest=True` sections; mutually exclusive with `--sway-json`. | |
| 488 | | `--tag NAME` | `auto-harvest` | Provenance tag written into `harvest_source`. | |
| 489 | | `--min-confidence F` | `0.0` | Skip candidates below this confidence. | |
| 490 | | `--strict` / `--lax` | lax | Strict: fail if any failing probe lacks a reference. Lax: skip + log. | |
| 491 | |
| 492 | Exit codes: `0` success, `1` validation error (malformed JSON, strict |
| 493 | miss, mutual-exclusion violation), `2` no candidates to harvest. |
| 494 | |
| 495 | ### `dlm train --listen-rpc` |
| 496 | |
| 497 | During `--watch`, open a JSON-RPC endpoint that accepts `inject_probe` |
| 498 | pushes from external eval harnesses. Requires `DLM_PROBE_TOKEN` in the |
| 499 | environment. See `docs/cookbook/probe-driven-training.md` for the wire |
| 500 | protocol and security notes. |
| 501 | |
| 502 | | Option | Default | Notes | |
| 503 | |---|---|---| |
| 504 | | `--listen-rpc HOST:PORT` | off | Bind the probe-RPC endpoint. Requires `--watch` or `--max-cycles`. | |
| 505 | | `--max-cycles N` | `0` | Bounded-loop alternative to `--watch` for convergence runs (scaffolded — currently refuses execution without `--watch`). | |
| 506 | |
| 507 | ## Exit codes |
| 508 | |
| 509 | | Code | Meaning | |
| 510 | |---|---| |
| 511 | | 0 | Success. | |
| 512 | | 1 | Runtime failure (license refused, disk full, OOM, template drift, lock validation). | |
| 513 | | 2 | CLI misuse (mutex violation, missing argument). | |
| 514 | |
| 515 | Domain errors are formatted consistently via bare `console.print` |
| 516 | calls in each subcommand (prefix convention: `<subject>: <message>`, |
| 517 | e.g. `lock: base_model_revision changed`). Uncaught exceptions escape |
| 518 | into `dlm.cli.reporter` which picks a matching prefix from the |
| 519 | module the exception came from and renders a tier-3 generic message. |