`3409fa9`

docs(repl): interactive-session cookbook + dlm repl CLI reference

Authored by

espadonne 3 weeks ago

SHA: 3409fa9b7f62c93a18115430d090eeb8d8399e93
Parents: 39598b7
Tree: cee441d

3 changed files

Status	File	+
M	`docs/cli/reference.md`	19
A	`docs/cookbook/interactive-session.md`	92
M	`mkdocs.yml`	1

docs/cli/reference.mdmodified

  Query is the CLI positional argument. Omit to read from stdin.
 +### `dlm repl`
++
 +Interactive prompt-and-respond REPL against the trained adapter
 +(Sprint 24).
++
 +```
 +dlm repl <path> [--adapter NAME] [--backend {auto,pytorch,mlx}]
 +```
++
 +| Option | Default | Notes |
 +|---|---|---|
 +| `--adapter NAME` | None | Named adapter; required on multi-adapter docs. |
 +| `--backend {auto,pytorch,mlx}` | `auto` | Same contract as `dlm prompt --backend`. |
++
 +Slash commands inside the REPL: `/help`, `/exit`, `/clear`, `/save`,
 +`/adapter`, `/params`, `/model`, `/history`. Ctrl-D exits; Ctrl-C
 +cancels generation or input. Session history persists at
 +`~/.dlm/history`. See the [interactive-session cookbook](../cookbook/interactive-session.md).
++
  ### `dlm export`
  Produce GGUF files + Modelfile + register with Ollama.

docs/cookbook/interactive-session.mdadded

 +# Interactive sessions
++
 +`dlm repl <path>` gives you a conversational prompt against a trained
 +`.dlm` without reloading the model between turns. It's the
 +human-facing counterpart to `dlm prompt`: same backend plumbing,
 +multi-turn context, readline-style editing, history that persists
 +across sessions.
++
 +## When to use it
++
 +- You're iterating on how the adapter responds to a series of
 +  related prompts.
 +- You want to tune generation knobs (`temperature`, `top_p`) on the
 +  fly without restarting.
 +- You're demoing the trained document to someone who expects a
 +  chat-style interface.
++
 +Single-shot `dlm prompt <path> "…"` is still the right call for
 +scripts, one-liners, or piped input.
++
 +## Basic usage
++
 +```bash
 +$ dlm repl mydoc.dlm
 +dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
 +> Hello.
 +Hi! How can I help?
 +[1] > What does the document cover?
 +…
 +```
++
 +The prompt `[N] > ` reports how many turn pairs have already
 +happened. Full chat template is applied every turn, so the model
 +sees the whole conversation in context.
++
 +## Slash commands
++
 +| Command | Effect |
 +|---|---|
 +| `/help` | Print the command list. |
 +| `/exit` or `/quit` | End the session (Ctrl-D does the same). |
 +| `/clear` | Reset conversation history; model stays loaded. |
 +| `/save <path>` | Write history as JSON for later review or replay. |
 +| `/history` | Print the current conversation. |
 +| `/adapter <name>` | Switch active adapter (multi-adapter docs only). |
 +| `/params key=value` | Update a generation knob in place. |
 +| `/params` | Print current generation knobs. |
 +| `/model` | Print the active backend + adapter. |
++
 +### `/params`
++
 +Accepts these keys: `temperature`, `top_p`, `top_k`, `max_new_tokens`,
 +`repetition_penalty`. Multiple updates in one line are allowed:
++
 +```
 +[2] > /params temperature=0.3 top_p=0.9
 +temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None
 +```
++
 +Bad values reject without partial updates — your previous knobs
 +stay intact.
++
 +## Ctrl-C semantics
++
 +- **During input**: cancels the line you're editing. The REPL
 +  redraws the prompt; session keeps running. (Use `/exit` or
 +  Ctrl-D to leave.)
 +- **During generation**: stops the model mid-stream. Tokens
 +  emitted so far stay on screen, and the partial response is
 +  appended to history with a `[cancelled]` marker so the model
 +  sees it in future turns.
++
 +## History persistence
++
 +Readline history (your past prompts) is stored at
 +`~/.dlm/history`. Arrow-up / Ctrl-R work across sessions. This is
 +*not* the conversation transcript — use `/save <path>` to export
 +that.
++
 +## Non-interactive output
++
 +`dlm repl mydoc.dlm > transcript.txt` detects the non-TTY stdout
 +and disables streaming; each response is printed once, complete.
 +Useful for capturing a scripted session (feed prompts on stdin).
++
 +## Backend selection
++
 +`--backend auto` (default) picks MLX on Apple Silicon when the
 +`mlx` extra is installed, PyTorch otherwise. Force either with
 +`--backend pytorch|mlx`. MLX drops `top_p`/`top_k`/
 +`repetition_penalty` today — see the [CLI reference](../cli/reference.md)
 +for the full matrix.

mkdocs.ymlmodified

        - Quantization tradeoffs: cookbook/quantization-tradeoffs.md
        - Preference (DPO vs ORPO): cookbook/preference-dpo-vs-orpo.md
        - Multi-adapter composition: cookbook/multi-adapter.md
 +      - Interactive sessions: cookbook/interactive-session.md
    - Architecture: architecture.md
    - Determinism: determinism.md
    - Hardware: