tenseleyflow/documentlanguagemodel / 3409fa9

Browse files

docs(repl): interactive-session cookbook + dlm repl CLI reference

Authored by espadonne
SHA
3409fa9b7f62c93a18115430d090eeb8d8399e93
Parents
39598b7
Tree
cee441d

3 changed files

StatusFile+-
M docs/cli/reference.md 19 0
A docs/cookbook/interactive-session.md 92 0
M mkdocs.yml 1 0
docs/cli/reference.mdmodified
@@ -91,6 +91,25 @@ dlm prompt <path> [query] [--max-tokens N] [--temp F] [--top-p F]
9191
 
9292
 Query is the CLI positional argument. Omit to read from stdin.
9393
 
94
+### `dlm repl`
95
+
96
+Interactive prompt-and-respond REPL against the trained adapter
97
+(Sprint 24).
98
+
99
+```
100
+dlm repl <path> [--adapter NAME] [--backend {auto,pytorch,mlx}]
101
+```
102
+
103
+| Option | Default | Notes |
104
+|---|---|---|
105
+| `--adapter NAME` | None | Named adapter; required on multi-adapter docs. |
106
+| `--backend {auto,pytorch,mlx}` | `auto` | Same contract as `dlm prompt --backend`. |
107
+
108
+Slash commands inside the REPL: `/help`, `/exit`, `/clear`, `/save`,
109
+`/adapter`, `/params`, `/model`, `/history`. Ctrl-D exits; Ctrl-C
110
+cancels generation or input. Session history persists at
111
+`~/.dlm/history`. See the [interactive-session cookbook](../cookbook/interactive-session.md).
112
+
94113
 ### `dlm export`
95114
 
96115
 Produce GGUF files + Modelfile + register with Ollama.
docs/cookbook/interactive-session.mdadded
@@ -0,0 +1,92 @@
1
+# Interactive sessions
2
+
3
+`dlm repl <path>` gives you a conversational prompt against a trained
4
+`.dlm` without reloading the model between turns. It's the
5
+human-facing counterpart to `dlm prompt`: same backend plumbing,
6
+multi-turn context, readline-style editing, history that persists
7
+across sessions.
8
+
9
+## When to use it
10
+
11
+- You're iterating on how the adapter responds to a series of
12
+  related prompts.
13
+- You want to tune generation knobs (`temperature`, `top_p`) on the
14
+  fly without restarting.
15
+- You're demoing the trained document to someone who expects a
16
+  chat-style interface.
17
+
18
+Single-shot `dlm prompt <path> "…"` is still the right call for
19
+scripts, one-liners, or piped input.
20
+
21
+## Basic usage
22
+
23
+```bash
24
+$ dlm repl mydoc.dlm
25
+dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
26
+> Hello.
27
+Hi! How can I help?
28
+[1] > What does the document cover?
29
+…
30
+```
31
+
32
+The prompt `[N] > ` reports how many turn pairs have already
33
+happened. Full chat template is applied every turn, so the model
34
+sees the whole conversation in context.
35
+
36
+## Slash commands
37
+
38
+| Command | Effect |
39
+|---|---|
40
+| `/help` | Print the command list. |
41
+| `/exit` or `/quit` | End the session (Ctrl-D does the same). |
42
+| `/clear` | Reset conversation history; model stays loaded. |
43
+| `/save <path>` | Write history as JSON for later review or replay. |
44
+| `/history` | Print the current conversation. |
45
+| `/adapter <name>` | Switch active adapter (multi-adapter docs only). |
46
+| `/params key=value` | Update a generation knob in place. |
47
+| `/params` | Print current generation knobs. |
48
+| `/model` | Print the active backend + adapter. |
49
+
50
+### `/params`
51
+
52
+Accepts these keys: `temperature`, `top_p`, `top_k`, `max_new_tokens`,
53
+`repetition_penalty`. Multiple updates in one line are allowed:
54
+
55
+```
56
+[2] > /params temperature=0.3 top_p=0.9
57
+temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None
58
+```
59
+
60
+Bad values reject without partial updates — your previous knobs
61
+stay intact.
62
+
63
+## Ctrl-C semantics
64
+
65
+- **During input**: cancels the line you're editing. The REPL
66
+  redraws the prompt; session keeps running. (Use `/exit` or
67
+  Ctrl-D to leave.)
68
+- **During generation**: stops the model mid-stream. Tokens
69
+  emitted so far stay on screen, and the partial response is
70
+  appended to history with a `[cancelled]` marker so the model
71
+  sees it in future turns.
72
+
73
+## History persistence
74
+
75
+Readline history (your past prompts) is stored at
76
+`~/.dlm/history`. Arrow-up / Ctrl-R work across sessions. This is
77
+*not* the conversation transcript — use `/save <path>` to export
78
+that.
79
+
80
+## Non-interactive output
81
+
82
+`dlm repl mydoc.dlm > transcript.txt` detects the non-TTY stdout
83
+and disables streaming; each response is printed once, complete.
84
+Useful for capturing a scripted session (feed prompts on stdin).
85
+
86
+## Backend selection
87
+
88
+`--backend auto` (default) picks MLX on Apple Silicon when the
89
+`mlx` extra is installed, PyTorch otherwise. Force either with
90
+`--backend pytorch|mlx`. MLX drops `top_p`/`top_k`/
91
+`repetition_penalty` today — see the [CLI reference](../cli/reference.md)
92
+for the full matrix.
mkdocs.ymlmodified
@@ -68,6 +68,7 @@ nav:
6868
       - Quantization tradeoffs: cookbook/quantization-tradeoffs.md
6969
       - Preference (DPO vs ORPO): cookbook/preference-dpo-vs-orpo.md
7070
       - Multi-adapter composition: cookbook/multi-adapter.md
71
+      - Interactive sessions: cookbook/interactive-session.md
7172
   - Architecture: architecture.md
7273
   - Determinism: determinism.md
7374
   - Hardware: