markdown · 3087 bytes Raw Blame History

Interactive sessions

dlm repl <path> gives you a conversational prompt against a trained .dlm without reloading the model between turns. It's the human-facing counterpart to dlm prompt: same backend plumbing, multi-turn context, readline-style editing, history that persists across sessions.

When to use it

  • You're iterating on how the adapter responds to a series of related prompts.
  • You want to tune generation knobs (temperature, top_p) on the fly without restarting.
  • You're demoing the trained document to someone who expects a chat-style interface.

Single-shot dlm prompt <path> "…" is still the right call for scripts, one-liners, or piped input.

Basic usage

$ dlm repl mydoc.dlm
dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
> Hello.
Hi! How can I help?
[1] > What does the document cover?
…

The prompt [N] > reports how many turn pairs have already happened. Full chat template is applied every turn, so the model sees the whole conversation in context.

Slash commands

Command Effect
/help Print the command list.
/exit or /quit End the session (Ctrl-D does the same).
/clear Reset conversation history; model stays loaded.
/save <path> Write history as JSON for later review or replay.
/history Print the current conversation.
/adapter <name> Switch active adapter (multi-adapter docs only).
/params key=value Update a generation knob in place.
/params Print current generation knobs.
/model Print the active backend + adapter.

/params

Accepts these keys: temperature, top_p, top_k, max_new_tokens, repetition_penalty. Multiple updates in one line are allowed:

[2] > /params temperature=0.3 top_p=0.9
temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None

Bad values reject without partial updates — your previous knobs stay intact.

Ctrl-C semantics

  • During input: cancels the line you're editing. The REPL redraws the prompt; session keeps running. (Use /exit or Ctrl-D to leave.)
  • During generation: stops the model mid-stream. Tokens emitted so far stay on screen, and the partial response is appended to history with a [cancelled] marker so the model sees it in future turns.

History persistence

Readline history (your past prompts) is stored at ~/.dlm/history. Arrow-up / Ctrl-R work across sessions. This is not the conversation transcript — use /save <path> to export that.

Non-interactive output

dlm repl mydoc.dlm > transcript.txt detects the non-TTY stdout and disables streaming; each response is printed once, complete. Useful for capturing a scripted session (feed prompts on stdin).

Backend selection

--backend auto (default) picks MLX on Apple Silicon when the mlx extra is installed, PyTorch otherwise. Force either with --backend pytorch|mlx. MLX drops top_p/top_k/ repetition_penalty today — see the CLI reference for the full matrix.

View source
1 # Interactive sessions
2
3 `dlm repl <path>` gives you a conversational prompt against a trained
4 `.dlm` without reloading the model between turns. It's the
5 human-facing counterpart to `dlm prompt`: same backend plumbing,
6 multi-turn context, readline-style editing, history that persists
7 across sessions.
8
9 ## When to use it
10
11 - You're iterating on how the adapter responds to a series of
12 related prompts.
13 - You want to tune generation knobs (`temperature`, `top_p`) on the
14 fly without restarting.
15 - You're demoing the trained document to someone who expects a
16 chat-style interface.
17
18 Single-shot `dlm prompt <path> "…"` is still the right call for
19 scripts, one-liners, or piped input.
20
21 ## Basic usage
22
23 ```bash
24 $ dlm repl mydoc.dlm
25 dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
26 > Hello.
27 Hi! How can I help?
28 [1] > What does the document cover?
29
30 ```
31
32 The prompt `[N] > ` reports how many turn pairs have already
33 happened. Full chat template is applied every turn, so the model
34 sees the whole conversation in context.
35
36 ## Slash commands
37
38 | Command | Effect |
39 |---|---|
40 | `/help` | Print the command list. |
41 | `/exit` or `/quit` | End the session (Ctrl-D does the same). |
42 | `/clear` | Reset conversation history; model stays loaded. |
43 | `/save <path>` | Write history as JSON for later review or replay. |
44 | `/history` | Print the current conversation. |
45 | `/adapter <name>` | Switch active adapter (multi-adapter docs only). |
46 | `/params key=value` | Update a generation knob in place. |
47 | `/params` | Print current generation knobs. |
48 | `/model` | Print the active backend + adapter. |
49
50 ### `/params`
51
52 Accepts these keys: `temperature`, `top_p`, `top_k`, `max_new_tokens`,
53 `repetition_penalty`. Multiple updates in one line are allowed:
54
55 ```
56 [2] > /params temperature=0.3 top_p=0.9
57 temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None
58 ```
59
60 Bad values reject without partial updates — your previous knobs
61 stay intact.
62
63 ## Ctrl-C semantics
64
65 - **During input**: cancels the line you're editing. The REPL
66 redraws the prompt; session keeps running. (Use `/exit` or
67 Ctrl-D to leave.)
68 - **During generation**: stops the model mid-stream. Tokens
69 emitted so far stay on screen, and the partial response is
70 appended to history with a `[cancelled]` marker so the model
71 sees it in future turns.
72
73 ## History persistence
74
75 Readline history (your past prompts) is stored at
76 `~/.dlm/history`. Arrow-up / Ctrl-R work across sessions. This is
77 *not* the conversation transcript — use `/save <path>` to export
78 that.
79
80 ## Non-interactive output
81
82 `dlm repl mydoc.dlm > transcript.txt` detects the non-TTY stdout
83 and disables streaming; each response is printed once, complete.
84 Useful for capturing a scripted session (feed prompts on stdin).
85
86 ## Backend selection
87
88 `--backend auto` (default) picks MLX on Apple Silicon when the
89 `mlx` extra is installed, PyTorch otherwise. Force either with
90 `--backend pytorch|mlx`. MLX drops `top_p`/`top_k`/
91 `repetition_penalty` today — see the [CLI reference](../cli/reference.md)
92 for the full matrix.