documentlanguagemodel Public

Watch 0 Fork 0 Star 0

markdown · 3087 bytes Raw Blame History

Interactive sessions

dlm repl <path> gives you a conversational prompt against a trained .dlm without reloading the model between turns. It's the human-facing counterpart to dlm prompt: same backend plumbing, multi-turn context, readline-style editing, history that persists across sessions.

When to use it

You're iterating on how the adapter responds to a series of related prompts.
You want to tune generation knobs (temperature, top_p) on the fly without restarting.
You're demoing the trained document to someone who expects a chat-style interface.

Single-shot dlm prompt <path> "…" is still the right call for scripts, one-liners, or piped input.

Basic usage

$ dlm repl mydoc.dlm
dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
> Hello.
Hi! How can I help?
[1] > What does the document cover?
…

The prompt [N] > reports how many turn pairs have already happened. Full chat template is applied every turn, so the model sees the whole conversation in context.

Slash commands

Command	Effect
`/help`	Print the command list.
`/exit` or `/quit`	End the session (Ctrl-D does the same).
`/clear`	Reset conversation history; model stays loaded.
`/save <path>`	Write history as JSON for later review or replay.
`/history`	Print the current conversation.
`/adapter <name>`	Switch active adapter (multi-adapter docs only).
`/params key=value`	Update a generation knob in place.
`/params`	Print current generation knobs.
`/model`	Print the active backend + adapter.

`/params`

Accepts these keys: temperature, top_p, top_k, max_new_tokens, repetition_penalty. Multiple updates in one line are allowed:

[2] > /params temperature=0.3 top_p=0.9
temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None

Bad values reject without partial updates — your previous knobs stay intact.

Ctrl-C semantics

During input: cancels the line you're editing. The REPL redraws the prompt; session keeps running. (Use /exit or Ctrl-D to leave.)
During generation: stops the model mid-stream. Tokens emitted so far stay on screen, and the partial response is appended to history with a [cancelled] marker so the model sees it in future turns.

History persistence

Readline history (your past prompts) is stored at ~/.dlm/history. Arrow-up / Ctrl-R work across sessions. This is not the conversation transcript — use /save <path> to export that.

Non-interactive output

dlm repl mydoc.dlm > transcript.txt detects the non-TTY stdout and disables streaming; each response is printed once, complete. Useful for capturing a scripted session (feed prompts on stdin).

Backend selection

--backend auto (default) picks MLX on Apple Silicon when the mlx extra is installed, PyTorch otherwise. Force either with --backend pytorch|mlx. MLX drops top_p/top_k/ repetition_penalty today — see the CLI reference for the full matrix.

View source

  
        1
        # Interactive sessions
      
        2
        
        3
        `dlm repl <path>` gives you a conversational prompt against a trained
      
        4
        `.dlm` without reloading the model between turns. It's the
      
        5
        human-facing counterpart to `dlm prompt`: same backend plumbing,
      
        6
        multi-turn context, readline-style editing, history that persists
      
        7
        across sessions.
      
        8
        
        9
        ## When to use it
      
        10
        
        11
        - You're iterating on how the adapter responds to a series of
      
        12
          related prompts.
      
        13
        - You want to tune generation knobs (`temperature`, `top_p`) on the
      
        14
          fly without restarting.
      
        15
        - You're demoing the trained document to someone who expects a
      
        16
          chat-style interface.
      
        17
        
        18
        Single-shot `dlm prompt <path> "…"` is still the right call for
      
        19
        scripts, one-liners, or piped input.
      
        20
        
        21
        ## Basic usage
      
        22
        
        23
        ```bash
      
        24
        $ dlm repl mydoc.dlm
      
        25
        dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
      
        26
        > Hello.
      
        27
        Hi! How can I help?
      
        28
        [1] > What does the document cover?
      
        29
        …
      
        30
        ```
      
        31
        
        32
        The prompt `[N] > ` reports how many turn pairs have already
      
        33
        happened. Full chat template is applied every turn, so the model
      
        34
        sees the whole conversation in context.
      
        35
        
        36
        ## Slash commands
      
        37
        
        38
        | Command | Effect |
      
        39
        |---|---|
      
        40
        | `/help` | Print the command list. |
      
        41
        | `/exit` or `/quit` | End the session (Ctrl-D does the same). |
      
        42
        | `/clear` | Reset conversation history; model stays loaded. |
      
        43
        | `/save <path>` | Write history as JSON for later review or replay. |
      
        44
        | `/history` | Print the current conversation. |
      
        45
        | `/adapter <name>` | Switch active adapter (multi-adapter docs only). |
      
        46
        | `/params key=value` | Update a generation knob in place. |
      
        47
        | `/params` | Print current generation knobs. |
      
        48
        | `/model` | Print the active backend + adapter. |
      
        49
        
        50
        ### `/params`
      
        51
        
        52
        Accepts these keys: `temperature`, `top_p`, `top_k`, `max_new_tokens`,
      
        53
        `repetition_penalty`. Multiple updates in one line are allowed:
      
        54
        
        55
        ```
      
        56
        [2] > /params temperature=0.3 top_p=0.9
      
        57
        temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None
      
        58
        ```
      
        59
        
        60
        Bad values reject without partial updates — your previous knobs
      
        61
        stay intact.
      
        62
        
        63
        ## Ctrl-C semantics
      
        64
        
        65
        - **During input**: cancels the line you're editing. The REPL
      
        66
          redraws the prompt; session keeps running. (Use `/exit` or
      
        67
          Ctrl-D to leave.)
      
        68
        - **During generation**: stops the model mid-stream. Tokens
      
        69
          emitted so far stay on screen, and the partial response is
      
        70
          appended to history with a `[cancelled]` marker so the model
      
        71
          sees it in future turns.
      
        72
        
        73
        ## History persistence
      
        74
        
        75
        Readline history (your past prompts) is stored at
      
        76
        `~/.dlm/history`. Arrow-up / Ctrl-R work across sessions. This is
      
        77
        *not* the conversation transcript — use `/save <path>` to export
      
        78
        that.
      
        79
        
        80
        ## Non-interactive output
      
        81
        
        82
        `dlm repl mydoc.dlm > transcript.txt` detects the non-TTY stdout
      
        83
        and disables streaming; each response is printed once, complete.
      
        84
        Useful for capturing a scripted session (feed prompts on stdin).
      
        85
        
        86
        ## Backend selection
      
        87
        
        88
        `--backend auto` (default) picks MLX on Apple Silicon when the
      
        89
        `mlx` extra is installed, PyTorch otherwise. Force either with
      
        90
        `--backend pytorch|mlx`. MLX drops `top_p`/`top_k`/
      
        91
        `repetition_penalty` today — see the [CLI reference](../cli/reference.md)
      
        92
        for the full matrix.

1	# Interactive sessions
2
3	`dlm repl <path>` gives you a conversational prompt against a trained
4	`.dlm` without reloading the model between turns. It's the
5	human-facing counterpart to `dlm prompt`: same backend plumbing,
6	multi-turn context, readline-style editing, history that persists
7	across sessions.
8
9	## When to use it
10
11	- You're iterating on how the adapter responds to a series of
12	related prompts.
13	- You want to tune generation knobs (`temperature`, `top_p`) on the
14	fly without restarting.
15	- You're demoing the trained document to someone who expects a
16	chat-style interface.
17
18	Single-shot `dlm prompt <path> "…"` is still the right call for
19	scripts, one-liners, or piped input.
20
21	## Basic usage
22
23	```bash
24	$ dlm repl mydoc.dlm
25	dlm repl — /help for commands, /exit to quit (history: ~/.dlm/history)
26	> Hello.
27	Hi! How can I help?
28	[1] > What does the document cover?
29	…
30	```
31
32	The prompt `[N] > ` reports how many turn pairs have already
33	happened. Full chat template is applied every turn, so the model
34	sees the whole conversation in context.
35
36	## Slash commands
37
38	\| Command \| Effect \|
39	\|---\|---\|
40	\| `/help` \| Print the command list. \|
41	\| `/exit` or `/quit` \| End the session (Ctrl-D does the same). \|
42	\| `/clear` \| Reset conversation history; model stays loaded. \|
43	\| `/save <path>` \| Write history as JSON for later review or replay. \|
44	\| `/history` \| Print the current conversation. \|
45	\| `/adapter <name>` \| Switch active adapter (multi-adapter docs only). \|
46	\| `/params key=value` \| Update a generation knob in place. \|
47	\| `/params` \| Print current generation knobs. \|
48	\| `/model` \| Print the active backend + adapter. \|
49
50	### `/params`
51
52	Accepts these keys: `temperature`, `top_p`, `top_k`, `max_new_tokens`,
53	`repetition_penalty`. Multiple updates in one line are allowed:
54
55	```
56	[2] > /params temperature=0.3 top_p=0.9
57	temperature=0.3 top_p=0.9 top_k=None max_new_tokens=256 repetition_penalty=None
58	```
59
60	Bad values reject without partial updates — your previous knobs
61	stay intact.
62
63	## Ctrl-C semantics
64
65	- During input: cancels the line you're editing. The REPL
66	redraws the prompt; session keeps running. (Use `/exit` or
67	Ctrl-D to leave.)
68	- During generation: stops the model mid-stream. Tokens
69	emitted so far stay on screen, and the partial response is
70	appended to history with a `[cancelled]` marker so the model
71	sees it in future turns.
72
73	## History persistence
74
75	Readline history (your past prompts) is stored at
76	`~/.dlm/history`. Arrow-up / Ctrl-R work across sessions. This is
77	not the conversation transcript — use `/save <path>` to export
78	that.
79
80	## Non-interactive output
81
82	`dlm repl mydoc.dlm > transcript.txt` detects the non-TTY stdout
83	and disables streaming; each response is printed once, complete.
84	Useful for capturing a scripted session (feed prompts on stdin).
85
86	## Backend selection
87
88	`--backend auto` (default) picks MLX on Apple Silicon when the
89	`mlx` extra is installed, PyTorch otherwise. Force either with
90	`--backend pytorch\|mlx`. MLX drops `top_p`/`top_k`/
91	`repetition_penalty` today — see the [CLI reference](../cli/reference.md)
92	for the full matrix.