documentlanguagemodel Public

Watch 0 Fork 0 Star 0

markdown · 4026 bytes Raw Blame History

Retrain without forgetting (and when to forget)

DLM's retention story: edits to a .dlm add new training signal while previous versions of the document stay in a replay corpus. This recipe walks through the default "additive" retrain, when to deliberately forget, and how to audit what the model has seen.

The default: additive retrain

# v0001 — initial training
$ uv run dlm train tutor.dlm
trained: v0001 (20 steps, seed=42, determinism=best_effort)
adapter: ~/.dlm/store/01KPM…/adapter/versions/v0001
log:     ~/.dlm/store/01KPM…/logs/train-000001-…jsonl

# Edit tutor.dlm — add more Q&A, fix a typo
$ $EDITOR tutor.dlm

# v0002 — retrain
$ uv run dlm train tutor.dlm
trained: v0002 (20 steps, seed=42, determinism=best_effort)
adapter: ~/.dlm/store/01KPM…/adapter/versions/v0002
log:     ~/.dlm/store/01KPM…/logs/train-000002-…jsonl

What dlm train does on retrain:

Loads the previous manifest.json; diffs this run's section hashes against the recorded content_hashes.
Appends the new/changed sections to the replay corpus under replay/corpus.zst (Sprint 08).
Builds the training set as current sections + recency-weighted sample from replay. Prior sections don't disappear — they keep showing up proportional to how recent they are.
Commits the new adapter as v0002; adapter/current.txt flips atomically (Sprint 09's two-phase commit).

The adapter from v0002 still remembers v0001's Q&A pairs because the training data for v0002 included them (via replay).

Inspecting what the model has seen

$ uv run dlm show tutor.dlm
tutor.dlm
  dlm_id:         01KPM618S78XK668EX0TFEWAJY
  base_model:     qwen2.5-coder-1.5b (revision abcdef1)
  store:          ~/.dlm/store/01KPM…  (12.4 KB)
  adapter:        v0002
  training runs:  2 — last 2026-04-19T18:30:14
  exports:        0

Pretty-print summary; use --json for the full record including content_hashes, replay_size_bytes, per-run TrainingRunSummary entries, and pinned_versions:

$ uv run dlm show tutor.dlm --json | jq '.training_runs'
2
$ uv run dlm show tutor.dlm --json | jq 'keys'
[
  "adapter_version", "base_model", "base_model_revision",
  "content_hashes", "dlm_id", "exports", "has_adapter_current",
  "last_trained_at", "orphaned", "path", "pinned_versions",
  "replay_size_bytes", "source_path", "total_size_bytes",
  "training_runs"
]

Intentional forgetting: `--fresh`

Sometimes you want the model to unlearn. Maybe the v0001 Q&A was wrong, and replay-weighted retraining keeps reinforcing it.

$ uv run dlm train tutor.dlm --fresh

--fresh wipes the replay corpus AND the optimizer state; the next run trains only on the current document's sections. The adapter version still increments (v0003), and the manifest records that this run was a fresh start.

Use sparingly — fresh training loses every prior training signal.

Intentional pruning: edit the replay corpus

The replay corpus is a zstd-framed file on disk at <store>/replay/corpus.zst + <store>/replay/index.json. There is no shipped CLI for surgical pruning yet; drop into Python via dlm.replay.store.ReplayStore.at(corpus_path, index_path) to read or rewrite it. A first-class dlm replay prune command is on the Phase 4 roadmap.

What `dlm.lock` records

After each successful run, dlm.lock records:

last_run_id — incremented each run
dlm_sha256 — hash of the source .dlm at this run
pinned_versions — the tuple that produced this adapter

See Determinism for the full field list and mismatch policy.

Caveats

Replay is lossy on eviction. The corpus has a soft size cap (Sprint 08) and evicts the oldest, lowest-weight entries when it grows. Everything still in the corpus trains; evictions are lost.
--fresh doesn't forget the base model. The base is whatever base_model: says in the frontmatter. Its pretraining data lives in its weights, not in your replay corpus.

View source

  
        1
        # Retrain without forgetting (and when to forget)
      
        2
        
        3
        DLM's retention story: edits to a `.dlm` add new training signal while
      
        4
        previous versions of the document stay in a replay corpus. This recipe
      
        5
        walks through the default "additive" retrain, when to deliberately
      
        6
        forget, and how to audit what the model has seen.
      
        7
        
        8
        ## The default: additive retrain
      
        9
        
        10
        ```sh
      
        11
        # v0001 — initial training
      
        12
        $ uv run dlm train tutor.dlm
      
        13
        trained: v0001 (20 steps, seed=42, determinism=best_effort)
      
        14
        adapter: ~/.dlm/store/01KPM…/adapter/versions/v0001
      
        15
        log:     ~/.dlm/store/01KPM…/logs/train-000001-…jsonl
      
        16
        
        17
        # Edit tutor.dlm — add more Q&A, fix a typo
      
        18
        $ $EDITOR tutor.dlm
      
        19
        
        20
        # v0002 — retrain
      
        21
        $ uv run dlm train tutor.dlm
      
        22
        trained: v0002 (20 steps, seed=42, determinism=best_effort)
      
        23
        adapter: ~/.dlm/store/01KPM…/adapter/versions/v0002
      
        24
        log:     ~/.dlm/store/01KPM…/logs/train-000002-…jsonl
      
        25
        ```
      
        26
        
        27
        What `dlm train` does on retrain:
      
        28
        
        29
        1. Loads the previous `manifest.json`; diffs this run's section hashes
      
        30
           against the recorded `content_hashes`.
      
        31
        2. Appends the new/changed sections to the replay corpus under
      
        32
           `replay/corpus.zst` (Sprint 08).
      
        33
        3. Builds the training set as `current sections + recency-weighted
      
        34
           sample from replay`. Prior sections don't disappear — they keep
      
        35
           showing up proportional to how recent they are.
      
        36
        4. Commits the new adapter as `v0002`; `adapter/current.txt` flips
      
        37
           atomically (Sprint 09's two-phase commit).
      
        38
        
        39
        The adapter from `v0002` still remembers v0001's Q&A pairs because the
      
        40
        training data for v0002 included them (via replay).
      
        41
        
        42
        ## Inspecting what the model has seen
      
        43
        
        44
        ```sh
      
        45
        $ uv run dlm show tutor.dlm
      
        46
        tutor.dlm
      
        47
          dlm_id:         01KPM618S78XK668EX0TFEWAJY
      
        48
          base_model:     qwen2.5-coder-1.5b (revision abcdef1)
      
        49
          store:          ~/.dlm/store/01KPM…  (12.4 KB)
      
        50
          adapter:        v0002
      
        51
          training runs:  2 — last 2026-04-19T18:30:14
      
        52
          exports:        0
      
        53
        ```
      
        54
        
        55
        Pretty-print summary; use `--json` for the full record including
      
        56
        `content_hashes`, `replay_size_bytes`, per-run `TrainingRunSummary`
      
        57
        entries, and `pinned_versions`:
      
        58
        
        59
        ```sh
      
        60
        $ uv run dlm show tutor.dlm --json | jq '.training_runs'
      
        61
        2
      
        62
        $ uv run dlm show tutor.dlm --json | jq 'keys'
      
        63
        [
      
        64
          "adapter_version", "base_model", "base_model_revision",
      
        65
          "content_hashes", "dlm_id", "exports", "has_adapter_current",
      
        66
          "last_trained_at", "orphaned", "path", "pinned_versions",
      
        67
          "replay_size_bytes", "source_path", "total_size_bytes",
      
        68
          "training_runs"
      
        69
        ]
      
        70
        ```
      
        71
        
        72
        ## Intentional forgetting: `--fresh`
      
        73
        
        74
        Sometimes you want the model to unlearn. Maybe the v0001 Q&A was
      
        75
        wrong, and replay-weighted retraining keeps reinforcing it.
      
        76
        
        77
        ```sh
      
        78
        $ uv run dlm train tutor.dlm --fresh
      
        79
        ```
      
        80
        
        81
        `--fresh` wipes the replay corpus AND the optimizer state; the next
      
        82
        run trains only on the current document's sections. The adapter
      
        83
        version still increments (`v0003`), and the manifest records that
      
        84
        this run was a fresh start.
      
        85
        
        86
        Use sparingly — fresh training loses every prior training signal.
      
        87
        
        88
        ## Intentional pruning: edit the replay corpus
      
        89
        
        90
        The replay corpus is a zstd-framed file on disk at
      
        91
        `<store>/replay/corpus.zst` + `<store>/replay/index.json`. There is
      
        92
        no shipped CLI for surgical pruning yet; drop into Python via
      
        93
        `dlm.replay.store.ReplayStore.at(corpus_path, index_path)` to read
      
        94
        or rewrite it. A first-class `dlm replay prune` command is on the
      
        95
        Phase 4 roadmap.
      
        96
        
        97
        ## What `dlm.lock` records
      
        98
        
        99
        After each successful run, `dlm.lock` records:
      
        100
        
        101
        - `last_run_id` — incremented each run
      
        102
        - `dlm_sha256` — hash of the source `.dlm` at this run
      
        103
        - `pinned_versions` — the tuple that produced this adapter
      
        104
        
        105
        See [Determinism](../determinism.md) for the full field list and
      
        106
        mismatch policy.
      
        107
        
        108
        ## Caveats
      
        109
        
        110
        - **Replay is lossy on eviction.** The corpus has a soft size cap
      
        111
          (Sprint 08) and evicts the oldest, lowest-weight entries when it
      
        112
          grows. Everything still in the corpus trains; evictions are lost.
      
        113
        - **`--fresh` doesn't forget the base model.** The base is whatever
      
        114
          `base_model:` says in the frontmatter. Its pretraining data lives
      
        115
          in its weights, not in your replay corpus.

1	# Retrain without forgetting (and when to forget)
2
3	DLM's retention story: edits to a `.dlm` add new training signal while
4	previous versions of the document stay in a replay corpus. This recipe
5	walks through the default "additive" retrain, when to deliberately
6	forget, and how to audit what the model has seen.
7
8	## The default: additive retrain
9
10	```sh
11	# v0001 — initial training
12	$ uv run dlm train tutor.dlm
13	trained: v0001 (20 steps, seed=42, determinism=best_effort)
14	adapter: ~/.dlm/store/01KPM…/adapter/versions/v0001
15	log: ~/.dlm/store/01KPM…/logs/train-000001-…jsonl
16
17	# Edit tutor.dlm — add more Q&A, fix a typo
18	$ $EDITOR tutor.dlm
19
20	# v0002 — retrain
21	$ uv run dlm train tutor.dlm
22	trained: v0002 (20 steps, seed=42, determinism=best_effort)
23	adapter: ~/.dlm/store/01KPM…/adapter/versions/v0002
24	log: ~/.dlm/store/01KPM…/logs/train-000002-…jsonl
25	```
26
27	What `dlm train` does on retrain:
28
29	1. Loads the previous `manifest.json`; diffs this run's section hashes
30	against the recorded `content_hashes`.
31	2. Appends the new/changed sections to the replay corpus under
32	`replay/corpus.zst` (Sprint 08).
33	3. Builds the training set as `current sections + recency-weighted
34	sample from replay`. Prior sections don't disappear — they keep
35	showing up proportional to how recent they are.
36	4. Commits the new adapter as `v0002`; `adapter/current.txt` flips
37	atomically (Sprint 09's two-phase commit).
38
39	The adapter from `v0002` still remembers v0001's Q&A pairs because the
40	training data for v0002 included them (via replay).
41
42	## Inspecting what the model has seen
43
44	```sh
45	$ uv run dlm show tutor.dlm
46	tutor.dlm
47	dlm_id: 01KPM618S78XK668EX0TFEWAJY
48	base_model: qwen2.5-coder-1.5b (revision abcdef1)
49	store: ~/.dlm/store/01KPM… (12.4 KB)
50	adapter: v0002
51	training runs: 2 — last 2026-04-19T18:30:14
52	exports: 0
53	```
54
55	Pretty-print summary; use `--json` for the full record including
56	`content_hashes`, `replay_size_bytes`, per-run `TrainingRunSummary`
57	entries, and `pinned_versions`:
58
59	```sh
60	$ uv run dlm show tutor.dlm --json \| jq '.training_runs'
61	2
62	$ uv run dlm show tutor.dlm --json \| jq 'keys'
63	[
64	"adapter_version", "base_model", "base_model_revision",
65	"content_hashes", "dlm_id", "exports", "has_adapter_current",
66	"last_trained_at", "orphaned", "path", "pinned_versions",
67	"replay_size_bytes", "source_path", "total_size_bytes",
68	"training_runs"
69	]
70	```
71
72	## Intentional forgetting: `--fresh`
73
74	Sometimes you want the model to unlearn. Maybe the v0001 Q&A was
75	wrong, and replay-weighted retraining keeps reinforcing it.
76
77	```sh
78	$ uv run dlm train tutor.dlm --fresh
79	```
80
81	`--fresh` wipes the replay corpus AND the optimizer state; the next
82	run trains only on the current document's sections. The adapter
83	version still increments (`v0003`), and the manifest records that
84	this run was a fresh start.
85
86	Use sparingly — fresh training loses every prior training signal.
87
88	## Intentional pruning: edit the replay corpus
89
90	The replay corpus is a zstd-framed file on disk at
91	`<store>/replay/corpus.zst` + `<store>/replay/index.json`. There is
92	no shipped CLI for surgical pruning yet; drop into Python via
93	`dlm.replay.store.ReplayStore.at(corpus_path, index_path)` to read
94	or rewrite it. A first-class `dlm replay prune` command is on the
95	Phase 4 roadmap.
96
97	## What `dlm.lock` records
98
99	After each successful run, `dlm.lock` records:
100
101	- `last_run_id` — incremented each run
102	- `dlm_sha256` — hash of the source `.dlm` at this run
103	- `pinned_versions` — the tuple that produced this adapter
104
105	See [Determinism](../determinism.md) for the full field list and
106	mismatch policy.
107
108	## Caveats
109
110	- Replay is lossy on eviction. The corpus has a soft size cap
111	(Sprint 08) and evicts the oldest, lowest-weight entries when it
112	grows. Everything still in the corpus trains; evictions are lost.
113	- `--fresh` doesn't forget the base model. The base is whatever
114	`base_model:` says in the frontmatter. Its pretraining data lives
115	in its weights, not in your replay corpus.