tenseleyflow/documentlanguagemodel / 3ab067d

Browse files

Document preference section format

Authored by espadonne
SHA
3ab067daadd721f1fded2717d9dd6e2887426541
Parents
74d85c2
Tree
a9822b1

3 changed files

StatusFile+-
A docs/format/preference-section.md 97 0
M docs/format/sections.md 1 0
M mkdocs.yml 1 0
docs/format/preference-section.mdadded
@@ -0,0 +1,97 @@
1
+# Preference section reference
2
+
3
+`::preference::` sections are the pairwise alignment format DLM feeds
4
+into the preference-training path (`dpo` / `orpo`). They are valid in
5
+hand-authored `.dlm` files and in auto-mined output written by
6
+`dlm preference mine --apply`.
7
+
8
+## Basic shape
9
+
10
+Each record contains three labeled blocks:
11
+
12
+```dlm
13
+::preference::
14
+### Prompt
15
+Explain recursion to a beginner.
16
+
17
+### Chosen
18
+Recursion is when a function calls itself on a smaller version of the
19
+same problem.
20
+
21
+### Rejected
22
+Recursion is a self-referential computational strategy implemented with
23
+stack-managed frame expansion.
24
+```
25
+
26
+One `::preference::` section can hold one or more Prompt/Chosen/Rejected
27
+triples. DLM splits them into preference rows at parse time.
28
+
29
+## Semantics
30
+
31
+- `Prompt` is the input shown to the model.
32
+- `Chosen` is the preferred response.
33
+- `Rejected` is the lower-quality alternative.
34
+
35
+Preference training does not try to predict the `Rejected` text.
36
+Instead, it learns to increase the model's relative preference for the
37
+Chosen response over the Rejected one.
38
+
39
+## Auto-mined sections
40
+
41
+When `dlm preference mine` writes sections back into a document, it
42
+marks them with an HTML comment immediately after the section fence:
43
+
44
+```dlm
45
+::preference::
46
+<!-- dlm-auto-mined: judge_name="sway" judge_score_chosen="0.82" judge_score_rejected="0.31" mined_at="2026-04-23T18:42:11Z" mined_run_id="7" -->
47
+### Prompt
48
+What is 2 + 2?
49
+### Chosen
50
+4.
51
+### Rejected
52
+The sum of two and two is four.
53
+```
54
+
55
+That marker corresponds to these parsed fields on the section:
56
+
57
+- `auto_mined: true`
58
+- `judge_name`
59
+- `judge_score_chosen`
60
+- `judge_score_rejected`
61
+- `mined_at`
62
+- `mined_run_id`
63
+
64
+These metadata fields are required together for auto-mined preference
65
+sections. Hand-authored sections omit the marker and keep
66
+`auto_mined=false`.
67
+
68
+## Validation rules
69
+
70
+- The auto-mined marker is only valid on `::preference::` sections.
71
+- Auto-mined sections must provide all metadata fields together.
72
+- The parser rejects malformed score/timestamp/run-id values rather than
73
+  silently guessing.
74
+- Section identity ignores the auto-mined metadata, so the same logical
75
+  preference pair keeps the same content identity whether it was written
76
+  by hand or mined automatically.
77
+
78
+## Interaction with training
79
+
80
+- `dlm train` includes auto-mined preference sections by default.
81
+- `dlm train --no-mined` excludes only `auto_mined=true` sections and
82
+  still uses hand-authored preference pairs.
83
+- Replay snapshots also preserve the `auto_mined` bit so future
84
+  preference runs can opt in or out consistently.
85
+
86
+## Related commands
87
+
88
+- `dlm preference mine <path>`
89
+- `dlm preference apply <path>`
90
+- `dlm preference revert <path>`
91
+- `dlm train <path> --no-mined`
92
+
93
+## See also
94
+
95
+- [Section grammar](sections.md)
96
+- [CLI reference](../cli/reference.md)
97
+- [Self-improving loop cookbook](../cookbook/self-improving-loop.md)
docs/format/sections.mdmodified
@@ -159,6 +159,7 @@ being picked up as new?", the ID in `dlm show --json` is the answer.
159159
 
160160
 ## See also
161161
 
162
+- [Preference section reference](preference-section.md)
162163
 - [First train walkthrough](../getting-started/first-train.md)
163164
 - [Cookbook: coding tutor](../cookbook/coding-tutor.md) — full
164165
   example of instruction-heavy authoring
mkdocs.ymlmodified
@@ -58,6 +58,7 @@ nav:
5858
   - The .dlm format:
5959
       - Frontmatter: format/frontmatter.md
6060
       - Sections: format/sections.md
61
+      - Preference sections: format/preference-section.md
6162
       - Export manifest: format/export-manifest.md
6263
       - .dlm/training.yaml: format/dlm-training-yaml.md
6364
       - .dlm/ignore: format/dlm-ignore.md