documentlanguagemodel Public
Preference section reference
::preference:: sections are the pairwise alignment format DLM feeds
into the preference-training path (dpo / orpo). They are valid in
hand-authored .dlm files and in auto-mined output written by
dlm preference mine --apply.
Basic shape
Each record contains three labeled blocks:
::preference::
### Prompt
Explain recursion to a beginner.
### Chosen
Recursion is when a function calls itself on a smaller version of the
same problem.
### Rejected
Recursion is a self-referential computational strategy implemented with
stack-managed frame expansion.
One ::preference:: section can hold one or more Prompt/Chosen/Rejected
triples. DLM splits them into preference rows at parse time.
Semantics
Promptis the input shown to the model.Chosenis the preferred response.Rejectedis the lower-quality alternative.
Preference training does not try to predict the Rejected text.
Instead, it learns to increase the model's relative preference for the
Chosen response over the Rejected one.
Auto-mined sections
When dlm preference mine writes sections back into a document, it
marks them with an HTML comment immediately after the section fence:
::preference::
<!-- dlm-auto-mined: judge_name="sway" judge_score_chosen="0.82" judge_score_rejected="0.31" mined_at="2026-04-23T18:42:11Z" mined_run_id="7" -->
### Prompt
What is 2 + 2?
### Chosen
4.
### Rejected
The sum of two and two is four.
That marker corresponds to these parsed fields on the section:
auto_mined: truejudge_namejudge_score_chosenjudge_score_rejectedmined_atmined_run_id
These metadata fields are required together for auto-mined preference
sections. Hand-authored sections omit the marker and keep
auto_mined=false.
Validation rules
- The auto-mined marker is only valid on
::preference::sections. - Auto-mined sections must provide all metadata fields together.
- The parser rejects malformed score/timestamp/run-id values rather than silently guessing.
- Section identity ignores the auto-mined metadata, so the same logical preference pair keeps the same content identity whether it was written by hand or mined automatically.
Interaction with training
dlm trainincludes auto-mined preference sections by default.dlm train --no-minedexcludes onlyauto_mined=truesections and still uses hand-authored preference pairs.- Replay snapshots also preserve the
auto_minedbit so future preference runs can opt in or out consistently.
Related commands
dlm preference mine <path>dlm preference apply <path>dlm preference revert <path>dlm train <path> --no-mined
See also
View source
| 1 | # Preference section reference |
| 2 | |
| 3 | `::preference::` sections are the pairwise alignment format DLM feeds |
| 4 | into the preference-training path (`dpo` / `orpo`). They are valid in |
| 5 | hand-authored `.dlm` files and in auto-mined output written by |
| 6 | `dlm preference mine --apply`. |
| 7 | |
| 8 | ## Basic shape |
| 9 | |
| 10 | Each record contains three labeled blocks: |
| 11 | |
| 12 | ```dlm |
| 13 | ::preference:: |
| 14 | ### Prompt |
| 15 | Explain recursion to a beginner. |
| 16 | |
| 17 | ### Chosen |
| 18 | Recursion is when a function calls itself on a smaller version of the |
| 19 | same problem. |
| 20 | |
| 21 | ### Rejected |
| 22 | Recursion is a self-referential computational strategy implemented with |
| 23 | stack-managed frame expansion. |
| 24 | ``` |
| 25 | |
| 26 | One `::preference::` section can hold one or more Prompt/Chosen/Rejected |
| 27 | triples. DLM splits them into preference rows at parse time. |
| 28 | |
| 29 | ## Semantics |
| 30 | |
| 31 | - `Prompt` is the input shown to the model. |
| 32 | - `Chosen` is the preferred response. |
| 33 | - `Rejected` is the lower-quality alternative. |
| 34 | |
| 35 | Preference training does not try to predict the `Rejected` text. |
| 36 | Instead, it learns to increase the model's relative preference for the |
| 37 | Chosen response over the Rejected one. |
| 38 | |
| 39 | ## Auto-mined sections |
| 40 | |
| 41 | When `dlm preference mine` writes sections back into a document, it |
| 42 | marks them with an HTML comment immediately after the section fence: |
| 43 | |
| 44 | ```dlm |
| 45 | ::preference:: |
| 46 | <!-- dlm-auto-mined: judge_name="sway" judge_score_chosen="0.82" judge_score_rejected="0.31" mined_at="2026-04-23T18:42:11Z" mined_run_id="7" --> |
| 47 | ### Prompt |
| 48 | What is 2 + 2? |
| 49 | ### Chosen |
| 50 | 4. |
| 51 | ### Rejected |
| 52 | The sum of two and two is four. |
| 53 | ``` |
| 54 | |
| 55 | That marker corresponds to these parsed fields on the section: |
| 56 | |
| 57 | - `auto_mined: true` |
| 58 | - `judge_name` |
| 59 | - `judge_score_chosen` |
| 60 | - `judge_score_rejected` |
| 61 | - `mined_at` |
| 62 | - `mined_run_id` |
| 63 | |
| 64 | These metadata fields are required together for auto-mined preference |
| 65 | sections. Hand-authored sections omit the marker and keep |
| 66 | `auto_mined=false`. |
| 67 | |
| 68 | ## Validation rules |
| 69 | |
| 70 | - The auto-mined marker is only valid on `::preference::` sections. |
| 71 | - Auto-mined sections must provide all metadata fields together. |
| 72 | - The parser rejects malformed score/timestamp/run-id values rather than |
| 73 | silently guessing. |
| 74 | - Section identity ignores the auto-mined metadata, so the same logical |
| 75 | preference pair keeps the same content identity whether it was written |
| 76 | by hand or mined automatically. |
| 77 | |
| 78 | ## Interaction with training |
| 79 | |
| 80 | - `dlm train` includes auto-mined preference sections by default. |
| 81 | - `dlm train --no-mined` excludes only `auto_mined=true` sections and |
| 82 | still uses hand-authored preference pairs. |
| 83 | - Replay snapshots also preserve the `auto_mined` bit so future |
| 84 | preference runs can opt in or out consistently. |
| 85 | |
| 86 | ## Related commands |
| 87 | |
| 88 | - `dlm preference mine <path>` |
| 89 | - `dlm preference apply <path>` |
| 90 | - `dlm preference revert <path>` |
| 91 | - `dlm train <path> --no-mined` |
| 92 | |
| 93 | ## See also |
| 94 | |
| 95 | - [Section grammar](sections.md) |
| 96 | - [CLI reference](../cli/reference.md) |
| 97 | - [Self-improving loop cookbook](../cookbook/self-improving-loop.md) |