documentlanguagemodel Public
Instruction section reference
::instruction:: sections are the supervised fine-tuning format DLM
uses for prompt/answer training data.
They are valid in hand-authored .dlm files and in synthetic output
written by dlm synth instructions --apply.
Basic shape
Each instruction section contains one or more Q / A pairs:
::instruction::
### Q
What is a decorator?
### A
A function that takes a function and returns a wrapped function.
### Q
When should I use `functools.wraps`?
### A
Whenever a decorator returns another callable and you want to preserve
the wrapped function's metadata.
DLM splits those into individual supervised rows at parse time.
Semantics
Qis the prompt shown to the model.Ais the target response.
At train time, DLM uses the question as context and the answer as the supervised target. This is the section type that most directly shapes assistant behavior.
Auto-synth instruction sections
When dlm synth instructions writes sections back into a document, it
adds an HTML marker immediately after the section fence:
::instruction::
<!-- dlm-auto-synth: synth_teacher="self" synth_strategy="extraction" synth_at="2026-04-24T10:18:42Z" source_section_id="b6b7d8a2f4b3f9c0" -->
### Q
What does DGEMM do?
### A
It multiplies dense matrices and can optionally accumulate the result.
That marker corresponds to these parsed fields on the section:
auto_synth: truesynth_teachersynth_strategysynth_atsource_section_id
Hand-authored instruction sections omit the marker and keep
auto_synth=false.
Validation rules
- The auto-synth marker is only valid on
::instruction::sections. - Auto-synth sections must provide all metadata fields together.
synth_teacherandsynth_strategymust be non-empty strings.source_section_idmust be a valid referenced section ID.- Section identity ignores the synth metadata, so the same logical question/answer pair keeps the same content identity whether it was written by hand or synthesized automatically.
Interaction with training
dlm trainincludes synthesized instruction sections by default.- There is currently no separate "ignore auto-synth instructions" train flag; they flow through the normal SFT path once they are present in the document.
dlm synth revertstrips everyauto_synth: trueinstruction section from the file without touching hand-authored rows.
Interaction with dlm synth
Relevant commands:
dlm synth instructions <path>dlm synth list <path>dlm synth revert <path>
The current instructions command can:
- stage accepted synth sections for inspection
- write accepted synth sections directly with
--apply - preview only with
--dry-run
Choosing a good instruction section
Hand-authored or synthesized, good instruction sections tend to have:
- a clear prompt with one task
- an answer that matches the tone you want the adapter to learn
- enough domain specificity that the pair teaches something real
Weak instruction sections tend to be:
- generic
- repetitive
- too broad to answer well
- stylistically inconsistent with the rest of the document
See also
View source
| 1 | # Instruction section reference |
| 2 | |
| 3 | `::instruction::` sections are the supervised fine-tuning format DLM |
| 4 | uses for prompt/answer training data. |
| 5 | |
| 6 | They are valid in hand-authored `.dlm` files and in synthetic output |
| 7 | written by `dlm synth instructions --apply`. |
| 8 | |
| 9 | ## Basic shape |
| 10 | |
| 11 | Each instruction section contains one or more `Q` / `A` pairs: |
| 12 | |
| 13 | ```dlm |
| 14 | ::instruction:: |
| 15 | ### Q |
| 16 | What is a decorator? |
| 17 | |
| 18 | ### A |
| 19 | A function that takes a function and returns a wrapped function. |
| 20 | |
| 21 | ### Q |
| 22 | When should I use `functools.wraps`? |
| 23 | |
| 24 | ### A |
| 25 | Whenever a decorator returns another callable and you want to preserve |
| 26 | the wrapped function's metadata. |
| 27 | ``` |
| 28 | |
| 29 | DLM splits those into individual supervised rows at parse time. |
| 30 | |
| 31 | ## Semantics |
| 32 | |
| 33 | - `Q` is the prompt shown to the model. |
| 34 | - `A` is the target response. |
| 35 | |
| 36 | At train time, DLM uses the question as context and the answer as the |
| 37 | supervised target. This is the section type that most directly shapes |
| 38 | assistant behavior. |
| 39 | |
| 40 | ## Auto-synth instruction sections |
| 41 | |
| 42 | When `dlm synth instructions` writes sections back into a document, it |
| 43 | adds an HTML marker immediately after the section fence: |
| 44 | |
| 45 | ```dlm |
| 46 | ::instruction:: |
| 47 | <!-- dlm-auto-synth: synth_teacher="self" synth_strategy="extraction" synth_at="2026-04-24T10:18:42Z" source_section_id="b6b7d8a2f4b3f9c0" --> |
| 48 | ### Q |
| 49 | What does DGEMM do? |
| 50 | |
| 51 | ### A |
| 52 | It multiplies dense matrices and can optionally accumulate the result. |
| 53 | ``` |
| 54 | |
| 55 | That marker corresponds to these parsed fields on the section: |
| 56 | |
| 57 | - `auto_synth: true` |
| 58 | - `synth_teacher` |
| 59 | - `synth_strategy` |
| 60 | - `synth_at` |
| 61 | - `source_section_id` |
| 62 | |
| 63 | Hand-authored instruction sections omit the marker and keep |
| 64 | `auto_synth=false`. |
| 65 | |
| 66 | ## Validation rules |
| 67 | |
| 68 | - The auto-synth marker is only valid on `::instruction::` sections. |
| 69 | - Auto-synth sections must provide all metadata fields together. |
| 70 | - `synth_teacher` and `synth_strategy` must be non-empty strings. |
| 71 | - `source_section_id` must be a valid referenced section ID. |
| 72 | - Section identity ignores the synth metadata, so the same logical |
| 73 | question/answer pair keeps the same content identity whether it was |
| 74 | written by hand or synthesized automatically. |
| 75 | |
| 76 | ## Interaction with training |
| 77 | |
| 78 | - `dlm train` includes synthesized instruction sections by default. |
| 79 | - There is currently no separate "ignore auto-synth instructions" train |
| 80 | flag; they flow through the normal SFT path once they are present in |
| 81 | the document. |
| 82 | - `dlm synth revert` strips every `auto_synth: true` instruction section |
| 83 | from the file without touching hand-authored rows. |
| 84 | |
| 85 | ## Interaction with `dlm synth` |
| 86 | |
| 87 | Relevant commands: |
| 88 | |
| 89 | - `dlm synth instructions <path>` |
| 90 | - `dlm synth list <path>` |
| 91 | - `dlm synth revert <path>` |
| 92 | |
| 93 | The current `instructions` command can: |
| 94 | |
| 95 | - stage accepted synth sections for inspection |
| 96 | - write accepted synth sections directly with `--apply` |
| 97 | - preview only with `--dry-run` |
| 98 | |
| 99 | ## Choosing a good instruction section |
| 100 | |
| 101 | Hand-authored or synthesized, good instruction sections tend to have: |
| 102 | |
| 103 | - a clear prompt with one task |
| 104 | - an answer that matches the tone you want the adapter to learn |
| 105 | - enough domain specificity that the pair teaches something real |
| 106 | |
| 107 | Weak instruction sections tend to be: |
| 108 | |
| 109 | - generic |
| 110 | - repetitive |
| 111 | - too broad to answer well |
| 112 | - stylistically inconsistent with the rest of the document |
| 113 | |
| 114 | ## See also |
| 115 | |
| 116 | - [Section grammar](sections.md) |
| 117 | - [Synthesize training data](../cookbook/synthesize-training-data.md) |
| 118 | - [Bootstrap self-improving](../cookbook/bootstrap-self-improving.md) |
| 119 | - [CLI reference](../cli/reference.md) |