markdown · 3425 bytes Raw Blame History

Instruction section reference

::instruction:: sections are the supervised fine-tuning format DLM uses for prompt/answer training data.

They are valid in hand-authored .dlm files and in synthetic output written by dlm synth instructions --apply.

Basic shape

Each instruction section contains one or more Q / A pairs:

::instruction::
### Q
What is a decorator?

### A
A function that takes a function and returns a wrapped function.

### Q
When should I use `functools.wraps`?

### A
Whenever a decorator returns another callable and you want to preserve
the wrapped function's metadata.

DLM splits those into individual supervised rows at parse time.

Semantics

  • Q is the prompt shown to the model.
  • A is the target response.

At train time, DLM uses the question as context and the answer as the supervised target. This is the section type that most directly shapes assistant behavior.

Auto-synth instruction sections

When dlm synth instructions writes sections back into a document, it adds an HTML marker immediately after the section fence:

::instruction::
<!-- dlm-auto-synth: synth_teacher="self" synth_strategy="extraction" synth_at="2026-04-24T10:18:42Z" source_section_id="b6b7d8a2f4b3f9c0" -->
### Q
What does DGEMM do?

### A
It multiplies dense matrices and can optionally accumulate the result.

That marker corresponds to these parsed fields on the section:

  • auto_synth: true
  • synth_teacher
  • synth_strategy
  • synth_at
  • source_section_id

Hand-authored instruction sections omit the marker and keep auto_synth=false.

Validation rules

  • The auto-synth marker is only valid on ::instruction:: sections.
  • Auto-synth sections must provide all metadata fields together.
  • synth_teacher and synth_strategy must be non-empty strings.
  • source_section_id must be a valid referenced section ID.
  • Section identity ignores the synth metadata, so the same logical question/answer pair keeps the same content identity whether it was written by hand or synthesized automatically.

Interaction with training

  • dlm train includes synthesized instruction sections by default.
  • There is currently no separate "ignore auto-synth instructions" train flag; they flow through the normal SFT path once they are present in the document.
  • dlm synth revert strips every auto_synth: true instruction section from the file without touching hand-authored rows.

Interaction with dlm synth

Relevant commands:

  • dlm synth instructions <path>
  • dlm synth list <path>
  • dlm synth revert <path>

The current instructions command can:

  • stage accepted synth sections for inspection
  • write accepted synth sections directly with --apply
  • preview only with --dry-run

Choosing a good instruction section

Hand-authored or synthesized, good instruction sections tend to have:

  • a clear prompt with one task
  • an answer that matches the tone you want the adapter to learn
  • enough domain specificity that the pair teaches something real

Weak instruction sections tend to be:

  • generic
  • repetitive
  • too broad to answer well
  • stylistically inconsistent with the rest of the document

See also

View source
1 # Instruction section reference
2
3 `::instruction::` sections are the supervised fine-tuning format DLM
4 uses for prompt/answer training data.
5
6 They are valid in hand-authored `.dlm` files and in synthetic output
7 written by `dlm synth instructions --apply`.
8
9 ## Basic shape
10
11 Each instruction section contains one or more `Q` / `A` pairs:
12
13 ```dlm
14 ::instruction::
15 ### Q
16 What is a decorator?
17
18 ### A
19 A function that takes a function and returns a wrapped function.
20
21 ### Q
22 When should I use `functools.wraps`?
23
24 ### A
25 Whenever a decorator returns another callable and you want to preserve
26 the wrapped function's metadata.
27 ```
28
29 DLM splits those into individual supervised rows at parse time.
30
31 ## Semantics
32
33 - `Q` is the prompt shown to the model.
34 - `A` is the target response.
35
36 At train time, DLM uses the question as context and the answer as the
37 supervised target. This is the section type that most directly shapes
38 assistant behavior.
39
40 ## Auto-synth instruction sections
41
42 When `dlm synth instructions` writes sections back into a document, it
43 adds an HTML marker immediately after the section fence:
44
45 ```dlm
46 ::instruction::
47 <!-- dlm-auto-synth: synth_teacher="self" synth_strategy="extraction" synth_at="2026-04-24T10:18:42Z" source_section_id="b6b7d8a2f4b3f9c0" -->
48 ### Q
49 What does DGEMM do?
50
51 ### A
52 It multiplies dense matrices and can optionally accumulate the result.
53 ```
54
55 That marker corresponds to these parsed fields on the section:
56
57 - `auto_synth: true`
58 - `synth_teacher`
59 - `synth_strategy`
60 - `synth_at`
61 - `source_section_id`
62
63 Hand-authored instruction sections omit the marker and keep
64 `auto_synth=false`.
65
66 ## Validation rules
67
68 - The auto-synth marker is only valid on `::instruction::` sections.
69 - Auto-synth sections must provide all metadata fields together.
70 - `synth_teacher` and `synth_strategy` must be non-empty strings.
71 - `source_section_id` must be a valid referenced section ID.
72 - Section identity ignores the synth metadata, so the same logical
73 question/answer pair keeps the same content identity whether it was
74 written by hand or synthesized automatically.
75
76 ## Interaction with training
77
78 - `dlm train` includes synthesized instruction sections by default.
79 - There is currently no separate "ignore auto-synth instructions" train
80 flag; they flow through the normal SFT path once they are present in
81 the document.
82 - `dlm synth revert` strips every `auto_synth: true` instruction section
83 from the file without touching hand-authored rows.
84
85 ## Interaction with `dlm synth`
86
87 Relevant commands:
88
89 - `dlm synth instructions <path>`
90 - `dlm synth list <path>`
91 - `dlm synth revert <path>`
92
93 The current `instructions` command can:
94
95 - stage accepted synth sections for inspection
96 - write accepted synth sections directly with `--apply`
97 - preview only with `--dry-run`
98
99 ## Choosing a good instruction section
100
101 Hand-authored or synthesized, good instruction sections tend to have:
102
103 - a clear prompt with one task
104 - an answer that matches the tone you want the adapter to learn
105 - enough domain specificity that the pair teaches something real
106
107 Weak instruction sections tend to be:
108
109 - generic
110 - repetitive
111 - too broad to answer well
112 - stylistically inconsistent with the rest of the document
113
114 ## See also
115
116 - [Section grammar](sections.md)
117 - [Synthesize training data](../cookbook/synthesize-training-data.md)
118 - [Bootstrap self-improving](../cookbook/bootstrap-self-improving.md)
119 - [CLI reference](../cli/reference.md)