# Synthesize training data `dlm synth instructions` turns prose-heavy `.dlm` files into usable `::instruction::` sections. This is the shortest path from "I have notes" to "I have supervised training pairs" when the document already contains domain prose but not enough authored Q/A. ## What it does The synth loop: 1. Finds non-empty prose sections in the document. 2. Prompts a teacher model to generate question/answer pairs about that prose. 3. Deduplicates the generated pairs. 4. Optionally filters them through the `sway` judge. 5. Either stages the accepted `auto_synth` sections for inspection or writes them straight back into the `.dlm`. The generated sections are still normal `::instruction::` sections. They just carry provenance metadata so DLM can tell synthesized pairs from hand-authored ones. ## Choose a teacher The teacher decides who writes the candidate Q/A pairs: - `self`: use the current local adapter for this document - `hf:`: use a HuggingFace text model - `openai:`: use the OpenAI API - `anthropic:`: use the Anthropic API - `vllm-server:`: use an OpenAI-compatible local server The current default is `self`, but that only makes sense once the document already has a trained adapter. For a cold start, either: - train once first, then synth with `self`, or - use `hf:` / `openai:` / `anthropic:` / `vllm-server:` as the teacher ## Minimal example Start with a prose-heavy document: ```dlm --- dlm_id: 01K... dlm_version: 15 base_model: smollm2-135m --- DGEMM multiplies two dense matrices and can optionally accumulate the result into an existing output matrix. ``` Generate one extraction-style pair per prose section with an HF teacher: ```sh uv run dlm synth instructions notes.dlm \ --teacher hf:Qwen/Qwen2.5-1.5B-Instruct \ --per-section 1 \ --strategy extraction ``` That prints two summaries: - the raw synth plan - the filter report (`generated`, `dedup`, `judge passed`, `threshold`) By default, accepted sections are staged under the store so you can inspect them: ```sh uv run dlm synth list notes.dlm ``` If you want the accepted pairs written straight back into the document, use `--apply`: ```sh uv run dlm synth instructions notes.dlm \ --teacher hf:Qwen/Qwen2.5-1.5B-Instruct \ --per-section 1 \ --strategy extraction \ --apply ``` ## Strategy choices The `--strategy` flag controls what kind of questions the teacher is asked to produce: - `extraction`: questions answered directly by the prose - `expansion`: questions a curious reader might ask beyond the exact wording of the prose - `both`: split the per-section budget across both prompt styles Start with `extraction` when you care about faithfulness. Reach for `expansion` once the document already has a stable domain voice and you want broader instructional coverage. ## Filter choices The `--filter` flag controls post-generation cleanup: - `sway`: dedup plus judge filtering against an empty baseline - `dedup-only`: keep only near-duplicate suppression - `none`: accept everything that parses as a valid pair `sway` is the safest default and is what most users should keep. It is especially helpful when using creative teachers or `--strategy both`. If you are debugging prompt quality, use `--filter none` once and look at the raw plan before deciding whether the issue is generation or filtering. ## Useful knobs ```sh uv run dlm synth instructions notes.dlm \ --teacher hf:Qwen/Qwen2.5-1.5B-Instruct \ --per-section 3 \ --strategy both \ --filter sway \ --threshold 0.2 \ --max-pairs 8 \ --max-new-tokens 512 \ --temp 0.2 \ --top-p 0.95 \ --seed 7 ``` The most useful flags in practice are: - `--per-section`: generate more than one candidate pair per prose block - `--max-pairs`: cap document churn on large files - `--threshold`: tighten or loosen `sway` acceptance - `--temp` and `--top-p`: increase diversity when the teacher is too repetitive ## Training after synth Once the document has accepted `auto_synth` instruction sections, the next normal train run consumes them like any other instruction pair: ```sh uv run dlm train notes.dlm ``` No special train flag is needed. Synthesized instruction sections flow through the same SFT path as hand-authored sections. ## Revert and inspection List applied auto-synth sections: ```sh uv run dlm synth list notes.dlm ``` Strip every synthesized instruction section from the document: ```sh uv run dlm synth revert notes.dlm ``` This only removes `auto_synth: true` instruction sections. Hand-authored instruction blocks stay untouched. ## Common failure modes ### The self teacher is weak If `--teacher self` produces junk, the adapter probably is not ready yet. Train once more first, or use a stronger external teacher for the first synth pass. ### Everything gets filtered out That usually means one of three things: - the teacher produced near-duplicates - the generated answers were worse than the empty-baseline comparison in `sway` - the threshold is too strict Lower `--threshold`, or temporarily switch to `--filter dedup-only` to see whether the judge is the main bottleneck. ### The document churns too much Use `--max-pairs` aggressively at first. A small accepted batch is much easier to reason about than dumping dozens of synthetic sections into a single file. ## See also - [Instruction section reference](../format/instruction-section.md) - [Bootstrap self-improving](bootstrap-self-improving.md) - [Self-improving loop](self-improving-loop.md) - [CLI reference](../cli/reference.md)