tenseleyflow/documentlanguagemodel / a403b34

Browse files

chore(gitignore): untrack .claude/ + AGENTS.md (editor artifacts)

Authored by espadonne
SHA
a403b347faf0113544978ff23cf0562478029099
Parents
728abe4
Tree
b62771a

3 changed files

StatusFile+-
D .claude/scheduled_tasks.lock 0 1
M .gitignore 2 0
D AGENTS.md 0 302
.claude/scheduled_tasks.lockdeleted
@@ -1,1 +0,0 @@
1
-{"sessionId":"8821b509-f066-4739-bb70-9869816c68de","pid":2932,"acquiredAt":1776617500007}
.gitignoremodified
@@ -33,3 +33,5 @@ dist/
3333
 *.swp
3434
 .DS_Store
3535
 site/
36
+.claude/
37
+AGENTS.md
AGENTS.mddeleted
@@ -1,302 +0,0 @@
1
-# DocumentLanguageModel — Codex Session Boot Context
2
-
3
-> This file is read on every session start. Keep it dense, authoritative, and
4
-> aligned with the living docs. When anything here conflicts with `.docs/`,
5
-> `.docs/` wins and this file must be updated.
6
-
7
-## One-line
8
-
9
-A text file with a `.dlm` extension becomes a local, reproducible, trainable
10
-LLM. Edit the document, retrain, share. Not a toy — LoRA/QLoRA on a real
11
-pretrained base, exportable to Ollama.
12
-
13
-## Current stage
14
-
15
-- ✅ Stage 1 — Planning & reference exploration (see `.docs/findings.md`)
16
-- ✅ Stage 2 — Revised overview + 29 sprint files across 7 phases
17
-- ✅ Stage 3 — this file
18
-- ✅ Stage 4 — Audit 01 (YELLOW → patched). Blockers F01–F04 and majors
19
-  F05–F22 triaged into Sprint 12b + inline sprint amendments. See
20
-  `.docs/audits/01-initial-plan-audit.md` and the end of this file for
21
-  the triage summary.
22
-- ⏳ Stage 5 — Implementation (begin at Sprint 01)
23
-
24
-## Where things live
25
-
26
-```
27
-.docs/overview.md             Canonical project description (read this first)
28
-.docs/findings.md             Stage 1 digest from 8 parallel ref explorations
29
-.docs/sprints/00-index.md     Master index of the 29 sprints
30
-.docs/sprints/phase-*/        Sprint files; each has DoD and risks
31
-.docs/audits/                 Stage 4+ audit outputs
32
-.refs/                        Cloned reference repos (gitignored)
33
-AGENTS.md                     You are here. Gitignored.
34
-```
35
-
36
-`.docs/` and `AGENTS.md` are in `.gitignore` by user choice — planning
37
-artifacts stay local.
38
-
39
-## Crystallized architecture
40
-
41
-**Training paradigm**: LoRA / QLoRA on a user-selected pretrained base. No
42
-from-scratch transformers. The base registry ships with Qwen 2.5
43
-(0.5B–3B + Coder-1.5B), Llama-3.2 (1B, 3B), SmolLM2 (135M–1.7B), and
44
-Phi-3.5-mini. Any HF model via `hf:org/name` with compatibility probes.
45
-
46
-**Document shape**: `mydoc.dlm` is a single UTF-8 text file — YAML
47
-frontmatter + markdown body with section fences (`::instruction::`,
48
-`::preference::`, default-prose). A stable `dlm_id` in the frontmatter
49
-binds the document to a content-addressed store at `~/.dlm/store/<dlm_id>/`.
50
-
51
-**Retention**: single rolling adapter trained on the current document +
52
-recency-weighted sample from a zstd-compressed replay corpus accumulating
53
-every prior document version. Rejected alternative: versioned adapters
54
-with weighted merge (LoRA-only, SVD cost, harder determinism).
55
-
56
-**Export**: separate `base.gguf` + `adapter.gguf` + generated Modelfile with
57
-`ADAPTER` directive. `--merged` opt-in produces a single file (QLoRA
58
-requires explicit `--dequantize`).
59
-
60
-**Hardware tiers**:
61
-- NVIDIA CUDA (SM ≥ 8.0): first-class, bf16 + QLoRA 4-bit + FlashAttention
62
-- NVIDIA CUDA (SM < 8.0): second-class, fp16 LoRA
63
-- Apple Silicon MPS: first-class training (fp16 LoRA), optional MLX inference in Phase 5
64
-- CPU: inference-only by default, training refused except `--force` on ≤200M bases
65
-- AMD ROCm: experimental; Phase 5 promotes to Tier 2
66
-
67
-## Stack
68
-
69
-**In**: Python 3.11+, PyTorch ≥ 2.4, HuggingFace `transformers`/`peft`/`trl`/
70
-`accelerate`/`datasets`, `bitsandbytes` (CUDA-gated), `safetensors`,
71
-`zstandard`, llama.cpp (vendored git submodule) for GGUF export,
72
-Ollama (user-installed), `typer`, `rich`, `uv`, `pytest`, `mypy --strict`,
73
-`ruff`.
74
-
75
-**Out**:
76
-- Unsloth (monkeypatch fragility, transformers-version pinning hell, CUDA-only, Apple Silicon excluded)
77
-- MLX for training (adapter `.npz` format is not PEFT-compatible)
78
-- From-scratch transformers
79
-- DeepSpeed / ZeRO through v1.0
80
-- Windows first-class (best-effort; Linux + macOS are supported tiers)
81
-
82
-## Pitfalls to always remember
83
-
84
-1. **Ollama uses Go `text/template`, not Jinja2.** The GGUF's Jinja
85
-   chat-template is fuzzy-matched by Ollama and fails silently when
86
-   unmatched. We always emit an explicit `TEMPLATE "..."` in the Modelfile
87
-   from our per-base-model Go template registry. Round-trip tests assert
88
-   token-identity with the HF Jinja reference.
89
-
90
-2. **`peft.save_pretrained` does NOT save optimizer / scheduler / RNG.** We
91
-   write a separate `training_state.pt` sidecar with optimizer state,
92
-   scheduler state, AMP scaler, torch/cuda/numpy/python RNGs, step, epoch,
93
-   pinned versions. Without this, resume is not deterministic.
94
-
95
-3. **`merge_and_unload` on 4-bit QLoRA base is precision-unsafe.** Refuse
96
-   the merged export path on QLoRA unless `--dequantize` is explicit; then
97
-   dequantize to fp16 before merge.
98
-
99
-4. **Pad token must NOT default to EOS.** Label corruption when EOS
100
-   appears mid-sequence. Fallback: unk_token → else add `<|pad|>` (and
101
-   then `modules_to_save=["embed_tokens","lm_head"]` is forced, inflating
102
-   adapter size; warn loudly).
103
-
104
-5. **Pre-tokenizer hash table in llama.cpp** is a silent-failure surface.
105
-   Sprint 06 probes at registry-build time + on `dlm init --base hf:...`;
106
-   Sprint 11 re-verifies at `dlm export` preflight. Bumping
107
-   `vendor/llama.cpp` re-runs the registry probe suite via
108
-   `scripts/bump-llama-cpp.sh`.
109
-
110
-6. **Sample packing without FlashAttention** causes `position_ids` drift on
111
-   MPS. Doctor disables packing when FlashAttention is unavailable and
112
-   packing is otherwise unsafe.
113
-
114
-7. **`target_modules="all-linear"` on small models** causes memory blowup
115
-   and instability. Use the per-architecture registry from sprint 06 as
116
-   the default.
117
-
118
-8. **Determinism is a contract**: fixed seed, `use_deterministic_algorithms`,
119
-   `CUBLAS_WORKSPACE_CONFIG=:4096:8`, pinned versions recorded in
120
-   `dlm.lock`. Any code change that breaks the golden determinism test is
121
-   a breaking change.
122
-
123
-Full inventory in `.docs/findings.md#9`.
124
-
125
-## Contract boundaries (audit F25)
126
-
127
-Four load-bearing files; keep them distinct when editing.
128
-
129
-- **`manifest.json`** (per-store): running narrative of training runs,
130
-  exports, content hashes, adapter version. Mutable on every run. Owned
131
-  by Sprint 04; extended by Sprints 09, 11, 12, 12b.
132
-- **`dlm.lock`** (per-store): version pins + hardware tier + determinism
133
-  flags + license acceptance fingerprint. Written once per run; stable.
134
-  Owned by Sprint 15; extended by Sprint 12b (license) and Sprint 23
135
-  (world_size + accelerate).
136
-- **`training_state.pt`** (per-store, per-adapter-version): optimizer,
137
-  scheduler, scaler, all RNGs, step/epoch. Required for bit-exact resume.
138
-  Owned by Sprint 09. Two-phase commit with adapter directory.
139
-- **`exports/<quant>/export_manifest.json`** (per-export): checksums,
140
-  quant level, pinned llama.cpp tag, smoke output. Owned by Sprint 11;
141
-  appended via Sprint 12.
142
-
143
-And one repo-level file:
144
-
145
-- **`dlm.lock`** at the repo root: records which `(torch, transformers,
146
-  peft, trl, bnb, platform)` tuples have a checked-in determinism golden.
147
-  Different from the per-store `dlm.lock`. Owned by Sprint 15.
148
-
149
-## Development guidelines
150
-
151
-- **Commit often, commit small.** Avoid monolithic commits; maximize commits
152
-  per feature so the history shows a narrative. One commit per distinct
153
-  change (a file, a config, a fix), not per day's work.
154
-- **Commit message style**: imperative, terse, one line unless a technical
155
-  choice requires elaboration. **No coauthorship** on any commit.
156
-- **Avoid `git add -A`.** Stage specific files by name; it's harder to
157
-  leak secrets or commit unrelated changes.
158
-- **No shortcuts when a robust approach exists.** If you find yourself
159
-  writing "the simplest approach is…", stop and ask whether this produces
160
-  a trainable LLM. If not, reapproach.
161
-- **Senior AI-engineering discipline.** Write efficient, well-engineered
162
-  code. Respect the pitfall inventory.
163
-- **Strict validation, fail fast.** Axolotl's permissive warnings are the
164
-  anti-pattern. Our Pydantic schemas reject unknown keys, wrong types, and
165
-  inconsistent combinations at parse time.
166
-- **Determinism is a contract.** See above.
167
-- **Tests before implementation** for anything touching training dynamics,
168
-  tokenization, or GGUF export. The tiny-model fixture (sprint 02) makes
169
-  end-to-end CI feasible; use it.
170
-- **`mypy --strict` from day one.** Never loosen; fix the type at source.
171
-- **Per-sprint definition of Done is binary.** A sprint is not Done until
172
-  every DoD checkbox passes and the sprint file is marked Done.
173
-
174
-## Workflow inside a sprint
175
-
176
-1. Read `.docs/sprints/phase-N/NN-*.md` in full.
177
-2. Cross-check against `.docs/findings.md` where the sprint references
178
-   pitfalls or patterns (the sprints do cite sections).
179
-3. Implement incrementally. Commit per file / per logical unit.
180
-4. Write tests alongside (or before) the code.
181
-5. Check every DoD item manually before flipping Status to Done.
182
-6. Update `.docs/sprints/00-index.md` status column if we maintain one.
183
-
184
-## CLI surface by release
185
-
186
-**v1.0** (Phase 3 end):
187
-```
188
-dlm init <path> [--base <key>] [--template <name>] [--i-accept-license]
189
-dlm train <path> [--resume|--fresh] [--seed N] [--max-steps N] [--gpus ...]
190
-                 [--strict-lock|--update-lock|--ignore-lock]
191
-dlm prompt <path> [query] [--max-tokens N] [--temp F] [--adapter <name,...>]
192
-dlm export <path> [--quant Q] [--merged [--dequantize]] [--name N] [--no-smoke]
193
-                  [--adapter-mix name:w,...]
194
-dlm pack <path> [--out X] [--include-exports] [--include-base
195
-                [--i-am-the-licensee <url>]]
196
-dlm unpack <path> [--home DIR] [--force]
197
-dlm migrate <path> [--dry-run] [--no-backup]
198
-dlm doctor [--json]
199
-dlm show <path> [--json]
200
-```
201
-
202
-**v2** (Phases 4–6):
203
-```
204
-dlm repl <path>
205
-dlm train <path> --watch [--repl]
206
-dlm metrics <path> [--json|--csv]
207
-dlm metrics watch <path>
208
-dlm templates list [--refresh]
209
-```
210
-
211
-**v2+** (Phase 7):
212
-```
213
-dlm push <path> [--to hf:org/name | --to <url>] [--sign]
214
-dlm pull <source>
215
-dlm serve <path> [--public [--i-know-this-is-public]]
216
-```
217
-
218
-## Stage gates
219
-
220
-- Stage 4 — **Patched (YELLOW → triaged)**. New Sprint 12b owns F01–F04.
221
-  17 majors amended inline into existing sprints. 9 minors deferred to
222
-  first touch of their owning sprints. A re-audit pass is recommended
223
-  before declaring GREEN and entering Stage 5.
224
-- Stage 5 — begin Sprint 01 (scaffolding) once Stage 4 is GREEN.
225
-
226
-## Context for future sessions
227
-
228
-- Always load `.docs/overview.md`, `.docs/findings.md`, and
229
-  `.docs/sprints/00-index.md` before working on a sprint. Skim the
230
-  relevant sprint file in full.
231
-- The user prefers concise, direct engineering discussion. Surface
232
-  tradeoffs; make recommendations with reasoning.
233
-- When in doubt about an implementation choice, check findings §10
234
-  (adoption matrix per reference repo) — it's the opinionated source of
235
-  truth for "why are we doing it this way, not that way."
236
-
237
-
238
-<claude-mem-context>
239
-# Memory Context
240
-
241
-# [DocumentLanguageModel] recent context, 2026-04-19 7:40pm EDT
242
-
243
-Legend: 🎯session 🔴bugfix 🟣feature 🔄refactor ✅change 🔵discovery ⚖️decision
244
-Format: ID TIME TYPE TITLE
245
-Fetch details: get_observations([IDs]) | Search: mem-search skill
246
-
247
-Stats: 50 obs (20,314t read) | 1,271,868t work | 98% savings
248
-
249
-### Apr 18, 2026
250
-92 5:26p 🔵 armfortas/fortsh Build Produces Widespread Ambiguous USE Import Warnings
251
-94 5:27p 🔵 fortsh Full Build Succeeds via armfortas — Complete Object Link Map Confirmed
252
-98 5:29p 🔵 fortsh Smoke Tests Pass — Parameter Expansion and Pipeline Basics Verified
253
-99 " 🔵 fortsh Test Suite Results — read 94%, var-ops 80% with Identified Failure Clusters
254
-100 " 🔴 Null Pointer Dereference in afs_compare_char When Empty String Variable Used in Parameter Expansion
255
-101 5:32p 🔵 Empty-String Parameter Expansion Bug Isolated to Assignment Side-Effect, Not Expansion Engine
256
-105 5:34p 🔵 V="" Assignment Alone Crashes via execute_ast_node — Bug Is in Assignment Executor, Not Compound Commands
257
-111 5:36p 🔵 armfortas IR Builder Architecture — FuncBuilder API Surface Mapped
258
-113 5:38p 🔵 SIGSEGV Confirmed — Dynamic Substring Index on Zero-Length Allocatable Char Crashes
259
-117 5:39p 🔵 fortsh Crash Site Confirmed in ast_executor.f90 — Dynamic Substring on Zero-Length Allocatable
260
-120 5:40p 🔴 lower_substring_full — Dynamic Substring Out-of-Bounds GEP Fixed with Safe Clamp
261
-121 5:42p 🔴 substring fix validated — 8/8 substring tests pass, repro RC=0, fortsh build proceeding without errors
262
-137 5:55p 🔵 armfortas allocate(scalar_derived) Skips Field Default Initializers
263
-138 " 🔵 fortsh IFS / read Builtin Architecture Confirmed
264
-139 5:56p 🔴 armfortas: allocate(scalar_derived) Now Applies Field Default Initializers
265
-142 5:59p 🔵 fortsh Build Completes with Ambiguous USE Import Warnings in readline Module
266
-143 6:01p 🔵 fortsh Build Produces Ambiguous USE Import Warnings from Duplicate Module Exports
267
-145 " 🔵 armfortas trim/adjustl Branch Produces Correct Value but print '(a)' Adds Leading Space
268
-146 6:03p 🔵 armfortas print '(a)' Emits Carriage-Control Space — Confirmed by od Byte Dump
269
-147 " 🟣 Regression Test Added: allocatable_shell_default_ifs_follows_trim_branch
270
-148 " 🔵 fortsh Builtin Test Results: read 100%, arithmetic 100%, variable_ops 85%, arrays 0% on literal init
271
-149 " 🔵 fortsh Array Literal Init Bug — Bounds Check Failure: index 1 outside [1, 0]
272
-153 6:06p 🔵 Array Section Argument Descriptor Bug — values(1:count) Passed as Assumed-Shape Gets upper=0
273
-155 6:14p 🔵 armfortas Emits Duplicate Ambiguous-USE Warnings Per Translation Unit
274
-156 " 🔵 fortsh Makefile Has Full Native armfortas Profile
275
-157 " 🟣 armfortas Rust Test Suite — Array Section Bounds Test Passing
276
-158 " 🔵 fortsh Binary Previously Built by armfortas — Incremental Rebuild in Progress
277
-159 " 🔵 fortsh test_variables_simple Uses Pooled String API — Not Standard Fortran Variables
278
-160 6:15p ✅ armfortas Array Section Fix Staged for Commit — lower.rs and cli_driver.rs
279
-161 " 🔴 armfortas Commit 4ec3e9a — Lower Array Section Descriptor Actuals
280
-162 6:16p 🔵 fortsh Incremental Rebuild with armfortas Completed Successfully
281
-163 6:17p 🔵 armfortas Working Tree — Active afs-as/afs-ld Changes Plus Repro Test Artifacts
282
-177 6:24p 🔵 fortsh Build — Mass Ambiguous USE Import Warnings from armfortas
283
-179 6:27p 🔵 armfortas Peak RSS ~99 MB Compiling fortsh lexer.f90
284
-182 6:30p 🟣 fortsh Binary Successfully Built with armfortas — /tmp/fortsh_armf_arrayfix/bin/fortsh
285
-183 6:31p 🔵 fortsh 1.7.0 Binary Verified Functional — Basic Array and Pipeline Semantics Correct
286
-184 " 🔵 Array Test Suite Baseline — 17/31 Pass (54%), 14 Failures Cataloged
287
-192 6:34p 🔵 Test Harness Uses Bash 3.2 as Reference — Assoc Array Failures Are Baseline Artifacts
288
-193 " 🔵 Three armfortas-Specific Array Regressions Confirmed Against flang-ref Baseline
289
-220 6:50p 🔵 armfortas Unset Module Variable — Parity with flang-new Confirmed
290
-222 " 🔵 fortsh Array Unset Bugs — Two Distinct Failures in armfortas Build vs Correct flang Reference
291
-223 6:53p 🔵 fortsh Null-Assignment Hole (`arr[1]=`) Produces Correct Sparse Indices in armfortas Build
292
-224 " 🔵 AST Executor Does Not Dispatch `unset` as Builtin — "command not found" at Runtime
293
-228 " 🔵 armfortas `builtin_unset` Direct Call Crashes — "Bounds check failed: index 1026 outside [1, 1025]"
294
-231 6:54p 🔵 armfortas `unset foo[N]` on Non-Existent Variable Crashes — Bug Not Array-Existence-Dependent
295
-232 " 🔵 fortsh `execute_simple_command` Builtin Dispatch — Routes Through `execute_pipeline`, Not Direct Call
296
-234 6:56p 🔵 armfortas `unset` Bug Scope Narrowed — Scalar `unset` Works, Array-Index Form Always Crashes
297
-240 7:39p 🔵 armfortas expand_out Token Output Shows Null-Byte Corruption
298
-241 " 🔵 flang-new Cannot Compile expand_out Repro — `fill` Not Found in Module `m`
299
-242 7:40p 🔵 armfortas .amod Exports `fill` With Fixed-Length Allocatable Character(len=32) Intent(out)
300
-
301
-Access 1272k tokens of past work via get_observations([IDs]) or mem-search skill.
302
-</claude-mem-context>