markdown · 13077 bytes Raw Blame History

afs-ld

Bespoke ARM64 Mach-O linker for Apple Silicon, written in Rust, stdlib only.

Why

armfortas already owns the compiler (armfortas) and the assembler (afs-as). Every binary the toolchain produces today is still shaped by Apple's ld — which puts the same class of opaque, untouchable bugs back in our path that motivated abandoning LLVM in the first place. afs-ld closes the loop. We own every byte from .f90 source to the final Mach-O executable on disk.

This is not a toy or educational linker. The target is production parity with Apple ld for:

  • Everything armfortas produces today: arm64 PIE executables statically linking libarmfortas_rt.a and dynamically linking libSystem.
  • The fortsh milestone: ~57 KLoC Fortran 2018, 55 modules, iso_c_binding, allocatable strings, derived types.
  • Deterministic output: -no_uuid parity, reproducible byte layout across invocations.
  • All ARM64 Mach-O relocation types, static archives (.a), binary dylibs, and TAPI TBD v4 text stubs (libSystem ships as .tbd on modern SDKs).
  • Both classic LC_DYLD_INFO opcodes and modern LC_DYLD_CHAINED_FIXUPS.
  • Dylib output (-dylib) as a first-class feature, not an afterthought.
  • Ad-hoc code signing — macOS 11+ will not execute an unsigned arm64 binary, even if every other byte is perfect.

Non-goals (current)

  • ELF, COFF, PE, or any non-Mach-O format.
  • x86_64, arm64_32, armv7, or any architecture other than arm64.
  • Bitcode/LTO. armfortas emits assembly, not bitcode; lto is not part of the armfortas pipeline.
  • ObjC / Swift metadata merging. No armfortas code emits it. Hooks exist for later.
  • Cross-compilation. We target the host Mac; no -sdk_version time-travel.

These are non-goals now; afs-ld is built to grow into them without architectural retrofit.

What afs-as hands us

afs-as emits MH_OBJECT only (hand-rolled, no external Mach-O crate). Its output defines the contract afs-ld reads:

  • Load commands: LC_SEGMENT_64, LC_BUILD_VERSION (PLATFORM_MACOS), optional LC_LINKER_OPTIMIZATION_HINT, LC_SYMTAB, LC_DYSYMTAB.
  • Section kinds: __TEXT,__text, __TEXT,__cstring, __TEXT,__literal16, __TEXT,__const, __TEXT,__compact_unwind, __TEXT,__eh_frame, __DATA,__data, __DATA,__bss, __DATA,__thread_data, __DATA,__thread_bss, __DATA,__thread_vars.
  • Symbol flags: N_UNDF/N_SECT/N_ABS with N_EXT, N_PEXT, N_WEAK_REF, N_WEAK_DEF, N_NO_DEAD_STRIP. Common symbols live in N_UNDF with n_desc-encoded alignment.
  • Relocation types: ARM64_RELOC_UNSIGNED, SUBTRACTOR, BRANCH26, PAGE21, PAGEOFF12, GOT_LOAD_PAGE21, GOT_LOAD_PAGEOFF12, POINTER_TO_GOT, TLVP_LOAD_PAGE21, TLVP_LOAD_PAGEOFF12, ADDEND (paired prefix).
  • Flags: MH_SUBSECTIONS_VIA_SYMBOLS always set — atomization model is in play.
  • LOH hints: AdrpAdd, AdrpLdr, AdrpLdrGot, AdrpLdrGotLdr — afs-ld preserves them from Sprint 0 and relaxes them in Sprint 25.

afs-as exposes no Mach-O reader. afs-ld ships its own.

Current driver contract

armfortas/src/driver/mod.rs:497-565 shells out to ld:

ld <obj1> <obj2> ... <libarmfortas_rt.a> \
   -lSystem -no_uuid -syslibroot <SDK> -e _main -o <output>

Inputs are .o from afs-as plus libarmfortas_rt.a from the runtime/ crate. Output is an arm64 PIE executable with entry _main (a synthetic wrapper at src/driver/mod.rs:371-392 calling _afs_program_init → user PROGRAM → _afs_program_finalize). afs-ld drops into this contract unchanged; the driver swap (Sprint 20) is initially gated behind AFS_LD=1.

Reference material

Already in parent .refs/:

  • .refs/llvm/lld/MachO/ (~21 KLoC C++) — primary architectural reference. Most relevant files: Driver.cpp (pipeline), InputFiles.cpp (object/archive/dylib parsing), SymbolTable.cpp (resolution), SyntheticSections.cpp (GOT/stubs/binding), Arch/ARM64.cpp (reloc math), Writer.cpp (layout).
  • .refs/llvm/lld/docs/MachO/index.rst — design notes comparing lld and ld64.

Cloned in Sprint 0:

  • .refs/ld64/ — Apple's open-source ld64 (GitHub mirror of last publicly released tarball). Authoritative for byte-level parity edge cases when our diff harness disagrees with Apple's ld.
  • .refs/mold/ — Rui Ueyama's mold including its Darwin port. Leaner second opinion and a source of performance ideas.

Spec-level:

  • Apple <mach-o/loader.h>, <mach-o/nlist.h>, <mach-o/reloc.h>, <mach-o/arm64/reloc.h> — mirrored numerically in our macho/constants.rs.
  • dyld open source — bind/rebase/lazy-bind opcode semantics and chained-fixups format.
  • ARM Architecture Reference Manual (ARMv8-A) — encoding of relocated instructions (ADRP, ADD, LDR, B/BL).

Repo layout

afs-ld is a Cargo workspace member of armfortas and a Git submodule at armfortas/afs-ld pointing at git@github.com:FortranGoingOnForty/afs-ld.git, mirroring how afs-as is organized.

afs-ld/
├── Cargo.toml                 # no deps outside std
├── CLAUDE.md                  # mirrors afs-as/CLAUDE.md, tailored to linker
├── README.md
├── .docs/
│   ├── overview.md            # this file
│   └── sprints/               # 32 sprint files
├── src/
│   ├── lib.rs                 # re-export Linker, LinkOptions, OutputKind
│   ├── main.rs                # afs-ld binary
│   ├── args.rs                # CLI parsing (hand-rolled, no clap)
│   ├── macho/
│   │   ├── mod.rs
│   │   ├── constants.rs       # LC_*, MH_*, S_*, N_*, ARM64_RELOC_*
│   │   ├── reader.rs          # parse MH_OBJECT
│   │   ├── writer.rs          # emit MH_EXECUTE or MH_DYLIB
│   │   ├── dylib.rs           # parse MH_DYLIB
│   │   └── tbd.rs             # parse TAPI TBD v4
│   ├── archive.rs             # ar/ranlib archive reader
│   ├── input.rs               # InputFile enum, lazy member fetch
│   ├── symbol.rs              # Symbol kinds, SymbolTable
│   ├── resolve.rs             # name resolution pass
│   ├── atom.rs                # subsections-via-symbols atom model
│   ├── section.rs             # InputSection, OutputSection, OutputSegment
│   ├── layout.rs              # VM addr + file offset assignment
│   ├── reloc/
│   │   ├── mod.rs
│   │   ├── arm64.rs           # ARM64_RELOC_* application
│   │   └── loh.rs             # LOH preservation / relaxation
│   ├── synth/                 # synthetic sections
│   │   ├── mod.rs
│   │   ├── got.rs
│   │   ├── stubs.rs
│   │   ├── tlv.rs
│   │   ├── symtab.rs
│   │   ├── dyld_info.rs       # classic rebase/bind/lazy/weak + export trie
│   │   ├── chained.rs         # LC_DYLD_CHAINED_FIXUPS
│   │   ├── unwind.rs
│   │   ├── eh_frame.rs
│   │   ├── func_starts.rs
│   │   ├── data_in_code.rs
│   │   └── code_sig.rs        # ad-hoc SHA-256 code signature
│   ├── map.rs                 # -map text link map
│   ├── why_live.rs            # -why_live dead-strip reasons
│   ├── gc.rs                  # -dead_strip
│   ├── icf.rs                 # -icf=safe
│   ├── driver.rs              # orchestrator
│   └── diag.rs                # diagnostics, path/line/col parity with afs-as
└── tests/
    ├── common/harness.rs      # spawn afs-ld, diff output vs system ld
    ├── reader_*.rs            # round-trip object reads
    ├── reloc_*.rs             # golden-file reloc application
    ├── resolve_*.rs           # symbol resolution matrices
    ├── hello_world.rs         # first end-to-end executable link
    ├── hello_library.rs       # first end-to-end dylib link
    ├── armfortas_integration.rs
    └── corpus/                # hand-curated .o / .a / .dylib / .tbd fixtures

Architecture pipeline

args → inputs → resolve → atomize → layout → apply relocs → synth sections → write → sign
  1. args: hand-rolled parser for the ld-compatible CLI surface. No clap.
  2. inputs: demultiplex .o, .a, .dylib, .tbd; lazy archive-member fetching.
  3. resolve: symbol-table fixed-point loop; weak/common/alias coalescing; archive-driven pulls.
  4. atomize: split input sections at symbol boundaries per MH_SUBSECTIONS_VIA_SYMBOLS.
  5. layout: assign VM addrs (__PAGEZERO/__TEXT/__DATA_CONST/__DATA/__LINKEDIT for executables; no __PAGEZERO for dylibs) and file offsets.
  6. apply relocs: ARM64_RELOC_* patching; GOT/stubs/lazy-pointer emission; LOH honoring.
  7. synth sections: __LINKEDIT payload — symbol table, string table, LC_DYLD_INFO and/or chained fixups, function starts, data-in-code, compact unwind, eh_frame passthrough.
  8. write: Mach-O header + load commands + segment data; -no_uuid deterministic.
  9. sign: ad-hoc SHA-256 page hashes in LC_CODE_SIGNATURE so the binary runs on bare arm64.

Coding conventions

  • Rust std only. No clap, no serde, no byteorder, no object, no goblin. Hand-roll parsers, serializers, and the tiny YAML subset we need for TBD.
  • unsafe only where genuinely required. Keep blocks small and commented.
  • Exhaustive pattern matching on Section, Symbol, Relocation, InputFile, Fixup — no catch-all _ arms outside tests.
  • Diagnostics: path, offset, caret under source — mirror afs-as/src/diag*.rs.
  • Determinism: no timestamps in output, sorted iteration, stable hashing.
  • Commit discipline: terse imperative, no co-authors, per-file/per-chunk commits, never monoliths. Matches armfortas and afs-as house rules.
  • No borrowed constants across crates. afs-ld duplicates MH_*, LC_*, S_*, N_*, ARM64_RELOC_* in macho/constants.rs rather than depending on afs-as at a type level. Each submodule stays independent.

Testing strategy

  • Unit: every parser and encoder has a round-trip test — parse a fixture, re-emit, compare bytes.
  • Corpus: tests/corpus/ collects .o, .a, .dylib, .tbd fixtures. Every new relocation type or section kind lands a corpus entry in the same sprint that implements it.
  • Differential (from Sprint 1): tests/common/harness.rs links the same inputs through ld and afs-ld, diffs load commands, symbol tables, bind/rebase or chained-fixup streams, and disassembly. Tolerated-diff allowlist covers UUID, timestamp, hash-backed temp names. CI gate from sprint one.
  • End-to-end (from Sprint 18): hello-world executable must run; (Sprint 18.5) hello-library dylib must dlopen. (Sprint 21) the full armfortas integration suite must pass.
  • fortsh link (Sprint 29): explicit milestone. A fortsh binary linked by afs-ld must behave identically to one linked by system ld.
  • Audits: post-Sprint 18 (hello), 18.5 (dylib), 22 (first signed & running on bare arm64), 27 (parity gate), 29 (fortsh), 31 (final). Brutal honesty rules from armfortas/CLAUDE.md apply.

Sprint roadmap (summary)

See .docs/sprints/index.md for the full list. Ten phases, 32 sprints:

  • Phase 0 — Scaffolding: Sprint 0.
  • Phase 1 — Mach-O reading: Sprints 1–3 (header/load commands, sections/symbols, relocations).
  • Phase 2 — Archives & dylibs: Sprints 4–6 (ar, binary dylib, TBD).
  • Phase 3 — Symbol resolution: Sprints 7–9 (model, resolution pass, atomization).
  • Phase 4 — Output construction: Sprints 10–14 (layout, reloc application, GOT/stubs, TLV, symtab/strtab). MH_EXECUTE and MH_DYLIB both first-class.
  • Phase 5 — Dyld metadata: Sprints 15 (classic LC_DYLD_INFO), 15.5 (chained fixups), 16 (function starts/data-in-code), 17 (unwind info).
  • Phase 6 — End-to-end: Sprint 18 (hello-world executable), 18.5 (hello-library dylib).
  • Phase 7 — CLI & driver: Sprints 19 (CLI + -map/-why_live diagnostics), 20 (driver swap).
  • Phase 8 — Runtime compatibility: Sprints 21 (runtime archive + integration tests), 22 (ad-hoc code signature).
  • Phase 9 — Advanced: Sprints 23 (-dead_strip), 24 (-icf=safe), 25 (LOH relaxation), 26 (thunks).
  • Phase 10 — Hardening: Sprints 27 (differential gate), 28 (performance), 29 (fortsh link audit), 30 (polish), 31 (final audit).

Scope decisions (confirmed)

  • Dylib output is in scope from Phase 4; the writer is dylib-aware from Sprint 10. Dylib milestone at Sprint 18.5.
  • Both classic LC_DYLD_INFO (Sprint 15) and chained fixups (Sprint 15.5) are in scope. Chained becomes default on macOS 12+ after Sprint 27 parity gate.
  • .refs/ gains ld64 and mold alongside lld. lld is the architectural reference, ld64 is authoritative for Apple-parity edge cases, mold informs performance.
  • -map and -why_live land in Sprint 19 with the core CLI. They are the debugging surface during driver adoption, not a polish item.
View source
1 # afs-ld
2
3 **Bespoke ARM64 Mach-O linker for Apple Silicon, written in Rust, stdlib only.**
4
5 ## Why
6
7 armfortas already owns the compiler (`armfortas`) and the assembler (`afs-as`). Every binary the toolchain produces today is still shaped by Apple's `ld` — which puts the same class of opaque, untouchable bugs back in our path that motivated abandoning LLVM in the first place. `afs-ld` closes the loop. We own every byte from `.f90` source to the final Mach-O executable on disk.
8
9 This is **not** a toy or educational linker. The target is production parity with Apple `ld` for:
10
11 - Everything armfortas produces today: arm64 PIE executables statically linking `libarmfortas_rt.a` and dynamically linking `libSystem`.
12 - The fortsh milestone: ~57 KLoC Fortran 2018, 55 modules, `iso_c_binding`, allocatable strings, derived types.
13 - Deterministic output: `-no_uuid` parity, reproducible byte layout across invocations.
14 - All ARM64 Mach-O relocation types, static archives (`.a`), binary dylibs, and TAPI TBD v4 text stubs (libSystem ships as `.tbd` on modern SDKs).
15 - Both classic `LC_DYLD_INFO` opcodes **and** modern `LC_DYLD_CHAINED_FIXUPS`.
16 - Dylib output (`-dylib`) as a first-class feature, not an afterthought.
17 - Ad-hoc code signing — macOS 11+ will not execute an unsigned arm64 binary, even if every other byte is perfect.
18
19 ## Non-goals (current)
20
21 - ELF, COFF, PE, or any non-Mach-O format.
22 - x86_64, arm64_32, armv7, or any architecture other than arm64.
23 - Bitcode/LTO. armfortas emits assembly, not bitcode; lto is not part of the armfortas pipeline.
24 - ObjC / Swift metadata merging. No armfortas code emits it. Hooks exist for later.
25 - Cross-compilation. We target the host Mac; no `-sdk_version` time-travel.
26
27 These are non-goals **now**; afs-ld is built to grow into them without architectural retrofit.
28
29 ## What afs-as hands us
30
31 afs-as emits `MH_OBJECT` only (hand-rolled, no external Mach-O crate). Its output defines the contract afs-ld reads:
32
33 - Load commands: `LC_SEGMENT_64`, `LC_BUILD_VERSION` (PLATFORM_MACOS), optional `LC_LINKER_OPTIMIZATION_HINT`, `LC_SYMTAB`, `LC_DYSYMTAB`.
34 - Section kinds: `__TEXT,__text`, `__TEXT,__cstring`, `__TEXT,__literal16`, `__TEXT,__const`, `__TEXT,__compact_unwind`, `__TEXT,__eh_frame`, `__DATA,__data`, `__DATA,__bss`, `__DATA,__thread_data`, `__DATA,__thread_bss`, `__DATA,__thread_vars`.
35 - Symbol flags: `N_UNDF`/`N_SECT`/`N_ABS` with `N_EXT`, `N_PEXT`, `N_WEAK_REF`, `N_WEAK_DEF`, `N_NO_DEAD_STRIP`. Common symbols live in `N_UNDF` with `n_desc`-encoded alignment.
36 - Relocation types: `ARM64_RELOC_UNSIGNED`, `SUBTRACTOR`, `BRANCH26`, `PAGE21`, `PAGEOFF12`, `GOT_LOAD_PAGE21`, `GOT_LOAD_PAGEOFF12`, `POINTER_TO_GOT`, `TLVP_LOAD_PAGE21`, `TLVP_LOAD_PAGEOFF12`, `ADDEND` (paired prefix).
37 - Flags: `MH_SUBSECTIONS_VIA_SYMBOLS` always set — atomization model is in play.
38 - LOH hints: `AdrpAdd`, `AdrpLdr`, `AdrpLdrGot`, `AdrpLdrGotLdr` — afs-ld preserves them from Sprint 0 and relaxes them in Sprint 25.
39
40 afs-as exposes no Mach-O reader. afs-ld ships its own.
41
42 ## Current driver contract
43
44 `armfortas/src/driver/mod.rs:497-565` shells out to `ld`:
45
46 ```
47 ld <obj1> <obj2> ... <libarmfortas_rt.a> \
48 -lSystem -no_uuid -syslibroot <SDK> -e _main -o <output>
49 ```
50
51 Inputs are `.o` from afs-as plus `libarmfortas_rt.a` from the `runtime/` crate. Output is an arm64 PIE executable with entry `_main` (a synthetic wrapper at `src/driver/mod.rs:371-392` calling `_afs_program_init` → user PROGRAM → `_afs_program_finalize`). afs-ld drops into this contract unchanged; the driver swap (Sprint 20) is initially gated behind `AFS_LD=1`.
52
53 ## Reference material
54
55 Already in parent `.refs/`:
56
57 - `.refs/llvm/lld/MachO/` (~21 KLoC C++) — primary architectural reference. Most relevant files: `Driver.cpp` (pipeline), `InputFiles.cpp` (object/archive/dylib parsing), `SymbolTable.cpp` (resolution), `SyntheticSections.cpp` (GOT/stubs/binding), `Arch/ARM64.cpp` (reloc math), `Writer.cpp` (layout).
58 - `.refs/llvm/lld/docs/MachO/index.rst` — design notes comparing lld and ld64.
59
60 Cloned in Sprint 0:
61
62 - `.refs/ld64/` — Apple's open-source `ld64` (GitHub mirror of last publicly released tarball). Authoritative for byte-level parity edge cases when our diff harness disagrees with Apple's `ld`.
63 - `.refs/mold/` — Rui Ueyama's `mold` including its Darwin port. Leaner second opinion and a source of performance ideas.
64
65 Spec-level:
66
67 - Apple `<mach-o/loader.h>`, `<mach-o/nlist.h>`, `<mach-o/reloc.h>`, `<mach-o/arm64/reloc.h>` — mirrored numerically in our `macho/constants.rs`.
68 - `dyld` open source — bind/rebase/lazy-bind opcode semantics and chained-fixups format.
69 - ARM Architecture Reference Manual (ARMv8-A) — encoding of relocated instructions (ADRP, ADD, LDR, B/BL).
70
71 ## Repo layout
72
73 afs-ld is a Cargo workspace member of `armfortas` and a Git submodule at `armfortas/afs-ld` pointing at `git@github.com:FortranGoingOnForty/afs-ld.git`, mirroring how `afs-as` is organized.
74
75 ```
76 afs-ld/
77 ├── Cargo.toml # no deps outside std
78 ├── CLAUDE.md # mirrors afs-as/CLAUDE.md, tailored to linker
79 ├── README.md
80 ├── .docs/
81 │ ├── overview.md # this file
82 │ └── sprints/ # 32 sprint files
83 ├── src/
84 │ ├── lib.rs # re-export Linker, LinkOptions, OutputKind
85 │ ├── main.rs # afs-ld binary
86 │ ├── args.rs # CLI parsing (hand-rolled, no clap)
87 │ ├── macho/
88 │ │ ├── mod.rs
89 │ │ ├── constants.rs # LC_*, MH_*, S_*, N_*, ARM64_RELOC_*
90 │ │ ├── reader.rs # parse MH_OBJECT
91 │ │ ├── writer.rs # emit MH_EXECUTE or MH_DYLIB
92 │ │ ├── dylib.rs # parse MH_DYLIB
93 │ │ └── tbd.rs # parse TAPI TBD v4
94 │ ├── archive.rs # ar/ranlib archive reader
95 │ ├── input.rs # InputFile enum, lazy member fetch
96 │ ├── symbol.rs # Symbol kinds, SymbolTable
97 │ ├── resolve.rs # name resolution pass
98 │ ├── atom.rs # subsections-via-symbols atom model
99 │ ├── section.rs # InputSection, OutputSection, OutputSegment
100 │ ├── layout.rs # VM addr + file offset assignment
101 │ ├── reloc/
102 │ │ ├── mod.rs
103 │ │ ├── arm64.rs # ARM64_RELOC_* application
104 │ │ └── loh.rs # LOH preservation / relaxation
105 │ ├── synth/ # synthetic sections
106 │ │ ├── mod.rs
107 │ │ ├── got.rs
108 │ │ ├── stubs.rs
109 │ │ ├── tlv.rs
110 │ │ ├── symtab.rs
111 │ │ ├── dyld_info.rs # classic rebase/bind/lazy/weak + export trie
112 │ │ ├── chained.rs # LC_DYLD_CHAINED_FIXUPS
113 │ │ ├── unwind.rs
114 │ │ ├── eh_frame.rs
115 │ │ ├── func_starts.rs
116 │ │ ├── data_in_code.rs
117 │ │ └── code_sig.rs # ad-hoc SHA-256 code signature
118 │ ├── map.rs # -map text link map
119 │ ├── why_live.rs # -why_live dead-strip reasons
120 │ ├── gc.rs # -dead_strip
121 │ ├── icf.rs # -icf=safe
122 │ ├── driver.rs # orchestrator
123 │ └── diag.rs # diagnostics, path/line/col parity with afs-as
124 └── tests/
125 ├── common/harness.rs # spawn afs-ld, diff output vs system ld
126 ├── reader_*.rs # round-trip object reads
127 ├── reloc_*.rs # golden-file reloc application
128 ├── resolve_*.rs # symbol resolution matrices
129 ├── hello_world.rs # first end-to-end executable link
130 ├── hello_library.rs # first end-to-end dylib link
131 ├── armfortas_integration.rs
132 └── corpus/ # hand-curated .o / .a / .dylib / .tbd fixtures
133 ```
134
135 ## Architecture pipeline
136
137 ```
138 args → inputs → resolve → atomize → layout → apply relocs → synth sections → write → sign
139 ```
140
141 1. **args**: hand-rolled parser for the `ld`-compatible CLI surface. No clap.
142 2. **inputs**: demultiplex `.o`, `.a`, `.dylib`, `.tbd`; lazy archive-member fetching.
143 3. **resolve**: symbol-table fixed-point loop; weak/common/alias coalescing; archive-driven pulls.
144 4. **atomize**: split input sections at symbol boundaries per `MH_SUBSECTIONS_VIA_SYMBOLS`.
145 5. **layout**: assign VM addrs (`__PAGEZERO`/`__TEXT`/`__DATA_CONST`/`__DATA`/`__LINKEDIT` for executables; no `__PAGEZERO` for dylibs) and file offsets.
146 6. **apply relocs**: ARM64_RELOC_* patching; GOT/stubs/lazy-pointer emission; LOH honoring.
147 7. **synth sections**: `__LINKEDIT` payload — symbol table, string table, `LC_DYLD_INFO` and/or chained fixups, function starts, data-in-code, compact unwind, eh_frame passthrough.
148 8. **write**: Mach-O header + load commands + segment data; `-no_uuid` deterministic.
149 9. **sign**: ad-hoc SHA-256 page hashes in `LC_CODE_SIGNATURE` so the binary runs on bare arm64.
150
151 ## Coding conventions
152
153 - **Rust std only.** No `clap`, no `serde`, no `byteorder`, no `object`, no `goblin`. Hand-roll parsers, serializers, and the tiny YAML subset we need for TBD.
154 - **`unsafe` only where genuinely required.** Keep blocks small and commented.
155 - **Exhaustive pattern matching** on `Section`, `Symbol`, `Relocation`, `InputFile`, `Fixup` — no catch-all `_` arms outside tests.
156 - **Diagnostics**: path, offset, caret under source — mirror `afs-as/src/diag*.rs`.
157 - **Determinism**: no timestamps in output, sorted iteration, stable hashing.
158 - **Commit discipline**: terse imperative, no co-authors, per-file/per-chunk commits, never monoliths. Matches armfortas and afs-as house rules.
159 - **No borrowed constants across crates.** afs-ld duplicates `MH_*`, `LC_*`, `S_*`, `N_*`, `ARM64_RELOC_*` in `macho/constants.rs` rather than depending on afs-as at a type level. Each submodule stays independent.
160
161 ## Testing strategy
162
163 - **Unit**: every parser and encoder has a round-trip test — parse a fixture, re-emit, compare bytes.
164 - **Corpus**: `tests/corpus/` collects `.o`, `.a`, `.dylib`, `.tbd` fixtures. Every new relocation type or section kind lands a corpus entry in the same sprint that implements it.
165 - **Differential** (from Sprint 1): `tests/common/harness.rs` links the same inputs through `ld` and `afs-ld`, diffs load commands, symbol tables, bind/rebase or chained-fixup streams, and disassembly. Tolerated-diff allowlist covers UUID, timestamp, hash-backed temp names. CI gate from sprint one.
166 - **End-to-end** (from Sprint 18): hello-world executable must run; (Sprint 18.5) hello-library dylib must `dlopen`. (Sprint 21) the full armfortas integration suite must pass.
167 - **fortsh link** (Sprint 29): explicit milestone. A fortsh binary linked by afs-ld must behave identically to one linked by system `ld`.
168 - **Audits**: post-Sprint 18 (hello), 18.5 (dylib), 22 (first signed & running on bare arm64), 27 (parity gate), 29 (fortsh), 31 (final). Brutal honesty rules from armfortas/CLAUDE.md apply.
169
170 ## Sprint roadmap (summary)
171
172 See `.docs/sprints/index.md` for the full list. Ten phases, 32 sprints:
173
174 - **Phase 0 — Scaffolding**: Sprint 0.
175 - **Phase 1 — Mach-O reading**: Sprints 1–3 (header/load commands, sections/symbols, relocations).
176 - **Phase 2 — Archives & dylibs**: Sprints 4–6 (`ar`, binary dylib, TBD).
177 - **Phase 3 — Symbol resolution**: Sprints 7–9 (model, resolution pass, atomization).
178 - **Phase 4 — Output construction**: Sprints 10–14 (layout, reloc application, GOT/stubs, TLV, symtab/strtab). MH_EXECUTE and MH_DYLIB both first-class.
179 - **Phase 5 — Dyld metadata**: Sprints 15 (classic `LC_DYLD_INFO`), 15.5 (chained fixups), 16 (function starts/data-in-code), 17 (unwind info).
180 - **Phase 6 — End-to-end**: Sprint 18 (hello-world executable), 18.5 (hello-library dylib).
181 - **Phase 7 — CLI & driver**: Sprints 19 (CLI + `-map`/`-why_live` diagnostics), 20 (driver swap).
182 - **Phase 8 — Runtime compatibility**: Sprints 21 (runtime archive + integration tests), 22 (ad-hoc code signature).
183 - **Phase 9 — Advanced**: Sprints 23 (`-dead_strip`), 24 (`-icf=safe`), 25 (LOH relaxation), 26 (thunks).
184 - **Phase 10 — Hardening**: Sprints 27 (differential gate), 28 (performance), 29 (fortsh link audit), 30 (polish), 31 (final audit).
185
186 ## Scope decisions (confirmed)
187
188 - Dylib output is in scope from Phase 4; the writer is dylib-aware from Sprint 10. Dylib milestone at Sprint 18.5.
189 - Both classic `LC_DYLD_INFO` (Sprint 15) and chained fixups (Sprint 15.5) are in scope. Chained becomes default on macOS 12+ after Sprint 27 parity gate.
190 - `.refs/` gains ld64 and mold alongside lld. lld is the architectural reference, ld64 is authoritative for Apple-parity edge cases, mold informs performance.
191 - `-map` and `-why_live` land in Sprint 19 with the core CLI. They are the debugging surface during driver adoption, not a polish item.