CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Repository Context
afs-ld is a git submodule of ARMFORTAS, a bespoke ARM64 Fortran compiler. It is the standalone ARM64 Mach-O linker: it reads Mach-O relocatable objects (MH_OBJECT) produced by afs-as, static archives, binary dylibs, and TAPI TBD text stubs, and emits linked Mach-O executables (MH_EXECUTE) and shared libraries (MH_DYLIB). It knows nothing about Fortran — the boundary with the compiler is the CLI (an ld-compatible flag surface).
The parent armfortas/CLAUDE.md describes the broader compiler philosophy (bespoke, no LLVM, no parser generators, no compiler-infrastructure crates) and applies here too. Rust standard library only — no clap, no serde, no byteorder, no object, no goblin, no memmap2, no YAML crate. Hand-roll parsers, serializers, and the tiny subset of YAML we need for TBD files.
Build, Test, Lint
cargo build -p afs-ld # build linker crate
cargo test -p afs-ld # run all afs-ld tests
cargo clippy -p afs-ld --all-targets -- -D warnings
cargo test --lib -p afs-ld # unit tests only (in src/)
cargo test --test <name> -p afs-ld # one integration test file
cargo test --test parity_matrix # vs Apple `ld` across the corpus
cargo test --test hello_world # executable end-to-end
cargo test --test hello_library # dylib end-to-end
cargo test -p afs-ld -- <substring> # filter by test name
Integration tests shell out to Apple ld, otool, nm, codesign, and xcrun. They require macOS on Apple Silicon and a working Xcode command-line toolchain. Do not stub these out — the differential against the system linker is the entire point of the parity matrix.
Architecture
Pipeline, end to end:
args → inputs → resolve → atomize → layout → apply relocs → synth sections → write → sign
│ │ │ │ │ │ │ │ │
args.rs input.rs resolve.rs atom.rs layout.rs reloc/arm64.rs synth/*.rs macho/ synth/
symbol.rs writer code_sig
Module responsibilities
src/args.rs— CLI parser. Hand-rolled, noclap. Recognizes theld-compatible flag surface (Sprint 19 ships the full set).src/macho/— Mach-O 64 read/write.constants.rsholds the numeric literals (LC_*,MH_*,S_*,N_*,ARM64_RELOC_*) — duplicated from afs-as rather than cross-crate coupled, keeping each submodule independent.reader.rsparses MH_OBJECT;writer.rsemits MH_EXECUTE and MH_DYLIB;dylib.rsparses binary MH_DYLIB;tbd.rsparses TAPI TBD v4 text stubs (minimal YAML subset, not a general parser).src/archive.rs— BSD + SysV + GNU-thin static archives. Lazy member fetch.src/input.rs—InputFileenum unifying objects, archives, dylibs, TBDs.src/symbol.rs/src/resolve.rs—Symbolsum type and the name resolution pass. Archive-driven fixed-point loop; weak/common/alias coalescing; diagnostics with did-you-mean.src/atom.rs— subsections-via-symbols atomization. Atoms are the unit of dead-stripping, ICF, and output layout.src/section.rs/src/layout.rs— output segment plan and VM/file-offset assignment.MH_EXECUTEandMH_DYLIBare both first-class.src/reloc/— ARM64 reloc application (arm64.rs) and LOH relaxation (loh.rs). Handles BRANCH26, PAGE21/PAGEOFF12, GOT_LOAD_, POINTER_TO_GOT, TLVP_LOAD_, UNSIGNED, SUBTRACTOR, ADDEND.src/synth/— synthetic sections:got,stubs,tlv,symtab,dyld_info(classic),chained(LC_DYLD_CHAINED_FIXUPS),unwind,eh_frame,func_starts,data_in_code,code_sig(ad-hoc SHA-256).src/gc.rs/src/icf.rs—-dead_stripand-icf=safepasses.src/map.rs/src/why_live.rs—-maplink map and-why_livedead-strip reasoning.src/driver.rs— orchestrator: args → inputs → resolve → atomize → layout → apply relocs → synth → write → sign.src/diag.rs— diagnostics. Path + byte offset + caret, matchingafs-as/src/diag*.rsstyle. Deterministic output: no wall clock, no pid, no thread-id.
Coding Conventions
- Rust std only. Any external dependency needs an explicit debate and a CLAUDE.md update.
unsafeonly where genuinely required. Keep blocks small and commented. The one known case islibc::mmapfor large input files (Sprint 28).- Exhaustive pattern matching on
Section,Symbol,Relocation,InputFile,Fixup,LoadCommand— no catch-all_arms outside tests. - Determinism: no timestamps in output, sorted iteration order, stable hashing, parallelism preserves byte-identical output.
- Commit discipline: terse imperative messages, no co-authors, per-file / per-chunk commits, never monoliths. No sprint-number references in commit messages.
- No "stubs pass silently": placeholder code that returns wrong answers is worse than code that panics. Tests must catch the stub, not paper over it.
- Diagnostics always cite input + offset + caret.
src/diag.rsis the one place that constructs these; every error path goes through it.
Test Architecture
Tests are layered, not a single golden path. Each layer catches a different class of regression:
| Test file | What it proves |
|---|---|
src/**/#[cfg(test)] |
Parser / encoder / resolution unit tests (cargo test --lib) |
tests/common/harness.rs |
Differential harness: spawn afs-ld + system ld on the same inputs, diff outputs |
tests/reader_*.rs |
Round-trip Mach-O object reads across the afs-as corpus |
tests/reloc_*.rs |
Golden-file relocation application |
tests/resolve_*.rs |
Symbol resolution matrices (strong vs weak vs common vs dylib vs archive) |
tests/hello_world.rs |
End-to-end: afs-as → afs-ld → runnable PIE executable |
tests/hello_library.rs |
End-to-end: afs-as → afs-ld → dlopenable dylib |
tests/parity_matrix.rs |
Full corpus byte-level differential vs Apple ld (CI gate) |
tests/armfortas_integration.rs |
Parent's integration suite run under AFS_LD=1 |
Corpus fixtures live in tests/corpus/. Every new relocation kind, section kind, or CLI flag lands a corpus entry in the same sprint that implements it.
Audit discipline
After each sprint, a brutally honest audit:
- Assume nothing works until proven otherwise. Test every claim.
- "Placeholder" and "stub" are synonyms for "broken." Wrong answers silently produced are worse than crashes.
- Check against the Mach-O ABI spec, not just "does it link." Wrong output corrupts the loader's bind/rebase state.
- Don't soften findings. "Major" means "produces wrong binaries." "Critical" means "silent corruption that dyld will accept."
- No deferred items unless they genuinely require a later sprint. Fix it now if it can be fixed now.
- The audit is not a formality. It's the last line of defense before bad linker output gets merged and ships bad binaries downstream.
Key references
.refs/llvm/lld/MachO/— primary architectural reference..refs/ld64/src/— Apple authoritative.src/ld/andsrc/mach_o/cover the whole linker..refs/mold/src/— performance techniques (parallel parsing, string merging, allocator tricks). Mostly ELF-oriented but the performance patterns apply.- Apple
<mach-o/loader.h>,<mach-o/nlist.h>,<mach-o/reloc.h>,<mach-o/arm64/reloc.h>— mirrored numerically insrc/macho/constants.rs. dyldopen source — bind/rebase/lazy-bind opcode semantics and chained-fixups format.- ARM Architecture Reference Manual (ARMv8-A) — encoding of relocated instructions.
Sprint roadmap
.docs/sprints/index.md — 32 sprints across 10 phases. Each sprint has an individual markdown file with prerequisites, deliverables, testing strategy, and definition of done.
View source
| 1 | # CLAUDE.md |
| 2 | |
| 3 | This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | |
| 5 | ## Repository Context |
| 6 | |
| 7 | `afs-ld` is a **git submodule** of [ARMFORTAS](https://github.com/FortranGoingOnForty/armfortas), a bespoke ARM64 Fortran compiler. It is the standalone ARM64 Mach-O linker: it reads Mach-O relocatable objects (MH_OBJECT) produced by `afs-as`, static archives, binary dylibs, and TAPI TBD text stubs, and emits linked Mach-O executables (MH_EXECUTE) and shared libraries (MH_DYLIB). It knows nothing about Fortran — the boundary with the compiler is the CLI (an `ld`-compatible flag surface). |
| 8 | |
| 9 | The parent `armfortas/CLAUDE.md` describes the broader compiler philosophy (bespoke, no LLVM, no parser generators, no compiler-infrastructure crates) and applies here too. **Rust standard library only** — no `clap`, no `serde`, no `byteorder`, no `object`, no `goblin`, no `memmap2`, no YAML crate. Hand-roll parsers, serializers, and the tiny subset of YAML we need for TBD files. |
| 10 | |
| 11 | ## Build, Test, Lint |
| 12 | |
| 13 | ```bash |
| 14 | cargo build -p afs-ld # build linker crate |
| 15 | cargo test -p afs-ld # run all afs-ld tests |
| 16 | cargo clippy -p afs-ld --all-targets -- -D warnings |
| 17 | |
| 18 | cargo test --lib -p afs-ld # unit tests only (in src/) |
| 19 | cargo test --test <name> -p afs-ld # one integration test file |
| 20 | cargo test --test parity_matrix # vs Apple `ld` across the corpus |
| 21 | cargo test --test hello_world # executable end-to-end |
| 22 | cargo test --test hello_library # dylib end-to-end |
| 23 | cargo test -p afs-ld -- <substring> # filter by test name |
| 24 | ``` |
| 25 | |
| 26 | Integration tests shell out to Apple `ld`, `otool`, `nm`, `codesign`, and `xcrun`. They require **macOS on Apple Silicon** and a working Xcode command-line toolchain. Do not stub these out — the differential against the system linker is the entire point of the parity matrix. |
| 27 | |
| 28 | ## Architecture |
| 29 | |
| 30 | Pipeline, end to end: |
| 31 | |
| 32 | ``` |
| 33 | args → inputs → resolve → atomize → layout → apply relocs → synth sections → write → sign |
| 34 | │ │ │ │ │ │ │ │ │ |
| 35 | args.rs input.rs resolve.rs atom.rs layout.rs reloc/arm64.rs synth/*.rs macho/ synth/ |
| 36 | symbol.rs writer code_sig |
| 37 | ``` |
| 38 | |
| 39 | ### Module responsibilities |
| 40 | |
| 41 | - **`src/args.rs`** — CLI parser. Hand-rolled, no `clap`. Recognizes the `ld`-compatible flag surface (Sprint 19 ships the full set). |
| 42 | - **`src/macho/`** — Mach-O 64 read/write. `constants.rs` holds the numeric literals (`LC_*`, `MH_*`, `S_*`, `N_*`, `ARM64_RELOC_*`) — duplicated from afs-as rather than cross-crate coupled, keeping each submodule independent. `reader.rs` parses MH_OBJECT; `writer.rs` emits MH_EXECUTE and MH_DYLIB; `dylib.rs` parses binary MH_DYLIB; `tbd.rs` parses TAPI TBD v4 text stubs (minimal YAML subset, not a general parser). |
| 43 | - **`src/archive.rs`** — BSD + SysV + GNU-thin static archives. Lazy member fetch. |
| 44 | - **`src/input.rs`** — `InputFile` enum unifying objects, archives, dylibs, TBDs. |
| 45 | - **`src/symbol.rs`** / **`src/resolve.rs`** — `Symbol` sum type and the name resolution pass. Archive-driven fixed-point loop; weak/common/alias coalescing; diagnostics with did-you-mean. |
| 46 | - **`src/atom.rs`** — subsections-via-symbols atomization. Atoms are the unit of dead-stripping, ICF, and output layout. |
| 47 | - **`src/section.rs`** / **`src/layout.rs`** — output segment plan and VM/file-offset assignment. `MH_EXECUTE` and `MH_DYLIB` are both first-class. |
| 48 | - **`src/reloc/`** — ARM64 reloc application (`arm64.rs`) and LOH relaxation (`loh.rs`). Handles BRANCH26, PAGE21/PAGEOFF12, GOT_LOAD_*, POINTER_TO_GOT, TLVP_LOAD_*, UNSIGNED, SUBTRACTOR, ADDEND. |
| 49 | - **`src/synth/`** — synthetic sections: `got`, `stubs`, `tlv`, `symtab`, `dyld_info` (classic), `chained` (LC_DYLD_CHAINED_FIXUPS), `unwind`, `eh_frame`, `func_starts`, `data_in_code`, `code_sig` (ad-hoc SHA-256). |
| 50 | - **`src/gc.rs`** / **`src/icf.rs`** — `-dead_strip` and `-icf=safe` passes. |
| 51 | - **`src/map.rs`** / **`src/why_live.rs`** — `-map` link map and `-why_live` dead-strip reasoning. |
| 52 | - **`src/driver.rs`** — orchestrator: args → inputs → resolve → atomize → layout → apply relocs → synth → write → sign. |
| 53 | - **`src/diag.rs`** — diagnostics. Path + byte offset + caret, matching `afs-as/src/diag*.rs` style. Deterministic output: no wall clock, no pid, no thread-id. |
| 54 | |
| 55 | ## Coding Conventions |
| 56 | |
| 57 | - **Rust std only.** Any external dependency needs an explicit debate and a CLAUDE.md update. |
| 58 | - **`unsafe` only where genuinely required.** Keep blocks small and commented. The one known case is `libc::mmap` for large input files (Sprint 28). |
| 59 | - **Exhaustive pattern matching** on `Section`, `Symbol`, `Relocation`, `InputFile`, `Fixup`, `LoadCommand` — no catch-all `_` arms outside tests. |
| 60 | - **Determinism**: no timestamps in output, sorted iteration order, stable hashing, parallelism preserves byte-identical output. |
| 61 | - **Commit discipline**: terse imperative messages, no co-authors, per-file / per-chunk commits, never monoliths. No sprint-number references in commit messages. |
| 62 | - **No "stubs pass silently"**: placeholder code that returns wrong answers is worse than code that panics. Tests must catch the stub, not paper over it. |
| 63 | - **Diagnostics always cite input + offset + caret.** `src/diag.rs` is the one place that constructs these; every error path goes through it. |
| 64 | |
| 65 | ## Test Architecture |
| 66 | |
| 67 | Tests are **layered**, not a single golden path. Each layer catches a different class of regression: |
| 68 | |
| 69 | | Test file | What it proves | |
| 70 | |---|---| |
| 71 | | `src/**/#[cfg(test)]` | Parser / encoder / resolution unit tests (`cargo test --lib`) | |
| 72 | | `tests/common/harness.rs` | Differential harness: spawn afs-ld + system ld on the same inputs, diff outputs | |
| 73 | | `tests/reader_*.rs` | Round-trip Mach-O object reads across the afs-as corpus | |
| 74 | | `tests/reloc_*.rs` | Golden-file relocation application | |
| 75 | | `tests/resolve_*.rs` | Symbol resolution matrices (strong vs weak vs common vs dylib vs archive) | |
| 76 | | `tests/hello_world.rs` | End-to-end: afs-as → afs-ld → runnable PIE executable | |
| 77 | | `tests/hello_library.rs` | End-to-end: afs-as → afs-ld → `dlopen`able dylib | |
| 78 | | `tests/parity_matrix.rs` | Full corpus byte-level differential vs Apple `ld` (CI gate) | |
| 79 | | `tests/armfortas_integration.rs` | Parent's integration suite run under `AFS_LD=1` | |
| 80 | |
| 81 | Corpus fixtures live in `tests/corpus/`. Every new relocation kind, section kind, or CLI flag lands a corpus entry in the same sprint that implements it. |
| 82 | |
| 83 | ## Audit discipline |
| 84 | |
| 85 | After each sprint, a brutally honest audit: |
| 86 | |
| 87 | - Assume nothing works until proven otherwise. Test every claim. |
| 88 | - "Placeholder" and "stub" are synonyms for "broken." Wrong answers silently produced are worse than crashes. |
| 89 | - Check against the Mach-O ABI spec, not just "does it link." Wrong output corrupts the loader's bind/rebase state. |
| 90 | - Don't soften findings. "Major" means "produces wrong binaries." "Critical" means "silent corruption that dyld will accept." |
| 91 | - No deferred items unless they genuinely require a later sprint. Fix it now if it can be fixed now. |
| 92 | - The audit is not a formality. It's the last line of defense before bad linker output gets merged and ships bad binaries downstream. |
| 93 | |
| 94 | ## Key references |
| 95 | |
| 96 | - `.refs/llvm/lld/MachO/` — primary architectural reference. |
| 97 | - `.refs/ld64/src/` — Apple authoritative. `src/ld/` and `src/mach_o/` cover the whole linker. |
| 98 | - `.refs/mold/src/` — performance techniques (parallel parsing, string merging, allocator tricks). Mostly ELF-oriented but the performance patterns apply. |
| 99 | - Apple `<mach-o/loader.h>`, `<mach-o/nlist.h>`, `<mach-o/reloc.h>`, `<mach-o/arm64/reloc.h>` — mirrored numerically in `src/macho/constants.rs`. |
| 100 | - `dyld` open source — bind/rebase/lazy-bind opcode semantics and chained-fixups format. |
| 101 | - ARM Architecture Reference Manual (ARMv8-A) — encoding of relocated instructions. |
| 102 | |
| 103 | ## Sprint roadmap |
| 104 | |
| 105 | `.docs/sprints/index.md` — 32 sprints across 10 phases. Each sprint has an individual markdown file with prerequisites, deliverables, testing strategy, and definition of done. |