add overview and sprint plan
- SHA
b13dcdc9e353cbcd1acb4f9f55a5632927465000- Parents
-
9c5dcc1 - Tree
418f4a8
b13dcdc
b13dcdc9e353cbcd1acb4f9f55a56329274650009c5dcc1
418f4a8.docs/overview.mdadded@@ -0,0 +1,191 @@ | ||
| 1 | +# afs-ld | |
| 2 | + | |
| 3 | +**Bespoke ARM64 Mach-O linker for Apple Silicon, written in Rust, stdlib only.** | |
| 4 | + | |
| 5 | +## Why | |
| 6 | + | |
| 7 | +armfortas already owns the compiler (`armfortas`) and the assembler (`afs-as`). Every binary the toolchain produces today is still shaped by Apple's `ld` — which puts the same class of opaque, untouchable bugs back in our path that motivated abandoning LLVM in the first place. `afs-ld` closes the loop. We own every byte from `.f90` source to the final Mach-O executable on disk. | |
| 8 | + | |
| 9 | +This is **not** a toy or educational linker. The target is production parity with Apple `ld` for: | |
| 10 | + | |
| 11 | +- Everything armfortas produces today: arm64 PIE executables statically linking `libarmfortas_rt.a` and dynamically linking `libSystem`. | |
| 12 | +- The fortsh milestone: ~57 KLoC Fortran 2018, 55 modules, `iso_c_binding`, allocatable strings, derived types. | |
| 13 | +- Deterministic output: `-no_uuid` parity, reproducible byte layout across invocations. | |
| 14 | +- All ARM64 Mach-O relocation types, static archives (`.a`), binary dylibs, and TAPI TBD v4 text stubs (libSystem ships as `.tbd` on modern SDKs). | |
| 15 | +- Both classic `LC_DYLD_INFO` opcodes **and** modern `LC_DYLD_CHAINED_FIXUPS`. | |
| 16 | +- Dylib output (`-dylib`) as a first-class feature, not an afterthought. | |
| 17 | +- Ad-hoc code signing — macOS 11+ will not execute an unsigned arm64 binary, even if every other byte is perfect. | |
| 18 | + | |
| 19 | +## Non-goals (current) | |
| 20 | + | |
| 21 | +- ELF, COFF, PE, or any non-Mach-O format. | |
| 22 | +- x86_64, arm64_32, armv7, or any architecture other than arm64. | |
| 23 | +- Bitcode/LTO. armfortas emits assembly, not bitcode; lto is not part of the armfortas pipeline. | |
| 24 | +- ObjC / Swift metadata merging. No armfortas code emits it. Hooks exist for later. | |
| 25 | +- Cross-compilation. We target the host Mac; no `-sdk_version` time-travel. | |
| 26 | + | |
| 27 | +These are non-goals **now**; afs-ld is built to grow into them without architectural retrofit. | |
| 28 | + | |
| 29 | +## What afs-as hands us | |
| 30 | + | |
| 31 | +afs-as emits `MH_OBJECT` only (hand-rolled, no external Mach-O crate). Its output defines the contract afs-ld reads: | |
| 32 | + | |
| 33 | +- Load commands: `LC_SEGMENT_64`, `LC_BUILD_VERSION` (PLATFORM_MACOS), optional `LC_LINKER_OPTIMIZATION_HINT`, `LC_SYMTAB`, `LC_DYSYMTAB`. | |
| 34 | +- Section kinds: `__TEXT,__text`, `__TEXT,__cstring`, `__TEXT,__literal16`, `__TEXT,__const`, `__TEXT,__compact_unwind`, `__TEXT,__eh_frame`, `__DATA,__data`, `__DATA,__bss`, `__DATA,__thread_data`, `__DATA,__thread_bss`, `__DATA,__thread_vars`. | |
| 35 | +- Symbol flags: `N_UNDF`/`N_SECT`/`N_ABS` with `N_EXT`, `N_PEXT`, `N_WEAK_REF`, `N_WEAK_DEF`, `N_NO_DEAD_STRIP`. Common symbols live in `N_UNDF` with `n_desc`-encoded alignment. | |
| 36 | +- Relocation types: `ARM64_RELOC_UNSIGNED`, `SUBTRACTOR`, `BRANCH26`, `PAGE21`, `PAGEOFF12`, `GOT_LOAD_PAGE21`, `GOT_LOAD_PAGEOFF12`, `POINTER_TO_GOT`, `TLVP_LOAD_PAGE21`, `TLVP_LOAD_PAGEOFF12`, `ADDEND` (paired prefix). | |
| 37 | +- Flags: `MH_SUBSECTIONS_VIA_SYMBOLS` always set — atomization model is in play. | |
| 38 | +- LOH hints: `AdrpAdd`, `AdrpLdr`, `AdrpLdrGot`, `AdrpLdrGotLdr` — afs-ld preserves them from Sprint 0 and relaxes them in Sprint 25. | |
| 39 | + | |
| 40 | +afs-as exposes no Mach-O reader. afs-ld ships its own. | |
| 41 | + | |
| 42 | +## Current driver contract | |
| 43 | + | |
| 44 | +`armfortas/src/driver/mod.rs:497-565` shells out to `ld`: | |
| 45 | + | |
| 46 | +``` | |
| 47 | +ld <obj1> <obj2> ... <libarmfortas_rt.a> \ | |
| 48 | + -lSystem -no_uuid -syslibroot <SDK> -e _main -o <output> | |
| 49 | +``` | |
| 50 | + | |
| 51 | +Inputs are `.o` from afs-as plus `libarmfortas_rt.a` from the `runtime/` crate. Output is an arm64 PIE executable with entry `_main` (a synthetic wrapper at `src/driver/mod.rs:371-392` calling `_afs_program_init` → user PROGRAM → `_afs_program_finalize`). afs-ld drops into this contract unchanged; the driver swap (Sprint 20) is initially gated behind `AFS_LD=1`. | |
| 52 | + | |
| 53 | +## Reference material | |
| 54 | + | |
| 55 | +Already in parent `.refs/`: | |
| 56 | + | |
| 57 | +- `.refs/llvm/lld/MachO/` (~21 KLoC C++) — primary architectural reference. Most relevant files: `Driver.cpp` (pipeline), `InputFiles.cpp` (object/archive/dylib parsing), `SymbolTable.cpp` (resolution), `SyntheticSections.cpp` (GOT/stubs/binding), `Arch/ARM64.cpp` (reloc math), `Writer.cpp` (layout). | |
| 58 | +- `.refs/llvm/lld/docs/MachO/index.rst` — design notes comparing lld and ld64. | |
| 59 | + | |
| 60 | +Cloned in Sprint 0: | |
| 61 | + | |
| 62 | +- `.refs/ld64/` — Apple's open-source `ld64` (GitHub mirror of last publicly released tarball). Authoritative for byte-level parity edge cases when our diff harness disagrees with Apple's `ld`. | |
| 63 | +- `.refs/mold/` — Rui Ueyama's `mold` including its Darwin port. Leaner second opinion and a source of performance ideas. | |
| 64 | + | |
| 65 | +Spec-level: | |
| 66 | + | |
| 67 | +- Apple `<mach-o/loader.h>`, `<mach-o/nlist.h>`, `<mach-o/reloc.h>`, `<mach-o/arm64/reloc.h>` — mirrored numerically in our `macho/constants.rs`. | |
| 68 | +- `dyld` open source — bind/rebase/lazy-bind opcode semantics and chained-fixups format. | |
| 69 | +- ARM Architecture Reference Manual (ARMv8-A) — encoding of relocated instructions (ADRP, ADD, LDR, B/BL). | |
| 70 | + | |
| 71 | +## Repo layout | |
| 72 | + | |
| 73 | +afs-ld is a Cargo workspace member of `armfortas` and a Git submodule at `armfortas/afs-ld` pointing at `git@github.com:FortranGoingOnForty/afs-ld.git`, mirroring how `afs-as` is organized. | |
| 74 | + | |
| 75 | +``` | |
| 76 | +afs-ld/ | |
| 77 | +├── Cargo.toml # no deps outside std | |
| 78 | +├── CLAUDE.md # mirrors afs-as/CLAUDE.md, tailored to linker | |
| 79 | +├── README.md | |
| 80 | +├── .docs/ | |
| 81 | +│ ├── overview.md # this file | |
| 82 | +│ └── sprints/ # 32 sprint files | |
| 83 | +├── src/ | |
| 84 | +│ ├── lib.rs # re-export Linker, LinkOptions, OutputKind | |
| 85 | +│ ├── main.rs # afs-ld binary | |
| 86 | +│ ├── args.rs # CLI parsing (hand-rolled, no clap) | |
| 87 | +│ ├── macho/ | |
| 88 | +│ │ ├── mod.rs | |
| 89 | +│ │ ├── constants.rs # LC_*, MH_*, S_*, N_*, ARM64_RELOC_* | |
| 90 | +│ │ ├── reader.rs # parse MH_OBJECT | |
| 91 | +│ │ ├── writer.rs # emit MH_EXECUTE or MH_DYLIB | |
| 92 | +│ │ ├── dylib.rs # parse MH_DYLIB | |
| 93 | +│ │ └── tbd.rs # parse TAPI TBD v4 | |
| 94 | +│ ├── archive.rs # ar/ranlib archive reader | |
| 95 | +│ ├── input.rs # InputFile enum, lazy member fetch | |
| 96 | +│ ├── symbol.rs # Symbol kinds, SymbolTable | |
| 97 | +│ ├── resolve.rs # name resolution pass | |
| 98 | +│ ├── atom.rs # subsections-via-symbols atom model | |
| 99 | +│ ├── section.rs # InputSection, OutputSection, OutputSegment | |
| 100 | +│ ├── layout.rs # VM addr + file offset assignment | |
| 101 | +│ ├── reloc/ | |
| 102 | +│ │ ├── mod.rs | |
| 103 | +│ │ ├── arm64.rs # ARM64_RELOC_* application | |
| 104 | +│ │ └── loh.rs # LOH preservation / relaxation | |
| 105 | +│ ├── synth/ # synthetic sections | |
| 106 | +│ │ ├── mod.rs | |
| 107 | +│ │ ├── got.rs | |
| 108 | +│ │ ├── stubs.rs | |
| 109 | +│ │ ├── tlv.rs | |
| 110 | +│ │ ├── symtab.rs | |
| 111 | +│ │ ├── dyld_info.rs # classic rebase/bind/lazy/weak + export trie | |
| 112 | +│ │ ├── chained.rs # LC_DYLD_CHAINED_FIXUPS | |
| 113 | +│ │ ├── unwind.rs | |
| 114 | +│ │ ├── eh_frame.rs | |
| 115 | +│ │ ├── func_starts.rs | |
| 116 | +│ │ ├── data_in_code.rs | |
| 117 | +│ │ └── code_sig.rs # ad-hoc SHA-256 code signature | |
| 118 | +│ ├── map.rs # -map text link map | |
| 119 | +│ ├── why_live.rs # -why_live dead-strip reasons | |
| 120 | +│ ├── gc.rs # -dead_strip | |
| 121 | +│ ├── icf.rs # -icf=safe | |
| 122 | +│ ├── driver.rs # orchestrator | |
| 123 | +│ └── diag.rs # diagnostics, path/line/col parity with afs-as | |
| 124 | +└── tests/ | |
| 125 | + ├── common/harness.rs # spawn afs-ld, diff output vs system ld | |
| 126 | + ├── reader_*.rs # round-trip object reads | |
| 127 | + ├── reloc_*.rs # golden-file reloc application | |
| 128 | + ├── resolve_*.rs # symbol resolution matrices | |
| 129 | + ├── hello_world.rs # first end-to-end executable link | |
| 130 | + ├── hello_library.rs # first end-to-end dylib link | |
| 131 | + ├── armfortas_integration.rs | |
| 132 | + └── corpus/ # hand-curated .o / .a / .dylib / .tbd fixtures | |
| 133 | +``` | |
| 134 | + | |
| 135 | +## Architecture pipeline | |
| 136 | + | |
| 137 | +``` | |
| 138 | +args → inputs → resolve → atomize → layout → apply relocs → synth sections → write → sign | |
| 139 | +``` | |
| 140 | + | |
| 141 | +1. **args**: hand-rolled parser for the `ld`-compatible CLI surface. No clap. | |
| 142 | +2. **inputs**: demultiplex `.o`, `.a`, `.dylib`, `.tbd`; lazy archive-member fetching. | |
| 143 | +3. **resolve**: symbol-table fixed-point loop; weak/common/alias coalescing; archive-driven pulls. | |
| 144 | +4. **atomize**: split input sections at symbol boundaries per `MH_SUBSECTIONS_VIA_SYMBOLS`. | |
| 145 | +5. **layout**: assign VM addrs (`__PAGEZERO`/`__TEXT`/`__DATA_CONST`/`__DATA`/`__LINKEDIT` for executables; no `__PAGEZERO` for dylibs) and file offsets. | |
| 146 | +6. **apply relocs**: ARM64_RELOC_* patching; GOT/stubs/lazy-pointer emission; LOH honoring. | |
| 147 | +7. **synth sections**: `__LINKEDIT` payload — symbol table, string table, `LC_DYLD_INFO` and/or chained fixups, function starts, data-in-code, compact unwind, eh_frame passthrough. | |
| 148 | +8. **write**: Mach-O header + load commands + segment data; `-no_uuid` deterministic. | |
| 149 | +9. **sign**: ad-hoc SHA-256 page hashes in `LC_CODE_SIGNATURE` so the binary runs on bare arm64. | |
| 150 | + | |
| 151 | +## Coding conventions | |
| 152 | + | |
| 153 | +- **Rust std only.** No `clap`, no `serde`, no `byteorder`, no `object`, no `goblin`. Hand-roll parsers, serializers, and the tiny YAML subset we need for TBD. | |
| 154 | +- **`unsafe` only where genuinely required.** Keep blocks small and commented. | |
| 155 | +- **Exhaustive pattern matching** on `Section`, `Symbol`, `Relocation`, `InputFile`, `Fixup` — no catch-all `_` arms outside tests. | |
| 156 | +- **Diagnostics**: path, offset, caret under source — mirror `afs-as/src/diag*.rs`. | |
| 157 | +- **Determinism**: no timestamps in output, sorted iteration, stable hashing. | |
| 158 | +- **Commit discipline**: terse imperative, no co-authors, per-file/per-chunk commits, never monoliths. Matches armfortas and afs-as house rules. | |
| 159 | +- **No borrowed constants across crates.** afs-ld duplicates `MH_*`, `LC_*`, `S_*`, `N_*`, `ARM64_RELOC_*` in `macho/constants.rs` rather than depending on afs-as at a type level. Each submodule stays independent. | |
| 160 | + | |
| 161 | +## Testing strategy | |
| 162 | + | |
| 163 | +- **Unit**: every parser and encoder has a round-trip test — parse a fixture, re-emit, compare bytes. | |
| 164 | +- **Corpus**: `tests/corpus/` collects `.o`, `.a`, `.dylib`, `.tbd` fixtures. Every new relocation type or section kind lands a corpus entry in the same sprint that implements it. | |
| 165 | +- **Differential** (from Sprint 1): `tests/common/harness.rs` links the same inputs through `ld` and `afs-ld`, diffs load commands, symbol tables, bind/rebase or chained-fixup streams, and disassembly. Tolerated-diff allowlist covers UUID, timestamp, hash-backed temp names. CI gate from sprint one. | |
| 166 | +- **End-to-end** (from Sprint 18): hello-world executable must run; (Sprint 18.5) hello-library dylib must `dlopen`. (Sprint 21) the full armfortas integration suite must pass. | |
| 167 | +- **fortsh link** (Sprint 29): explicit milestone. A fortsh binary linked by afs-ld must behave identically to one linked by system `ld`. | |
| 168 | +- **Audits**: post-Sprint 18 (hello), 18.5 (dylib), 22 (first signed & running on bare arm64), 27 (parity gate), 29 (fortsh), 31 (final). Brutal honesty rules from armfortas/CLAUDE.md apply. | |
| 169 | + | |
| 170 | +## Sprint roadmap (summary) | |
| 171 | + | |
| 172 | +See `.docs/sprints/index.md` for the full list. Ten phases, 32 sprints: | |
| 173 | + | |
| 174 | +- **Phase 0 — Scaffolding**: Sprint 0. | |
| 175 | +- **Phase 1 — Mach-O reading**: Sprints 1–3 (header/load commands, sections/symbols, relocations). | |
| 176 | +- **Phase 2 — Archives & dylibs**: Sprints 4–6 (`ar`, binary dylib, TBD). | |
| 177 | +- **Phase 3 — Symbol resolution**: Sprints 7–9 (model, resolution pass, atomization). | |
| 178 | +- **Phase 4 — Output construction**: Sprints 10–14 (layout, reloc application, GOT/stubs, TLV, symtab/strtab). MH_EXECUTE and MH_DYLIB both first-class. | |
| 179 | +- **Phase 5 — Dyld metadata**: Sprints 15 (classic `LC_DYLD_INFO`), 15.5 (chained fixups), 16 (function starts/data-in-code), 17 (unwind info). | |
| 180 | +- **Phase 6 — End-to-end**: Sprint 18 (hello-world executable), 18.5 (hello-library dylib). | |
| 181 | +- **Phase 7 — CLI & driver**: Sprints 19 (CLI + `-map`/`-why_live` diagnostics), 20 (driver swap). | |
| 182 | +- **Phase 8 — Runtime compatibility**: Sprints 21 (runtime archive + integration tests), 22 (ad-hoc code signature). | |
| 183 | +- **Phase 9 — Advanced**: Sprints 23 (`-dead_strip`), 24 (`-icf=safe`), 25 (LOH relaxation), 26 (thunks). | |
| 184 | +- **Phase 10 — Hardening**: Sprints 27 (differential gate), 28 (performance), 29 (fortsh link audit), 30 (polish), 31 (final audit). | |
| 185 | + | |
| 186 | +## Scope decisions (confirmed) | |
| 187 | + | |
| 188 | +- Dylib output is in scope from Phase 4; the writer is dylib-aware from Sprint 10. Dylib milestone at Sprint 18.5. | |
| 189 | +- Both classic `LC_DYLD_INFO` (Sprint 15) and chained fixups (Sprint 15.5) are in scope. Chained becomes default on macOS 12+ after Sprint 27 parity gate. | |
| 190 | +- `.refs/` gains ld64 and mold alongside lld. lld is the architectural reference, ld64 is authoritative for Apple-parity edge cases, mold informs performance. | |
| 191 | +- `-map` and `-why_live` land in Sprint 19 with the core CLI. They are the debugging surface during driver adoption, not a polish item. | |
.docs/sprints/index.mdadded@@ -0,0 +1,59 @@ | ||
| 1 | +# afs-ld Sprint Index | |
| 2 | + | |
| 3 | +32 sprints across 10 phases. Small bites, clear milestones, testable deliverables at every stage. Each sprint is independently reviewable and mergeable; every sprint that lands new surface area also lands corpus fixtures and differential coverage. | |
| 4 | + | |
| 5 | +## Phase 0 — Scaffolding | |
| 6 | +- [Sprint 0](sprint00.md) — Scaffolding, References, Harness | |
| 7 | + | |
| 8 | +## Phase 1 — Mach-O Reading | |
| 9 | +- [Sprint 1](sprint01.md) — MH_OBJECT Header & Load Commands | |
| 10 | +- [Sprint 2](sprint02.md) — Sections, Symbols, String Tables | |
| 11 | +- [Sprint 3](sprint03.md) — Relocations (Read-Side) | |
| 12 | + | |
| 13 | +## Phase 2 — Archives & Dylibs | |
| 14 | +- [Sprint 4](sprint04.md) — Static Archives (ar) | |
| 15 | +- [Sprint 5](sprint05.md) — Dylibs (MH_DYLIB binary) | |
| 16 | +- [Sprint 6](sprint06.md) — TAPI TBD Text Stubs | |
| 17 | + | |
| 18 | +## Phase 3 — Symbol Resolution | |
| 19 | +- [Sprint 7](sprint07.md) — Symbol Model & Table | |
| 20 | +- [Sprint 8](sprint08.md) — Name Resolution Pass | |
| 21 | +- [Sprint 9](sprint09.md) — Subsections-via-Symbols Atomization | |
| 22 | + | |
| 23 | +## Phase 4 — Output Construction (MH_EXECUTE and MH_DYLIB both first-class) | |
| 24 | +- [Sprint 10](sprint10.md) — Output Segment & Section Layout (dylib-aware) | |
| 25 | +- [Sprint 11](sprint11.md) — Core Relocation Application (ARM64) | |
| 26 | +- [Sprint 12](sprint12.md) — GOT, Stubs, Lazy Pointers | |
| 27 | +- [Sprint 13](sprint13.md) — TLV Relocations | |
| 28 | +- [Sprint 14](sprint14.md) — LC_SYMTAB / LC_DYSYMTAB / String Table | |
| 29 | + | |
| 30 | +## Phase 5 — Dynamic Linker Metadata | |
| 31 | +- [Sprint 15](sprint15.md) — Classic LC_DYLD_INFO Opcodes | |
| 32 | +- [Sprint 15.5](sprint15_5.md) — Chained Fixups (LC_DYLD_CHAINED_FIXUPS) | |
| 33 | +- [Sprint 16](sprint16.md) — LC_FUNCTION_STARTS & LC_DATA_IN_CODE | |
| 34 | +- [Sprint 17](sprint17.md) — Unwind Info | |
| 35 | + | |
| 36 | +## Phase 6 — First End-to-End | |
| 37 | +- [Sprint 18](sprint18.md) — HELLO WORLD MILESTONE (Executable) | |
| 38 | +- [Sprint 18.5](sprint18_5.md) — HELLO LIBRARY MILESTONE (Dylib) | |
| 39 | + | |
| 40 | +## Phase 7 — CLI & Driver Integration | |
| 41 | +- [Sprint 19](sprint19.md) — CLI Surface + Diagnostics (-map, -why_live) | |
| 42 | +- [Sprint 20](sprint20.md) — Driver Swap | |
| 43 | + | |
| 44 | +## Phase 8 — Runtime Compatibility | |
| 45 | +- [Sprint 21](sprint21.md) — Runtime Archive Linking | |
| 46 | +- [Sprint 22](sprint22.md) — Code Signature (Ad-Hoc) | |
| 47 | + | |
| 48 | +## Phase 9 — Advanced Features | |
| 49 | +- [Sprint 23](sprint23.md) — Dead Strip (`-dead_strip`) | |
| 50 | +- [Sprint 24](sprint24.md) — ICF (`-icf=safe`) | |
| 51 | +- [Sprint 25](sprint25.md) — LOH Relaxation | |
| 52 | +- [Sprint 26](sprint26.md) — Thunks for Out-of-Range Branches | |
| 53 | + | |
| 54 | +## Phase 10 — Production Hardening | |
| 55 | +- [Sprint 27](sprint27.md) — Differential Harness vs Apple ld | |
| 56 | +- [Sprint 28](sprint28.md) — Performance & Parallelism | |
| 57 | +- [Sprint 29](sprint29.md) — fortsh Link Audit | |
| 58 | +- [Sprint 30](sprint30.md) — Diagnostics & Polish | |
| 59 | +- [Sprint 31](sprint31.md) — Final Audit | |
.docs/sprints/sprint00.mdadded@@ -0,0 +1,119 @@ | ||
| 1 | +# Sprint 0: Scaffolding, References, Harness | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +None — this is where afs-ld begins. Assumes armfortas and afs-as already exist and compile. | |
| 5 | + | |
| 6 | +## Current state to remediate | |
| 7 | + | |
| 8 | +The `afs-ld/` directory currently sits inside armfortas's working tree as a plain subdirectory. `afs-ld/.gitignore` was accidentally committed to armfortas history (commit `85d5ba8 "init"`), tracking a file that will shortly belong to a separate repo. Before anything else in this sprint, afs-ld must be extracted into its own repo and wired back in as a Git submodule. The tracked `.gitignore` must be removed from armfortas's index (history can stay; removing a single file at HEAD is clean) so that afs-ld's contents live in the submodule and nowhere else. | |
| 9 | + | |
| 10 | +## Goals | |
| 11 | +Extract afs-ld into its own repo, wire it back as a submodule, stand up the crate (CLAUDE.md, README, Cargo.toml, skeleton source), clone reference linkers, build the differential harness. End state: `cargo test -p afs-ld` runs from the parent workspace, at least one test passes meaningfully, and `git submodule status` lists afs-ld alongside afs-as. | |
| 12 | + | |
| 13 | +## Deliverables | |
| 14 | + | |
| 15 | +### 1. Submodule remediation (do first) | |
| 16 | + | |
| 17 | +Goal: move from "afs-ld is a tracked subdirectory of armfortas" to "afs-ld is a submodule pointing at `git@github.com:FortranGoingOnForty/afs-ld.git`". Exact sequence: | |
| 18 | + | |
| 19 | +1. **Preserve the current afs-ld contents**: copy `armfortas/afs-ld/` to a temp location (the `.docs/overview.md` and sprint files produced in planning are the primary content to preserve; `.fackr/` is scratch and can be dropped). | |
| 20 | +2. **Untrack from armfortas**: `git rm --cached afs-ld/.gitignore && git commit -m "remove accidentally-tracked afs-ld/.gitignore"`. Confirm `git ls-files afs-ld` returns empty. | |
| 21 | +3. **Delete the directory from armfortas's working tree** (submodule-add will recreate it): `rm -rf afs-ld/`. | |
| 22 | +4. **Create the external repo** `FortranGoingOnForty/afs-ld` on GitHub (empty, no README — submodule-add will seed it). | |
| 23 | +5. **Initialize locally and push**: in a scratch directory, | |
| 24 | + ``` | |
| 25 | + git init afs-ld | |
| 26 | + cd afs-ld | |
| 27 | + <copy preserved contents back> | |
| 28 | + git add -A | |
| 29 | + git commit -m "init" | |
| 30 | + git remote add origin git@github.com:FortranGoingOnForty/afs-ld.git | |
| 31 | + git push -u origin trunk | |
| 32 | + ``` | |
| 33 | +6. **Add as submodule in armfortas**: | |
| 34 | + ``` | |
| 35 | + cd <armfortas root> | |
| 36 | + git submodule add git@github.com:FortranGoingOnForty/afs-ld.git afs-ld | |
| 37 | + git commit -m "add afs-ld submodule" | |
| 38 | + ``` | |
| 39 | + Confirm `.gitmodules` gained the stanza: | |
| 40 | + ``` | |
| 41 | + [submodule "afs-ld"] | |
| 42 | + path = afs-ld | |
| 43 | + url = git@github.com:FortranGoingOnForty/afs-ld.git | |
| 44 | + ``` | |
| 45 | +7. **Verify** with `git submodule status` — afs-as and afs-ld both listed, each pinned to a commit hash. | |
| 46 | + | |
| 47 | +Note: the old `git rm --cached` commit stays in armfortas history. Rewriting history to erase it is more destructive than it's worth for a single `.gitignore` file. The file at HEAD is gone; that is sufficient. | |
| 48 | + | |
| 49 | +### 2. Crate wiring | |
| 50 | + | |
| 51 | +- Root `Cargo.toml` adds `"afs-ld"` to `[workspace] members` alongside `afs-as`. | |
| 52 | +- `afs-ld/Cargo.toml`: binary + library, zero external dependencies, `edition = "2021"`. Mirror `afs-as/Cargo.toml` — same `keywords`, `categories`, `license = "GPL-3.0-only"`, adjusted `description` and `repository`. | |
| 53 | +- `afs-ld/src/lib.rs`: public re-exports of `Linker`, `LinkOptions`, `OutputKind` (types are stubs this sprint). | |
| 54 | +- `afs-ld/src/main.rs`: CLI that prints usage and exits 0 when run with no args; forwards real args to `Linker::run` (which errors with "not yet implemented" this sprint). | |
| 55 | + | |
| 56 | +### 3. CLAUDE.md and README.md | |
| 57 | + | |
| 58 | +- `afs-ld/CLAUDE.md`: mirror `afs-as/CLAUDE.md`, replacing assembler-isms with linker-isms. Non-negotiable rules: Rust std only, exhaustive matching, caret diagnostics, per-chunk commits, no co-authors, no sprint-number references in commits. | |
| 59 | +- `afs-ld/README.md`: one-page intro, supported CLI subset at current state (nothing yet — say so), build/test commands. | |
| 60 | +- `afs-ld/.gitignore`: `target/`, `.fackr/`, no `.docs/` since those files live in the repo. This one is intentionally tracked because it lives inside afs-ld's own repo now, not armfortas's. | |
| 61 | + | |
| 62 | +### 4. Reference clones | |
| 63 | + | |
| 64 | +Add to parent `.refs/` (gitignored): | |
| 65 | + | |
| 66 | +- `.refs/ld64/` — `git clone --depth 1 https://github.com/apple-oss-distributions/ld64.git`. Apple's last publicly released ld64. Authoritative for Apple-parity edge cases. | |
| 67 | +- `.refs/mold/` — `git clone --depth 1 https://github.com/rui314/mold.git`. Performance reference and a second Rust-adjacent angle on Mach-O. | |
| 68 | + | |
| 69 | +`.refs/llvm/lld/MachO/` already exists from armfortas Sprint 0 — primary architectural reference. | |
| 70 | + | |
| 71 | +### 5. Differential harness | |
| 72 | + | |
| 73 | +`afs-ld/tests/common/harness.rs`: | |
| 74 | + | |
| 75 | +```rust | |
| 76 | +pub struct LinkCase { | |
| 77 | + pub name: &'static str, | |
| 78 | + pub inputs: Vec<PathBuf>, // .o / .a / .tbd | |
| 79 | + pub args: Vec<String>, // -o, -e, -syslibroot, -l, ... | |
| 80 | +} | |
| 81 | + | |
| 82 | +pub struct LinkOutputs { | |
| 83 | + pub ours: Vec<u8>, // afs-ld output | |
| 84 | + pub theirs: Vec<u8>, // system ld output | |
| 85 | +} | |
| 86 | + | |
| 87 | +pub fn link_both(case: &LinkCase) -> LinkOutputs; | |
| 88 | +pub fn diff_macho(ours: &[u8], theirs: &[u8]) -> DiffReport; | |
| 89 | +``` | |
| 90 | + | |
| 91 | +`DiffReport` categorizes byte differences as `Tolerated` (UUID, timestamp, temp-path hashes) or `Critical` (anything else). Critical diffs fail the test. `link_both` shells out to `ld` via `xcrun -f ld` so it picks up the active toolchain. | |
| 92 | + | |
| 93 | +### 6. Skeleton CLI and first failing test | |
| 94 | + | |
| 95 | +- `afs-ld/src/args.rs`: hand-rolled argv parser stub that recognizes `-o`, `-e`, `-arch`, and positional inputs. Unknown flags error loudly with a hint. | |
| 96 | +- `afs-ld/tests/reader_empty.rs`: attempts to link `0 inputs → empty output`, expects the diagnostic `"afs-ld: error: no input files"`. Passes today by producing that exact string. | |
| 97 | +- `afs-ld/tests/diff_harness_sanity.rs`: runs the harness against a known-identical pair (two copies of the same pre-linked binary produced by `xcrun ld`) and expects zero diffs. Passes. | |
| 98 | +- `afs-ld/tests/diff_harness_finds_critical.rs`: feeds the harness two binaries that differ in a non-tolerated byte range (e.g. different text bytes) and asserts the harness reports `Critical`. Passes. | |
| 99 | + | |
| 100 | +## Testing Strategy | |
| 101 | + | |
| 102 | +- `cargo build -p afs-ld` compiles from a fresh clone of the parent with `git submodule update --init --recursive`. | |
| 103 | +- `cargo test -p afs-ld` runs harness-sanity, critical-detection, and empty-input tests. | |
| 104 | +- `cargo clippy -p afs-ld -- -D warnings` clean. | |
| 105 | +- Manual verification of submodule state: | |
| 106 | + - `git ls-files | grep afs-ld` in armfortas prints only the `.gitmodules` entry (and nothing under `afs-ld/`). | |
| 107 | + - `git submodule status` shows both afs-as and afs-ld with valid commit hashes. | |
| 108 | + - `git submodule update --init --recursive` on a fresh armfortas clone populates afs-ld correctly. | |
| 109 | + | |
| 110 | +## Definition of Done | |
| 111 | + | |
| 112 | +- The accidentally-tracked `afs-ld/.gitignore` is removed from armfortas's index at HEAD. | |
| 113 | +- afs-ld exists as a standalone GitHub repo under `FortranGoingOnForty`. | |
| 114 | +- afs-ld is wired into armfortas as a Git submodule, visible in `.gitmodules` and `git submodule status`. | |
| 115 | +- `armfortas/Cargo.toml` lists `afs-ld` in `[workspace] members`. | |
| 116 | +- `afs-ld/CLAUDE.md`, `README.md`, `Cargo.toml`, `src/lib.rs`, `src/main.rs`, `src/args.rs` all committed in the new repo. | |
| 117 | +- `.refs/ld64/` and `.refs/mold/` cloned. | |
| 118 | +- Differential harness runs, correctly reports zero diffs on identical binaries, correctly reports critical diffs on intentionally-different binaries. | |
| 119 | +- `cargo test --workspace` green. | |
.docs/sprints/sprint01.mdadded@@ -0,0 +1,92 @@ | ||
| 1 | +# Sprint 1: MH_OBJECT Header & Load Commands | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 0 — crate, harness, references in place. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Read a Mach-O relocatable object file: parse the header and every load command afs-as emits. End state: given any `.o` in `afs-as/tests/corpus/`, afs-ld can pretty-print its structure and round-trip-compare it to a golden. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Mach-O constants | |
| 12 | +`afs-ld/src/macho/constants.rs`: duplicate the constants afs-as uses. Numeric literals only, no imports from afs-as. | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub const MH_MAGIC_64: u32 = 0xFEEDFACF; | |
| 16 | +pub const CPU_TYPE_ARM64: u32 = 0x0100000C; | |
| 17 | +pub const MH_OBJECT: u32 = 1; | |
| 18 | +pub const MH_EXECUTE: u32 = 2; | |
| 19 | +pub const MH_DYLIB: u32 = 6; | |
| 20 | +pub const MH_SUBSECTIONS_VIA_SYMBOLS: u32 = 0x2000; | |
| 21 | + | |
| 22 | +pub const LC_SEGMENT_64: u32 = 0x19; | |
| 23 | +pub const LC_SYMTAB: u32 = 0x02; | |
| 24 | +pub const LC_DYSYMTAB: u32 = 0x0B; | |
| 25 | +pub const LC_BUILD_VERSION: u32 = 0x32; | |
| 26 | +pub const LC_LINKER_OPTIMIZATION_HINT: u32 = 0x2E; | |
| 27 | +// ... plus LC_MAIN, LC_DYLD_INFO_ONLY, LC_DYLD_CHAINED_FIXUPS, | |
| 28 | +// LC_FUNCTION_STARTS, LC_DATA_IN_CODE, LC_CODE_SIGNATURE, | |
| 29 | +// LC_ID_DYLIB, LC_LOAD_DYLIB, LC_LOAD_WEAK_DYLIB, | |
| 30 | +// LC_REEXPORT_DYLIB, LC_RPATH, LC_UUID, LC_SOURCE_VERSION. | |
| 31 | +``` | |
| 32 | + | |
| 33 | +### 2. Header parser | |
| 34 | +`afs-ld/src/macho/reader.rs`: | |
| 35 | + | |
| 36 | +```rust | |
| 37 | +pub struct MachHeader64 { | |
| 38 | + pub magic: u32, pub cputype: u32, pub cpusubtype: u32, | |
| 39 | + pub filetype: u32, pub ncmds: u32, pub sizeofcmds: u32, | |
| 40 | + pub flags: u32, pub reserved: u32, | |
| 41 | +} | |
| 42 | + | |
| 43 | +pub fn parse_header(bytes: &[u8]) -> Result<MachHeader64, ReadError>; | |
| 44 | +``` | |
| 45 | + | |
| 46 | +Validate: magic matches MH_MAGIC_64, cputype matches CPU_TYPE_ARM64, `ncmds * 8 <= sizeofcmds`, `32 + sizeofcmds <= bytes.len()`. Clear, sourced diagnostics via `src/diag.rs`. | |
| 47 | + | |
| 48 | +### 3. Load-command dispatcher | |
| 49 | +`LoadCommand` enum with variants for each command afs-as emits: | |
| 50 | + | |
| 51 | +```rust | |
| 52 | +pub enum LoadCommand { | |
| 53 | + Segment64(Segment64), | |
| 54 | + Symtab(SymtabCmd), | |
| 55 | + Dysymtab(DysymtabCmd), | |
| 56 | + BuildVersion(BuildVersionCmd), | |
| 57 | + LinkerOptimizationHint(LohCmd), | |
| 58 | + // placeholders for later sprints: | |
| 59 | + DyldInfoOnly, DyldChainedFixups, Main, FunctionStarts, | |
| 60 | + DataInCode, CodeSignature, IdDylib, LoadDylib, LoadWeakDylib, | |
| 61 | + ReexportDylib, Rpath, Uuid, SourceVersion, | |
| 62 | + Unknown { cmd: u32, cmdsize: u32, data: Vec<u8> }, | |
| 63 | +} | |
| 64 | + | |
| 65 | +pub fn parse_commands(header: &MachHeader64, bytes: &[u8]) -> Result<Vec<LoadCommand>, ReadError>; | |
| 66 | +``` | |
| 67 | + | |
| 68 | +Exhaustive matching. Unknown commands preserved (not erased) so round-trips survive. | |
| 69 | + | |
| 70 | +### 4. Segment + section header parsing (metadata only — contents in Sprint 2) | |
| 71 | +Decode `segment_command_64` (72 bytes) + N `section_64` structs (80 bytes each). Store: | |
| 72 | +- segname (fixed 16 bytes, null-padded) | |
| 73 | +- sectname (fixed 16 bytes, null-padded) | |
| 74 | +- addr, size, offset, align (as log2), reloff, nreloc, flags, reserved1, reserved2, reserved3 | |
| 75 | + | |
| 76 | +### 5. LC_BUILD_VERSION + LC_LINKER_OPTIMIZATION_HINT | |
| 77 | +Decode platform (PLATFORM_MACOS = 1), minos, sdk, ntools, tool records. Decode the LOH blob as raw bytes (interpretation in Sprint 25). | |
| 78 | + | |
| 79 | +### 6. Pretty-printer | |
| 80 | +`afs-ld/src/bin/dump.rs` (optional subcommand `afs-ld --dump <path>`): otool-like output. Used by the round-trip harness. | |
| 81 | + | |
| 82 | +## Testing Strategy | |
| 83 | +- Round-trip test: for every `.o` in `afs-as/tests/corpus/`, parse, serialize back into the same byte layout (no reshuffling in this sprint — just read+echo), compare. | |
| 84 | +- Malformed-input tests: truncated header, wrong magic, wrong cputype, `ncmds` lying about `sizeofcmds`, unaligned commands. Each must produce a specific diagnostic, never a panic. | |
| 85 | +- Differential: `otool -lV` against our dumper for the full corpus. Diff must be zero after whitespace normalization. | |
| 86 | + | |
| 87 | +## Definition of Done | |
| 88 | +- All afs-as corpus `.o` files parse cleanly. | |
| 89 | +- Every load command afs-as emits is represented in `LoadCommand`. | |
| 90 | +- Malformed-input fuzz finds no panics. | |
| 91 | +- Round-trip byte-level equality on the full corpus. | |
| 92 | +- `otool -lV` and our dumper agree after whitespace normalization. | |
.docs/sprints/sprint02.mdadded@@ -0,0 +1,104 @@ | ||
| 1 | +# Sprint 2: Sections, Symbols, String Tables | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 1 — header + load commands parsed. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Decode section payloads, the symbol table (nlist_64), and the string table. Expose the full section/symbol/string model that later sprints build on. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Section attributes and kinds | |
| 12 | +`afs-ld/src/section.rs` — `SectionKind` mirrors afs-as's but richer on the reader side (we receive inputs with flags already set): | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub enum SectionKind { | |
| 16 | + Text, CStringLiterals, Literal4, Literal8, Literal16, | |
| 17 | + ConstData, Data, ZeroFill, GbZeroFill, | |
| 18 | + ThreadLocalRegular, ThreadLocalZerofill, | |
| 19 | + ThreadLocalVariables, ThreadLocalInitPointers, | |
| 20 | + CompactUnwind, EhFrame, Coalesced, | |
| 21 | + Regular, Unknown, | |
| 22 | +} | |
| 23 | + | |
| 24 | +pub fn kind_from_flags(flags: u32) -> SectionKind; // S_* attribute bits | |
| 25 | +``` | |
| 26 | + | |
| 27 | +Respect all `S_ATTR_*` flags and section-type nibble (`flags & 0xff`). | |
| 28 | + | |
| 29 | +### 2. Section content slicing | |
| 30 | +`InputSection` struct: segment, name, kind, addr, size, align (log2), flags, raw `data: &[u8]` borrowed from the mmap'd input, plus the raw relocation entries as `&[u8]` (decoded in Sprint 3). For `S_ZEROFILL` / `S_THREAD_LOCAL_ZEROFILL`, `data` is empty; size is virtual. | |
| 31 | + | |
| 32 | +### 3. nlist_64 and symbol flags | |
| 33 | +`afs-ld/src/symbol.rs`: | |
| 34 | + | |
| 35 | +```rust | |
| 36 | +pub const N_STAB: u8 = 0xe0; | |
| 37 | +pub const N_PEXT: u8 = 0x10; | |
| 38 | +pub const N_TYPE: u8 = 0x0e; // mask | |
| 39 | +pub const N_EXT: u8 = 0x01; | |
| 40 | + | |
| 41 | +pub const N_UNDF: u8 = 0x0; | |
| 42 | +pub const N_ABS: u8 = 0x2; | |
| 43 | +pub const N_SECT: u8 = 0xe; | |
| 44 | +pub const N_INDR: u8 = 0xa; | |
| 45 | + | |
| 46 | +pub const N_NO_DEAD_STRIP: u16 = 0x0020; | |
| 47 | +pub const N_WEAK_REF: u16 = 0x0040; | |
| 48 | +pub const N_WEAK_DEF: u16 = 0x0080; | |
| 49 | +pub const N_ARM_THUMB_DEF: u16 = 0x0008; | |
| 50 | +pub const N_SYMBOL_RESOLVER: u16 = 0x0100; | |
| 51 | + | |
| 52 | +pub struct RawNlist { | |
| 53 | + pub strx: u32, | |
| 54 | + pub n_type: u8, pub n_sect: u8, pub n_desc: u16, | |
| 55 | + pub n_value: u64, | |
| 56 | +} | |
| 57 | + | |
| 58 | +pub struct InputSymbol<'a> { | |
| 59 | + pub name: &'a str, | |
| 60 | + pub kind: SymKind, // Undef, Abs, SectLocal, SectExt, PExt, Indirect | |
| 61 | + pub weak_ref: bool, pub weak_def: bool, | |
| 62 | + pub no_dead_strip: bool, pub private_extern: bool, | |
| 63 | + pub sect_idx: u8, pub value: u64, | |
| 64 | + pub common_align_pow2: Option<u8>, // from n_desc bits 8..15 when UNDF + value != 0 | |
| 65 | +} | |
| 66 | +``` | |
| 67 | + | |
| 68 | +Common symbols detected the way afs-as emits them: `N_UNDF | N_EXT` with nonzero `n_value` encoding the size and `n_desc >> 8` encoding alignment. | |
| 69 | + | |
| 70 | +### 4. Indirect (N_INDR) pass-through | |
| 71 | +Alias symbols: record the aliased name from the string table via `n_value` used as a strx into the string table. Resolution lives in Sprint 7; this sprint just surfaces the data. | |
| 72 | + | |
| 73 | +### 5. String table reader | |
| 74 | +`StringTable` wraps the raw bytes of `__LINKEDIT` string table, exposes `name_at(strx: u32) -> &str`, validates null termination, gracefully handles the suffix-dedup trick afs-as uses (`"_foo\0"` can overlap with a later `"_bar_foo\0"` by pointing mid-string). | |
| 75 | + | |
| 76 | +### 6. DYSYMTAB partitioning | |
| 77 | +Decode the partition `(ilocalsym, nlocalsym)`, `(iextdefsym, nextdefsym)`, `(iundefsym, nundefsym)`. Record `toc`, `modtab`, `extrefsym`, `indirectsymoff/nindirectsyms`, `extreloff`, `locreloff` offsets for later phases (most are for dylibs). | |
| 78 | + | |
| 79 | +### 7. Input file model | |
| 80 | +`afs-ld/src/input.rs`: | |
| 81 | + | |
| 82 | +```rust | |
| 83 | +pub struct ObjectFile { | |
| 84 | + pub path: PathBuf, | |
| 85 | + pub header: MachHeader64, | |
| 86 | + pub commands: Vec<LoadCommand>, | |
| 87 | + pub sections: Vec<InputSection>, | |
| 88 | + pub symbols: Vec<InputSymbol>, | |
| 89 | + pub strings: StringTable, | |
| 90 | + pub dysymtab: DysymtabView, | |
| 91 | +} | |
| 92 | +``` | |
| 93 | + | |
| 94 | +## Testing Strategy | |
| 95 | +- Round-trip: parse every section/symbol/string from the afs-as corpus; re-emit; match bytes. | |
| 96 | +- Diffing against `nm -a` and `otool -r` for symbols and relocation offsets (relocation bodies come in Sprint 3). | |
| 97 | +- Edge cases: empty `__bss`, tentative common with 16-byte alignment, weak-def with `N_NO_DEAD_STRIP`, indirect symbol chains. | |
| 98 | +- Fuzz: malformed nlist entries (strx out of bounds, n_sect out of range, invalid n_type bits) produce sourced diagnostics, never panics. | |
| 99 | + | |
| 100 | +## Definition of Done | |
| 101 | +- Every symbol attribute afs-as can emit is recognized and round-trips. | |
| 102 | +- Common symbols surface with correct size and alignment. | |
| 103 | +- String table reader handles suffix-dedup overlaps correctly. | |
| 104 | +- Corpus-wide symbol and section parity against `nm -a` / `otool -v`. | |
.docs/sprints/sprint03.mdadded@@ -0,0 +1,91 @@ | ||
| 1 | +# Sprint 3: Relocations (Read-Side) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 1–2 — header/load commands and section/symbol parsing in place. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Decode every ARM64 relocation type afs-as emits. Normalize paired relocations (ADDEND + primary, SUBTRACTOR + UNSIGNED) into a linker-friendly form. End state: the linker's reloc model captures every arithmetic and semantic constraint needed by Sprint 11. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Relocation constants and raw form | |
| 12 | +`afs-ld/src/macho/constants.rs` additions: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub const ARM64_RELOC_UNSIGNED: u8 = 0; | |
| 16 | +pub const ARM64_RELOC_SUBTRACTOR: u8 = 1; | |
| 17 | +pub const ARM64_RELOC_BRANCH26: u8 = 2; | |
| 18 | +pub const ARM64_RELOC_PAGE21: u8 = 3; | |
| 19 | +pub const ARM64_RELOC_PAGEOFF12: u8 = 4; | |
| 20 | +pub const ARM64_RELOC_GOT_LOAD_PAGE21: u8 = 5; | |
| 21 | +pub const ARM64_RELOC_GOT_LOAD_PAGEOFF12: u8 = 6; | |
| 22 | +pub const ARM64_RELOC_POINTER_TO_GOT: u8 = 7; | |
| 23 | +pub const ARM64_RELOC_TLVP_LOAD_PAGE21: u8 = 8; | |
| 24 | +pub const ARM64_RELOC_TLVP_LOAD_PAGEOFF12: u8 = 9; | |
| 25 | +pub const ARM64_RELOC_ADDEND: u8 = 10; | |
| 26 | +``` | |
| 27 | + | |
| 28 | +Raw `relocation_info`: 8 bytes. `r_address: i32`, `r_info: u32` packed as `[r_symbolnum:24][r_pcrel:1][r_length:2][r_extern:1][r_type:4]`. | |
| 29 | + | |
| 30 | +### 2. Parsed relocation form | |
| 31 | +`afs-ld/src/reloc/mod.rs`: | |
| 32 | + | |
| 33 | +```rust | |
| 34 | +pub struct Reloc { | |
| 35 | + pub offset: u32, // byte offset into input section | |
| 36 | + pub kind: RelocKind, | |
| 37 | + pub length: RelocLength, // Byte=0, Half=1, Word=2, Quad=3 | |
| 38 | + pub pcrel: bool, | |
| 39 | + pub referent: Referent, | |
| 40 | + pub addend: i64, // folded from ARM64_RELOC_ADDEND prefix or inline | |
| 41 | +} | |
| 42 | + | |
| 43 | +pub enum RelocKind { | |
| 44 | + Unsigned, Branch26, | |
| 45 | + Page21, PageOff12, | |
| 46 | + GotLoadPage21, GotLoadPageOff12, PointerToGot, | |
| 47 | + TlvpLoadPage21, TlvpLoadPageOff12, | |
| 48 | + Subtractor, // minuend in `referent`; paired with a following Unsigned subtrahend | |
| 49 | +} | |
| 50 | + | |
| 51 | +pub enum Referent { | |
| 52 | + Symbol(SymRef), // r_extern = 1 | |
| 53 | + Section(SectRef), // r_extern = 0 | |
| 54 | +} | |
| 55 | +``` | |
| 56 | + | |
| 57 | +### 3. Paired reloc fusion | |
| 58 | +Two pairings afs-as emits: | |
| 59 | + | |
| 60 | +1. **ARM64_RELOC_ADDEND**: a prefix reloc whose `r_symbolnum` field is actually a 24-bit signed addend. The next reloc in the list is the primary (UNSIGNED, PAGE21, PAGEOFF12, BRANCH26, or GOT/TLVP variant). Parser fuses: `addend_reloc.symnum → primary.addend`, primary kept. | |
| 61 | + | |
| 62 | +2. **ARM64_RELOC_SUBTRACTOR + ARM64_RELOC_UNSIGNED**: difference expression. Emitted as a pair where SUBTRACTOR names the subtrahend symbol and UNSIGNED names the minuend. Parser fuses into a single `RelocKind::Subtractor { minuend, subtrahend }` record on the minuend-carrying entry. | |
| 63 | + | |
| 64 | +After fusion no ADDEND or SUBTRACTOR relocs should leak out of the reader. | |
| 65 | + | |
| 66 | +### 4. Integrity checks | |
| 67 | +- `r_address + (length_bytes)` within section bounds. | |
| 68 | +- `r_extern = 1` → `r_symbolnum < nsyms`. | |
| 69 | +- `r_extern = 0` → `r_symbolnum` is a 1-based section index in range. | |
| 70 | +- `r_pcrel` matches the reloc kind (PC-relative: BRANCH26, PAGE21 variants, PAGEOFF12 that looks at ADRP page, TLVP variants; not PC-relative: UNSIGNED, PAGEOFF12 for immediates, POINTER_TO_GOT is PC-relative). | |
| 71 | +- `r_length` matches kind (all ARM64 reloc kinds are length = 2 except UNSIGNED which is length = 2 or 3 and SUBTRACTOR which matches UNSIGNED). | |
| 72 | + | |
| 73 | +### 5. Round-trip serializer (for golden tests) | |
| 74 | +`afs-ld/src/reloc/mod.rs::write_relocs(sect: &InputSection, relocs: &[Reloc]) -> Vec<u8>` reassembles into Mach-O wire form, including the ADDEND prefix when necessary. Used to prove the reader lost nothing. | |
| 75 | + | |
| 76 | +## Testing Strategy | |
| 77 | +- Round-trip every reloc in the afs-as corpus. Byte equality after ADDEND/SUBTRACTOR fusion + re-emission. | |
| 78 | +- Synthetic fixtures for each reloc kind (smallest possible `.s` input through afs-as): | |
| 79 | + - `bl _extern` → BRANCH26 external. | |
| 80 | + - `adrp x0, _g@PAGE; add x0, x0, _g@PAGEOFF` → PAGE21 + PAGEOFF12. | |
| 81 | + - `adrp x0, _g@GOTPAGE; ldr x0, [x0, _g@GOTPAGEOFF]` → GOT_LOAD_PAGE21 + GOT_LOAD_PAGEOFF12. | |
| 82 | + - `.quad _g + 0x1000` → ADDEND + UNSIGNED pair. | |
| 83 | + - `.quad _a - _b` → SUBTRACTOR + UNSIGNED pair. | |
| 84 | + - `adrp x0, _tlv@TLVPPAGE; ldr x0, [x0, _tlv@TLVPPAGEOFF]` → TLVP_LOAD_* pair. | |
| 85 | +- Malformed-input: reloc pointing past section end, unpaired SUBTRACTOR, unpaired ADDEND. Each produces a specific diagnostic citing input path and offset. | |
| 86 | + | |
| 87 | +## Definition of Done | |
| 88 | +- Every ARM64 reloc afs-as emits is represented in `RelocKind` post-fusion. | |
| 89 | +- Paired relocs never leak as separate entries into downstream code. | |
| 90 | +- Corpus-wide round-trip byte equality. | |
| 91 | +- Integrity checks trigger diagnostics on malformed fixtures. | |
.docs/sprints/sprint04.mdadded@@ -0,0 +1,92 @@ | ||
| 1 | +# Sprint 4: Static Archives (`ar`) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 1–3 — Mach-O reading complete. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Read static archives (`.a`) including the BSD, System V, and GNU-thin variants. Support lazy member fetching: a member is only parsed when an undefined symbol names it. This is the mechanism by which `libarmfortas_rt.a` gets pulled in. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Archive format recognizer | |
| 12 | +`afs-ld/src/archive.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub struct Archive<'a> { | |
| 16 | + pub path: PathBuf, | |
| 17 | + pub flavor: Flavor, // Bsd, Sysv, GnuThin | |
| 18 | + pub symdef: SymbolIndex, // names → member offsets | |
| 19 | + pub members: Vec<Member<'a>>, | |
| 20 | +} | |
| 21 | + | |
| 22 | +pub enum Flavor { Bsd, Sysv, GnuThin } | |
| 23 | +``` | |
| 24 | + | |
| 25 | +Detection by magic: `!<arch>\n` for all flavors; thin archives use `!<thin>\n`. BSD vs SysV distinguished by the first entry: `#1/<N>` BSD extended filenames vs `//` SysV long-name string table. | |
| 26 | + | |
| 27 | +### 2. Header parsing | |
| 28 | +Each member preceded by a 60-byte `ar_hdr`: | |
| 29 | +``` | |
| 30 | +char name[16]; // "#1/<N>" on BSD, or "//" string-table index on SysV, or "foo.o/" on SysV short | |
| 31 | +char date[12]; | |
| 32 | +char uid[6]; | |
| 33 | +char gid[6]; | |
| 34 | +char mode[8]; | |
| 35 | +char size[10]; | |
| 36 | +char fmag[2]; // "`\n" | |
| 37 | +``` | |
| 38 | + | |
| 39 | +Parse field-by-field with tight bounds checks. Size is a decimal ASCII integer, not a C literal. | |
| 40 | + | |
| 41 | +### 3. Name decoding | |
| 42 | +- BSD: name field `#1/<N>`, real name is the first N bytes of the member body (body shrinks accordingly). | |
| 43 | +- SysV: name field holds a byte offset into the `//` string table. | |
| 44 | +- SysV short: `foo.o/ ` — slash-terminated, space-padded. | |
| 45 | +- GNU-thin: member body is zero bytes; the name encodes a path relative to the archive. afs-ld `mmap`s the external file. | |
| 46 | + | |
| 47 | +Names stored canonical (null-stripped, slash-stripped). | |
| 48 | + | |
| 49 | +### 4. Symbol index | |
| 50 | +SysV `/` member or BSD `__.SYMDEF` / `__.SYMDEF SORTED` member. BSD layout: | |
| 51 | +``` | |
| 52 | +uint32 ranlib_count | |
| 53 | +ranlib[ranlib_count] { uint32 strx; uint32 offset; } | |
| 54 | +uint32 stringsize | |
| 55 | +char strings[stringsize] | |
| 56 | +``` | |
| 57 | + | |
| 58 | +SysV: big-endian `nsyms: u32`, then `nsyms` big-endian `u32` offsets, then packed null-terminated strings. | |
| 59 | + | |
| 60 | +`SymbolIndex` exposes `fn members_defining(name: &str) -> impl Iterator<Item = MemberRef>`. | |
| 61 | + | |
| 62 | +### 5. Lazy fetch API | |
| 63 | +```rust | |
| 64 | +impl<'a> Archive<'a> { | |
| 65 | + pub fn fetch(&mut self, name: &str) -> Option<ObjectFile>; | |
| 66 | +} | |
| 67 | +``` | |
| 68 | + | |
| 69 | +Returns `None` if the archive does not define `name`. Fetching an archive member memoizes: a second lookup for the same member returns a cached handle. The resolution pass (Sprint 8) is the only caller. | |
| 70 | + | |
| 71 | +### 6. `-force_load` / `-all_load` support (semantics, not CLI yet) | |
| 72 | +Archive has a `force_all(&mut self)` method that pre-fetches every member. Sprint 19 wires the CLI. | |
| 73 | + | |
| 74 | +### 7. Archive-of-archives | |
| 75 | +Rare but legal: member can be another `.a`. Recurse one level. If a sub-archive defines `name`, the outer `fetch` returns the sub-member's object file and records a provenance chain for diagnostics. | |
| 76 | + | |
| 77 | +## Testing Strategy | |
| 78 | +- Fixtures in `tests/corpus/archives/`: | |
| 79 | + - `libbsd.a` made by Apple `ar` (BSD flavor, extended filenames). | |
| 80 | + - `libsysv.a` made by GNU `ar` on Linux (for cross-check). | |
| 81 | + - `libthin.a` made by `ar --thin` (GNU-thin). | |
| 82 | + - `libmulti.a` containing several members each defining one or more symbols. | |
| 83 | +- `cargo test -p afs-ld test_archive_bsd` verifies BSD index → correct member for each name. | |
| 84 | +- Symbol-defining-two-members scenario: archive picks the one whose member comes first (ld's traditional rule). | |
| 85 | +- Missing-symbol lookup returns `None`, does not error. | |
| 86 | +- Thin-archive member file missing on disk produces a path-qualified diagnostic. | |
| 87 | + | |
| 88 | +## Definition of Done | |
| 89 | +- All three archive flavors read. | |
| 90 | +- `libarmfortas_rt.a` (built by parent workspace) parses and every runtime symbol is findable by name. | |
| 91 | +- Archive-of-archives works one level deep. | |
| 92 | +- Differential: `ar -t libarmfortas_rt.a` output matches our `--dump-archive` output. | |
.docs/sprints/sprint05.mdadded@@ -0,0 +1,78 @@ | ||
| 1 | +# Sprint 5: Dylibs (MH_DYLIB Binary) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 1–3 — Mach-O reading complete. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Parse binary dylibs (`MH_DYLIB`). Extract exported symbols via the export trie or `LC_DYLD_CHAINED_FIXUPS` exports, resolve re-exports through umbrella frameworks, and expose a linkable `DylibFile` surface. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. DylibFile model | |
| 12 | +`afs-ld/src/macho/dylib.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub struct DylibFile { | |
| 16 | + pub path: PathBuf, | |
| 17 | + pub install_name: String, | |
| 18 | + pub current_version: u32, // X.Y.Z packed | |
| 19 | + pub compat_version: u32, | |
| 20 | + pub is_umbrella: bool, | |
| 21 | + pub load_kind: DylibLoadKind, // Normal, Weak, Reexport, Upward | |
| 22 | + pub ordinal: u16, // two-level namespace ordinal | |
| 23 | + pub reexports: Vec<PathBuf>, // LC_REEXPORT_DYLIB paths | |
| 24 | + pub exports: ExportTrie, // resolved during loading | |
| 25 | +} | |
| 26 | + | |
| 27 | +pub enum DylibLoadKind { Normal, Weak, Reexport, Upward } | |
| 28 | +``` | |
| 29 | + | |
| 30 | +### 2. Load command decoding | |
| 31 | +- `LC_ID_DYLIB` (for the dylib itself): install_name, timestamp, current_version, compat_version. | |
| 32 | +- `LC_LOAD_DYLIB`: normal dependency. | |
| 33 | +- `LC_LOAD_WEAK_DYLIB`: weak dep (imports allowed to be null at runtime). | |
| 34 | +- `LC_REEXPORT_DYLIB`: dependency whose exports we rebroadcast (umbrella-framework case). | |
| 35 | +- `LC_LOAD_UPWARD_DYLIB`: cyclic dependency escape hatch. | |
| 36 | + | |
| 37 | +### 3. Export trie decoder | |
| 38 | +Export trie lives in `__LINKEDIT` pointed at by either `LC_DYLD_INFO_ONLY.export_off/export_size` (classic) or `LC_DYLD_CHAINED_FIXUPS.exports_trie_offset` (modern). Trie format: | |
| 39 | + | |
| 40 | +- Each node: ULEB128 terminal-size, optional terminal payload (flags ULEB + address ULEB, plus re-export or resolver data), then child count, then `(edge_string, child_offset_ULEB)` pairs. | |
| 41 | +- Terminal flags: `EXPORT_SYMBOL_FLAGS_KIND_REGULAR`/`_THREAD_LOCAL`/`_ABSOLUTE`, `EXPORT_SYMBOL_FLAGS_WEAK_DEFINITION`, `EXPORT_SYMBOL_FLAGS_REEXPORT`, `EXPORT_SYMBOL_FLAGS_STUB_AND_RESOLVER`. | |
| 42 | + | |
| 43 | +```rust | |
| 44 | +pub struct ExportTrie { /* walk-only view */ } | |
| 45 | +impl ExportTrie { | |
| 46 | + pub fn lookup(&self, name: &str) -> Option<ExportEntry>; | |
| 47 | + pub fn iter(&self) -> impl Iterator<Item = (String, ExportEntry)>; | |
| 48 | +} | |
| 49 | + | |
| 50 | +pub struct ExportEntry { | |
| 51 | + pub flags: u32, | |
| 52 | + pub address: u64, | |
| 53 | + pub reexport: Option<(u16 /*ordinal*/, String /*imported_name*/)>, | |
| 54 | + pub resolver: Option<u64>, | |
| 55 | +} | |
| 56 | +``` | |
| 57 | + | |
| 58 | +Walking is recursive; we guard against malformed trees with a depth cap and visited-offset set. | |
| 59 | + | |
| 60 | +### 4. Two-level namespace ordinals | |
| 61 | +Each dylib loaded by path gets an ordinal (1..=N) assigned in load-command order; `BIND_SPECIAL_DYLIB_SELF=0`, `BIND_SPECIAL_DYLIB_MAIN_EXECUTABLE=-1`, `BIND_SPECIAL_DYLIB_FLAT_LOOKUP=-2`, `BIND_SPECIAL_DYLIB_WEAK_LOOKUP=-3`. When an imported symbol is bound in Sprint 15, we use this ordinal. | |
| 62 | + | |
| 63 | +### 5. Re-export resolution | |
| 64 | +Loading a dylib recursively loads its `LC_REEXPORT_DYLIB` chain. Names looked up in the umbrella are delegated down the chain. For CoreFoundation / Foundation style umbrella frameworks (not strictly required for armfortas today but landed now to avoid retrofit). | |
| 65 | + | |
| 66 | +### 6. SDK path resolution | |
| 67 | +`-syslibroot <SDK>` + `-l<name>` needs to locate `${SDK}/usr/lib/lib<name>.{dylib,tbd}`. This sprint establishes the search order; the rest lands in Sprint 19's CLI work. | |
| 68 | + | |
| 69 | +## Testing Strategy | |
| 70 | +- Fixtures: tiny hand-built `.dylib` via the system toolchain (one exported symbol, one re-export). Parsed and exports match `nm -g`. | |
| 71 | +- Differential: load `CoreFoundation.tbd` in Sprint 6, not here; this sprint uses real binary `.dylib`s from `/usr/lib/` (where present on older macOS) or synthetic ones. | |
| 72 | +- Malformed trie: cycle, out-of-bounds child offset, ULEB128 overrun — diagnostics, no panics. | |
| 73 | + | |
| 74 | +## Definition of Done | |
| 75 | +- Export trie walker handles real `.dylib` files correctly. | |
| 76 | +- `DylibFile` constructed with correct install_name, versions, ordinal. | |
| 77 | +- Re-exports chained through umbrella fixtures. | |
| 78 | +- `dyld_info -export <dylib>` output matches our export dumper. | |
.docs/sprints/sprint06.mdadded@@ -0,0 +1,80 @@ | ||
| 1 | +# Sprint 6: TAPI TBD Text Stubs | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 5 — binary dylib reader works. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Read `.tbd` files (TAPI text dylib stubs). On modern SDKs `libSystem`, `libc++`, and `CoreFoundation` ship only as `.tbd` — linking without this sprint means no system libraries, full stop. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Minimal YAML subset | |
| 12 | +TBD is YAML with a well-defined schema. We implement only the subset TAPI emits, not a general YAML parser: | |
| 13 | + | |
| 14 | +- Flow scalars (plain, single-quoted, double-quoted). | |
| 15 | +- Flow sequences: `[ a, b, c ]`. | |
| 16 | +- Block sequences: `- item`. | |
| 17 | +- Block mappings: `key: value`. | |
| 18 | +- Multi-document files with `---` / `...`. | |
| 19 | +- Tags: `!tapi-tbd`. | |
| 20 | +- Version directives: `%YAML 1.2`. | |
| 21 | + | |
| 22 | +No anchors, no aliases, no complex types, no folded scalars. If a real `.tbd` in the wild uses features outside the subset, the parser fails loudly with line/column. | |
| 23 | + | |
| 24 | +### 2. TBD schema | |
| 25 | +`afs-ld/src/macho/tbd.rs`: | |
| 26 | + | |
| 27 | +```rust | |
| 28 | +pub struct Tbd { | |
| 29 | + pub tbd_version: u32, // 3 or 4 | |
| 30 | + pub targets: Vec<Target>, // arch + platform | |
| 31 | + pub install_name: String, | |
| 32 | + pub current_version: Option<String>, | |
| 33 | + pub compatibility_version: Option<String>, | |
| 34 | + pub parent_umbrella: Vec<Scoped<String>>, | |
| 35 | + pub allowable_clients: Vec<Scoped<String>>, | |
| 36 | + pub reexported_libraries: Vec<Scoped<String>>, | |
| 37 | + pub exports: Vec<Scoped<Exports>>, | |
| 38 | + pub reexports: Vec<Scoped<Exports>>, | |
| 39 | +} | |
| 40 | + | |
| 41 | +pub struct Target { pub arch: Arch, pub platform: Platform } | |
| 42 | +pub struct Scoped<T> { pub targets: Vec<Target>, pub value: T } | |
| 43 | +pub struct Exports { | |
| 44 | + pub symbols: Vec<String>, | |
| 45 | + pub weak_symbols: Vec<String>, | |
| 46 | + pub thread_local_symbols: Vec<String>, | |
| 47 | + pub objc_classes: Vec<String>, | |
| 48 | + pub objc_eh_types: Vec<String>, | |
| 49 | + pub objc_ivars: Vec<String>, | |
| 50 | +} | |
| 51 | +``` | |
| 52 | + | |
| 53 | +v3 and v4 both supported; v4 is what modern Xcode ships. | |
| 54 | + | |
| 55 | +### 3. Materialize into DylibFile | |
| 56 | +`Tbd::into_dylib_file(tbd: Tbd, for_target: Target) -> DylibFile`. Filters scoped entries to only those matching `arm64 / macos`. Produces the same `DylibFile` surface Sprint 5 produces, so downstream code doesn't care about source format. | |
| 57 | + | |
| 58 | +### 4. SDK search implementation | |
| 59 | +Integrate with `-syslibroot`. Search order for `-l<name>`: | |
| 60 | +1. `${SDK}/usr/lib/lib<name>.tbd` | |
| 61 | +2. `${SDK}/usr/lib/lib<name>.dylib` | |
| 62 | +3. `${SDK}/usr/local/lib/lib<name>.tbd` | |
| 63 | +4. `${SDK}/usr/local/lib/lib<name>.dylib` | |
| 64 | +5. `-L<dir>` entries in order, same four suffixes. | |
| 65 | + | |
| 66 | +For frameworks (`-framework Foo`): `${SDK}/System/Library/Frameworks/Foo.framework/Foo.{tbd,dylib}`. | |
| 67 | + | |
| 68 | +### 5. Platform/arch filtering | |
| 69 | +`Target { arch: Arm64, platform: MacOS }` is what armfortas cares about. If the TBD has no matching target, produce a clear diagnostic: "`<path>` does not export for arm64-macos". | |
| 70 | + | |
| 71 | +## Testing Strategy | |
| 72 | +- Fixtures: copies of `${SDK}/usr/lib/libSystem.tbd`, `libc++.tbd`, `libobjc.tbd` checked into `tests/corpus/tbd/` (small, just headers of exported symbols — confirm they're not under a license that forbids redistribution; if so, generate equivalent fixtures). | |
| 73 | +- Parse `libSystem.tbd`, assert that `_dyld_stub_binder`, `_malloc`, `_free`, `_printf` are all in exports. | |
| 74 | +- Verify `DylibFile` produced is byte-level equivalent (in the fields we populate) to one produced by loading an actual `libSystem.dylib` from an older SDK. | |
| 75 | +- Malformed YAML: missing `install_name`, tabs in indentation, unterminated quoted scalar — each with a precise diagnostic. | |
| 76 | + | |
| 77 | +## Definition of Done | |
| 78 | +- Can read a modern Xcode `libSystem.tbd` and enumerate its exports. | |
| 79 | +- SDK + `-l` + `-framework` resolution picks the right file on a real toolchain. | |
| 80 | +- Differential test: hello-world link with `libSystem.tbd` produces the same bind entries as with a binary dylib (on older SDKs where both exist). | |
.docs/sprints/sprint07.mdadded@@ -0,0 +1,85 @@ | ||
| 1 | +# Sprint 7: Symbol Model & Table | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 2, 4, 5, 6 — object, archive, dylib, TBD readers in place. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +A uniform symbol table that fuses definitions from every input kind. Establishes the invariants Sprint 8's resolution pass will preserve. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. `Symbol` sum type | |
| 12 | +`afs-ld/src/symbol.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub enum Symbol { | |
| 16 | + Undefined { name: Istr, origin: InputId, weak_ref: bool }, | |
| 17 | + Defined { name: Istr, origin: InputId, atom: AtomId, value: u64, | |
| 18 | + weak: bool, private_extern: bool, no_dead_strip: bool }, | |
| 19 | + Common { name: Istr, origin: InputId, size: u64, align_pow2: u8 }, | |
| 20 | + DylibImport { name: Istr, dylib: DylibId, ordinal: u16, weak_import: bool }, | |
| 21 | + LazyArchive { name: Istr, archive: ArchiveId, member: MemberId }, | |
| 22 | + LazyObject { name: Istr, origin: InputId }, // --start-lib / --end-lib | |
| 23 | + Alias { name: Istr, aliased: Istr }, // N_INDR | |
| 24 | +} | |
| 25 | +``` | |
| 26 | + | |
| 27 | +`Istr` = interned string handle. Interning happens once when a name first enters the table; all comparisons are handle-equality. | |
| 28 | + | |
| 29 | +### 2. `SymbolTable` | |
| 30 | +```rust | |
| 31 | +pub struct SymbolTable { | |
| 32 | + names: StringInterner, | |
| 33 | + by_name: HashMap<Istr, SymbolId>, | |
| 34 | + symbols: Vec<Symbol>, | |
| 35 | + // replacement log for diagnostics + -why_live | |
| 36 | + transitions: Vec<Transition>, | |
| 37 | +} | |
| 38 | + | |
| 39 | +pub struct Transition { pub at: SymbolId, pub from: SymbolKindTag, pub to: SymbolKindTag, pub cause: Cause } | |
| 40 | +``` | |
| 41 | + | |
| 42 | +`HashMap` is fine for Sprint 7; Sprint 28 may swap in a custom open-addressing table. | |
| 43 | + | |
| 44 | +### 3. Insertion semantics | |
| 45 | +`SymbolTable::insert(sym: Symbol)` runs the resolution rules inline: | |
| 46 | + | |
| 47 | +| Existing \ New | Undefined | Defined | Common | DylibImport | LazyArchive | LazyObject | | |
| 48 | +|----------------------|-----------|---------|--------|-------------|-------------|------------| | |
| 49 | +| *vacant* | insert | insert | insert | insert | insert | insert | | |
| 50 | +| Undefined | keep | replace | replace| replace | replace | replace | | |
| 51 | +| Defined (strong) | keep | **error if both strong and same kind** | keep | keep | keep | keep | | |
| 52 | +| Defined (weak) | keep | replace if new strong | keep | keep | keep | keep | | |
| 53 | +| Common | keep | replace (common → Defined) | pick larger size / stricter align | keep | keep | keep | | |
| 54 | +| DylibImport | keep | replace (definition shadows import) | keep | keep | keep | keep | | |
| 55 | +| LazyArchive | **fetch** | replace | replace | replace | keep first | replace | | |
| 56 | +| LazyObject | **fetch** | replace | replace | replace | replace | keep | | |
| 57 | + | |
| 58 | +"Fetch" means: load the member/object, enqueue its symbols, mark this entry's transition. | |
| 59 | + | |
| 60 | +### 4. Weak coalescing rules | |
| 61 | +- `weak_def` + `weak_def` → first wins. | |
| 62 | +- `weak_def` + strong → strong wins. | |
| 63 | +- Strong + strong → hard error, diagnostic cites both input paths. | |
| 64 | +- `weak_ref` without a definition is not an error; the reference resolves to address 0 (handled in relocation pass). | |
| 65 | + | |
| 66 | +### 5. Aliases (N_INDR) | |
| 67 | +Flattened on insertion: an `Alias(name → aliased)` is resolved by looking up `aliased`. If `aliased` is itself an alias, walk until a non-alias is found; cycle detection with a depth cap. | |
| 68 | + | |
| 69 | +### 6. Transition log | |
| 70 | +Every `insert` records the old/new kind + input path + (for lazy fetches) the reason the fetch happened. The `-why_live` diagnostic introduced in Sprint 19 reads this log. | |
| 71 | + | |
| 72 | +### 7. Tombstoned symbols | |
| 73 | +Common → Defined promotion preserves the common size and alignment (so the BSS slot is large enough). Dead-stripping (Sprint 23) can tombstone a Defined without removing it from the table. | |
| 74 | + | |
| 75 | +## Testing Strategy | |
| 76 | +- Unit tests for every cell in the resolution matrix. Each combination has a named test. | |
| 77 | +- Synthetic inputs: two `.o`s both defining `_foo` strong → error; one strong + one weak → strong wins; two weak → first wins; common + strong → common replaced. | |
| 78 | +- Alias-chain cycles detected with a diagnostic, not a stack overflow. | |
| 79 | +- Interner stress test: 100K unique names, membership queries are O(1) average. | |
| 80 | + | |
| 81 | +## Definition of Done | |
| 82 | +- Every matrix cell has a passing test. | |
| 83 | +- Weak coalescing matches `ld` on a corpus of 20+ scenarios (differential test: both linkers produce the same `nm` output). | |
| 84 | +- Alias flattening correct and cycle-safe. | |
| 85 | +- Transition log surfaces replacement causes for `-why_live`. | |
.docs/sprints/sprint08.mdadded@@ -0,0 +1,92 @@ | ||
| 1 | +# Sprint 8: Name Resolution Pass | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 7 — `SymbolTable` with insertion semantics. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Drive the symbol table to a fixed point: every undefined reference either resolves to a Defined (from an object), Common (promoted in BSS), DylibImport (from a dylib/TBD), or raises a clear, actionable diagnostic. `-force_load` / `-all_load` / `-undefined <treatment>` all handled. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Resolution algorithm | |
| 12 | +`afs-ld/src/resolve.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub fn resolve(inputs: &mut Inputs, table: &mut SymbolTable, opts: &LinkOptions) | |
| 16 | + -> Result<(), Vec<ResolveError>> | |
| 17 | +{ | |
| 18 | + seed_table_with_objects_and_dylib_imports(inputs, table, opts); | |
| 19 | + if opts.all_load { force_load_everything(inputs, table); } | |
| 20 | + for forced in &opts.force_load { force_load_one(inputs, table, forced); } | |
| 21 | + fixed_point_pull_from_archives(inputs, table); | |
| 22 | + classify_unresolved(table, opts); | |
| 23 | +} | |
| 24 | +``` | |
| 25 | + | |
| 26 | +### 2. Seed phase | |
| 27 | +Walk every explicit `.o` first, then every `.dylib` / `.tbd`: add Defined / Common from objects, DylibImport from dylibs. Archives are added as LazyArchive entries only — their members are not parsed until pulled. | |
| 28 | + | |
| 29 | +### 3. Fixed-point pull | |
| 30 | +``` | |
| 31 | +while let Some(name) = table.undefined_pending.pop() { | |
| 32 | + for archive in &inputs.archives_in_command_line_order { | |
| 33 | + if let Some(member) = archive.fetch(name) { | |
| 34 | + ingest_member(member, table); | |
| 35 | + break; | |
| 36 | + } | |
| 37 | + } | |
| 38 | +} | |
| 39 | +``` | |
| 40 | + | |
| 41 | +Order matters: armfortas's driver currently passes `<objs> <runtime.a> -lSystem`, and resolution must match `ld`'s left-to-right behavior. Ingesting a member can create new undefined pending names; loop terminates when no member was fetched this round. | |
| 42 | + | |
| 43 | +### 4. `-force_load` and `-all_load` | |
| 44 | +- `-force_load <archive>`: pull every member of that archive before fixed-point. | |
| 45 | +- `-all_load`: pull every member of every archive. | |
| 46 | +- Both happen before the fixed-point loop so their transitively-pulled symbols feed into the same fixed point. | |
| 47 | + | |
| 48 | +### 5. `-undefined <treatment>` | |
| 49 | +After the fixed point, any still-Undefined entry is classified by the `-undefined` setting: | |
| 50 | +- `error` (default): hard error, cite every input that references the name (collected via the transition log). | |
| 51 | +- `warning`: warn but emit, writing the symbol as address 0 (bind to nothing). | |
| 52 | +- `suppress`: silent, address 0. | |
| 53 | +- `dynamic_lookup`: flat-namespace DylibImport with ordinal `BIND_SPECIAL_DYLIB_FLAT_LOOKUP=-2`. | |
| 54 | + | |
| 55 | +### 6. Weak references | |
| 56 | +`weak_ref` to a missing symbol is always valid regardless of `-undefined`; it resolves to address 0 at bind time and the runtime tests for null. | |
| 57 | + | |
| 58 | +### 7. Diagnostics | |
| 59 | +Undefined errors must cite every referrer input, not just one. Output format: | |
| 60 | + | |
| 61 | +``` | |
| 62 | +afs-ld: error: undefined symbol: _afs_print | |
| 63 | + referenced by program.o(text section + 0x34) | |
| 64 | + referenced by runtime.o(text section + 0x120) | |
| 65 | + (also via 2 relocations in libarmfortas_rt.a(io.o)) | |
| 66 | +Hint: did you mean _afs_print_real? (Levenshtein distance 5) | |
| 67 | +``` | |
| 68 | + | |
| 69 | +Did-you-mean uses a basic Levenshtein-3 search over defined symbols. | |
| 70 | + | |
| 71 | +### 8. Diagnostics for duplicate strong | |
| 72 | +``` | |
| 73 | +afs-ld: error: duplicate symbol _foo | |
| 74 | + defined in: a.o (text + 0x0) | |
| 75 | + also in: b.o (text + 0x0) | |
| 76 | +``` | |
| 77 | + | |
| 78 | +No suggestion — two strong defs is a real ambiguity. | |
| 79 | + | |
| 80 | +## Testing Strategy | |
| 81 | +- Resolution matrix revisited from Sprint 7, but with real archives and dylibs. | |
| 82 | +- Order sensitivity: `a.o b.a` vs `b.a a.o` — first resolves when `a.o` references a symbol in `b.a`; second does not (matches `ld`'s classic behavior). | |
| 83 | +- `-force_load` pulls in a member whose symbols would otherwise go unreferenced. | |
| 84 | +- `-all_load` across a multi-member archive. | |
| 85 | +- Weak-import from a dylib that at runtime will be missing. | |
| 86 | +- Did-you-mean fires on a close misspell, stays silent when the closest match is > 3 edits away. | |
| 87 | + | |
| 88 | +## Definition of Done | |
| 89 | +- Fixed-point loop terminates on all corpus inputs. | |
| 90 | +- Diagnostics match the format above, include every referrer, include did-you-mean suggestions. | |
| 91 | +- Differential test against `ld` for order-dependent resolution on 10+ scenarios. | |
| 92 | +- `-force_load` / `-all_load` / `-undefined=*` all pass dedicated tests. | |
.docs/sprints/sprint09.mdadded@@ -0,0 +1,74 @@ | ||
| 1 | +# Sprint 9: Subsections-via-Symbols Atomization | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 2, 7, 8 — sections, symbols, resolved table. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Split each input section into **atoms** at symbol boundaries when `MH_SUBSECTIONS_VIA_SYMBOLS` is set (afs-as always sets this). Atoms are the unit of dead-stripping (Sprint 23), ICF (Sprint 24), and output layout (Sprint 10). Every Defined symbol owns exactly one atom. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Atom model | |
| 12 | +`afs-ld/src/atom.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub struct Atom { | |
| 16 | + pub id: AtomId, | |
| 17 | + pub owner: SymbolId, // the primary symbol defining this atom | |
| 18 | + pub alt_entries: Vec<SymbolId>, // .alt_entry chains | |
| 19 | + pub section: OutputSectionKey, // which output section it will land in | |
| 20 | + pub input_origin: InputId, | |
| 21 | + pub input_section: SectIdx, | |
| 22 | + pub offset: u32, // offset within input section | |
| 23 | + pub size: u32, | |
| 24 | + pub align_pow2: u8, | |
| 25 | + pub data: DataRef, // borrowed from input mmap, or ZeroFill | |
| 26 | + pub relocs: Vec<RelocIdx>, // relocs originating inside this atom | |
| 27 | + pub flags: AtomFlags, // NoDeadStrip, WeakDef, ThreadLocal, ... | |
| 28 | +} | |
| 29 | +``` | |
| 30 | + | |
| 31 | +### 2. Atomization algorithm | |
| 32 | +For each input section: | |
| 33 | +1. Collect every Defined symbol whose section is this section, sorted by value. | |
| 34 | +2. If `MH_SUBSECTIONS_VIA_SYMBOLS` is set: split the section at each symbol's offset. Each slice becomes an atom owned by the symbol at its head. | |
| 35 | +3. If a symbol is `.alt_entry`, fold it into the previous atom's `alt_entries`, don't split. | |
| 36 | +4. If the flag is not set: one atom per section (Apple-style consolidated section). | |
| 37 | + | |
| 38 | +Atoms for text preserve instruction alignment; atoms for zerofill carry size only. | |
| 39 | + | |
| 40 | +### 3. Literal atoms (C strings, 16-byte literals) | |
| 41 | +`__TEXT,__cstring` and `__TEXT,__literal16` are special. Every null-terminated string / every 16-byte block is an atom candidate for de-duplication (Sprint 24 ICF). For now, store each literal as its own atom with a content-hash annotation. | |
| 42 | + | |
| 43 | +### 4. Unwind + compact-unwind atoms | |
| 44 | +`__TEXT,__compact_unwind` contains 32-byte records, each referring (via a reloc) to a function atom. One unwind atom per function; tracked as `parent_of: AtomId` so unwind atoms get stripped alongside dead functions. | |
| 45 | + | |
| 46 | +### 5. Reloc → atom remapping | |
| 47 | +Every reloc has an input offset into its source section. After atomization, recompute as `(atom, offset_within_atom)`. When a reloc crosses atom boundaries it can only point at a whole symbol (subsections-via-symbols invariant); confirm this and diagnose if not. | |
| 48 | + | |
| 49 | +### 6. Reloc references to atoms | |
| 50 | +`Reloc::referent` gains: | |
| 51 | +```rust | |
| 52 | +pub enum Referent { | |
| 53 | + SymbolExternal(SymbolId), // undefined or dylib import | |
| 54 | + SymbolLocal(AtomId, i64), // same-tu reference, addend in bytes | |
| 55 | + AbsoluteSection(AtomId, i64), // rare, section-relative | |
| 56 | +} | |
| 57 | +``` | |
| 58 | + | |
| 59 | +The "local" case is what the atomization unlocks: a reloc from function `_a` to function `_b` in the same `.o` becomes a reference to `_b`'s atom, not to an offset within a monolithic text section. | |
| 60 | + | |
| 61 | +### 7. `.no_dead_strip` propagation | |
| 62 | +Symbol flag propagates to its atom. Unwind atoms inherit `NoDeadStrip` from their parent function. Entry point symbol is marked `NoDeadStrip`. | |
| 63 | + | |
| 64 | +## Testing Strategy | |
| 65 | +- Fixture: a `.s` with several functions where one branches to another. After atomization, reloc's referent must be the callee atom, not a byte-offset. | |
| 66 | +- `.alt_entry` folding: `_foo` and `.alt_entry _bar` in the same input produce one atom whose `alt_entries = [_bar]`. | |
| 67 | +- Boundary-crossing reloc (synthesized maliciously): parser diagnoses. | |
| 68 | +- Differential: `ld -dead_strip` behavior on a corpus of ~20 atomization fixtures compared to what Sprint 23 will produce. | |
| 69 | + | |
| 70 | +## Definition of Done | |
| 71 | +- Every `.o` in the afs-as corpus atomizes without diagnostics. | |
| 72 | +- `.alt_entry` correctly folded. | |
| 73 | +- Relocs re-targeted to atoms; no raw section-relative references leak into Sprint 10. | |
| 74 | +- Unwind atoms track their parent function atom. | |
.docs/sprints/sprint10.mdadded@@ -0,0 +1,98 @@ | ||
| 1 | +# Sprint 10: Output Segment & Section Layout (dylib-aware) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 7–9 — resolved table and atomized inputs. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +One layout engine, two modes: `MH_EXECUTE` and `MH_DYLIB`. Assign VM addresses, file offsets, and segment membership to every atom. End state: the writer can emit a valid-but-empty Mach-O for both modes that `otool -lV` accepts. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Output segment & section model | |
| 12 | +`afs-ld/src/section.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub struct OutputSegment { | |
| 16 | + pub name: String, // "__TEXT", "__DATA_CONST", "__DATA", "__LINKEDIT", "__PAGEZERO" | |
| 17 | + pub sections: Vec<OutputSectionId>, | |
| 18 | + pub vm_addr: u64, pub vm_size: u64, | |
| 19 | + pub file_off: u64, pub file_size: u64, | |
| 20 | + pub init_prot: Prot, pub max_prot: Prot, | |
| 21 | +} | |
| 22 | + | |
| 23 | +pub struct OutputSection { | |
| 24 | + pub segment: String, pub name: String, // e.g. ("__TEXT", "__text") | |
| 25 | + pub kind: SectionKind, | |
| 26 | + pub align_pow2: u8, pub flags: u32, | |
| 27 | + pub atoms: Vec<AtomId>, | |
| 28 | + pub addr: u64, pub size: u64, pub file_off: u64, | |
| 29 | +} | |
| 30 | +``` | |
| 31 | + | |
| 32 | +### 2. Segment plan | |
| 33 | +Two plans, keyed by `OutputKind::Executable | Dylib`: | |
| 34 | + | |
| 35 | +**Executable**: | |
| 36 | +- `__PAGEZERO`: VM `[0, 0x1_0000_0000)`, prot `---`. No file backing. | |
| 37 | +- `__TEXT`: prot `r-x`. Contains `__text`, `__stubs`, `__stub_helper`, `__cstring`, `__const`, `__literal16`, `__unwind_info`, `__eh_frame`. | |
| 38 | +- `__DATA_CONST`: prot `r--` (rebased to `r--` by dyld after fixups). Contains `__got`, `__const` data. | |
| 39 | +- `__DATA`: prot `rw-`. Contains `__data`, `__bss`, `__la_symbol_ptr`, `__thread_ptrs`, `__thread_vars`, `__thread_data`, `__thread_bss`. | |
| 40 | +- `__LINKEDIT`: prot `r--`. Symbol table, string table, dyld-info opcodes (or chained fixups), function starts, data-in-code, code signature. | |
| 41 | + | |
| 42 | +**Dylib**: | |
| 43 | +- No `__PAGEZERO`. `__TEXT` starts at VM `0`. | |
| 44 | +- Everything else the same. | |
| 45 | + | |
| 46 | +### 3. Section placement order within segments | |
| 47 | +Matches ld's defaults so differential testing converges: | |
| 48 | +- `__TEXT`: `__text`, `__stubs`, `__stub_helper`, `__cstring`, `__const`, `__literal16`, `__unwind_info`, `__eh_frame`. | |
| 49 | +- `__DATA_CONST`: `__got`, `__const`. | |
| 50 | +- `__DATA`: `__la_symbol_ptr`, `__data`, `__thread_vars`, `__thread_ptrs`, `__thread_data`, `__thread_bss`, `__bss`. | |
| 51 | +- `__LINKEDIT`: fixup stream, function starts, data-in-code, symbol table, string table, code signature (in that order — matches ld's observed layout). | |
| 52 | + | |
| 53 | +Missing sections are simply absent; empty sections are dropped entirely. | |
| 54 | + | |
| 55 | +### 4. Atom placement | |
| 56 | +Each atom maps to one output section by its `OutputSectionKey`. Within a section, atoms ordered by: | |
| 57 | +1. Input-file command-line order. | |
| 58 | +2. Within an input, atom original offset. | |
| 59 | +3. Tiebreaker: symbol name (for determinism). | |
| 60 | + | |
| 61 | +ICF (Sprint 24) and `-order_file` (later polish) will override this later; for now, deterministic default. | |
| 62 | + | |
| 63 | +### 5. Address assignment | |
| 64 | +Pass 1: accumulate sizes per section, respecting atom alignment. Pass 2: assign section `addr` by accumulating `vm_addr + padding-to-section-alignment`. Pass 3: file offsets — `__TEXT` starts at file 0 (header lives there); other segments at next 4 KiB boundary. Zerofill sections have `size > 0` but contribute 0 to file size. | |
| 65 | + | |
| 66 | +Page alignment is 16 KiB on arm64 (Apple Silicon always). Section alignment comes from atoms. | |
| 67 | + | |
| 68 | +### 6. MH_EXECUTE vs MH_DYLIB writer dispatch | |
| 69 | +`afs-ld/src/macho/writer.rs`: | |
| 70 | + | |
| 71 | +```rust | |
| 72 | +pub enum OutputKind { Executable, Dylib } | |
| 73 | + | |
| 74 | +pub fn write(layout: &Layout, kind: OutputKind, opts: &LinkOptions, out: &mut Vec<u8>) | |
| 75 | + -> Result<(), WriteError>; | |
| 76 | +``` | |
| 77 | + | |
| 78 | +Dispatches to the right load-command set: | |
| 79 | +- Executable: `LC_MAIN` with entry offset, optional `LC_UUID`, `LC_SOURCE_VERSION`. | |
| 80 | +- Dylib: `LC_ID_DYLIB` with install-name, current-version, compat-version; no `LC_MAIN`. | |
| 81 | +- Both: `LC_SEGMENT_64` per segment, `LC_BUILD_VERSION`, `LC_SYMTAB`, `LC_DYSYMTAB`, `LC_DYLD_INFO_ONLY` or `LC_DYLD_CHAINED_FIXUPS`, `LC_FUNCTION_STARTS`, `LC_DATA_IN_CODE`, one `LC_LOAD_DYLIB` per dylib dependency, `LC_RPATH` entries, `LC_CODE_SIGNATURE`. | |
| 82 | + | |
| 83 | +### 7. Minimum-viable empty output | |
| 84 | +End-of-sprint gate: both `OutputKind::Executable` (empty `_main`) and `OutputKind::Dylib` (no exports) emit a file that: | |
| 85 | +- `otool -lV` accepts without complaint. | |
| 86 | +- `file` identifies as `Mach-O 64-bit executable arm64` / `Mach-O 64-bit dynamically linked shared library arm64`. | |
| 87 | +- Does not yet need to run or load — just parse. | |
| 88 | + | |
| 89 | +## Testing Strategy | |
| 90 | +- Snapshot tests: produce the minimal empty executable and dylib; compare load-command layout against a golden captured from `ld`. | |
| 91 | +- Differential: for empty inputs, our load-command order and segment protections must match `ld`'s. | |
| 92 | +- Golden section-ordering tests for the standard ld section order. | |
| 93 | + | |
| 94 | +## Definition of Done | |
| 95 | +- Empty executable output passes `otool -lV`. | |
| 96 | +- Empty dylib output passes `otool -lV`. | |
| 97 | +- Section placement order matches `ld` on a corpus of staged fixtures. | |
| 98 | +- Address assignment deterministic across 100 invocations of the same input. | |
.docs/sprints/sprint11.mdadded@@ -0,0 +1,71 @@ | ||
| 1 | +# Sprint 11: Core Relocation Application (ARM64) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 3, 9, 10 — relocs parsed, atoms sized, addresses assigned. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Patch atom bytes according to every basic ARM64 reloc kind. This sprint covers `BRANCH26`, `PAGE21`, `PAGEOFF12`, `UNSIGNED`, `SUBTRACTOR`, and the folded `ADDEND`. GOT/stubs land in Sprint 12, TLV in Sprint 13. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Reloc application pass | |
| 12 | +`afs-ld/src/reloc/arm64.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub fn apply(layout: &Layout, atom: &Atom, bytes: &mut [u8]) -> Result<(), RelocError>; | |
| 16 | +``` | |
| 17 | + | |
| 18 | +For each reloc in the atom: | |
| 19 | +1. Resolve `Referent` to a final address (atom.addr + referent.atom.offset + addend, or dylib import → 0 for now, handled fully in Sprint 12). | |
| 20 | +2. Compute the reloc value per kind. | |
| 21 | +3. Patch `bytes` at `reloc.offset`. | |
| 22 | + | |
| 23 | +### 2. Reloc math (reference) | |
| 24 | + | |
| 25 | +| Kind | Formula | Encoding | | |
| 26 | +|---|---|---| | |
| 27 | +| `Unsigned` (length=2) | `S + A` | little-endian u32 write | | |
| 28 | +| `Unsigned` (length=3) | `S + A` | little-endian u64 write | | |
| 29 | +| `Subtractor` | `S_min - S_sub + A` | u32 or u64 depending on length | | |
| 30 | +| `Branch26` | `(S + A - P) >> 2` | 26-bit sign-check, OR into bottom 26 bits of the instruction | | |
| 31 | +| `Page21` | `(page(S+A) - page(P)) >> 12` | ADRP immhi:immlo encoding, 21-bit sign-check | | |
| 32 | +| `PageOff12` | `(S + A) & 0xFFF` | ADD imm12 or LDR imm12 (scaled per LDR size!) | | |
| 33 | + | |
| 34 | +Where `S` = symbol/section address, `A` = addend, `P` = address of the relocated instruction, `page(x) = x & ~0xFFF`. | |
| 35 | + | |
| 36 | +### 3. PAGEOFF12 scaling detail | |
| 37 | +For `LDR` immediate-offset forms the 12-bit immediate is scaled by the load size (1 for `LDRB`, 2 for `LDRH`, 4 for `LDR W`, 8 for `LDR X`). afs-as sets the instruction bits correctly; our job is to right-shift the 12-bit offset by the load size's log2 before OR'ing into the instruction. The `size` nibble of the LDR encoding tells us the shift. | |
| 38 | + | |
| 39 | +For `ADD` immediate-offset the 12-bit immediate is unscaled — write as-is. | |
| 40 | + | |
| 41 | +Disambiguate by disassembling the instruction: opcode bits `[31:24]` distinguish ADD vs LDR (B/H/W/X). | |
| 42 | + | |
| 43 | +### 4. Range checks | |
| 44 | +- `Branch26`: `(S + A - P)` must fit in signed 28 bits (26 bits × 4-byte scale). If not, emit a hard error citing the caller atom and the out-of-range target. Thunks land in Sprint 26. | |
| 45 | +- `Page21`: `(page(S+A) - page(P))` must fit in signed 33 bits (21 bits × 4 KiB scale). In practice, always satisfiable on macOS. | |
| 46 | +- `PageOff12`: always fits by construction. | |
| 47 | +- `Unsigned`: wraps silently. | |
| 48 | + | |
| 49 | +### 5. Subtractor + Unsigned pair | |
| 50 | +A `RelocKind::Subtractor` entry carries both the minuend and subtrahend. Formula: `target = minuend.addr + minuend_addend - (subtrahend.addr + subtrahend_addend)`. Write as u32 or u64 depending on length. afs-as uses this for `.quad _a - _b` and for CIE offset diff in `__eh_frame`. | |
| 51 | + | |
| 52 | +### 6. PC vs atom address | |
| 53 | +`P` is the address of the relocated 4-byte instruction, `atom.addr + reloc.offset`. `P` for an `Unsigned` reloc still evaluates even when `pcrel=false`; the formula just doesn't use it. A wrong `P` is the most common reloc bug — unit test every kind against a hand-computed value. | |
| 54 | + | |
| 55 | +### 7. Error reporting | |
| 56 | +Every failed reloc cites the originating input + atom + offset + kind + referent. No panics. | |
| 57 | + | |
| 58 | +### 8. Defer: GOT and TLVP | |
| 59 | +`GOT_LOAD_PAGE21`, `GOT_LOAD_PAGEOFF12`, `POINTER_TO_GOT`, `TLVP_LOAD_*` — Sprint 12 and Sprint 13 allocate the synthetic sections and wire them. This sprint emits a clear "not yet implemented" error if encountered. | |
| 60 | + | |
| 61 | +## Testing Strategy | |
| 62 | +- Unit test each kind with hand-computed encodings cross-checked against ARM ARM and `otool -tv` disassembly. | |
| 63 | +- Differential: identical inputs through `ld` and afs-ld produce the same patched bytes (within allowed-diff categories from Sprint 0's harness). | |
| 64 | +- Corner cases: max-negative branch, wrap-around UNSIGNED addend, SUBTRACTOR across sections, SUBTRACTOR within same section. | |
| 65 | +- Regression fixture: a tiny `.o` that exercises every kind covered in this sprint. | |
| 66 | + | |
| 67 | +## Definition of Done | |
| 68 | +- All covered reloc kinds apply correctly against a corpus of fixtures. | |
| 69 | +- Out-of-range BRANCH26 emits an actionable error. | |
| 70 | +- Differential pass: 10+ fixtures link to byte-identical `__text` under afs-ld and `ld`. | |
| 71 | +- GOT/TLVP kinds emit "not yet implemented" errors with a pointer to Sprints 12/13. | |
.docs/sprints/sprint12.mdadded@@ -0,0 +1,96 @@ | ||
| 1 | +# Sprint 12: GOT, Stubs, Lazy Pointers | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 5, 10, 11 — dylibs loaded, layout pass, core reloc application. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Synthesize `__got`, `__stubs`, `__stub_helper`, `__la_symbol_ptr`. Wire `GOT_LOAD_*` / `POINTER_TO_GOT` relocations to GOT slots. Rewire `BRANCH26` to a dylib import through a stub. Classic lazy-binding model; chained fixups land in Sprint 15.5. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. GOT synthetic section | |
| 12 | +`afs-ld/src/synth/got.rs`: | |
| 13 | + | |
| 14 | +```rust | |
| 15 | +pub struct GotSection { | |
| 16 | + entries: Vec<GotEntry>, | |
| 17 | + index: HashMap<SymbolId, usize>, | |
| 18 | +} | |
| 19 | + | |
| 20 | +pub struct GotEntry { pub symbol: SymbolId, pub weak_import: bool } | |
| 21 | +``` | |
| 22 | + | |
| 23 | +- Lives in `__DATA_CONST,__got`, section flags `S_NON_LAZY_SYMBOL_POINTERS` (type = 6). `reserved1` in the section header = the starting indirect-symbol-table index. | |
| 24 | +- 8 bytes per entry, aligned 8. | |
| 25 | +- GOT entry for a Defined symbol holds that symbol's address directly (no dyld bind). | |
| 26 | +- GOT entry for a DylibImport is zeroed in the file; dyld binds it at load time via the non-lazy bind stream (Sprint 15). | |
| 27 | + | |
| 28 | +### 2. Stubs synthetic section | |
| 29 | +`afs-ld/src/synth/stubs.rs`: | |
| 30 | + | |
| 31 | +ARM64 stub is 12 bytes: | |
| 32 | +``` | |
| 33 | +ADRP x16, la_symbol_ptr@PAGE | |
| 34 | +LDR x16, [x16, la_symbol_ptr@PAGEOFF] | |
| 35 | +BR x16 | |
| 36 | +``` | |
| 37 | + | |
| 38 | +- Lives in `__TEXT,__stubs`, section flags `S_SYMBOL_STUBS | S_ATTR_PURE_INSTRUCTIONS | S_ATTR_SOME_INSTRUCTIONS` (type = 8). `reserved1` = starting indirect-sym index, `reserved2` = 12 (stub size). | |
| 39 | +- One stub per dylib-imported function whose address is branched to (`BRANCH26` target). | |
| 40 | + | |
| 41 | +### 3. Lazy symbol pointers | |
| 42 | +`__DATA,__la_symbol_ptr`, section flags `S_LAZY_SYMBOL_POINTERS` (type = 7). Each 8-byte entry is initialized to point at the corresponding `__stub_helper` entry; at first call, the stub-helper resolves the symbol and patches the lazy pointer. | |
| 43 | + | |
| 44 | +### 4. Stub helper | |
| 45 | +`__TEXT,__stub_helper`: | |
| 46 | + | |
| 47 | +Header (24 bytes on arm64): | |
| 48 | +``` | |
| 49 | +ADRP x17, __dyld_private@PAGE | |
| 50 | +ADD x17, x17, __dyld_private@PAGEOFF | |
| 51 | +STP x16, x17, [sp, #-16]! | |
| 52 | +ADRP x16, dyld_stub_binder@GOTPAGE | |
| 53 | +LDR x16, [x16, dyld_stub_binder@GOTPAGEOFF] | |
| 54 | +BR x16 | |
| 55 | +``` | |
| 56 | + | |
| 57 | +Per-symbol entry (12 bytes): | |
| 58 | +``` | |
| 59 | +LDR w16, =<lazy_bind_offset> | |
| 60 | +B <header_addr> | |
| 61 | +``` | |
| 62 | + | |
| 63 | +Where `<lazy_bind_offset>` is the offset of this symbol's opcode sequence within the `__LINKEDIT` lazy-bind stream (Sprint 15 wires this). | |
| 64 | + | |
| 65 | +Needs `___dyld_private` (local anchor) and `_dyld_stub_binder` (dylib import from `libSystem`). | |
| 66 | + | |
| 67 | +### 5. Binding strategy | |
| 68 | +- `_dyld_stub_binder` is imported from `libSystem`. Gets a GOT entry; no stub (we take its address directly). | |
| 69 | +- `___dyld_private` is a 0-filled 8-byte slot in `__DATA,__data`. Not exported. Dyld uses it as scratch during binding. | |
| 70 | + | |
| 71 | +### 6. Reloc rewiring | |
| 72 | +Relocation application pass (Sprint 11 + this): | |
| 73 | + | |
| 74 | +- `GOT_LOAD_PAGE21` / `GOT_LOAD_PAGEOFF12` → target = GOT slot address. | |
| 75 | +- `POINTER_TO_GOT` → target = GOT slot address (used for 32-bit pointer-to-GOT references). | |
| 76 | +- `BRANCH26` to a dylib import → target = stub address. | |
| 77 | +- `BRANCH26` to a Defined → target unchanged (direct call). | |
| 78 | +- `BRANCH26` to an Undefined resolved via `-undefined dynamic_lookup` → target = stub address. | |
| 79 | + | |
| 80 | +### 7. Indirect symbol table | |
| 81 | +`__LINKEDIT` indirect-symbol table = list of u32 symbol-table indices, used by dyld to map each stub / lazy pointer / GOT slot entry back to its symbol. Populated here, pointed at by `LC_DYSYMTAB.indirectsymoff`. | |
| 82 | + | |
| 83 | +### 8. Weak-import dylib functions | |
| 84 | +`weak_import` symbols get stubs whose lazy binding opcode sequence includes the `BIND_SYMBOL_FLAGS_WEAK_IMPORT` flag. At runtime, if the symbol is missing, dyld patches the lazy pointer to 0 instead of erroring. The call site must test for null before branching — that's user code's responsibility. | |
| 85 | + | |
| 86 | +## Testing Strategy | |
| 87 | +- Hello-world staging fixture: a `.o` that calls `_printf` + references `_errno`. Produces `__stubs`, `__la_symbol_ptr`, `__stub_helper`, `__got` in the expected order and sizes. | |
| 88 | +- Differential: stub/lazy-pointer/GOT layout byte-identical to `ld` on the staging fixture. | |
| 89 | +- Reloc-rewire test: `BRANCH26` to a dylib-imported function lands in the stub, not in the dylib directly. | |
| 90 | +- Disassembly test: `otool -v -t` on `__stubs` matches the expected three-instruction sequence for every entry. | |
| 91 | + | |
| 92 | +## Definition of Done | |
| 93 | +- GOT, stubs, lazy pointers, stub helper all emitted with correct flags and `reserved*` fields. | |
| 94 | +- Indirect symbol table populated correctly (Sprint 14 consumes it). | |
| 95 | +- BRANCH26-to-dylib correctly rewired to stubs. | |
| 96 | +- Differential pass on the staging hello-world fixture. | |
.docs/sprints/sprint13.mdadded@@ -0,0 +1,64 @@ | ||
| 1 | +# Sprint 13: TLV Relocations | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 12 — GOT-like synthesis patterns established. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Support thread-local variables: the full chain from afs-as's `__thread_vars` / `__thread_data` / `__thread_bss` through `__DATA,__thread_ptrs` into the ARM64 TLV runtime call. `TLVP_LOAD_PAGE21` and `TLVP_LOAD_PAGEOFF12` relocations applied correctly. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. TLV descriptor layout | |
| 12 | +Apple's TLV model: each TLV gets a 3-word descriptor in `__DATA,__thread_vars` (section type `S_THREAD_LOCAL_VARIABLES`, 0x13): | |
| 13 | + | |
| 14 | +``` | |
| 15 | +u64 thunk_addr; // pointer to tlv_get_addr (libSystem) — rebased/bound at load | |
| 16 | +u64 key; // pthread_key_t, set to 0 initially | |
| 17 | +u64 offset; // offset of the variable's initial data within __thread_data | |
| 18 | +``` | |
| 19 | + | |
| 20 | +afs-as emits the descriptor template (thunk_addr = 0, key = 0, offset = section-relative to `__thread_data` or `__thread_bss`). afs-ld: | |
| 21 | + | |
| 22 | +- Patches `thunk_addr` to reference `_tlv_bootstrap` (from libSystem) via `__DATA,__thread_ptrs`. | |
| 23 | +- Leaves `key = 0` (runtime initializes on first access). | |
| 24 | +- Adjusts `offset` to be the final VM offset into the laid-out `__thread_data` / `__thread_bss` section. | |
| 25 | + | |
| 26 | +### 2. `__DATA,__thread_ptrs` synth | |
| 27 | +Section type `S_THREAD_LOCAL_VARIABLE_POINTERS` (0x16). Contains non-lazy pointers to the TLV thunk function (`_tlv_bootstrap` from libSystem). One 8-byte entry per imported TLV thunk. Equivalent to the GOT for TLVs. | |
| 28 | + | |
| 29 | +### 3. TLVP reloc application | |
| 30 | +`TLVP_LOAD_PAGE21` and `TLVP_LOAD_PAGEOFF12` resolve to a `__thread_ptrs` entry (not a `__thread_vars` descriptor directly). The thread-local access sequence afs-as emits: | |
| 31 | + | |
| 32 | +``` | |
| 33 | +ADRP x0, _tlv@TLVPPAGE | |
| 34 | +LDR x0, [x0, _tlv@TLVPPAGEOFF] ; x0 = &thread_ptrs[_tlv] | |
| 35 | +LDR x1, [x0] ; x1 = &tlv_descriptor (actually thunk ptr!) | |
| 36 | +BLR x1 ; returns address of TLV body in x0 | |
| 37 | +``` | |
| 38 | + | |
| 39 | +Wait — re-check Apple's TLV ABI. The correct sequence: `__thread_ptrs` entry is a pointer to the TLV descriptor (the 3-word thing in `__thread_vars`). The sequence loads `[desc+0]` = thunk, `[desc+8]` = key, and calls the thunk with the descriptor address in `x0`. The thunk reads the key, calls `pthread_getspecific` if needed, and returns the body address. Verify against reference in `.refs/ld64/` before coding. | |
| 40 | + | |
| 41 | +### 4. Coordinate with afs-as section layout | |
| 42 | +afs-as emits: | |
| 43 | +- `__DATA,__thread_data` (S_THREAD_LOCAL_REGULAR, 0x11): TLV initializers. | |
| 44 | +- `__DATA,__thread_bss` (S_THREAD_LOCAL_ZEROFILL, 0x12): zero-initialized TLVs. | |
| 45 | +- `__DATA,__thread_vars` (S_THREAD_LOCAL_VARIABLES, 0x13): descriptors. | |
| 46 | + | |
| 47 | +afs-ld preserves these three sections and adds `__DATA,__thread_ptrs` (S_THREAD_LOCAL_VARIABLE_POINTERS, 0x16). | |
| 48 | + | |
| 49 | +### 5. `_tlv_bootstrap` import | |
| 50 | +Auto-injected as an undefined symbol (if any TLV descriptor needs it), resolves from `libSystem`. Its GOT-equivalent entry lives in `__DATA,__thread_ptrs`, not `__DATA_CONST,__got` (TLV has its own indirection). | |
| 51 | + | |
| 52 | +### 6. Zero TLVs early-out | |
| 53 | +If no input section has `S_THREAD_LOCAL_*` contents and no reloc has a TLVP kind, emit no TLV sections at all. | |
| 54 | + | |
| 55 | +## Testing Strategy | |
| 56 | +- Fixture: a `.f90` with `THREADPRIVATE` → `.o` with `__thread_vars`, `__thread_data`, `__thread_bss` and TLVP relocs. Link with afs-ld and with `ld`. Diff the resulting TLV descriptors, `__thread_ptrs`, and reloc-patched bytes. | |
| 57 | +- Runtime test: link a tiny C program that reads a TLV via the Apple TLV ABI sequence, run it, check output. | |
| 58 | +- Zero-TLV fixture: no TLV sections leak into the output. | |
| 59 | + | |
| 60 | +## Definition of Done | |
| 61 | +- `TLVP_LOAD_*` relocs apply correctly. | |
| 62 | +- `__thread_ptrs` emitted with correct type flag and entries. | |
| 63 | +- `_tlv_bootstrap` imported only when needed. | |
| 64 | +- Runtime test loads and reads a TLV correctly under afs-ld. | |
.docs/sprints/sprint14.mdadded@@ -0,0 +1,77 @@ | ||
| 1 | +# Sprint 14: LC_SYMTAB / LC_DYSYMTAB / String Table | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 10–13 — layout, relocs, GOT/stubs/TLV all emitted. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Build the final symbol table, string table, and `LC_DYSYMTAB` partitioning expected by dyld. Byte-level matches with `ld`'s layout on simple inputs. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Symbol table partitioning | |
| 12 | +dyld requires symbols in this order inside `LC_SYMTAB`: | |
| 13 | + | |
| 14 | +1. **Locals** (ilocalsym..ilocalsym+nlocalsym): private Defined + `N_PEXT` private-external symbols, debug stabs (we have none from afs-as today), `N_STAB` entries. | |
| 15 | +2. **External defined** (iextdefsym..iextdefsym+nextdefsym): `N_EXT` Defined symbols sorted by name for dyld lookups. | |
| 16 | +3. **Undefined** (iundefsym..iundefsym+nundefsym): dylib imports, sorted by name. | |
| 17 | + | |
| 18 | +`LC_DYSYMTAB` records each partition's start and count. | |
| 19 | + | |
| 20 | +### 2. Symbol entry construction | |
| 21 | +Per output symbol, emit an `nlist_64`: | |
| 22 | + | |
| 23 | +``` | |
| 24 | +strx: offset into the string table | |
| 25 | +n_type: N_SECT | N_EXT for external Defined; | |
| 26 | + N_SECT | N_PEXT for private Defined; | |
| 27 | + N_UNDF | N_EXT for undefined / dylib import; | |
| 28 | + N_ABS for absolute | |
| 29 | +n_sect: 1-based index into the section-table-in-header order; 0 for UNDF/ABS | |
| 30 | +n_desc: for UNDF: high 16 bits = library ordinal (1-based) or special (0..-3) | |
| 31 | + N_WEAK_REF | N_WEAK_DEF | N_NO_DEAD_STRIP as appropriate | |
| 32 | +n_value: Defined's VM address; UNDF = 0 | |
| 33 | +``` | |
| 34 | + | |
| 35 | +Two-level namespace: every dylib-imported symbol gets the ordinal of its DylibFile in `n_desc`'s high 16 bits. Flat lookup = 0; special ordinals per `<mach-o/nlist.h>`. | |
| 36 | + | |
| 37 | +### 3. String table | |
| 38 | +- Starts with a null byte at offset 0 (dyld-required). | |
| 39 | +- All symbol names follow, each null-terminated. | |
| 40 | +- Suffix-dedup like afs-as: sort names by reverse-lexicographic suffix order and reuse trailing bytes where possible. Cheap space win and preserves the style contract with afs-as. | |
| 41 | +- 8-byte pad at end. | |
| 42 | + | |
| 43 | +### 4. Indirect symbol table | |
| 44 | +Already populated by Sprint 12 via GOT/stubs/lazy-pointers. Lives in `__LINKEDIT`, pointed at by `LC_DYSYMTAB.indirectsymoff / nindirectsyms`. Each entry is a u32: | |
| 45 | +- Symbol-table index for symbols in `__stubs`, `__la_symbol_ptr`, `__got`. | |
| 46 | +- Special sentinel `INDIRECT_SYMBOL_LOCAL=0x80000000` for entries pointing at local symbols (not exported). | |
| 47 | +- Special sentinel `INDIRECT_SYMBOL_ABS=0x40000000` for absolute symbols. | |
| 48 | + | |
| 49 | +### 5. Local-symbol stripping (`-x`) | |
| 50 | +`ld` supports `-x` to strip local symbols from the output. We record the flag (Sprint 19 wires CLI) and at emission time drop locals from the symbol table. Relocs (if any were external-only) and debug info are unaffected. If `-x` is not set, emit all locals. | |
| 51 | + | |
| 52 | +### 6. Relocations in the output | |
| 53 | +For `MH_EXECUTE` and `MH_DYLIB`, dyld-era outputs don't emit per-section relocations — `LC_DYLD_INFO` (or chained fixups) does that job. afs-ld writes zero `nreloc`/`reloff` on output sections. (For `MH_OBJECT`, which we're not emitting, it would matter.) | |
| 54 | + | |
| 55 | +### 7. File-offset sequencing in __LINKEDIT | |
| 56 | +`__LINKEDIT` data layout order ld uses (we match for differential ease): | |
| 57 | +1. Chained fixups blob (if present) or dyld-info opcode streams. | |
| 58 | +2. Function starts blob. | |
| 59 | +3. Data-in-code blob. | |
| 60 | +4. Symbol table (`nsyms * 16` bytes). | |
| 61 | +5. Indirect symbol table. | |
| 62 | +6. String table. | |
| 63 | +7. Code signature. | |
| 64 | + | |
| 65 | +Each block aligned to 8 bytes; `__LINKEDIT` itself page-aligned. `LC_SYMTAB`, `LC_DYSYMTAB`, `LC_FUNCTION_STARTS`, `LC_DATA_IN_CODE`, `LC_DYLD_INFO_ONLY`, `LC_CODE_SIGNATURE` all point into this region with file offsets. | |
| 66 | + | |
| 67 | +## Testing Strategy | |
| 68 | +- Build a fixture with one local, one external Defined, one undefined dylib import. Verify `LC_SYMTAB` / `LC_DYSYMTAB` partitions match `ld`'s output exactly. | |
| 69 | +- String-table dedup: two symbols `_afs_array_sum` and `_array_sum` share suffix bytes. | |
| 70 | +- Two-level namespace ordinals assigned in load-command order; mismatches produce hard errors when the dylib isn't listed. | |
| 71 | +- Differential: symbol-table byte-level match for every staging fixture. | |
| 72 | + | |
| 73 | +## Definition of Done | |
| 74 | +- `nm -v` output identical (modulo address offsets allowed by differential harness) between afs-ld and `ld` on all staging fixtures. | |
| 75 | +- `LC_DYSYMTAB` partition boundaries exact. | |
| 76 | +- Indirect symbol table entries point to the correct nlist indices for stubs, lazy pointers, and GOT. | |
| 77 | +- String-table byte length within 5% of `ld`'s (suffix-dedup parity). | |
.docs/sprints/sprint15.mdadded@@ -0,0 +1,101 @@ | ||
| 1 | +# Sprint 15: Classic LC_DYLD_INFO Opcodes | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 12, 14 — GOT/stubs/lazy-pointers in place, symbol table shaped. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Generate the four ULEB128 opcode streams and the export trie that dyld reads via `LC_DYLD_INFO_ONLY`. This is the classic format (macOS 11–13 default) and the `-no_fixup_chains` path on newer macOS. Chained fixups land in Sprint 15.5. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. The five streams | |
| 12 | + | |
| 13 | +`LC_DYLD_INFO_ONLY` load command points at five blobs in `__LINKEDIT`: | |
| 14 | +- **rebase_off / rebase_size**: rebase opcodes — fix up absolute pointers for ASLR slide. | |
| 15 | +- **bind_off / bind_size**: bind opcodes — non-lazy imports from dylibs. | |
| 16 | +- **weak_bind_off / weak_bind_size**: weak-bind opcodes — C++-style weak symbol coalescing at runtime. | |
| 17 | +- **lazy_bind_off / lazy_bind_size**: lazy-bind opcodes — one block per stub_helper entry. | |
| 18 | +- **export_off / export_size**: export trie — what this image exports to other images. | |
| 19 | + | |
| 20 | +### 2. Opcode encoder | |
| 21 | +`afs-ld/src/synth/dyld_info.rs`: | |
| 22 | + | |
| 23 | +```rust | |
| 24 | +pub struct OpcodeStream { buf: Vec<u8> } | |
| 25 | + | |
| 26 | +impl OpcodeStream { | |
| 27 | + pub fn uleb(&mut self, v: u64); | |
| 28 | + pub fn sleb(&mut self, v: i64); | |
| 29 | + pub fn string(&mut self, s: &str); // null-terminated | |
| 30 | + pub fn byte(&mut self, op_and_imm: u8); | |
| 31 | + pub fn done(&mut self); // terminating REBASE_OPCODE_DONE / BIND_OPCODE_DONE | |
| 32 | +} | |
| 33 | +``` | |
| 34 | + | |
| 35 | +Opcode byte = (opcode_nibble << 4) | imm_nibble. | |
| 36 | + | |
| 37 | +### 3. Rebase stream | |
| 38 | +For every absolute pointer in output `__DATA` / `__DATA_CONST` (an `Unsigned` reloc or a GOT entry resolved to a local address), emit rebase opcodes: | |
| 39 | + | |
| 40 | +``` | |
| 41 | +REBASE_OPCODE_SET_TYPE_IMM(REBASE_TYPE_POINTER) | |
| 42 | +REBASE_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(seg_idx, offset_within_seg) | |
| 43 | +REBASE_OPCODE_DO_REBASE_ULEB_TIMES(count) or _IMM(count) | |
| 44 | +``` | |
| 45 | + | |
| 46 | +Batching: consecutive rebases collapse into single `_ULEB_TIMES`; strided rebases use `_ULEB_TIMES_SKIPPING_ULEB`. Matching ld's batching is what keeps the differential harness happy. | |
| 47 | + | |
| 48 | +### 4. Non-lazy bind stream | |
| 49 | +For every GOT entry pointing at a dylib import: | |
| 50 | + | |
| 51 | +``` | |
| 52 | +BIND_OPCODE_SET_DYLIB_ORDINAL_IMM(ordinal) or _ULEB(ordinal) | |
| 53 | +BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM(flags) + <name>\0 | |
| 54 | +BIND_OPCODE_SET_TYPE_IMM(BIND_TYPE_POINTER) | |
| 55 | +BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(seg_idx, offset) | |
| 56 | +BIND_OPCODE_DO_BIND | |
| 57 | +``` | |
| 58 | + | |
| 59 | +Flags: `BIND_SYMBOL_FLAGS_WEAK_IMPORT`, `BIND_SYMBOL_FLAGS_NON_WEAK_DEFINITION`. | |
| 60 | + | |
| 61 | +### 5. Weak bind stream | |
| 62 | +For symbols that participate in weak coalescing across the program (weak defs that can be overridden by other images). For armfortas today this is empty; fortsh may or may not need it. Emit a terminator-only stream by default. | |
| 63 | + | |
| 64 | +### 6. Lazy bind stream | |
| 65 | +One block per stub_helper entry (one dylib-imported callable per stub). Each block: | |
| 66 | +``` | |
| 67 | +BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(seg_idx_of_la_symbol_ptr, offset_of_this_slot) | |
| 68 | +BIND_OPCODE_SET_DYLIB_ORDINAL_IMM/ULEB(ordinal) | |
| 69 | +BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM(flags) + <name>\0 | |
| 70 | +BIND_OPCODE_DO_BIND | |
| 71 | +BIND_OPCODE_DONE | |
| 72 | +``` | |
| 73 | + | |
| 74 | +The stub_helper entry pushes the byte offset of its block; `dyld_stub_binder` reads from that offset, interprets the block, patches the lazy pointer. | |
| 75 | + | |
| 76 | +### 7. Export trie | |
| 77 | +Rooted at `__LINKEDIT[export_off]`. Built from the output's external Defined symbols (including re-exports from dylibs we re-export). Tree construction: | |
| 78 | + | |
| 79 | +- Collect `(name, ExportEntry)` pairs. | |
| 80 | +- Build a prefix trie. | |
| 81 | +- Emit depth-first: each node = ULEB terminal-size, optional terminal payload (flags + address ULEB), child-count, (edge_string, child_offset) pairs. | |
| 82 | +- Child offsets are fixed up in a second pass once sizes are known. | |
| 83 | + | |
| 84 | +Terminal payload formats: | |
| 85 | +- Regular: `flags ULEB | address_from_file_start ULEB`. | |
| 86 | +- Re-export: `flags ULEB | dylib_ordinal ULEB | imported_name\0`. | |
| 87 | +- Stub-and-resolver: `flags ULEB | stub_addr ULEB | resolver_addr ULEB`. | |
| 88 | + | |
| 89 | +### 8. Stream-size determinism | |
| 90 | +Every stream must be deterministic across invocations given identical inputs. Sort keys everywhere, no hashmap iteration order. | |
| 91 | + | |
| 92 | +## Testing Strategy | |
| 93 | +- Differential: for every staging fixture, afs-ld and `ld` produce byte-identical opcode streams after normalizing any tolerated differences. | |
| 94 | +- Unit tests for ULEB128 encoding at boundary values (0, 127, 128, 16383, 16384, big). | |
| 95 | +- Export-trie walker (Sprint 5's `DylibFile::exports` reader) round-trips our emitted tries: emit a trie, parse it back, every name resolves. | |
| 96 | + | |
| 97 | +## Definition of Done | |
| 98 | +- All five streams emitted correctly. | |
| 99 | +- Export trie round-trips through our own reader. | |
| 100 | +- Differential byte-level parity with `ld` on 10+ staging fixtures. | |
| 101 | +- Opcode emission is deterministic. | |
.docs/sprints/sprint15_5.mdadded@@ -0,0 +1,100 @@ | ||
| 1 | +# Sprint 15.5: Chained Fixups (LC_DYLD_CHAINED_FIXUPS) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 15 — classic dyld-info format working. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Emit the modern `LC_DYLD_CHAINED_FIXUPS` format, introduced in macOS 12 and mandatory on arm64e. `LC_DYLD_EXPORTS_TRIE` pairs with it for the export side. Coexists with Sprint 15 under `-fixup_chains` / `-no_fixup_chains`; chained becomes default once Sprint 27's parity gate clears. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Chained-fixups header | |
| 12 | +`__LINKEDIT` blob pointed at by `LC_DYLD_CHAINED_FIXUPS`: | |
| 13 | + | |
| 14 | +``` | |
| 15 | +struct dyld_chained_fixups_header { | |
| 16 | + uint32 fixups_version; // 0 | |
| 17 | + uint32 starts_offset; // offset of dyld_chained_starts_in_image | |
| 18 | + uint32 imports_offset; // offset of imports table | |
| 19 | + uint32 symbols_offset; // offset of symbol strings | |
| 20 | + uint32 imports_count; | |
| 21 | + uint32 imports_format; // DYLD_CHAINED_IMPORT (1), _ADDEND (2), or _ADDEND64 (3) | |
| 22 | + uint32 symbols_format; // 0 = uncompressed | |
| 23 | +} | |
| 24 | +``` | |
| 25 | + | |
| 26 | +### 2. Per-segment fixup starts | |
| 27 | +``` | |
| 28 | +struct dyld_chained_starts_in_image { | |
| 29 | + uint32 seg_count; | |
| 30 | + uint32 seg_info_offset[seg_count]; // 0 = no fixups in this segment | |
| 31 | +} | |
| 32 | + | |
| 33 | +struct dyld_chained_starts_in_segment { | |
| 34 | + uint32 size; | |
| 35 | + uint16 page_size; // 0x4000 on arm64 Apple Silicon | |
| 36 | + uint16 pointer_format; // DYLD_CHAINED_PTR_64 (2) or _64_OFFSET (6) | |
| 37 | + uint64 segment_offset; | |
| 38 | + uint32 max_valid_pointer; // 0 for 64-bit | |
| 39 | + uint16 page_count; | |
| 40 | + uint16 page_start[page_count]; // offset of first chain within each page (0xFFFF = no chain) | |
| 41 | +} | |
| 42 | +``` | |
| 43 | + | |
| 44 | +### 3. Pointer formats | |
| 45 | +arm64 uses `DYLD_CHAINED_PTR_64` (plain 64-bit) or `DYLD_CHAINED_PTR_64_OFFSET` (offsets from image base). arm64e uses `DYLD_CHAINED_PTR_ARM64E` (with auth bits); skip arm64e for now. Each chained pointer is a 64-bit word with fields: | |
| 46 | + | |
| 47 | +``` | |
| 48 | +bind: 31-bit ordinal | 1-bit bind | 8-bit next | 1-bit auth=0 | |
| 49 | +rebase: 36-bit target | 19-bit high | 1-bit bind=0 | 8-bit next | 1-bit auth=0 | |
| 50 | +``` | |
| 51 | + | |
| 52 | +The `next` field is the distance in 4-byte units from this fixup to the next one within the page (0 = end of chain). Rebuilding chains after layout is the bulk of this sprint. | |
| 53 | + | |
| 54 | +### 4. Imports table | |
| 55 | +One entry per imported symbol. `DYLD_CHAINED_IMPORT` format: | |
| 56 | +``` | |
| 57 | +uint32 lib_ordinal : 8; // dylib ordinal | |
| 58 | +uint32 weak_import : 1; | |
| 59 | +uint32 name_offset : 23; // into the symbol strings blob | |
| 60 | +``` | |
| 61 | + | |
| 62 | +`_ADDEND` (32-bit) and `_ADDEND64` (64-bit) formats add an explicit addend — we pick the smallest that fits our inputs (ADDEND64 only if any addend exceeds i32 range). | |
| 63 | + | |
| 64 | +### 5. Chain construction | |
| 65 | +Walk every output fixup (rebase or bind), grouped by segment and page. Within a page, chain them in ascending file offset; the `next` field of each points at the next. Pages with no fixups set `page_start = 0xFFFF`. Validate that no chain ever crosses a page boundary. | |
| 66 | + | |
| 67 | +### 6. Exports trie → LC_DYLD_EXPORTS_TRIE | |
| 68 | +The export trie is unchanged from Sprint 15's format. In chained-fixups mode the trie lives under `LC_DYLD_EXPORTS_TRIE` instead of inside `LC_DYLD_INFO_ONLY`. | |
| 69 | + | |
| 70 | +### 7. CLI flag wiring | |
| 71 | +`-fixup_chains` forces chained, `-no_fixup_chains` forces classic. Default policy: | |
| 72 | +- If `-platform_version macos` minimum ≥ 12.0: chained. | |
| 73 | +- Otherwise: classic. | |
| 74 | + | |
| 75 | +Sprint 19's CLI sprint consumes these flags; this sprint just implements both paths. | |
| 76 | + | |
| 77 | +### 8. Removing `__stub_helper` under chained | |
| 78 | +Chained fixups don't use lazy binding — `__stub_helper` is unnecessary, `__la_symbol_ptr` becomes an ordinary bind slot in `__DATA` or `__AUTH_DATA`. Under chained mode the writer skips emitting `__stub_helper` and wires `BRANCH26` through a different stub that loads directly from the bind slot. | |
| 79 | + | |
| 80 | +Modified ARM64 stub for chained: | |
| 81 | +``` | |
| 82 | +ADRP x16, _symbol@PAGE | |
| 83 | +LDR x16, [x16, _symbol@PAGEOFF] | |
| 84 | +BR x16 | |
| 85 | +``` | |
| 86 | + | |
| 87 | +Where `_symbol@PAGE/PAGEOFF` resolves to the bind slot in `__DATA`. Dyld has already bound it by the time the stub runs. | |
| 88 | + | |
| 89 | +## Testing Strategy | |
| 90 | +- Parity vs `ld -fixup_chains` on staging fixtures: byte-identical chain layout and imports table. | |
| 91 | +- Parity vs `ld -no_fixup_chains`: retains Sprint 15 byte-identical output. | |
| 92 | +- Page-boundary test: a fixup at the last 4 bytes of a page followed by one at byte 0 of the next page — each in its own chain, both reachable. | |
| 93 | +- Default-policy test: `-platform_version macos 11.0` → classic, `-platform_version macos 12.0` → chained. | |
| 94 | +- Runtime test: binaries linked both ways load and execute correctly on an M-series Mac. | |
| 95 | + | |
| 96 | +## Definition of Done | |
| 97 | +- Both `-fixup_chains` and `-no_fixup_chains` produce runnable binaries. | |
| 98 | +- Chain layout byte-identical to `ld` on 10+ staging fixtures. | |
| 99 | +- Default format switches on `-platform_version`. | |
| 100 | +- `__stub_helper` correctly omitted in chained mode. | |
.docs/sprints/sprint16.mdadded@@ -0,0 +1,51 @@ | ||
| 1 | +# Sprint 16: LC_FUNCTION_STARTS & LC_DATA_IN_CODE | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 14 — `__LINKEDIT` layout sequencing; Sprint 11 — atoms placed. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Emit the two small `__LINKEDIT` blobs used by debuggers, disassemblers, and the dynamic loader: `LC_FUNCTION_STARTS` (delta-encoded entry points) and `LC_DATA_IN_CODE` (markers for data embedded in `__text`). | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. LC_FUNCTION_STARTS | |
| 12 | + | |
| 13 | +Format: a single stream of ULEB128 deltas. First ULEB = offset from the Mach-O header to the first function entry. Each subsequent ULEB = delta from the previous entry. A terminating `0` ends the stream. 8-byte aligned. | |
| 14 | + | |
| 15 | +Source: every atom in `__TEXT,__text` plus `.alt_entry` chain members. Exclude atoms from `__stubs` and `__stub_helper` — ld doesn't list those. | |
| 16 | + | |
| 17 | +### 2. LC_DATA_IN_CODE | |
| 18 | + | |
| 19 | +Format: a packed array of: | |
| 20 | +``` | |
| 21 | +struct data_in_code_entry { | |
| 22 | + uint32 offset; // from Mach-O header | |
| 23 | + uint16 length; // bytes | |
| 24 | + uint16 kind; // DICE_KIND_DATA=1, _JUMP_TABLE8=2, _JUMP_TABLE16=3, | |
| 25 | + // _JUMP_TABLE32=4, _ABS_JUMP_TABLE32=5 | |
| 26 | +} | |
| 27 | +``` | |
| 28 | + | |
| 29 | +Source: per-input `LC_DATA_IN_CODE` blocks. Remap each entry's offset from its input-section base to the final output VM address. Entries sorted by offset. | |
| 30 | + | |
| 31 | +afs-as doesn't emit jump tables today, but we preserve whatever the input has so future C/Objective-C objects with jump tables survive linking. | |
| 32 | + | |
| 33 | +### 3. Sorting determinism | |
| 34 | + | |
| 35 | +Function starts: strictly ascending by VM address. Data-in-code: strictly ascending by output offset. Ties resolved by input command-line order. | |
| 36 | + | |
| 37 | +### 4. Integration with `__LINKEDIT` layout (Sprint 14) | |
| 38 | + | |
| 39 | +Both blobs get file offsets assigned after chained fixups / classic dyld-info but before the symbol table. Pointed at by their respective load commands with `dataoff / datasize`. | |
| 40 | + | |
| 41 | +## Testing Strategy | |
| 42 | + | |
| 43 | +- Differential: function starts list byte-identical between afs-ld and `ld` on every staging fixture. | |
| 44 | +- Data-in-code: fixture with a jump table input; entries survive linking with correct remapped offsets. | |
| 45 | +- Empty output: fixtures with no functions produce zero-byte LC_FUNCTION_STARTS (actually: ld still emits a terminator? check) and absent LC_DATA_IN_CODE when no input had data-in-code. | |
| 46 | + | |
| 47 | +## Definition of Done | |
| 48 | + | |
| 49 | +- LC_FUNCTION_STARTS parity with `ld` on every staging fixture. | |
| 50 | +- LC_DATA_IN_CODE entries remapped correctly across linking. | |
| 51 | +- Both blobs placed in the right `__LINKEDIT` slot per Sprint 14. | |
.docs/sprints/sprint17.mdadded@@ -0,0 +1,90 @@ | ||
| 1 | +# Sprint 17: Unwind Info | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 9, 10, 11 — atoms, output layout, reloc application. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Synthesize `__TEXT,__unwind_info` from per-function `__compact_unwind` records that afs-as already emits. Pass `__TEXT,__eh_frame` through as the DWARF fallback path. Without this sprint, `_Unwind_Backtrace`, C++ exceptions, and some system panics produce garbage or abort. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Input: afs-as `__compact_unwind` | |
| 12 | + | |
| 13 | +afs-as emits one 32-byte record per function: | |
| 14 | +``` | |
| 15 | +uint64 function_start; // reloc to function atom | |
| 16 | +uint32 code_len; | |
| 17 | +uint32 encoding; // ARM64 compact-unwind encoding (UNWIND_ARM64_MODE_*) | |
| 18 | +uint64 personality; // reloc to personality function or 0 | |
| 19 | +uint64 lsda; // reloc to LSDA or 0 | |
| 20 | +``` | |
| 21 | + | |
| 22 | +ARM64 encoding nibbles (`UNWIND_ARM64_MODE_MASK = 0x0F000000`): | |
| 23 | +- `UNWIND_ARM64_MODE_FRAMELESS = 0x02000000` (+ stack size in 16-byte units) | |
| 24 | +- `UNWIND_ARM64_MODE_DWARF = 0x03000000` (falls back to __eh_frame) | |
| 25 | +- `UNWIND_ARM64_MODE_FRAME = 0x04000000` (+ saved-register bitfield for x19-x28, d8-d15) | |
| 26 | + | |
| 27 | +### 2. `__TEXT,__unwind_info` layout | |
| 28 | + | |
| 29 | +Complex, but structured. Header: | |
| 30 | +``` | |
| 31 | +uint32 version; // UNWIND_SECTION_VERSION = 1 | |
| 32 | +uint32 common_encodings_offset; | |
| 33 | +uint32 common_encodings_count; | |
| 34 | +uint32 personalities_offset; | |
| 35 | +uint32 personalities_count; | |
| 36 | +uint32 indices_offset; // first-level index | |
| 37 | +uint32 indices_count; | |
| 38 | +``` | |
| 39 | + | |
| 40 | +Then three variable-length arrays: | |
| 41 | + | |
| 42 | +1. **Common encodings**: up to 127 most-frequent 32-bit encodings. Lookups in per-page tables reference them by index instead of repeating the 32-bit value. | |
| 43 | +2. **Personalities**: array of 32-bit offsets from mach header to each personality function (usually `___gxx_personality_v0` or `___objc_personality_v0`). | |
| 44 | +3. **First-level indices**: `(function_offset, second_level_page_offset, lsda_index_offset)` triples, one per page worth of functions. Last entry is a sentinel with function_offset = text section end. | |
| 45 | + | |
| 46 | +Then **second-level pages** — one per first-level index — each starting with a kind tag: | |
| 47 | +- `UNWIND_SECOND_LEVEL_REGULAR = 2`: array of `(function_offset, encoding)` pairs. Larger, uncompressed. | |
| 48 | +- `UNWIND_SECOND_LEVEL_COMPRESSED = 3`: delta-encoded `(function_delta, encoding_index)` pairs in 32 bits each; encoding_index ≤ 127 indexes common encodings, ≥ 128 indexes a page-local encodings array. | |
| 49 | + | |
| 50 | +Plus an **LSDA table**: sorted `(function_offset, lsda_offset)` pairs for functions that have LSDAs. | |
| 51 | + | |
| 52 | +### 3. Construction algorithm | |
| 53 | + | |
| 54 | +1. Gather input `__compact_unwind` records; remap function_start to output VM. | |
| 55 | +2. Sort by function_start. | |
| 56 | +3. Tally encoding frequencies; pick top 127 as common encodings. | |
| 57 | +4. Walk the sorted list, packing up to `pageSize/4 - header` records per compressed page (ld uses 4 KB pages here, ~1020 entries max). | |
| 58 | +5. Records with DWARF encoding: defer to `__eh_frame` — we still emit them but dyld's unwinder will follow the encoding to DWARF. | |
| 59 | +6. Write the three top arrays, the per-page second-level tables, and the LSDA index. | |
| 60 | + | |
| 61 | +### 4. `__eh_frame` pass-through | |
| 62 | + | |
| 63 | +afs-as emits DWARF CIEs and FDEs in `__TEXT,__eh_frame`. We don't re-encode — we concatenate per-input `__eh_frame` contents, adjust personality function references (LC_SUBTRACTOR pairs), and emit. CIE deduplication is a nice-to-have (Sprint 30); for this sprint we pass through without deduping. | |
| 64 | + | |
| 65 | +### 5. Coordination with dead-strip | |
| 66 | + | |
| 67 | +If Sprint 23 removes a function, its compact-unwind and eh_frame records must go too. Compact-unwind atoms are already `parent_of` linked to function atoms from Sprint 9; Sprint 23 walks that link. Eh_frame FDEs similarly reference their function via a SUBTRACTOR pair — when the function atom dies, strip the FDE. | |
| 68 | + | |
| 69 | +### 6. Correctness validation | |
| 70 | + | |
| 71 | +After writing, we can re-read our own `__unwind_info` (write a tiny walker) and verify: | |
| 72 | +- Every function in `__text` is represented (either in compact form or with DWARF encoding). | |
| 73 | +- Every personality/LSDA reference resolves to a valid VM address. | |
| 74 | +- First-level index is strictly ascending. | |
| 75 | +- Second-level compressed encoding_index < common_count + 255. | |
| 76 | + | |
| 77 | +## Testing Strategy | |
| 78 | + | |
| 79 | +- Fixture from afs-as emitting a function with prologue (`stp x29, x30, [sp, #-16]!`) → compact-unwind FRAME encoding. Parity byte-level with `ld`. | |
| 80 | +- Function with no prologue (leaf) → FRAMELESS encoding with size 0. | |
| 81 | +- Function that falls back to DWARF → DWARF encoding, associated FDE survives in `__eh_frame`. | |
| 82 | +- C++ fixture compiled by clang (C interop via iso_c_binding is in-scope for armfortas) — personality + LSDA survive; `try/throw/catch` still works when executed. | |
| 83 | +- Backtrace test: program calls `backtrace()` from execinfo.h; output lists the right function names. | |
| 84 | + | |
| 85 | +## Definition of Done | |
| 86 | + | |
| 87 | +- `__unwind_info` byte-identical to `ld` on staging fixtures with prologues, leaves, and DWARF fallbacks. | |
| 88 | +- `__eh_frame` passthrough preserves all FDEs with correct personality/LSDA references. | |
| 89 | +- Backtraces produce real symbolic names on a binary linked by afs-ld. | |
| 90 | +- C++ exceptions (via clang input) unwind correctly when linked by afs-ld. | |
.docs/sprints/sprint18.mdadded@@ -0,0 +1,71 @@ | ||
| 1 | +# Sprint 18: HELLO WORLD MILESTONE (Executable) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 0–17 — full read, resolve, atomize, layout, reloc-apply, dyld-info, function-starts, unwind pipeline. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Produce a runnable arm64 PIE executable. afs-ld takes `hello.o + libarmfortas_rt.a + libSystem.tbd + -e _main` and emits a binary that, when executed on an M-series Mac, prints "Hello, World!". Not a demo — an exit criterion for everything that came before. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Staging fixture | |
| 12 | +`tests/corpus/hello/` contains: | |
| 13 | +- `hello.f90`: trivial Fortran program with `print *, "Hello, World!"`. | |
| 14 | +- `hello.o`: assembled by afs-as from armfortas's `hello.s` output. | |
| 15 | +- Expected output on run: `Hello, World!\n` (Fortran print adds a leading blank on some paths — match what armfortas currently produces). | |
| 16 | + | |
| 17 | +### 2. End-to-end link | |
| 18 | +Invocation: | |
| 19 | +``` | |
| 20 | +afs-ld hello.o libarmfortas_rt.a \ | |
| 21 | + -lSystem -syslibroot "$(xcrun --show-sdk-path)" \ | |
| 22 | + -e _main -no_uuid -platform_version macos 11.0 14.0 \ | |
| 23 | + -o hello | |
| 24 | +``` | |
| 25 | + | |
| 26 | +Expected: | |
| 27 | +- Output file passes `file hello` → `Mach-O 64-bit executable arm64`. | |
| 28 | +- `otool -lV hello` accepts without errors and shows `LC_MAIN`, expected segments, dylibs. | |
| 29 | +- `codesign -dv hello` reports no signature (Sprint 22 adds ad-hoc signing — binaries still run unsigned on Apple Silicon **from** Xcode or a trusted source, but `./hello` from a Terminal prompt requires signing. Document this caveat.). | |
| 30 | +- `./hello` (after the Sprint 22 signature) prints the expected string and exits 0. | |
| 31 | + | |
| 32 | +### 3. Differential gate | |
| 33 | +`tests/hello_world.rs`: | |
| 34 | +```rust | |
| 35 | +let ours = link_with_afs_ld(&inputs, &args); | |
| 36 | +let theirs = link_with_system_ld(&inputs, &args); | |
| 37 | +let diff = diff_macho(&ours, &theirs); | |
| 38 | +assert!(diff.critical.is_empty(), "critical diffs: {:#?}", diff.critical); | |
| 39 | +``` | |
| 40 | + | |
| 41 | +Allowed tolerated diffs: | |
| 42 | +- UUID bytes (we emit zero with `-no_uuid`, ld may or may not — both should honor `-no_uuid`). | |
| 43 | +- String table ordering within a partition as long as every symbol resolves to the same address. | |
| 44 | +- LC_DYLD_INFO vs LC_DYLD_CHAINED_FIXUPS if defaults disagree — gate with the same `-fixup_chains` / `-no_fixup_chains` flag. | |
| 45 | + | |
| 46 | +### 4. Load-command `otool -lV` parity | |
| 47 | +Run `otool -lV` on both outputs; diff should be empty after normalizing absolute file offsets (ld and afs-ld may interleave `__LINKEDIT` regions slightly differently — document and justify any remaining diffs). | |
| 48 | + | |
| 49 | +### 5. Execution gate (requires Sprint 22's ad-hoc signing, but staged here) | |
| 50 | +Two cases: | |
| 51 | +- Unsigned path: `./hello` fails with "killed: 9" (correct Apple Silicon behavior); `codesign -s - hello && ./hello` works. | |
| 52 | +- Once Sprint 22 lands, afs-ld's own output is signed ad-hoc and `./hello` works directly. | |
| 53 | + | |
| 54 | +This sprint declares success as soon as `codesign -s - hello && ./hello` prints the expected string. | |
| 55 | + | |
| 56 | +### 6. Audit | |
| 57 | +Brutal audit after Sprint 18. Same rules as armfortas audits: | |
| 58 | +- No "placeholder" or "stub" explanations that hide wrong output. | |
| 59 | +- Test every claim. Wrong output from a linker is critical. | |
| 60 | +- If the binary runs but produces extra or missing newlines, investigate — don't rationalize. | |
| 61 | + | |
| 62 | +## Testing Strategy | |
| 63 | +- `tests/hello_world.rs` runs the differential gate. | |
| 64 | +- `tests/hello_world_run.rs` executes the binary (gated on CI-locally-on-Mac) and asserts stdout. | |
| 65 | +- Regression fixtures: any hello-world variant that once broke gets its own test. | |
| 66 | + | |
| 67 | +## Definition of Done | |
| 68 | +- `tests/hello_world.rs` passes — zero critical diffs vs `ld`. | |
| 69 | +- Binary runs and produces correct output (after `codesign -s - hello` pre-Sprint-22). | |
| 70 | +- `otool -lV` diff empty after documented normalizations. | |
| 71 | +- Audit passes. | |
.docs/sprints/sprint18_5.mdadded@@ -0,0 +1,55 @@ | ||
| 1 | +# Sprint 18.5: HELLO LIBRARY MILESTONE (Dylib) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 18 — executable path works end-to-end. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Validate `MH_DYLIB` output end-to-end. afs-ld emits a dylib that `dlopen`/`dlsym` can load and a minimal C or Fortran harness can call into. Proves that every dylib-specific decision in Sprints 10–17 is actually correct. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Staging fixture | |
| 12 | +`tests/corpus/hello_library/`: | |
| 13 | +- `foo.f90`: module exporting a single `foo_add(a, b) -> c` interoperable procedure. | |
| 14 | +- `foo.o`: assembled object. | |
| 15 | +- `caller.c`: `int main() { void *h = dlopen("./libfoo.dylib", RTLD_NOW); int (*f)(int,int) = dlsym(h, "foo_add"); printf("%d\n", f(2, 3)); }`. | |
| 16 | +- Expected runtime output: `5`. | |
| 17 | + | |
| 18 | +### 2. Dylib link invocation | |
| 19 | +``` | |
| 20 | +afs-ld -dylib foo.o libarmfortas_rt.a \ | |
| 21 | + -lSystem -syslibroot "$(xcrun --show-sdk-path)" \ | |
| 22 | + -install_name @rpath/libfoo.dylib \ | |
| 23 | + -compatibility_version 1.0 -current_version 1.0.0 \ | |
| 24 | + -no_uuid -platform_version macos 11.0 14.0 \ | |
| 25 | + -o libfoo.dylib | |
| 26 | +``` | |
| 27 | + | |
| 28 | +### 3. Validation checklist | |
| 29 | +- `file libfoo.dylib` → `Mach-O 64-bit dynamically linked shared library arm64`. | |
| 30 | +- `otool -lV libfoo.dylib` shows `LC_ID_DYLIB` with install-name `@rpath/libfoo.dylib`, `current_version = 1.0.0`, `compat_version = 1.0.0`. No `__PAGEZERO`, no `LC_MAIN`. | |
| 31 | +- Export trie contains `_foo_add` (Fortran name-mangled per armfortas convention, or `bind(C)` if used). | |
| 32 | +- `dlopen("./libfoo.dylib", RTLD_NOW)` returns non-null. | |
| 33 | +- `dlsym(h, "foo_add")` returns the function address. | |
| 34 | +- Calling `foo_add(2, 3)` returns `5`. | |
| 35 | + | |
| 36 | +### 4. Differential | |
| 37 | +Link the same inputs with `ld -dylib` and afs-ld. Compare load commands, export trie contents, indirect symbol table. Tolerated diffs same as Sprint 18. | |
| 38 | + | |
| 39 | +### 5. `-rpath` interaction | |
| 40 | +`caller` is linked against `libfoo.dylib` with `@rpath` indirection. Sprint 19 will wire the full `-rpath` CLI; this sprint validates that an install-name of `@rpath/...` is correctly emitted and that the binary's `DYLD_PRINT_LIBRARIES=1` output shows dyld resolving `@rpath` via the `LC_RPATH` entries of the caller. | |
| 41 | + | |
| 42 | +### 6. `dladdr`/`backtrace` in the dylib | |
| 43 | +When `foo_add` calls into `libarmfortas_rt`, `backtrace_symbols()` should return readable names — proves the symbol table partitioning for a dylib is correct and the Sprint 17 unwind info is wired into dyld's unwinder. | |
| 44 | + | |
| 45 | +## Testing Strategy | |
| 46 | +- `tests/hello_library.rs`: builds and `dlopen`s the dylib, calls `foo_add`, asserts the return. | |
| 47 | +- `tests/hello_library_nm.rs`: runs `nm -D` on the dylib, asserts `_foo_add` appears as external. | |
| 48 | +- Differential harness with `ld -dylib` on the same inputs. | |
| 49 | + | |
| 50 | +## Definition of Done | |
| 51 | +- `libfoo.dylib` loads via `dlopen` and exports `_foo_add`. | |
| 52 | +- Calling the exported function returns the expected value. | |
| 53 | +- Differential parity with `ld` on the staging fixture. | |
| 54 | +- `otool -lV` shows correct dylib-specific load commands with no `__PAGEZERO` or `LC_MAIN`. | |
| 55 | +- Post-sprint audit passes. | |
.docs/sprints/sprint19.mdadded@@ -0,0 +1,146 @@ | ||
| 1 | +# Sprint 19: CLI Surface + Diagnostics (`-map`, `-why_live`) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 18–18.5 — executable and dylib milestones reached. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Full `ld`-compatible CLI surface for the flags armfortas already uses and those fortsh is likely to invoke. Includes the two diagnostics surfaces we declared launch-blocking: `-map` (text link map) and `-why_live` (dead-strip reason chain). No polish-tier deferral. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Full flag list | |
| 12 | +Recognized: | |
| 13 | + | |
| 14 | +**Inputs/outputs**: | |
| 15 | +- `-o <path>` | |
| 16 | +- positional `<input>` | |
| 17 | +- `-l<name>` / `-l <name>` | |
| 18 | +- `-L <dir>` | |
| 19 | +- `-framework <name>` | |
| 20 | +- `-weak_framework <name>` | |
| 21 | +- `-force_load <archive>` | |
| 22 | +- `-all_load` | |
| 23 | +- `-ObjC` (skippable no-op unless inputs have ObjC — they won't from armfortas today) | |
| 24 | + | |
| 25 | +**Target & platform**: | |
| 26 | +- `-arch arm64` | |
| 27 | +- `-syslibroot <path>` | |
| 28 | +- `-platform_version macos <min> <sdk>` | |
| 29 | + | |
| 30 | +**Output kind**: | |
| 31 | +- (default) executable | |
| 32 | +- `-dylib` | |
| 33 | +- `-r` (relocatable — deferred; errors for now) | |
| 34 | +- `-bundle` (deferred; errors for now) | |
| 35 | + | |
| 36 | +**Entry & startup**: | |
| 37 | +- `-e <symbol>` (default `_main` for executables) | |
| 38 | + | |
| 39 | +**Runtime search paths**: | |
| 40 | +- `-rpath <path>` | |
| 41 | +- `-install_name <path>` (dylib only) | |
| 42 | +- `-compatibility_version <v>` (dylib only) | |
| 43 | +- `-current_version <v>` (dylib only) | |
| 44 | + | |
| 45 | +**Symbol handling**: | |
| 46 | +- `-undefined <error|warning|suppress|dynamic_lookup>` (default: error) | |
| 47 | +- `-exported_symbols_list <file>` | |
| 48 | +- `-unexported_symbols_list <file>` | |
| 49 | +- `-exported_symbol <sym>` | |
| 50 | +- `-unexported_symbol <sym>` | |
| 51 | +- `-x` (strip locals) | |
| 52 | +- `-S` (strip debug) | |
| 53 | + | |
| 54 | +**Layout & output metadata**: | |
| 55 | +- `-no_uuid` | |
| 56 | +- `-dead_strip` (gates Sprint 23 pass) | |
| 57 | +- `-icf=safe` / `-icf=none` (gates Sprint 24 pass) | |
| 58 | +- `-fixup_chains` / `-no_fixup_chains` | |
| 59 | + | |
| 60 | +**Diagnostics**: | |
| 61 | +- `-map <path>`: emit text link map | |
| 62 | +- `-why_live <symbol>`: print dead-strip reason chain | |
| 63 | +- `-t` / `-trace`: print input file paths as they are loaded | |
| 64 | +- `-v` / `--version` | |
| 65 | +- `-h` / `--help` | |
| 66 | + | |
| 67 | +**Passthrough / compat**: | |
| 68 | +- `-Wl,<comma-separated>`: normalize into separate flags. | |
| 69 | +- Unknown flags: error with suggestion (Levenshtein-3 over the list above). | |
| 70 | + | |
| 71 | +### 2. `-map <path>` output format | |
| 72 | +Text file mirroring ld's link map: | |
| 73 | +``` | |
| 74 | +# Path: <output path> | |
| 75 | +# Arch: arm64 | |
| 76 | +# Object files: | |
| 77 | +[ 0] linker synthesized | |
| 78 | +[ 1] hello.o | |
| 79 | +[ 2] libarmfortas_rt.a(runtime.o) | |
| 80 | +... | |
| 81 | + | |
| 82 | +# Sections: | |
| 83 | +# Address Size Segment Section | |
| 84 | +0x100003f9c 0x00000018 __TEXT __text | |
| 85 | +0x100003fb4 0x00000024 __TEXT __stubs | |
| 86 | +... | |
| 87 | + | |
| 88 | +# Symbols: | |
| 89 | +# Address Size File Name | |
| 90 | +0x100003f9c 0x00000014 [ 1] _main | |
| 91 | +0x100003fb0 0x00000004 [ 1] .alt_entry_of_main | |
| 92 | +0x100003fb4 0x0000000c linker _printf (stub) | |
| 93 | +... | |
| 94 | + | |
| 95 | +# Dead stripped: | |
| 96 | +<file> <symbol> | |
| 97 | +[ 2] _unused_helper | |
| 98 | +``` | |
| 99 | + | |
| 100 | +### 3. `-why_live <symbol>` output | |
| 101 | +Walks the live-edge graph from Sprint 23 backward from the named symbol to a root: | |
| 102 | +``` | |
| 103 | +_main is live because: | |
| 104 | + _main is in -e _main (GC root) | |
| 105 | + | |
| 106 | +_afs_write_char is live because: | |
| 107 | + _afs_write_char is reachable from _afs_print | |
| 108 | + _afs_print is reachable from _main | |
| 109 | + _main is in -e _main (GC root) | |
| 110 | +``` | |
| 111 | + | |
| 112 | +When used before `-dead_strip` has been applied, the diagnostic explains that `-dead_strip` was not requested. Multiple `-why_live` names allowed. | |
| 113 | + | |
| 114 | +### 4. Exported / unexported symbols files | |
| 115 | +Each line of the file is a symbol name. Wildcards: `*` matches any chars, `?` matches one. Used to adjust the final export trie and to mark symbols `N_PEXT` when `-unexported_symbol` is set. Consumed by Sprint 14's symbol-table construction (which this sprint amends). | |
| 116 | + | |
| 117 | +### 5. CLI parser | |
| 118 | +`afs-ld/src/args.rs`: | |
| 119 | +- Hand-rolled, no clap. | |
| 120 | +- Streaming argv scan. | |
| 121 | +- Error messages cite the flag, the invalid value, and the expected format. | |
| 122 | +- `-Wl,-map,foo.txt` normalized to `-map foo.txt` before dispatch. | |
| 123 | + | |
| 124 | +### 6. `-t` trace output | |
| 125 | +As each input file is loaded: | |
| 126 | +``` | |
| 127 | +afs-ld: loading hello.o | |
| 128 | +afs-ld: loading libarmfortas_rt.a | |
| 129 | +afs-ld: loading libarmfortas_rt.a(io.o) | |
| 130 | +afs-ld: loading /usr/lib/libSystem.tbd | |
| 131 | +``` | |
| 132 | + | |
| 133 | +## Testing Strategy | |
| 134 | +- One test per flag: parse the flag, assert `LinkOptions` field set correctly. | |
| 135 | +- Error-message snapshot tests for every invalid-flag case. | |
| 136 | +- `-map` differential: produce a map, compare shape (not exact byte) to `ld`'s map on hello-world. | |
| 137 | +- `-why_live _main` produces a root-only explanation. | |
| 138 | +- `-why_live <transitively-reachable-sym>` produces a chain. | |
| 139 | +- `-Wl,-map,foo.txt` parsed identically to `-map foo.txt`. | |
| 140 | + | |
| 141 | +## Definition of Done | |
| 142 | +- Every flag listed above parses and wires correctly. | |
| 143 | +- `-map` produces human-readable output covering object files, sections, symbols, dead-stripped entries. | |
| 144 | +- `-why_live` produces a coherent chain on fixtures with dead-strip enabled. | |
| 145 | +- Unknown-flag errors include a did-you-mean suggestion. | |
| 146 | +- CLI surface passes a snapshot test against the `--help` output. | |
.docs/sprints/sprint20.mdadded@@ -0,0 +1,72 @@ | ||
| 1 | +# Sprint 20: Driver Swap | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 18–19 — hello-world works, CLI complete. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Wire afs-ld into the armfortas driver. Initially gated behind `AFS_LD=1`. After the Sprint 27 parity gate, flip the default. Keep a fallback to system `ld` for at least one sprint after default-on. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Driver change site | |
| 12 | +`armfortas/src/driver/mod.rs`: | |
| 13 | + | |
| 14 | +Two call sites ship today: | |
| 15 | +- Single-file link path at lines 497–530. | |
| 16 | +- Multi-file link path at lines 533–565. | |
| 17 | + | |
| 18 | +Both build a `Command::new("ld")`. Refactor to: | |
| 19 | + | |
| 20 | +```rust | |
| 21 | +fn linker_command() -> (Command, &'static str) { | |
| 22 | + match env::var("AFS_LD").as_deref() { | |
| 23 | + Ok("1") | Ok("true") => (Command::new(find_afs_ld()), "afs-ld"), | |
| 24 | + _ => (Command::new("ld"), "system ld"), | |
| 25 | + } | |
| 26 | +} | |
| 27 | +``` | |
| 28 | + | |
| 29 | +`find_afs_ld()`: | |
| 30 | +1. `AFS_LD_PATH` env var (full path to the binary). | |
| 31 | +2. `<workspace>/target/debug/afs-ld`. | |
| 32 | +3. `<workspace>/target/release/afs-ld`. | |
| 33 | +4. `PATH` lookup. | |
| 34 | + | |
| 35 | +Failure produces a clear diagnostic pointing to the env var and build commands. | |
| 36 | + | |
| 37 | +### 2. Testing harness update | |
| 38 | +`armfortas/src/testing.rs:871-908` (used by integration tests) — same refactor. Integration tests respect `AFS_LD=1`. | |
| 39 | + | |
| 40 | +### 3. Flag pass-through parity | |
| 41 | +The driver today builds a fixed command. After this sprint it still builds the same command — afs-ld accepts the same flags. Differential in practice: | |
| 42 | + | |
| 43 | +``` | |
| 44 | +# System ld | |
| 45 | +ld hello.o libarmfortas_rt.a -lSystem -no_uuid -syslibroot <SDK> -e _main -o hello | |
| 46 | + | |
| 47 | +# afs-ld (same args) | |
| 48 | +afs-ld hello.o libarmfortas_rt.a -lSystem -no_uuid -syslibroot <SDK> -e _main -o hello | |
| 49 | +``` | |
| 50 | + | |
| 51 | +### 4. Fallback semantics | |
| 52 | +If afs-ld errors, produce a driver-level diagnostic that cites afs-ld's exit status and stderr, plus a hint to retry with `AFS_LD=0`. Do **not** automatically retry with system `ld` — silently falling back masks real bugs. | |
| 53 | + | |
| 54 | +### 5. Test coverage on both paths | |
| 55 | +`cargo test --workspace` runs all integration tests twice: once with `AFS_LD=0` (baseline), once with `AFS_LD=1`. Divergence is a test failure. This is the CI gate for afs-ld adoption. | |
| 56 | + | |
| 57 | +### 6. Preserving `-no_uuid` determinism | |
| 58 | +Driver passes `-no_uuid` today. Verify afs-ld honors it byte-identically: same inputs under same seed produce the same output (no process-id, no timestamp, no random padding). | |
| 59 | + | |
| 60 | +### 7. Docs | |
| 61 | +Update `armfortas/CLAUDE.md` with a note about `AFS_LD=1` and how to enable/disable. `armfortas/README.md` if/when it mentions linking. | |
| 62 | + | |
| 63 | +## Testing Strategy | |
| 64 | +- `tests/linker_swap.rs`: runs hello-world both ways, asserts the binaries differ only in tolerated regions. | |
| 65 | +- Integration suite under `AFS_LD=1`: every existing integration test must pass. This is the gate. | |
| 66 | +- Failure-path test: a deliberately-broken link (missing symbol); both paths produce an error, not a segfault. | |
| 67 | + | |
| 68 | +## Definition of Done | |
| 69 | +- `AFS_LD=1 cargo test --workspace` passes every test green. | |
| 70 | +- Driver refactor lands on a branch that can be rolled back cleanly by flipping the env-var default. | |
| 71 | +- Diagnostic quality on afs-ld failures matches or exceeds system ld's. | |
| 72 | +- No silent fallback — afs-ld failures surface loudly. | |
.docs/sprints/sprint21.mdadded@@ -0,0 +1,70 @@ | ||
| 1 | +# Sprint 21: Runtime Archive Linking | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 4, 8, 20 — archives, resolution, driver swap. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Link `libarmfortas_rt.a` end-to-end into every armfortas-produced binary. The full parent integration suite runs green under `AFS_LD=1`. This is the sprint that proves afs-ld can do real work on real armfortas output, not just staging fixtures. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Runtime inventory | |
| 12 | +Walk `libarmfortas_rt.a` and catalog every exported symbol. Groups: | |
| 13 | + | |
| 14 | +- Lifecycle: `_afs_program_init`, `_afs_program_finalize`. | |
| 15 | +- Array: `_afs_allocate_array`, `_afs_deallocate_array`, `_afs_check_bounds`, `_afs_fill_*`, `_afs_array_add_*`, `_afs_array_mul_*`, `_afs_transpose_*`, `_afs_matmul_*`. | |
| 16 | +- I/O: `_afs_write_*`, `_afs_read_*`, `_afs_open_file`, `_afs_close_file`, `_afs_flush`, formatted/unformatted/list-directed helpers. | |
| 17 | +- String: `_afs_string_*` (allocatable, deferred-length variants). | |
| 18 | +- Math intrinsics: `_afs_i128_*`, `_afs_cmplx_*`, etc. | |
| 19 | +- System: `_afs_stop`, `_afs_command_argument_*`, `_afs_get_environment`. | |
| 20 | + | |
| 21 | +This inventory gets persisted as `tests/runtime_symbols.txt` so the test suite can assert no symbol silently disappears between runtime rebuilds. | |
| 22 | + | |
| 23 | +### 2. Archive fetch verification | |
| 24 | +Verify Sprint 4's archive reader pulls members correctly: | |
| 25 | + | |
| 26 | +- Parse `libarmfortas_rt.a`, walk its BSD symbol index. | |
| 27 | +- For each inventory symbol, look up the defining member. | |
| 28 | +- Cross-check: `nm` on each member file agrees. | |
| 29 | + | |
| 30 | +### 3. End-to-end integration tests | |
| 31 | +Run the parent `armfortas/tests/` suite under `AFS_LD=1`: | |
| 32 | + | |
| 33 | +- `tests/run_programs.rs`: full program tests (array, I/O, derived types, modules). | |
| 34 | +- `tests/multifile.rs`: multi-object link; module globals resolved correctly. | |
| 35 | +- `tests/i128_cross_object.rs`: 128-bit integer interop across C/Fortran boundary. | |
| 36 | +- `tests/fortsh_module_graph.rs`: complex USE chains. | |
| 37 | +- `tests/incremental.rs`: incremental module dependency tracking. | |
| 38 | + | |
| 39 | +Every failure here is a real linker bug — triage and fix. | |
| 40 | + | |
| 41 | +### 4. Known gotchas to verify | |
| 42 | +Based on afs-as + runtime history, pay particular attention to: | |
| 43 | + | |
| 44 | +- **`_afs_program_init` lifecycle wrapping**: the driver-synthesized `_main` at `src/driver/mod.rs:371-392` calls `_afs_program_init` → user prog → `_afs_program_finalize`. All three must resolve. | |
| 45 | +- **I/O state machine in `libarmfortas_rt`**: references `_errno`, `_malloc`, `_free` from libSystem; `_afs_io_state` as a BSS symbol. Verify `__DATA,__bss` placement matches ld. | |
| 46 | +- **Common symbols**: some module globals come through as common; verify promotion to BSS (Sprint 7 matrix). | |
| 47 | +- **Weak refs** to optional runtime hooks (if any). Check that unresolved weak refs evaluate to 0 and the call-site null-check dispatches correctly. | |
| 48 | + | |
| 49 | +### 5. Archive-ordering edge cases | |
| 50 | +Some programs pull symbols that create new undefined references in the middle of resolution. Fixed-point loop from Sprint 8 handles this; verify it holds for the runtime archive with its ~40 members. | |
| 51 | + | |
| 52 | +### 6. Diagnostic polish | |
| 53 | +Every runtime symbol that fails to resolve must produce a diagnostic that: | |
| 54 | +- Names the missing symbol. | |
| 55 | +- Cites at least one referrer in user code. | |
| 56 | +- Hints at the rebuild path (`cargo build -p armfortas-rt`). | |
| 57 | + | |
| 58 | +### 7. Regression corpus | |
| 59 | +Any test that once broke becomes a permanent corpus entry. The afs-ld `tests/runtime_*.rs` pattern mirrors armfortas's. | |
| 60 | + | |
| 61 | +## Testing Strategy | |
| 62 | +- Full parent integration suite under `AFS_LD=1` (this is the primary deliverable). | |
| 63 | +- `tests/runtime_inventory.rs`: assert every symbol in `tests/runtime_symbols.txt` is still defined by the current `libarmfortas_rt.a`. | |
| 64 | +- Archive fetch coverage: every inventoried symbol pulls its member exactly once. | |
| 65 | + | |
| 66 | +## Definition of Done | |
| 67 | +- `AFS_LD=1 cargo test -p armfortas` green. | |
| 68 | +- Runtime inventory stable and asserted. | |
| 69 | +- No silent skips; every test that was passing under `AFS_LD=0` passes under `AFS_LD=1`. | |
| 70 | +- Diagnostics on missing runtime symbols are actionable. | |
.docs/sprints/sprint22.mdadded@@ -0,0 +1,107 @@ | ||
| 1 | +# Sprint 22: Code Signature (Ad-Hoc) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 10, 14 — segment layout and `__LINKEDIT` finalized. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Emit a valid ad-hoc `LC_CODE_SIGNATURE`. On macOS 11+, arm64 binaries without a signature are killed by the kernel at exec time; without this sprint every afs-ld output requires manual `codesign -s -` to run. This is existence-blocking, not optional. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. SuperBlob structure | |
| 12 | +`LC_CODE_SIGNATURE` points to a code-signing blob in `__LINKEDIT`: | |
| 13 | + | |
| 14 | +``` | |
| 15 | +struct CS_SuperBlob { | |
| 16 | + u32 magic; // CSMAGIC_EMBEDDED_SIGNATURE = 0xfade0cc0 | |
| 17 | + u32 length; // total blob size including this header | |
| 18 | + u32 count; // number of index entries | |
| 19 | + // then count × CS_BlobIndex { u32 type; u32 offset; } | |
| 20 | + // then each blob inlined at its offset | |
| 21 | +} | |
| 22 | +``` | |
| 23 | + | |
| 24 | +For ad-hoc: two inner blobs — the CodeDirectory and an empty Requirements set. Entitlements absent. | |
| 25 | + | |
| 26 | +### 2. CodeDirectory | |
| 27 | +``` | |
| 28 | +struct CS_CodeDirectory { | |
| 29 | + u32 magic; // CSMAGIC_CODEDIRECTORY = 0xfade0c02 | |
| 30 | + u32 length; | |
| 31 | + u32 version; // 0x20400 (modern) | |
| 32 | + u32 flags; // CS_ADHOC = 0x2 | |
| 33 | + u32 hashOffset; // from this struct's start to the main hash array | |
| 34 | + u32 identOffset; // to the null-terminated identifier | |
| 35 | + u32 nSpecialSlots; // 2 (info plist + requirements); 0 when absent | |
| 36 | + u32 nCodeSlots; // pages × 1 | |
| 37 | + u32 codeLimit; // file offset of end-of-signed-data | |
| 38 | + u8 hashSize; // 32 for SHA-256 | |
| 39 | + u8 hashType; // CS_HASHTYPE_SHA256 = 2 | |
| 40 | + u8 platform; // 0 for no platform binary | |
| 41 | + u8 pageSize; // log2(page) = 12 for 4 KiB | |
| 42 | + u32 spare2; | |
| 43 | + u32 scatterOffset; // 0 | |
| 44 | + u32 teamOffset; // 0 | |
| 45 | + u32 spare3; | |
| 46 | + u64 codeLimit64; // 0 unless codeLimit > 4 GiB | |
| 47 | + u64 execSegBase; | |
| 48 | + u64 execSegLimit; | |
| 49 | + u64 execSegFlags; // CS_EXECSEG_MAIN_BINARY = 0x1 for executables | |
| 50 | +} | |
| 51 | +``` | |
| 52 | + | |
| 53 | +After the struct: | |
| 54 | +- `identifier\0` — we use the install-name for dylibs, the output binary basename for executables. | |
| 55 | +- Special slots (filled with zeroes for ad-hoc): `nSpecialSlots` × 32 bytes of zero before the main slots. | |
| 56 | +- Main slots: one SHA-256 per 4 KiB page of signed data, followed in file order. | |
| 57 | + | |
| 58 | +### 3. Signed data range | |
| 59 | +Signing covers file bytes `[0, codeLimit)`. `codeLimit` is set to `LC_CODE_SIGNATURE.dataoff` (the start of the signature blob itself). The signature never signs itself. | |
| 60 | + | |
| 61 | +### 4. Page hashing | |
| 62 | +- Page size 4 KiB (not 16 KiB — code-signing pageSize is independent of VM page size). | |
| 63 | +- SHA-256 over each 4 KiB chunk; the final chunk is hashed over whatever bytes remain (not padded). | |
| 64 | +- Hashes concatenated at `hashOffset`. | |
| 65 | + | |
| 66 | +### 5. Requirements blob | |
| 67 | +``` | |
| 68 | +struct CS_RequirementsBlob { | |
| 69 | + u32 magic; // CSMAGIC_REQUIREMENTS = 0xfade0c01 | |
| 70 | + u32 length; // 12 | |
| 71 | + u32 count; // 0 | |
| 72 | +} | |
| 73 | +``` | |
| 74 | + | |
| 75 | +Minimum legal empty requirements. | |
| 76 | + | |
| 77 | +### 6. SHA-256 implementation | |
| 78 | +Hand-rolled. Standard 64-round SHA-256 from FIPS 180-4. ~200 LoC in Rust. Unit-tested against known vectors (empty string, "abc", "a"×1M, NIST test vectors). | |
| 79 | + | |
| 80 | +### 7. Layout recomputation | |
| 81 | +The signature blob size depends on `codeLimit`, which depends on its own file offset. Two-pass approach: | |
| 82 | + | |
| 83 | +1. Compute layout excluding signature; know exactly the signature's start offset. | |
| 84 | +2. Compute signature size = SuperBlob header + indices + CodeDirectory header + ident + special slots + (ceil(codeLimit / 4096) × 32 bytes hash) + Requirements blob. | |
| 85 | +3. Reserve that many bytes at the signature offset. | |
| 86 | +4. Write all other data. | |
| 87 | +5. Hash pages and write signature in place. | |
| 88 | + | |
| 89 | +### 8. Platform binary opt-out | |
| 90 | +Ad-hoc signatures from third-party tools are not platform binaries; `platform = 0`, `flags = CS_ADHOC`. | |
| 91 | + | |
| 92 | +### 9. Validation | |
| 93 | +After writing, validate with `codesign -v <binary>`. Expected: zero output, exit 0. | |
| 94 | + | |
| 95 | +## Testing Strategy | |
| 96 | +- Sign hello-world, then `./hello` (no manual `codesign` step). Expect "Hello, World!". | |
| 97 | +- `codesign -dv <binary>` reports `Signature=adhoc`. | |
| 98 | +- SHA-256 unit tests against NIST vectors. | |
| 99 | +- Mutate a single byte in the binary post-sign, re-run, expect kernel kill ("Killed: 9") — proves the signature is real and the kernel is checking it. | |
| 100 | +- Dylib ad-hoc sign: `dlopen` of `libfoo.dylib` from Sprint 18.5 still works. | |
| 101 | + | |
| 102 | +## Definition of Done | |
| 103 | +- `./hello` runs directly (no `codesign -s -` needed). | |
| 104 | +- `codesign -v` clean on every afs-ld output. | |
| 105 | +- Dylib loading via `dlopen` works on Sprint 18.5 fixtures. | |
| 106 | +- SHA-256 passes NIST test vectors. | |
| 107 | +- Tampering detected by the kernel (confidence check). | |
.docs/sprints/sprint23.mdadded@@ -0,0 +1,74 @@ | ||
| 1 | +# Sprint 23: Dead Strip (`-dead_strip`) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 9 — atomization; Sprint 19 — `-dead_strip` CLI flag. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Implement `-dead_strip`: remove atoms that are unreachable from the GC roots. Populates the side table that Sprint 19's `-why_live` diagnostic reads. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. GC roots | |
| 12 | +Live set seeded from: | |
| 13 | +- The entry-point symbol's atom (executable only). | |
| 14 | +- Every exported symbol's atom (governed by `-exported_symbols_list` / defaults). | |
| 15 | +- Every atom with `NoDeadStrip` flag (from `N_NO_DEAD_STRIP` or `.no_dead_strip`). | |
| 16 | +- Every atom referenced by a `LC_RPATH` / `LC_LOAD_DYLIB` side-channel (usually none, but keep the hook). | |
| 17 | +- `_dyld_stub_binder` — always referenced by `__stub_helper`. | |
| 18 | +- Compact-unwind and eh_frame atoms are **not** roots; they are transitively live via their `parent_of` link to a function atom (Sprint 9). | |
| 19 | +- Personality functions referenced by any live unwind FDE. | |
| 20 | + | |
| 21 | +### 2. Mark-live traversal | |
| 22 | +Worklist algorithm over the atom reference graph: | |
| 23 | + | |
| 24 | +```rust | |
| 25 | +pub fn mark_live(layout: &mut Layout, roots: &[AtomId]) { | |
| 26 | + let mut worklist: Vec<AtomId> = roots.to_vec(); | |
| 27 | + while let Some(atom_id) = worklist.pop() { | |
| 28 | + if layout.atoms[atom_id].live { continue; } | |
| 29 | + layout.atoms[atom_id].live = true; | |
| 30 | + record_why_live(atom_id, cause); // for -why_live diagnostic | |
| 31 | + for referent in atom_references(&layout.atoms[atom_id]) { | |
| 32 | + worklist.push(referent); | |
| 33 | + } | |
| 34 | + } | |
| 35 | +} | |
| 36 | +``` | |
| 37 | + | |
| 38 | +`atom_references` pulls from the reloc list of the atom, yielding the atom-id each reloc points to. | |
| 39 | + | |
| 40 | +### 3. Transitive rules | |
| 41 | +- A live function makes its `__compact_unwind` record live via the `parent_of` link from Sprint 9. | |
| 42 | +- A live function makes its `__eh_frame` FDE live (FDE references function via SUBTRACTOR pair — the FDE's life is parasitic on the function's). | |
| 43 | +- A live personality function is reached via the unwind records that reference it. | |
| 44 | +- LSDA blobs are live iff their owning function is. | |
| 45 | + | |
| 46 | +### 4. Dead atoms purged | |
| 47 | +After mark-live, walk the atom list; atoms with `live = false` are removed from the output. Output sections shrink accordingly; Sprint 10's layout pass re-runs to compact addresses. | |
| 48 | + | |
| 49 | +### 5. `-why_live` side table | |
| 50 | +Every time an atom is marked live, record its cause: "GC root (entry point)", "reachable from <other atom>", "reachable via unwind parent", etc. Stored as a `HashMap<AtomId, LiveCause>`. Sprint 19's `-why_live` walks back from a named symbol through this map to a root. | |
| 51 | + | |
| 52 | +### 6. Interactions with ICF (Sprint 24) | |
| 53 | +Dead-strip runs before ICF. ICF folds only live atoms. Dead atoms never get folded. | |
| 54 | + | |
| 55 | +### 7. Stripped-symbol enumeration | |
| 56 | +The `-map` output (Sprint 19) lists dead-stripped symbols. Populate this list from the purged atoms. | |
| 57 | + | |
| 58 | +### 8. Default behavior | |
| 59 | +`-dead_strip` is opt-in. Without it, no GC runs, no `-why_live` data is produced (the diagnostic explains this). | |
| 60 | + | |
| 61 | +## Testing Strategy | |
| 62 | +- Fixture: two functions, one called by `_main`, one unreferenced. With `-dead_strip`: output's `__text` shrinks, `nm -n` lists only the called function. | |
| 63 | +- `-why_live _main` on the fixture: "root". | |
| 64 | +- `-why_live _called_fn`: "reachable from _main; _main is root". | |
| 65 | +- `-why_live _unreferenced_fn`: "not live (dead-stripped)". | |
| 66 | +- `N_NO_DEAD_STRIP` fixture: that symbol survives even without a reference. | |
| 67 | +- Weak def reference: weak-coalesced winner survives, losers dead-stripped. | |
| 68 | +- Differential: output's `__text` length matches `ld -dead_strip` on 10+ fixtures. | |
| 69 | + | |
| 70 | +## Definition of Done | |
| 71 | +- Live set matches `ld -dead_strip` on a corpus of 15+ fixtures. | |
| 72 | +- `-why_live` produces coherent chains. | |
| 73 | +- Compact-unwind and eh_frame entries correctly follow their parent functions. | |
| 74 | +- fortsh builds under `-dead_strip` (Sprint 29 retest). | |
.docs/sprints/sprint24.mdadded@@ -0,0 +1,65 @@ | ||
| 1 | +# Sprint 24: ICF (`-icf=safe`) | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 9, 23 — atoms, dead-strip done. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Identical Code Folding: merge atoms whose content, relocations, and attributes are identical, so one copy survives. Safe mode only — respects address-taken symbols so function-pointer equality is preserved. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Safety rules | |
| 12 | + | |
| 13 | +Atom is **foldable** iff: | |
| 14 | +- It lives in `__TEXT,__text`, `__TEXT,__cstring`, `__TEXT,__literal16`, `__TEXT,__const`, or `__DATA_CONST,__const`. | |
| 15 | +- It has no `NoDeadStrip` flag. | |
| 16 | +- It has no `AddressTaken` annotation (see below). | |
| 17 | +- It is not the primary symbol of its atom under `-exported_symbols_list`. | |
| 18 | + | |
| 19 | +### 2. Address-taken detection | |
| 20 | + | |
| 21 | +During reloc application (Sprint 11 + this sprint's prep pass), mark any atom referenced via `Unsigned` or `PointerToGot` or any `GOT_LOAD_*` as `AddressTaken`. Function pointers, vtable slots, RTTI entries, anything comparable via `==` — all take the address, so they must not be folded. | |
| 22 | + | |
| 23 | +Only `Branch26` references are considered "fold-safe" on their own; direct calls don't create address equality. | |
| 24 | + | |
| 25 | +### 3. Segregation algorithm | |
| 26 | + | |
| 27 | +Two-phase refinement (lld/MachO style): | |
| 28 | + | |
| 29 | +1. **Initial hash**: for each foldable atom, compute a 64-bit hash over (size, flags, content bytes, reloc list normalized to (kind, addend, target class)). | |
| 30 | +2. **Bucket**: group atoms by hash. Single-element buckets are unique and not foldable. | |
| 31 | +3. **Refine**: within each bucket, check pairwise equality of content + relocs. Use equivalence classes — two atoms equivalent iff every referenced atom is equivalent (fixed-point refinement, à la Hopcroft's DFA minimization). | |
| 32 | +4. **Repeat step 3 until no class splits**. Converges in log(N) iterations in practice. | |
| 33 | +5. **Fold**: within each final equivalence class, pick a winner by input command-line order (stable, deterministic); rewrite every other atom's `owner_symbol` redirect to the winner; erase the loser atoms. | |
| 34 | + | |
| 35 | +### 4. Reloc patching | |
| 36 | + | |
| 37 | +After folding, some relocations now point at folded-away atoms. Rewrite every such reloc to target the winner. Layout recomputed; sizes shrink. | |
| 38 | + | |
| 39 | +### 5. `-icf=none` path | |
| 40 | + | |
| 41 | +Default off. `-icf=safe` enables. `-icf=all` (unsafe, doesn't respect AddressTaken) is not implemented — emits a diagnostic. | |
| 42 | + | |
| 43 | +### 6. Interaction with `-map` and `-why_live` | |
| 44 | + | |
| 45 | +- `-map` reports folded symbols with their winner: `_foo folded to _bar`. | |
| 46 | +- `-why_live` of a folded symbol reports the winner's live-chain. | |
| 47 | + | |
| 48 | +### 7. Determinism | |
| 49 | + | |
| 50 | +Winner selection by command-line order is stable across invocations. Hash function is seeded with a fixed seed — same inputs, same winners, same output bytes. | |
| 51 | + | |
| 52 | +## Testing Strategy | |
| 53 | + | |
| 54 | +- Fixture: two functions with byte-identical code and relocs. `-icf=safe` folds them into one; `nm` shows only one entry, both symbol aliases point at the same address. | |
| 55 | +- Address-taken fixture: two identical functions, one has its address stored in a table. Fold does not happen for the address-taken one; both survive. | |
| 56 | +- String dedup: `__cstring` atoms with identical content folded. | |
| 57 | +- Large-scale: 50+ near-identical functions, fold correctness verified by running each alias. | |
| 58 | +- Differential: folded output's `__text` size matches `ld -icf=safe` within 1%. | |
| 59 | + | |
| 60 | +## Definition of Done | |
| 61 | + | |
| 62 | +- `-icf=safe` reduces text size on a constructed fixture. | |
| 63 | +- Address-taken functions never folded. | |
| 64 | +- All aliases point to the folded winner at runtime. | |
| 65 | +- fortsh builds with `-icf=safe` and behaves identically to its `-icf=none` counterpart. | |
.docs/sprints/sprint25.mdadded@@ -0,0 +1,74 @@ | ||
| 1 | +# Sprint 25: LOH Relaxation | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 1, 11 — LOH hints preserved, reloc application in place. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Apply Linker Optimization Hints that afs-as emits. LOHs describe safe peephole opportunities — replace ADRP+ADD with a single ADR when the target is in ±1 MB, or nop-out an unnecessary LDR. Until this sprint, LOHs are preserved as-is and no relaxation happens. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. LOH kinds afs-as emits | |
| 12 | + | |
| 13 | +From the existing `.loh` directives: | |
| 14 | +- `AdrpAdd`: ADRP + ADD (compute address). If target is in ±1 MB, replace with ADR + nop. | |
| 15 | +- `AdrpLdr`: ADRP + LDR (load through page/pageoff). If target is in ±1 MB and aligned, can become ADR + LDR (using the LDR's literal form). | |
| 16 | +- `AdrpLdrGot`: ADRP + LDR from GOT. If the GOT entry's content is a local symbol with known address, can skip the GOT load entirely (ADR + nop). | |
| 17 | +- `AdrpLdrGotLdr`: ADRP + LDR (GOT) + LDR (final). Similar combo: if GOT can be skipped, fold into a direct LDR. | |
| 18 | + | |
| 19 | +### 2. LOH data format | |
| 20 | + | |
| 21 | +`LC_LINKER_OPTIMIZATION_HINT` points to a ULEB128 stream: | |
| 22 | +``` | |
| 23 | +uleb128 kind | |
| 24 | +uleb128 argcount | |
| 25 | +uleb128 arg1 // file offset | |
| 26 | +uleb128 arg2 | |
| 27 | +... | |
| 28 | +``` | |
| 29 | + | |
| 30 | +Kind constants: `LOH_ARM64_ADRP_ADRP=1`, `LOH_ARM64_ADRP_LDR=2`, `LOH_ARM64_ADRP_ADD_LDR=3`, `LOH_ARM64_ADRP_LDR_GOT_LDR=4`, `LOH_ARM64_ADRP_ADD_STR=5`, `LOH_ARM64_ADRP_LDR_GOT_STR=6`, `LOH_ARM64_ADRP_ADD=7`, `LOH_ARM64_ADRP_LDR_GOT=8`. | |
| 31 | + | |
| 32 | +afs-as emits kinds 3, 7, 8 (and 4 for load-from-pointer-in-GOT). | |
| 33 | + | |
| 34 | +### 3. Relaxation pass | |
| 35 | + | |
| 36 | +Runs **after** reloc application (Sprint 11) and **before** LOH re-serialization. For each LOH: | |
| 37 | + | |
| 38 | +1. Parse the referenced instructions. | |
| 39 | +2. Compute if the symbolic target fits the tighter encoding. | |
| 40 | +3. If yes: rewrite the instruction bytes; mark the LOH as "applied" so it can be either dropped or left in the output (ld's convention varies; we match). | |
| 41 | +4. If no: leave untouched. | |
| 42 | + | |
| 43 | +Safety: every relaxation is reversible (the original instructions still achieve the goal), and no relaxation narrows a correctly-wider encoding to an incorrect one. Extensive testing required. | |
| 44 | + | |
| 45 | +### 4. Safe conservatism | |
| 46 | + | |
| 47 | +A LOH is only applied when the target fits **strictly** within the narrower range. Off-by-one guard: recompute both original and relaxed forms, assert the relaxed form computes the same address. | |
| 48 | + | |
| 49 | +### 5. Cross-LOH interaction | |
| 50 | + | |
| 51 | +A single instruction can participate in multiple LOHs (one as a member of an ADRP+ADD, another as a member of an ADRP+ADD+LDR). Apply LOHs in a deterministic order — longest first — and skip any LOH whose instructions have already been rewritten. | |
| 52 | + | |
| 53 | +### 6. Output LOH preservation | |
| 54 | + | |
| 55 | +ld emits `LC_LINKER_OPTIMIZATION_HINT` in the output even for executables (for the benefit of post-processing tools). We match: emit a new LOH blob with the final state (applied LOHs marked or omitted). | |
| 56 | + | |
| 57 | +### 7. `-no_loh` flag | |
| 58 | + | |
| 59 | +For debugging: `-no_loh` skips relaxation. Helpful when comparing output against a known-bad state. | |
| 60 | + | |
| 61 | +## Testing Strategy | |
| 62 | + | |
| 63 | +- Synthetic fixture with a function whose `__data` target is 1 MB away → AdrpAdd LOH applies; disassembly shows ADR + nop. | |
| 64 | +- Fixture with a target 10 MB away → LOH does not apply; ADRP + ADD preserved. | |
| 65 | +- Differential: afs-ld output byte-matches `ld` output for both fixtures. | |
| 66 | +- Runtime test: the relaxed code still dereferences the right address. | |
| 67 | +- Random-fuzz: 100 fixtures with various target distances; every relaxation verified against recomputed ground truth. | |
| 68 | + | |
| 69 | +## Definition of Done | |
| 70 | + | |
| 71 | +- LOH relaxation applied correctly on fixtures that fit. | |
| 72 | +- LOH skipped correctly on fixtures that don't. | |
| 73 | +- Byte-parity with `ld` on a representative corpus. | |
| 74 | +- `-no_loh` flag produces a cleanly un-relaxed output. | |
.docs/sprints/sprint26.mdadded@@ -0,0 +1,87 @@ | ||
| 1 | +# Sprint 26: Thunks for Out-of-Range Branches | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 11 — BRANCH26 reloc application; Sprint 10 — layout pass. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +When a `BRANCH26` target is more than ±128 MiB from the caller, insert a branch island that can reach any 32-bit-aligned address via an ADRP + BR sequence. Required for very large executables; fortsh is not that large today, but a statically-linked Fortran program with full-intrinsic binding could be. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Detection pass | |
| 12 | + | |
| 13 | +After layout (Sprint 10) assigns addresses, walk every BRANCH26 reloc. Compute `distance = (target - P) >> 2`. If `|distance| > 0x0200_0000` (that's 2^25 = 33,554,432 × 4 bytes = 128 MiB), flag the reloc as needing a thunk. | |
| 14 | + | |
| 15 | +### 2. Thunk synthesis | |
| 16 | + | |
| 17 | +One thunk atom per (output segment, distant target). A thunk is 12 bytes: | |
| 18 | + | |
| 19 | +``` | |
| 20 | +ADRP x16, <target>@PAGE | |
| 21 | +ADD x16, x16, <target>@PAGEOFF | |
| 22 | +BR x16 | |
| 23 | +``` | |
| 24 | + | |
| 25 | +Or, if the target is a Defined with a known value at link time: | |
| 26 | + | |
| 27 | +``` | |
| 28 | +ADRP x16, #<computed_page> | |
| 29 | +ADD x16, x16, #<pageoff> | |
| 30 | +BR x16 | |
| 31 | +``` | |
| 32 | + | |
| 33 | +The ADRP+ADD form reaches anywhere in the process's 4 GiB virtual range (actually ±4 GiB, plenty). | |
| 34 | + | |
| 35 | +### 3. Placement | |
| 36 | + | |
| 37 | +Thunks land in `__TEXT,__thunks`, a new synthetic section placed between `__text` and `__stubs`. Placement must be within ±128 MiB of every call site that uses it — for very large binaries, multiple thunk islands may be needed. | |
| 38 | + | |
| 39 | +Algorithm: | |
| 40 | +1. Run layout once. | |
| 41 | +2. Detect overflow sites. | |
| 42 | +3. Insert thunks near the caller cluster. | |
| 43 | +4. Re-run layout (sizes changed). | |
| 44 | +5. Re-check overflow — repeat until stable. | |
| 45 | + | |
| 46 | +Termination: adding a thunk can only make addresses shift by up to 12 bytes per thunk; overflow is a global property that converges rapidly. | |
| 47 | + | |
| 48 | +### 4. Thunk sharing | |
| 49 | + | |
| 50 | +Multiple callers to the same out-of-range target share one thunk. Keyed by `(output_section, target_atom_id)`. | |
| 51 | + | |
| 52 | +### 5. Reloc rewrite | |
| 53 | + | |
| 54 | +Each thunked BRANCH26 reloc gets rewritten to point at the thunk atom instead of the original target. Thunk atom's BR then reaches the real target via ADRP+ADD. | |
| 55 | + | |
| 56 | +### 6. Interaction with `-dead_strip` and ICF | |
| 57 | + | |
| 58 | +- Thunks are dead-stripped if no live caller remains. | |
| 59 | +- Thunks are never ICF candidates (they have unique target addresses). | |
| 60 | +- A dead-stripped target invalidates its thunk(s); easy since we generate thunks after dead-strip. | |
| 61 | + | |
| 62 | +### 7. `-thunks <none|safe|normal>` | |
| 63 | + | |
| 64 | +- `-thunks=none`: overflow is a hard error (default for small programs to catch bugs). | |
| 65 | +- `-thunks=safe` (default on large programs): thunks inserted when needed. | |
| 66 | +- `-thunks=all`: thunks inserted for every BRANCH26 (for testing). | |
| 67 | + | |
| 68 | +Sprint 19 CLI wires these; this sprint implements the behavior. | |
| 69 | + | |
| 70 | +### 8. Regression: small programs don't grow | |
| 71 | + | |
| 72 | +Default is `-thunks=safe`. Programs that don't need thunks emit no `__thunks` section and are byte-identical to the pre-sprint output. | |
| 73 | + | |
| 74 | +## Testing Strategy | |
| 75 | + | |
| 76 | +- Synthetic: compile a source that produces >128 MiB of code (requires artificially padding `.o` files, or using a large constant array in `__text`). Verify thunks inserted. | |
| 77 | +- Every thunk target reachable from its caller cluster. | |
| 78 | +- Runtime: the large binary's entry point actually runs and calls through thunks without crashing. | |
| 79 | +- `-thunks=none` + overflow: produces a clear error citing the caller and target. | |
| 80 | +- Small-program regression: fortsh output size unchanged vs pre-Sprint-26 (no thunks inserted). | |
| 81 | + | |
| 82 | +## Definition of Done | |
| 83 | + | |
| 84 | +- Thunks correctly inserted for out-of-range BRANCH26 on large fixtures. | |
| 85 | +- Layout fixed-point converges rapidly. | |
| 86 | +- Small programs unchanged. | |
| 87 | +- `-thunks` CLI matrix all wired. | |
.docs/sprints/sprint27.mdadded@@ -0,0 +1,113 @@ | ||
| 1 | +# Sprint 27: Differential Harness vs Apple ld | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +All prior sprints, especially 18, 18.5, 21 — end-to-end milestones. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Industrial-strength parity harness. Automated byte-level comparison of afs-ld output against `ld` across a curated corpus. Explicit tolerance lists, regression-gated CI. This is the Sprint 20 default-swap gate: afs-ld becomes the armfortas default only after this sprint's corpus is green. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Corpus | |
| 12 | + | |
| 13 | +`tests/parity_corpus/` contains 50+ link scenarios, each a small test directory with: | |
| 14 | +- `inputs/` (the `.o`, `.a`, `.tbd` files). | |
| 15 | +- `args.txt` (the afs-ld / `ld` command-line). | |
| 16 | +- `notes.md` (what this exercises). | |
| 17 | + | |
| 18 | +Scenarios cover: | |
| 19 | +- Hello-world variants (classic vs chained, with/without `-dead_strip`, with/without `-icf`). | |
| 20 | +- Every relocation type in isolation. | |
| 21 | +- GOT and stub exercises. | |
| 22 | +- TLV exercises. | |
| 23 | +- Weak-def coalescing. | |
| 24 | +- Common-symbol promotion. | |
| 25 | +- Multi-archive resolution with order dependence. | |
| 26 | +- Dylib-with-reexport chain. | |
| 27 | +- LSystem links with real system SDK. | |
| 28 | +- `libarmfortas_rt.a` + a 3-function Fortran program. | |
| 29 | + | |
| 30 | +### 2. Diff dimensions | |
| 31 | + | |
| 32 | +For each scenario, compare: | |
| 33 | +- Load commands: count, order, contents (with tolerated-diff for UUID/timestamp). | |
| 34 | +- Segment sizes and file offsets. | |
| 35 | +- Section bytes (byte-level equality after reloc application). | |
| 36 | +- Symbol table: same nlist entries in the same partition order. | |
| 37 | +- String table: same content (byte-level is ideal, length within 5% is tolerated for suffix-dedup variation). | |
| 38 | +- `LC_DYLD_INFO_ONLY` opcode streams (classic) or `LC_DYLD_CHAINED_FIXUPS` chains (chained). | |
| 39 | +- Export trie walk equivalence (may differ in byte layout but must export the same names with the same flags and addresses). | |
| 40 | +- `__unwind_info` byte-level. | |
| 41 | +- Code signature: ignored in diff (ld signs with sha256 hashes over its output's bytes; we sign over ours; different bytes, different hashes — expected). | |
| 42 | + | |
| 43 | +### 3. Tolerated-diff rules | |
| 44 | + | |
| 45 | +```rust | |
| 46 | +pub enum ToleratedDiff { | |
| 47 | + UuidBytes, | |
| 48 | + Timestamp, | |
| 49 | + PathHashInString(&'static str), // e.g. temp path in stabs | |
| 50 | + StringTableSuffixDedupVariance, | |
| 51 | + CodeSignatureHashes, | |
| 52 | +} | |
| 53 | +``` | |
| 54 | + | |
| 55 | +Each tolerance has a precise predicate — no loose "any byte in __LINKEDIT". Unknown diffs fail. | |
| 56 | + | |
| 57 | +### 4. Harness structure | |
| 58 | + | |
| 59 | +`afs-ld/tests/parity_matrix.rs` walks `tests/parity_corpus/` and runs each scenario: | |
| 60 | + | |
| 61 | +```rust | |
| 62 | +#[test] | |
| 63 | +fn parity_corpus() { | |
| 64 | + for case in load_corpus("tests/parity_corpus/") { | |
| 65 | + let ours = link_with_afs_ld(&case).unwrap(); | |
| 66 | + let theirs = link_with_system_ld(&case).unwrap(); | |
| 67 | + let diffs = diff_macho(&ours, &theirs); | |
| 68 | + let critical: Vec<_> = diffs.into_iter().filter(|d| !is_tolerated(d)).collect(); | |
| 69 | + assert!(critical.is_empty(), | |
| 70 | + "{}: {} critical diff(s):\n{:#?}", case.name, critical.len(), critical); | |
| 71 | + } | |
| 72 | +} | |
| 73 | +``` | |
| 74 | + | |
| 75 | +### 5. CI gating | |
| 76 | + | |
| 77 | +GitHub Actions job runs on every PR: | |
| 78 | +- `cargo test --test parity_matrix` green. | |
| 79 | +- Artifact uploaded: per-scenario HTML diff viewer for debugging. | |
| 80 | +- A failing scenario blocks merge. | |
| 81 | + | |
| 82 | +### 6. Per-scenario allowed-diff annotation | |
| 83 | + | |
| 84 | +Some scenarios might have legitimate small differences we don't want to suppress globally. Each scenario's `notes.md` can declare case-specific tolerances: | |
| 85 | + | |
| 86 | +```yaml | |
| 87 | +tolerated: | |
| 88 | + - region: __LINKEDIT bytes 0x1000-0x1010 reason: "ld emits padding here" | |
| 89 | +``` | |
| 90 | + | |
| 91 | +Use sparingly; each tolerance must be justified and date-stamped. | |
| 92 | + | |
| 93 | +### 7. Runtime parity | |
| 94 | + | |
| 95 | +Beyond byte-level: each scenario that produces a runnable executable is also executed; stdout, stderr, and exit code must match between the two linked binaries. | |
| 96 | + | |
| 97 | +### 8. Parity budget | |
| 98 | + | |
| 99 | +The goal is **zero** critical diffs across the corpus. Sprint 27 is not done until the harness is fully green; if a diff can't be resolved within the sprint, it must be filed as a bug blocking default-swap in Sprint 20. | |
| 100 | + | |
| 101 | +## Testing Strategy | |
| 102 | + | |
| 103 | +- `cargo test --test parity_matrix` green. | |
| 104 | +- Intentional-regression: mutate one byte in afs-ld's writer, confirm the harness catches it. | |
| 105 | +- Scale test: full corpus runs in <2 minutes on a reasonable machine (gates Sprint 28 perf work). | |
| 106 | + | |
| 107 | +## Definition of Done | |
| 108 | + | |
| 109 | +- 50+ corpus scenarios all pass with zero critical diffs. | |
| 110 | +- CI-enforced. | |
| 111 | +- Every tolerated-diff category has a justification and a test that proves it triggers. | |
| 112 | +- Intentional-regression canary detects any change outside the allowlist. | |
| 113 | +- Sprint 20's default-swap is unblocked. | |
.docs/sprints/sprint28.mdadded@@ -0,0 +1,86 @@ | ||
| 1 | +# Sprint 28: Performance & Parallelism | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprint 27 — correctness gate in place; can freely refactor for speed. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Make afs-ld fast enough to feel like a production tool. Target: within 2× of Apple `ld`'s wall time on the fortsh link. Mold demonstrates linkers can be very fast; we don't need mold's speed, but we need to not be painful. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Baseline profile | |
| 12 | + | |
| 13 | +Profile the fortsh link (Sprint 29 produces the fixture). Categorize wall time: | |
| 14 | + | |
| 15 | +- Input parsing (Mach-O headers, sections, symbols, relocations). | |
| 16 | +- Symbol resolution (hash-map probes, archive lookups). | |
| 17 | +- Atomization. | |
| 18 | +- Layout. | |
| 19 | +- Reloc application. | |
| 20 | +- Synth sections (`__unwind_info` is often a hotspot). | |
| 21 | +- Writing output. | |
| 22 | +- Code signature hashing. | |
| 23 | + | |
| 24 | +Identify the biggest bucket; optimize there first. | |
| 25 | + | |
| 26 | +### 2. Parallel input parsing | |
| 27 | + | |
| 28 | +Parse each `.o` in a separate worker thread; results collected into the symbol table after all parsing completes. Archive member parsing also parallel. Uses std's `thread::scope` — no external crates. Parallelism bounded by `std::thread::available_parallelism()`. | |
| 29 | + | |
| 30 | +### 3. Parallel reloc application | |
| 31 | + | |
| 32 | +Each atom's relocs are independent. Process per-atom in parallel; the output buffer is preallocated and each atom writes to a disjoint slice. | |
| 33 | + | |
| 34 | +### 4. Parallel SHA-256 for code signing | |
| 35 | + | |
| 36 | +One thread per 4 KiB page. SHA-256 is inherently sequential within a page but trivially parallel across pages. Drop-in speedup for large binaries. | |
| 37 | + | |
| 38 | +### 5. Bump allocator for ephemeral data | |
| 39 | + | |
| 40 | +Parser produces many small allocations (strings, reloc lists, atom descriptors). A per-input arena avoids fragmentation and makes bulk drop free. Implement as `src/arena.rs` — a std-only `Vec<Box<[u8]>>` chunker. | |
| 41 | + | |
| 42 | +### 6. mmap for large inputs | |
| 43 | + | |
| 44 | +`std::fs::File` + `memmap2`? No — memmap2 is an external crate. Use `libc::mmap` via an unsafe `src/mmap.rs` wrapper. Input files are always read-only; mmap saves a read syscall and lets us share parse state across threads cheaply. Fall back to `fs::read` for GNU-thin archive members whose external path doesn't mmap cleanly (rare). | |
| 45 | + | |
| 46 | +### 7. Symbol-table hash map | |
| 47 | + | |
| 48 | +Profile shows std `HashMap` is fine for our scale. If not: replace with an open-addressing table keyed by `Istr` (handle-equality), linear probing, power-of-2 capacity. ~100 LoC. | |
| 49 | + | |
| 50 | +### 8. String interner | |
| 51 | + | |
| 52 | +Single global `StringInterner` shared across inputs. Interning cost: one hash lookup per name. Optimize by batching per-input: each input parses its strings into a local table, then merges into the global interner in one pass. | |
| 53 | + | |
| 54 | +### 9. No-alloc hot paths | |
| 55 | + | |
| 56 | +Reloc application and chain construction should not allocate per-reloc. Preallocated scratch buffers, reused across the relocation pass. | |
| 57 | + | |
| 58 | +### 10. Benchmarks | |
| 59 | + | |
| 60 | +`afs-ld/bench/` (or a `#[bench]` behind `cargo +nightly bench`) with: | |
| 61 | +- `bench_hello_world`: small, measures startup overhead. | |
| 62 | +- `bench_runtime_link`: mid, measures symbol-table & reloc-apply. | |
| 63 | +- `bench_fortsh_link`: large, measures end-to-end throughput. | |
| 64 | + | |
| 65 | +Budget targets: | |
| 66 | +- hello-world: ≤ 20 ms. | |
| 67 | +- runtime link: ≤ 150 ms. | |
| 68 | +- fortsh link: ≤ 2× Apple `ld`'s wall time on the same machine. | |
| 69 | + | |
| 70 | +### 11. Determinism preserved | |
| 71 | + | |
| 72 | +Parallelism must not reorder output. Each worker produces a deterministic result; join order is fixed; sorts are stable. A parallel and sequential run must produce byte-identical outputs. | |
| 73 | + | |
| 74 | +## Testing Strategy | |
| 75 | + | |
| 76 | +- Benchmarks land as regression gates: nightly CI records throughput; > 10% regression fails. | |
| 77 | +- Determinism: 100 parallel runs of the same input, assert byte-identical output every time. | |
| 78 | +- Sprint 27 parity must remain green — no correctness regression. | |
| 79 | +- Single-threaded fallback (`-j 1`) for debugging. | |
| 80 | + | |
| 81 | +## Definition of Done | |
| 82 | + | |
| 83 | +- fortsh link wall time within 2× of `ld`'s. | |
| 84 | +- All Sprint 27 scenarios still byte-identical. | |
| 85 | +- Determinism bulletproof across parallelism. | |
| 86 | +- No external dependencies added. | |
.docs/sprints/sprint29.mdadded@@ -0,0 +1,91 @@ | ||
| 1 | +# Sprint 29: fortsh Link Audit | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 18–28 — every functional sprint, parity gate, performance tuning. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +End-to-end link of fortsh under afs-ld. fortsh is ~57 KLoC Fortran 2018, 55 modules, heavy `iso_c_binding`, allocatable strings, derived types. Linking it is the first real-world stress test of everything before this sprint. Fix what breaks. No excuses. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. fortsh build pipeline under afs-ld | |
| 12 | + | |
| 13 | +``` | |
| 14 | +cd fortsh | |
| 15 | +AFS_LD=1 armfortas --std=f2018 -O2 <all sources> -o fortsh | |
| 16 | +``` | |
| 17 | + | |
| 18 | +Expected: | |
| 19 | +- Build succeeds. | |
| 20 | +- `./fortsh --version` prints the expected version string. | |
| 21 | +- Interactive mode starts and reads a line. | |
| 22 | +- `./fortsh -c "echo hello"` prints "hello". | |
| 23 | + | |
| 24 | +### 2. Failure taxonomy | |
| 25 | + | |
| 26 | +Anticipated categories (adjust during sprint based on what actually breaks): | |
| 27 | + | |
| 28 | +- **Symbol resolution**: missing runtime symbols, weak-coalesce wrong winner, common-size mismatches. | |
| 29 | +- **Relocation math**: PAGE21/PAGEOFF12 miscomputation on specific offsets, SUBTRACTOR pair issues in eh_frame. | |
| 30 | +- **TLV**: thread-local I/O state failing at runtime. | |
| 31 | +- **Unwind**: backtrace on crash produces garbage. | |
| 32 | +- **Dead-strip**: functions stripped that were live (or live that should have been stripped). | |
| 33 | +- **Chained fixups**: a chain crossing a page boundary or containing a bad `next` offset. | |
| 34 | +- **Code signature**: kernel-kill on exec. | |
| 35 | + | |
| 36 | +Each class has a known file/function starting point for triage from earlier sprints. | |
| 37 | + | |
| 38 | +### 3. Audit process | |
| 39 | + | |
| 40 | +Same rules as armfortas audits (`armfortas/CLAUDE.md`): | |
| 41 | + | |
| 42 | +- **Assume nothing works until proven otherwise.** Every subsystem gets exercised by some fortsh code path. | |
| 43 | +- **Stubs and placeholders are synonyms for broken.** If fortsh passes a case only because of a hand-patched workaround, the sprint isn't done. | |
| 44 | +- **Wrong output is worse than crashes.** A fortsh that "runs" but produces wrong answers is a critical failure. | |
| 45 | +- **Don't soften findings.** "Major" = wrong answers. "Critical" = silent corruption. | |
| 46 | +- **Fix now unless it genuinely requires a later sprint.** | |
| 47 | + | |
| 48 | +Every bug becomes a regression test in `tests/parity_corpus/fortsh_*/`. | |
| 49 | + | |
| 50 | +### 4. Runtime behavior matrix | |
| 51 | + | |
| 52 | +Curated list of fortsh scenarios run under the afs-ld-linked binary: | |
| 53 | + | |
| 54 | +- Interactive `echo`, `cat`, `ls` (builtins). | |
| 55 | +- Pipe: `echo hello | cat`. | |
| 56 | +- Redirect: `echo hello > /tmp/f`. | |
| 57 | +- Variables: `x=1; echo $x`. | |
| 58 | +- Scripts: `./fortsh script.fsh`. | |
| 59 | +- Error paths: `nonexistent_command` returns non-zero. | |
| 60 | +- `iso_c_binding` calls into libc from Fortran. | |
| 61 | +- Allocatable string assignment: `s = s // "more"`. | |
| 62 | +- Derived-type shell_state_t access. | |
| 63 | + | |
| 64 | +Every item green, or the sprint isn't done. | |
| 65 | + | |
| 66 | +### 5. Differential vs system-ld-linked fortsh | |
| 67 | + | |
| 68 | +Same fortsh source, linked by both. Runtime behavior **must** match for every scenario in §4. Binary size within 5%. Load-command shape equivalent. Output byte-by-byte for the parts our Sprint 27 rules cover. | |
| 69 | + | |
| 70 | +### 6. Perf check | |
| 71 | + | |
| 72 | +Link time for fortsh under afs-ld is within Sprint 28's 2× budget. | |
| 73 | + | |
| 74 | +### 7. Audit report | |
| 75 | + | |
| 76 | +`.docs/audits/sprint29_fortsh.md` (or wherever the project convention puts audit reports, parallel to armfortas's audit structure): a brutally honest writeup of what worked, what broke, what was fixed, what remains. No soft-pedaling. | |
| 77 | + | |
| 78 | +## Testing Strategy | |
| 79 | + | |
| 80 | +- Full fortsh test suite executed under both linker paths. | |
| 81 | +- Each scenario in §4 scripted as a parity test. | |
| 82 | +- Perf budget asserted. | |
| 83 | +- Memory usage at link time within reason (< 1 GiB on fortsh). | |
| 84 | + | |
| 85 | +## Definition of Done | |
| 86 | + | |
| 87 | +- fortsh links under afs-ld. | |
| 88 | +- Every scenario in §4 passes. | |
| 89 | +- All fortsh integration tests pass. | |
| 90 | +- Differential with ld-linked fortsh matches on every runtime scenario. | |
| 91 | +- Audit report filed; no open critical items. | |
.docs/sprints/sprint30.mdadded@@ -0,0 +1,89 @@ | ||
| 1 | +# Sprint 30: Diagnostics & Polish | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Sprints 19 (`-map`, `-why_live`), 29 (fortsh audit informs diagnostic quality). | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +Raise every diagnostic surface to afs-as's caret-under-line standard. Polish `--help`, `--version`, `-t`, error recovery in binary parsers. Ship-quality UX for linker errors. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Binary-input diagnostics | |
| 12 | + | |
| 13 | +Every parser error cites the input file, the byte offset, and a caret pointing at the offending region in a hex dump: | |
| 14 | + | |
| 15 | +``` | |
| 16 | +afs-ld: error: in input.o at byte 0x1a4: LC_SEGMENT_64 claims nsects=3 but cmdsize fits only 2 section headers | |
| 17 | + | |
| 18 | + 0x1a0: 00 01 00 00 48 02 00 00 03 00 00 00 00 00 00 00 | |
| 19 | + ^^ | |
| 20 | + cmdsize=0x248 accommodates 2 × 80-byte section_64 entries; nsects=3 needs 240+72=312 bytes. | |
| 21 | +``` | |
| 22 | + | |
| 23 | +Implemented in `src/diag.rs` with helpers for "byte offset → nearest load command, section, atom, symbol" so every error can contextualize itself. | |
| 24 | + | |
| 25 | +### 2. Source-level backmapping | |
| 26 | + | |
| 27 | +When diagnosing a reloc error, point at the originating `.s` line if the object's symbol table includes debug info (afs-as emits no debug info today; this is a forward-compatible hook). Otherwise, map to the offending atom's symbol name and input file. | |
| 28 | + | |
| 29 | +### 3. Did-you-mean everywhere | |
| 30 | + | |
| 31 | +- Undefined symbol (Sprint 8). | |
| 32 | +- Unknown flag (Sprint 19). | |
| 33 | +- Missing library (`-lFoo` → did you mean `-lfoo`?). | |
| 34 | +- Mistyped architecture (`-arch arm86` → did you mean `arm64`?). | |
| 35 | + | |
| 36 | +Levenshtein-3, capped at the 10 closest matches. | |
| 37 | + | |
| 38 | +### 4. Colorized output | |
| 39 | + | |
| 40 | +ANSI color codes on TTY stderr. Flagged off under `NO_COLOR` env var and under `--color=never`. Matches afs-as's approach in `afs-as/src/diag*.rs`. | |
| 41 | + | |
| 42 | +### 5. Verbose and trace modes | |
| 43 | + | |
| 44 | +- `-v`: version + target + active flag summary. | |
| 45 | +- `-t` / `--trace`: every input file logged as it's loaded. | |
| 46 | +- `-verbose_deprecation`: warnings for deprecated flags (we accept them for `ld` compatibility but note they're deprecated). | |
| 47 | + | |
| 48 | +### 6. `--help` format | |
| 49 | + | |
| 50 | +Mirrors `ld`'s. Sections: inputs, outputs, symbols, diagnostics, platform. Each flag has a one-line description and (where applicable) a default value. Fits on 80 columns, readable. | |
| 51 | + | |
| 52 | +### 7. `--version` format | |
| 53 | + | |
| 54 | +``` | |
| 55 | +afs-ld <version> | |
| 56 | +Bespoke ARM64 Mach-O linker for armfortas | |
| 57 | +Target: arm64-apple-macos | |
| 58 | +Commit: <git hash> | |
| 59 | +``` | |
| 60 | + | |
| 61 | +### 8. Deterministic stderr | |
| 62 | + | |
| 63 | +Error output is the same for the same input across runs. No wall clock, no pid, no thread-id. Supports scripted diffing in CI. | |
| 64 | + | |
| 65 | +### 9. Error-code conventions | |
| 66 | + | |
| 67 | +Exit codes: | |
| 68 | +- 0: success. | |
| 69 | +- 1: link failure (undefined symbol, ambiguous resolution, etc.). | |
| 70 | +- 2: CLI misuse (bad flag, missing required arg). | |
| 71 | +- 64–78: BSD `<sysexits.h>` codes where they fit (EX_USAGE=64, EX_DATAERR=65, EX_NOINPUT=66, EX_UNAVAILABLE=69, EX_SOFTWARE=70). | |
| 72 | + | |
| 73 | +### 10. Regression: no diagnostic regresses below afs-as quality | |
| 74 | + | |
| 75 | +Every diagnostic in afs-ld should be at least as useful as the closest afs-as analog. Cross-checked in an audit. | |
| 76 | + | |
| 77 | +## Testing Strategy | |
| 78 | + | |
| 79 | +- Snapshot tests for every major error category: undefined symbol, duplicate symbol, missing library, bad flag, malformed input. Compare against a stored expected-output; diff the text (modulo terminal width and colors). | |
| 80 | +- `--help` and `--version` snapshot tests. | |
| 81 | +- TTY detection test via a pty harness (or a manual verification step). | |
| 82 | + | |
| 83 | +## Definition of Done | |
| 84 | + | |
| 85 | +- Every error category has a snapshot test that matches a stored golden. | |
| 86 | +- Did-you-mean fires on the five categories listed. | |
| 87 | +- `--help` fits 80 cols and is scannable. | |
| 88 | +- Color on TTY, off under `NO_COLOR` / `--color=never`. | |
| 89 | +- Exit codes follow the convention. | |
.docs/sprints/sprint31.mdadded@@ -0,0 +1,101 @@ | ||
| 1 | +# Sprint 31: Final Audit | |
| 2 | + | |
| 3 | +## Prerequisites | |
| 4 | +Every prior sprint. | |
| 5 | + | |
| 6 | +## Goals | |
| 7 | +The last line of defense before afs-ld is declared the armfortas default linker permanently (i.e., Sprint 20's env-var fallback is removed). Brutally honest audit of every subsystem. Regressions caught, gaps documented, decisions defended. | |
| 8 | + | |
| 9 | +## Deliverables | |
| 10 | + | |
| 11 | +### 1. Parity corpus green | |
| 12 | + | |
| 13 | +Sprint 27's `tests/parity_corpus/` fully green, plus every fortsh-derived scenario added in Sprint 29. No tolerated-diff entries added since Sprint 27 without audit-committee (i.e., user) sign-off. | |
| 14 | + | |
| 15 | +### 2. Determinism sweep | |
| 16 | + | |
| 17 | +Link every corpus scenario 10 times under parallelism. All 10 outputs byte-identical. Record the hash. | |
| 18 | + | |
| 19 | +### 3. Spec conformance survey | |
| 20 | + | |
| 21 | +Walk the Mach-O, Apple Mach-O ABI, and arm64 AAPCS64 specs section by section. For each feature used by armfortas or fortsh, confirm afs-ld implements it correctly. Checklist: | |
| 22 | + | |
| 23 | +- Header & magic. | |
| 24 | +- Load command set. | |
| 25 | +- Segment/section flags. | |
| 26 | +- Every relocation type in `<mach-o/arm64/reloc.h>`. | |
| 27 | +- Symbol types in `<mach-o/nlist.h>`. | |
| 28 | +- `LC_DYLD_INFO_ONLY` opcode set. | |
| 29 | +- `LC_DYLD_CHAINED_FIXUPS` format. | |
| 30 | +- Export trie terminal formats. | |
| 31 | +- `__unwind_info` layout. | |
| 32 | +- Compact unwind encoding. | |
| 33 | +- Code signature SuperBlob. | |
| 34 | + | |
| 35 | +For each, cite the afs-ld file/function that implements it. Gaps documented in `.docs/audits/sprint31_final.md`. | |
| 36 | + | |
| 37 | +### 4. CLI parity survey | |
| 38 | + | |
| 39 | +Every `ld` flag that armfortas or fortsh passes must be supported. Cross-check against: | |
| 40 | +- `armfortas/src/driver/mod.rs` linker-invocation call sites. | |
| 41 | +- `fortsh` CMake / build-system linker flags (consult the project). | |
| 42 | +- The set listed in Sprint 19. | |
| 43 | + | |
| 44 | +Any flag in the "passes but no-op" category audited for silent misbehavior. | |
| 45 | + | |
| 46 | +### 5. Binary size audit | |
| 47 | + | |
| 48 | +Compare total output size (afs-ld vs `ld`) on: | |
| 49 | +- hello-world. | |
| 50 | +- libarmfortas_rt-linked Fortran program. | |
| 51 | +- fortsh. | |
| 52 | + | |
| 53 | +Within 5% of `ld` on each. Larger than 5% triggers an investigation into where the bloat lives. | |
| 54 | + | |
| 55 | +### 6. Performance audit | |
| 56 | + | |
| 57 | +Sprint 28's benchmarks run one more time. fortsh link within 2× of `ld`. No regression since Sprint 28. | |
| 58 | + | |
| 59 | +### 7. Diagnostic quality audit | |
| 60 | + | |
| 61 | +Manual pass over every error and warning message. Each evaluated on: | |
| 62 | +- Does it name the input? | |
| 63 | +- Does it cite a location (file, offset, symbol)? | |
| 64 | +- Does it tell the user how to fix it? | |
| 65 | + | |
| 66 | +Low-quality diagnostics fixed on the spot. | |
| 67 | + | |
| 68 | +### 8. Dead code and `unwrap`/`panic` sweep | |
| 69 | + | |
| 70 | +Cargo-geiger-style (but hand-rolled, since we forbid external deps): | |
| 71 | +- Every `.unwrap()` / `.expect()` reviewed. Panics only in truly-impossible cases. | |
| 72 | +- Every `todo!()` or `unimplemented!()` either implemented or explicitly deferred with a pointer to a future sprint. | |
| 73 | +- Dead code removed. | |
| 74 | + | |
| 75 | +### 9. CLAUDE.md, README, overview.md refresh | |
| 76 | + | |
| 77 | +Sync documentation with the final state of the crate. Note any scope changes from the original plan. If any sprint was rescoped or split, update the sprint index. | |
| 78 | + | |
| 79 | +### 10. Submodule pin | |
| 80 | + | |
| 81 | +Parent armfortas pinned to a specific afs-ld commit. Tag the afs-ld repo `v0.1.0`. | |
| 82 | + | |
| 83 | +### 11. Default-swap removal | |
| 84 | + | |
| 85 | +After the audit passes, Sprint 20's `AFS_LD=1` default flip becomes permanent. The env-var fallback stays for one more sprint as a safety net (configurable via `AFS_LD=0` to fall back to system `ld`), then removed entirely. | |
| 86 | + | |
| 87 | +## Testing Strategy | |
| 88 | + | |
| 89 | +- Every prior test suite run; all green. | |
| 90 | +- Determinism sweep (§2). | |
| 91 | +- Perf sweep (§6). | |
| 92 | +- Manual binary-size diff (§5). | |
| 93 | +- Manual CLI parity checklist (§4). | |
| 94 | + | |
| 95 | +## Definition of Done | |
| 96 | + | |
| 97 | +- Audit report `.docs/audits/sprint31_final.md` written. | |
| 98 | +- All tests green. | |
| 99 | +- No open critical items. | |
| 100 | +- afs-ld is the armfortas default linker. | |
| 101 | +- Tagged `v0.1.0`. | |