fortrangoingonforty/afs-ld / d983997

Browse files

init crate scaffold

Authored by espadonne
SHA
d983997c3014841e0834f469adba93aa1b1d4fa0
Tree
b4417ef

4 changed files

StatusFile+-
A .gitignore 3 0
A CLAUDE.md 105 0
A Cargo.toml 18 0
A README.md 38 0
.gitignoreadded
@@ -0,0 +1,3 @@
1
+target/
2
+.fackr/
3
+.claude/
CLAUDE.mdadded
@@ -0,0 +1,105 @@
1
+# CLAUDE.md
2
+
3
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+## Repository Context
6
+
7
+`afs-ld` is a **git submodule** of [ARMFORTAS](https://github.com/FortranGoingOnForty/armfortas), a bespoke ARM64 Fortran compiler. It is the standalone ARM64 Mach-O linker: it reads Mach-O relocatable objects (MH_OBJECT) produced by `afs-as`, static archives, binary dylibs, and TAPI TBD text stubs, and emits linked Mach-O executables (MH_EXECUTE) and shared libraries (MH_DYLIB). It knows nothing about Fortran — the boundary with the compiler is the CLI (an `ld`-compatible flag surface).
8
+
9
+The parent `armfortas/CLAUDE.md` describes the broader compiler philosophy (bespoke, no LLVM, no parser generators, no compiler-infrastructure crates) and applies here too. **Rust standard library only** — no `clap`, no `serde`, no `byteorder`, no `object`, no `goblin`, no `memmap2`, no YAML crate. Hand-roll parsers, serializers, and the tiny subset of YAML we need for TBD files.
10
+
11
+## Build, Test, Lint
12
+
13
+```bash
14
+cargo build -p afs-ld                          # build linker crate
15
+cargo test  -p afs-ld                          # run all afs-ld tests
16
+cargo clippy -p afs-ld --all-targets -- -D warnings
17
+
18
+cargo test --lib -p afs-ld                     # unit tests only (in src/)
19
+cargo test --test <name> -p afs-ld             # one integration test file
20
+cargo test --test parity_matrix                # vs Apple `ld` across the corpus
21
+cargo test --test hello_world                  # executable end-to-end
22
+cargo test --test hello_library                # dylib end-to-end
23
+cargo test -p afs-ld -- <substring>            # filter by test name
24
+```
25
+
26
+Integration tests shell out to Apple `ld`, `otool`, `nm`, `codesign`, and `xcrun`. They require **macOS on Apple Silicon** and a working Xcode command-line toolchain. Do not stub these out — the differential against the system linker is the entire point of the parity matrix.
27
+
28
+## Architecture
29
+
30
+Pipeline, end to end:
31
+
32
+```
33
+args → inputs → resolve → atomize → layout → apply relocs → synth sections → write → sign
34
+ │        │        │         │        │            │              │            │       │
35
+args.rs  input.rs resolve.rs atom.rs layout.rs  reloc/arm64.rs  synth/*.rs  macho/   synth/
36
+                 symbol.rs                                                   writer    code_sig
37
+```
38
+
39
+### Module responsibilities
40
+
41
+- **`src/args.rs`** — CLI parser. Hand-rolled, no `clap`. Recognizes the `ld`-compatible flag surface (Sprint 19 ships the full set).
42
+- **`src/macho/`** — Mach-O 64 read/write. `constants.rs` holds the numeric literals (`LC_*`, `MH_*`, `S_*`, `N_*`, `ARM64_RELOC_*`) — duplicated from afs-as rather than cross-crate coupled, keeping each submodule independent. `reader.rs` parses MH_OBJECT; `writer.rs` emits MH_EXECUTE and MH_DYLIB; `dylib.rs` parses binary MH_DYLIB; `tbd.rs` parses TAPI TBD v4 text stubs (minimal YAML subset, not a general parser).
43
+- **`src/archive.rs`** — BSD + SysV + GNU-thin static archives. Lazy member fetch.
44
+- **`src/input.rs`** — `InputFile` enum unifying objects, archives, dylibs, TBDs.
45
+- **`src/symbol.rs`** / **`src/resolve.rs`** — `Symbol` sum type and the name resolution pass. Archive-driven fixed-point loop; weak/common/alias coalescing; diagnostics with did-you-mean.
46
+- **`src/atom.rs`** — subsections-via-symbols atomization. Atoms are the unit of dead-stripping, ICF, and output layout.
47
+- **`src/section.rs`** / **`src/layout.rs`** — output segment plan and VM/file-offset assignment. `MH_EXECUTE` and `MH_DYLIB` are both first-class.
48
+- **`src/reloc/`** — ARM64 reloc application (`arm64.rs`) and LOH relaxation (`loh.rs`). Handles BRANCH26, PAGE21/PAGEOFF12, GOT_LOAD_*, POINTER_TO_GOT, TLVP_LOAD_*, UNSIGNED, SUBTRACTOR, ADDEND.
49
+- **`src/synth/`** — synthetic sections: `got`, `stubs`, `tlv`, `symtab`, `dyld_info` (classic), `chained` (LC_DYLD_CHAINED_FIXUPS), `unwind`, `eh_frame`, `func_starts`, `data_in_code`, `code_sig` (ad-hoc SHA-256).
50
+- **`src/gc.rs`** / **`src/icf.rs`** — `-dead_strip` and `-icf=safe` passes.
51
+- **`src/map.rs`** / **`src/why_live.rs`** — `-map` link map and `-why_live` dead-strip reasoning.
52
+- **`src/driver.rs`** — orchestrator: args → inputs → resolve → atomize → layout → apply relocs → synth → write → sign.
53
+- **`src/diag.rs`** — diagnostics. Path + byte offset + caret, matching `afs-as/src/diag*.rs` style. Deterministic output: no wall clock, no pid, no thread-id.
54
+
55
+## Coding Conventions
56
+
57
+- **Rust std only.** Any external dependency needs an explicit debate and a CLAUDE.md update.
58
+- **`unsafe` only where genuinely required.** Keep blocks small and commented. The one known case is `libc::mmap` for large input files (Sprint 28).
59
+- **Exhaustive pattern matching** on `Section`, `Symbol`, `Relocation`, `InputFile`, `Fixup`, `LoadCommand` — no catch-all `_` arms outside tests.
60
+- **Determinism**: no timestamps in output, sorted iteration order, stable hashing, parallelism preserves byte-identical output.
61
+- **Commit discipline**: terse imperative messages, no co-authors, per-file / per-chunk commits, never monoliths. No sprint-number references in commit messages.
62
+- **No "stubs pass silently"**: placeholder code that returns wrong answers is worse than code that panics. Tests must catch the stub, not paper over it.
63
+- **Diagnostics always cite input + offset + caret.** `src/diag.rs` is the one place that constructs these; every error path goes through it.
64
+
65
+## Test Architecture
66
+
67
+Tests are **layered**, not a single golden path. Each layer catches a different class of regression:
68
+
69
+| Test file | What it proves |
70
+|---|---|
71
+| `src/**/#[cfg(test)]` | Parser / encoder / resolution unit tests (`cargo test --lib`) |
72
+| `tests/common/harness.rs` | Differential harness: spawn afs-ld + system ld on the same inputs, diff outputs |
73
+| `tests/reader_*.rs` | Round-trip Mach-O object reads across the afs-as corpus |
74
+| `tests/reloc_*.rs` | Golden-file relocation application |
75
+| `tests/resolve_*.rs` | Symbol resolution matrices (strong vs weak vs common vs dylib vs archive) |
76
+| `tests/hello_world.rs` | End-to-end: afs-as → afs-ld → runnable PIE executable |
77
+| `tests/hello_library.rs` | End-to-end: afs-as → afs-ld → `dlopen`able dylib |
78
+| `tests/parity_matrix.rs` | Full corpus byte-level differential vs Apple `ld` (CI gate) |
79
+| `tests/armfortas_integration.rs` | Parent's integration suite run under `AFS_LD=1` |
80
+
81
+Corpus fixtures live in `tests/corpus/`. Every new relocation kind, section kind, or CLI flag lands a corpus entry in the same sprint that implements it.
82
+
83
+## Audit discipline
84
+
85
+After each sprint, a brutally honest audit:
86
+
87
+- Assume nothing works until proven otherwise. Test every claim.
88
+- "Placeholder" and "stub" are synonyms for "broken." Wrong answers silently produced are worse than crashes.
89
+- Check against the Mach-O ABI spec, not just "does it link." Wrong output corrupts the loader's bind/rebase state.
90
+- Don't soften findings. "Major" means "produces wrong binaries." "Critical" means "silent corruption that dyld will accept."
91
+- No deferred items unless they genuinely require a later sprint. Fix it now if it can be fixed now.
92
+- The audit is not a formality. It's the last line of defense before bad linker output gets merged and ships bad binaries downstream.
93
+
94
+## Key references
95
+
96
+- `.refs/llvm/lld/MachO/` — primary architectural reference.
97
+- `.refs/ld64/src/` — Apple authoritative. `src/ld/` and `src/mach_o/` cover the whole linker.
98
+- `.refs/mold/src/` — performance techniques (parallel parsing, string merging, allocator tricks). Mostly ELF-oriented but the performance patterns apply.
99
+- Apple `<mach-o/loader.h>`, `<mach-o/nlist.h>`, `<mach-o/reloc.h>`, `<mach-o/arm64/reloc.h>` — mirrored numerically in `src/macho/constants.rs`.
100
+- `dyld` open source — bind/rebase/lazy-bind opcode semantics and chained-fixups format.
101
+- ARM Architecture Reference Manual (ARMv8-A) — encoding of relocated instructions.
102
+
103
+## Sprint roadmap
104
+
105
+`.docs/sprints/index.md` — 32 sprints across 10 phases. Each sprint has an individual markdown file with prerequisites, deliverables, testing strategy, and definition of done.
Cargo.tomladded
@@ -0,0 +1,18 @@
1
+[package]
2
+name = "afs-ld"
3
+version = "0.1.0"
4
+edition = "2021"
5
+description = "Standalone ARM64 Mach-O linker for macOS"
6
+license = "GPL-3.0-only"
7
+repository = "https://github.com/FortranGoingOnForty/afs-ld"
8
+readme = "README.md"
9
+keywords = ["arm64", "aarch64", "linker", "macho", "macos"]
10
+categories = ["development-tools"]
11
+
12
+[[bin]]
13
+name = "afs-ld"
14
+path = "src/main.rs"
15
+
16
+[lib]
17
+name = "afs_ld"
18
+path = "src/lib.rs"
README.mdadded
@@ -0,0 +1,38 @@
1
+# afs-ld
2
+
3
+**Bespoke ARM64 Mach-O linker for Apple Silicon. Rust stdlib only.**
4
+
5
+Sister project to [afs-as](https://github.com/FortranGoingOnForty/afs-as) (the assembler) and [armfortas](https://github.com/FortranGoingOnForty/armfortas) (the compiler). Together they form a complete Fortran-to-executable toolchain with zero dependencies on LLVM or external compiler infrastructure.
6
+
7
+## Status
8
+
9
+Sprint 0 — scaffolding only. Does not yet produce usable output.
10
+
11
+## Build
12
+
13
+```bash
14
+cargo build -p afs-ld
15
+cargo test  -p afs-ld
16
+cargo clippy -p afs-ld --all-targets -- -D warnings
17
+```
18
+
19
+Tests require macOS on Apple Silicon and a working Xcode command-line toolchain (`xcrun`).
20
+
21
+## Design
22
+
23
+- Reads `.o` (MH_OBJECT) Mach-O produced by `afs-as`, static archives (`.a`), binary dylibs, and TAPI TBD text stubs.
24
+- Emits `MH_EXECUTE` or `MH_DYLIB` PIE Mach-O files.
25
+- Ad-hoc code signing so binaries run directly on macOS 11+ arm64 hardware.
26
+- Supports both classic `LC_DYLD_INFO` opcodes and modern `LC_DYLD_CHAINED_FIXUPS`.
27
+
28
+See `.docs/overview.md` for full architecture and `.docs/sprints/index.md` for the development roadmap.
29
+
30
+## Non-goals
31
+
32
+- ELF / COFF / PE — Mach-O only.
33
+- Architectures other than arm64.
34
+- LTO / bitcode.
35
+
36
+## License
37
+
38
+GPL-3.0-only.