afs-ld Public

Notifications Fork 0 Star 0

Comparing changes

Choose two branches to see what's changed or to start a new pull request.

Create pull request

Able to merge. These branches can be automatically merged.

33 commits 43 files changed 2 contributors

Commits on trunk

Wire performance closeout gate 73c9b05 mfwolffe committed 1 week ago
Avoid cloned strtab offset map 43a662a mfwolffe committed 1 week ago
Parallelize code signature hashing d608670 mfwolffe committed 1 week ago
Parallelize atom relocation patching bb7169d mfwolffe committed 1 week ago
Add linker job limit 47f6d74 mfwolffe committed 1 week ago
Parse archive members in parallel 4096883 mfwolffe committed 1 week ago
Parse object inputs in parallel ba308a5 mfwolffe committed 1 week ago
Gate deterministic link output 01ed08d mfwolffe committed 1 week ago
Parallelize parity matrix runs d1bd8f6 mfwolffe committed 1 week ago
Speed TBD input loading 85493c6 mfwolffe committed 1 week ago
Speed thunk and symbol planning 4bf5d09 mfwolffe committed 1 week ago
Report parity matrix timings afcd90f mfwolffe committed 1 week ago
Speed linkedit export planning e41e4b6 mfwolffe committed 1 week ago
Speed synthetic planning 1d9f402 mfwolffe committed 1 week ago
Index function starts metadata b81b24f mfwolffe committed 1 week ago
Match Apple ld load commands 72e7349 espadonne committed 3 weeks ago
Use Layout::empty in writer smoke f277ac8 espadonne committed 3 weeks ago
Silence tbd yaml clippy 7adc761 mfwolffe committed 3 weeks ago
Relax CI parity edge cases b924708 mfwolffe committed 3 weeks ago
Tolerate runner parity drift 31a9727 mfwolffe committed 3 weeks ago
Normalize CI parity surfaces b8c1d1c mfwolffe committed 3 weeks ago
Normalize parity drift 614d61b mfwolffe committed 3 weeks ago
Name parity matrix failures 6107a74 mfwolffe committed 3 weeks ago
Preserve mixed dylib load order fdcd358 mfwolffe committed 3 weeks ago
Merge branch 'repo-cleanup' de7cc73 mfwolffe committed 3 weeks ago
Track planning docs c1785ff espadonne committed 4 weeks ago
Prove layout determinism 233258a espadonne committed 4 weeks ago
Emit uuid and source version 62b5488 espadonne committed 4 weeks ago
Preserve section reserved fields b451ee2 espadonne committed 4 weeks ago
Match ld load command order 0165b5b espadonne committed 4 weeks ago
Add rpath load commands 238b669 espadonne committed 4 weeks ago
Record direct dylib load commands 15b2372 espadonne committed 4 weeks ago
Write laid-out Mach-O output c5cdcf6 espadonne committed 4 weeks ago

.docs/sprints/closeout0-9.mdadded

 +# Sprint 0-9 Closeout Checklist
++
 +Concrete closeout checklist based on the current codebase audit.
++
 +Current conclusion: we are not ready to honestly declare Sprint 10 complete-in-practice yet.
 +The main blockers are:
++
 +- Sprint 0's tolerated-diff categories are still deferred until afs-ld can emit real linked output for Mach-O-to-Mach-O differential checks.
++
 +## Sprint 10 Gate
++
 +Do not declare "we are on Sprint 10" until all of these are true:
++
 +- [x] Sprint 9 reloc referents are remapped to atom-aware forms.
 +- [x] Sprint 8 resolution orchestration exists as a real callable stage, not just loose helper APIs.
 +- [x] `cargo test -p afs-ld` is green after the closeout work.
 +- [x] `cargo clippy -p afs-ld --all-targets -- -D warnings` is green after the closeout work.
 +- [x] `README.md` and sprint docs no longer materially misstate the current state of the crate.
++
 +## Recommended Order
++
 +- [x] Close Sprint 9 reloc-to-atom remap first.
 +- [x] Close Sprint 8 resolution orchestration and option coverage second.
 +- [x] Close Sprint 6 TBD/SDK search gaps third.
 +- [x] Close Sprint 4 nested archive support fourth.
 +- [ ] Finish the deferred Sprint 0 differential-harness tolerance work once afs-ld can emit real output.
++
 +## Cross-Sprint Exit Criteria
++
 +- [ ] Every closeout chunk lands with tests.
 +- [ ] Every bug fix or behavioral gap gets a regression test.
 +- [ ] No newly-discovered roadmap/code mismatch is left undocumented.
 +- [ ] Any user-facing diagnostic we touch stays deterministic and testable.
++
 +## Sprint 0
++
 +Status: closed
++
 +Validated:
++
 +- [x] `afs-ld` exists as its own git submodule in the parent workspace.
 +- [x] Parent `Cargo.toml` includes `afs-ld` as a workspace member.
 +- [x] `CLAUDE.md`, `README.md`, crate wiring, and test harness scaffolding exist.
 +- [x] Reference repos are present under parent `.refs/` (`ld64`, `mold`, `lld`).
 +- [x] `tests/reader_empty.rs` enforces the empty-invocation CLI contract.
 +- [x] `tests/diff_harness_sanity.rs` and `tests/diff_harness_finds_critical.rs` exist and pass.
 +- [x] `cargo clippy -p afs-ld --all-targets -- -D warnings` is currently clean.
++
 +Remaining closeout work:
++
 +- [x] Explicitly downscope Sprint 0 docs so the current diff harness is described as synthetic until end-to-end linking exists.
 +- [ ] Add tolerated-diff categories once real Mach-O-to-Mach-O comparisons exist.
++
 +## Sprint 1
++
 +Status: closed
++
 +Validated:
++
 +- [x] Mach-O constants are duplicated locally in `src/macho/constants.rs`.
 +- [x] `MachHeader64` parsing exists and rejects malformed headers.
 +- [x] Load-command dispatch exists and preserves unknown commands as raw bytes.
 +- [x] Segment and section-header metadata parsing exists.
 +- [x] `LC_BUILD_VERSION` and `LC_LINKER_OPTIMIZATION_HINT` decoding exists.
 +- [x] `--dump` exists through `src/dump.rs` and `src/main.rs`.
 +- [x] Corpus round-trip tests pass in `tests/reader_corpus_round_trip.rs`.
++
 +Remaining closeout work:
++
 +- [x] Add an `otool -lV` parity test for dumper output shape across the corpus.
 +- [x] Add a panic-focused malformed-input stress pass beyond the current unit tests so the "no panics on malformed input" claim is defensible.
++
 +## Sprint 2
++
 +Status: closed
++
 +Validated:
++
 +- [x] Section classification exists in `src/section.rs`.
 +- [x] `InputSection` carries section data and raw relocation bytes.
 +- [x] `RawNlist` / `InputSymbol` parsing and classification exist in `src/symbol.rs`.
 +- [x] Common symbols, weak flags, private externs, and indirect aliases are surfaced.
 +- [x] `StringTable` exists and handles suffix-dedup overlaps.
 +- [x] `DysymtabCmd` is parsed and exposed through `ObjectFile`.
 +- [x] `ObjectFile` integrates header, commands, sections, symbols, strings, and dysymtab.
++
 +Remaining closeout work:
++
 +- [x] Add `nm -a` parity tests for symbol view and classification.
 +- [x] Add `otool -r` parity checks for relocation-offset surfaces promised by Sprint 2, with section/load-command parity covered by the Sprint 1 `otool -lV` gate.
 +- [x] Add stronger malformed-symbol / malformed-string-table stress coverage if we want the "never panics" bar to be explicit.
++
 +## Sprint 3
++
 +Status: closed enough for current closeout
++
 +Validated:
++
 +- [x] ARM64 relocation constants exist.
 +- [x] Raw relocation parsing and writing exist.
 +- [x] Fused `Reloc` form exists.
 +- [x] `ADDEND` and `SUBTRACTOR + UNSIGNED` pairing is fused in `parse_relocs`.
 +- [x] Validation logic exists in `validate_relocs`.
 +- [x] Write-side round-trip support exists.
 +- [x] Unit coverage is broad and current corpus relocation round-trips pass.
++
 +Remaining closeout work:
++
 +- [x] No audit-blocking work found for Sprint 3.
++
 +## Sprint 4
++
 +Status: closed
++
 +Validated:
++
 +- [x] BSD, SysV, and GNU-thin archive flavors are recognized.
 +- [x] Archive headers and name decoding are implemented.
 +- [x] Symbol-index parsing exists for BSD and SysV archives.
 +- [x] Lazy member fetch exists via `fetch_object_defining`.
 +- [x] `libarmfortas_rt.a` is exercised by `tests/archive_runtime.rs`.
 +- [x] Archive dump mode exists via `--dump-archive`.
++
 +Remaining closeout work:
++
 +- [x] Implement one-level nested archive support (`.a` member inside `.a`) and preserve provenance for diagnostics.
 +- [x] Formally treat `resolve::force_load_archive` / `force_load_all` as the Sprint 4 completion surface and document that surface instead of adding a parallel archive-only helper.
 +- [x] Add `ar -t` shape/parity coverage for `--dump-archive`.
++
 +## Sprint 5
++
 +Status: partially closed
++
 +Validated:
++
 +- [x] `DylibFile` exists and parses binary `MH_DYLIB`.
 +- [x] `LC_ID_DYLIB`, dependency dylib commands, ordinals, and rpaths are decoded.
 +- [x] Export trie decoding exists with cycle/depth protection.
 +- [x] Real clang-built dylib coverage exists in `tests/dylib_integration.rs`.
 +- [x] Dylib dump mode exists via `--dump-dylib`.
++
 +Remaining closeout work:
++
 +- [ ] Prove recursive re-export / umbrella lookup behavior with a focused test, not just dependency collection.
 +- [ ] Confirm the public dylib surface matches what Sprint 5 intended for re-exported symbols, not only direct exports.
++
 +## Sprint 6
++
 +Status: closed
++
 +Validated:
++
 +- [x] The custom YAML subset parser exists in `src/macho/tbd_yaml.rs`.
 +- [x] TBD schema decoding exists in `src/macho/tbd.rs`.
 +- [x] `DylibFile::from_tbd` exists and materializes TBDs into the same linker-facing surface.
 +- [x] Real `libSystem.tbd` smoke/integration coverage exists in `tests/tbd_smoke.rs` and `tests/tbd_integration.rs`.
 +- [x] TBD dump mode exists via `--dump-tbd`.
++
 +Remaining closeout work:
++
 +- [x] Implement SDK `-syslibroot` library search helpers for `.tbd` / `.dylib`.
 +- [x] Implement framework search helpers promised by Sprint 6.
 +- [x] Make target filtering fail loudly when the requested target is not exported, instead of only materializing matching targets when the caller already knows one exists.
 +- [x] No further audit-blocking work found for Sprint 6 in the current helper/test surface.
++
 +## Sprint 7
++
 +Status: closed
++
 +Validated:
++
 +- [x] `Symbol` sum type exists with the planned major variants.
 +- [x] `StringInterner`, opaque ids, and `SymbolTable` exist.
 +- [x] The insertion matrix is heavily unit-tested.
 +- [x] Weak/strong/common coalescing behavior is covered in unit tests.
 +- [x] Alias-cycle detection and chain resolution exist.
 +- [x] Transition logging exists.
++
 +Remaining closeout work:
++
 +- [ ] Add the differential weak-coalescing / duplicate-behavior coverage against system `ld` that Sprint 7 originally called for.
++
 +## Sprint 8
++
 +Status: closed
++
 +Validated:
++
 +- [x] Archive seeding, object seeding, and dylib seeding exist.
 +- [x] Fixed-point archive fetch draining exists.
 +- [x] `force_load_archive` and `force_load_all` helpers exist in `src/resolve.rs`.
 +- [x] Undefined classification exists for `Error`, `Warning`, `Suppress`, and `DynamicLookup`.
 +- [x] Did-you-mean support exists.
 +- [x] Duplicate-symbol and undefined-symbol formatting helpers exist.
 +- [x] Real integration coverage exists for archive pull plus unresolved-symbol reporting.
++
 +Remaining closeout work:
++
 +- [x] Add a real orchestration entrypoint for resolution (`seed -> optional force load -> drain -> classify`) that can be called as a coherent stage.
 +- [x] Add option/state plumbing for `all_load`, `force_load`, and undefined treatment so resolution is not just a bag of helper APIs.
 +- [x] Add an archive order-sensitivity test.
 +- [x] Add dedicated tests for `force_load_archive` and `force_load_all`.
 +- [x] Add dedicated tests for `UndefinedTreatment::Warning`, `Suppress`, and `DynamicLookup`.
 +- [x] Add a dedicated test that unresolved weak refs stay accepted regardless of treatment.
 +- [x] Tighten diagnostics toward the Sprint 8 format by carrying section/offset provenance and aggregate repeated relocation sites when available.
++
 +## Sprint 9
++
 +Status: closed
++
 +Validated:
++
 +- [x] Atom model and atom table exist.
 +- [x] Section splitting at symbol boundaries exists.
 +- [x] `.alt_entry` folding exists.
 +- [x] CString atom splitting exists and is integration-tested.
 +- [x] Compact-unwind atom splitting and `parent_of` wiring exist.
 +- [x] Backpatching of `Symbol::Defined { atom }` exists.
 +- [x] `N_NO_DEAD_STRIP` and weak-def flags are propagated into atom flags.
 +- [x] Embedded payload addends on symbol-based data relocs are folded into local atom offsets or preserved on external refs.
++
 +Remaining closeout work:
++
 +- [x] Remap relocations from raw section/symbol referents into atom-aware referents.
 +- [x] Add atom-local relocation storage or an equivalent per-atom relocation view.
 +- [x] Ensure same-object references point at target atoms, not raw section offsets.
 +- [x] Add a focused integration test proving a local branch or data reference resolves to the callee/target atom.
 +- [x] Add a boundary-crossing reloc diagnostic test.
 +- [x] Confirm no raw section-relative relocation state leaks into Sprint 10 inputs.
 +- [x] No further audit-blocking work found for Sprint 9 in the current corpus and targeted local-addend probes.
++
 +## Documentation Closeout
++
 +- [x] Update `README.md` so it no longer says the crate is only Sprint 0 scaffolding.
 +- [x] Refresh sprint docs whose deliverables have been implemented under a different surface than originally planned.
 +- [x] Keep `CLAUDE.md` as the authority for discipline, but make user-facing docs match the actual code.
++
 +## Verification Commands
++
 +- [x] `cargo test -p afs-ld`
 +- [x] `cargo clippy -p afs-ld --all-targets -- -D warnings`
 +- [ ] Focused xcrun-backed checks when touching reader/resolve/atom/TBD/dylib paths:
 +  - [x] `cargo test -p afs-ld --test reader_corpus_round_trip -- --nocapture`
 +  - [x] `cargo test -p afs-ld --test resolve_integration -- --nocapture`
 +  - [x] `cargo test -p afs-ld --test atom_integration -- --nocapture`
 +  - [x] `cargo test -p afs-ld --test dylib_integration -- --nocapture`
 +  - [x] `cargo test -p afs-ld --test tbd_integration -- --nocapture`

.docs/sprints/sprint00.mdmodified

  pub fn diff_macho(ours: &[u8], theirs: &[u8]) -> DiffReport;
  ```
 -`DiffReport` categorizes byte differences as `Tolerated` (UUID, timestamp, temp-path hashes) or `Critical` (anything else). Critical diffs fail the test. `link_both` shells out to `ld` via `xcrun -f ld` so it picks up the active toolchain.
 +`DiffReport` categorizes byte differences as `Tolerated` (UUID, timestamp, temp-path hashes) or `Critical` (anything else). Critical diffs fail the test.
++
 +Closeout note: the current Sprint 0 surface is intentionally synthetic. `diff_macho` exists and is tested, but `link_both` remains a placeholder until afs-ld can emit real linked output. That means the current harness validates diff categorization logic, not end-to-end linker parity yet.
  ### 6. Skeleton CLI and first failing test
  - `afs-ld/src/args.rs`: hand-rolled argv parser stub that recognizes `-o`, `-e`, `-arch`, and positional inputs. Unknown flags error loudly with a hint.
  - `afs-ld/tests/reader_empty.rs`: attempts to link `0 inputs → empty output`, expects the diagnostic `"afs-ld: error: no input files"`. Passes today by producing that exact string.
 -- `afs-ld/tests/diff_harness_sanity.rs`: runs the harness against a known-identical pair (two copies of the same pre-linked binary produced by `xcrun ld`) and expects zero diffs. Passes.
 -- `afs-ld/tests/diff_harness_finds_critical.rs`: feeds the harness two binaries that differ in a non-tolerated byte range (e.g. different text bytes) and asserts the harness reports `Critical`. Passes.
 +- `afs-ld/tests/diff_harness_sanity.rs`: exercises the diff surface against two identical synthetic byte slices and expects zero diffs. Passes.
 +- `afs-ld/tests/diff_harness_finds_critical.rs`: feeds the harness two synthetic binaries that differ in a non-tolerated byte range and asserts the harness reports `Critical`. Passes.
  ## Testing Strategy
  - `armfortas/Cargo.toml` lists `afs-ld` in `[workspace] members`.
  - `afs-ld/CLAUDE.md`, `README.md`, `Cargo.toml`, `src/lib.rs`, `src/main.rs`, `src/args.rs` all committed in the new repo.
  - `.refs/ld64/` and `.refs/mold/` cloned.
 -- Differential harness runs, correctly reports zero diffs on identical binaries, correctly reports critical diffs on intentionally-different binaries.
 +- Differential harness substrate runs, correctly reports zero diffs on identical byte slices, correctly reports critical diffs on intentionally-different byte slices.
  - `cargo test --workspace` green.

.docs/sprints/sprint01.mdmodified

  ## Goals
  Read a Mach-O relocatable object file: parse the header and every load command afs-as emits. End state: given any `.o` in `afs-as/tests/corpus/`, afs-ld can pretty-print its structure and round-trip-compare it to a golden.
 +Closeout note: alongside the original unit coverage, `tests/reader_malformed_stress.rs`
 +now runs deterministic truncated/header-corruption cases over real corpus-built
 +objects to defend the "no panics on malformed input" bar, and
 +`tests/reader_tool_parity.rs` checks the `--dump` load-command surface against
 +`otool -lV` across the afs-as corpus.
++
  ## Deliverables
  ### 1. Mach-O constants

.docs/sprints/sprint02.mdmodified

  ## Goals
  Decode section payloads, the symbol table (nlist_64), and the string table. Expose the full section/symbol/string model that later sprints build on.
 +Closeout note: `tests/reader_malformed_stress.rs` now also covers malformed
 +symbol/string-table variants derived from real corpus objects so the reader's
 +symbol and string surfaces are exercised under targeted bad-input cases, not
 +just hand-written unit fixtures. `tests/reader_tool_parity.rs` now also checks
 +symbol classification against `nm -a` and raw relocation tables against
 +`otool -r` across the afs-as corpus.
++
  ## Deliverables
  ### 1. Section attributes and kinds

.docs/sprints/sprint04.mdmodified

  ## Goals
  Read static archives (`.a`) including the BSD, System V, and GNU-thin variants. Support lazy member fetching: a member is only parsed when an undefined symbol names it. This is the mechanism by which `libarmfortas_rt.a` gets pulled in.
 +Closeout note: the force-load surface landed in the resolver as
 +`resolve::force_load_archive` / `resolve::force_load_all`, and one-level
 +nested archives are expanded through the fetched-member path with provenance
 +chains such as `outer.a(inner.a)(foo.o)`. `--dump-archive` now intentionally
 +prints the same member listing shape as `ar -t`, and parity is checked
 +against both generated archives and `libarmfortas_rt.a` when available.
++
  ## Deliverables
  ### 1. Archive format recognizer
  Returns `None` if the archive does not define `name`. Fetching an archive member memoizes: a second lookup for the same member returns a cached handle. The resolution pass (Sprint 8) is the only caller.
  ### 6. `-force_load` / `-all_load` support (semantics, not CLI yet)
 -Archive has a `force_all(&mut self)` method that pre-fetches every member. Sprint 19 wires the CLI.
 +Implemented via the resolver-level helpers
 +`resolve::force_load_archive` / `resolve::force_load_all`, which pre-fetch
 +archive members against the live linker input registry. Sprint 19 wires the
 +CLI surface.
  ### 7. Archive-of-archives
  Rare but legal: member can be another `.a`. Recurse one level. If a sub-archive defines `name`, the outer `fetch` returns the sub-member's object file and records a provenance chain for diagnostics.

.docs/sprints/sprint08.mdmodified

  ## Goals
  Drive the symbol table to a fixed point: every undefined reference either resolves to a Defined (from an object), Common (promoted in BSS), DylibImport (from a dylib/TBD), or raises a clear, actionable diagnostic. `-force_load` / `-all_load` / `-undefined <treatment>` all handled.
 +Closeout note: the implemented entrypoint is
 +`resolve(inputs, table, opts) -> ResolutionReport`. The current library
 +surface applies archive force-loading as archives are encountered in
 +command-line order so left-to-right archive behavior stays explicit.
++
  ## Deliverables
  ### 1. Resolution algorithm
  ```rust
  pub fn resolve(inputs: &mut Inputs, table: &mut SymbolTable, opts: &LinkOptions)
 -    -> Result<(), Vec<ResolveError>>
 +    -> Result<ResolutionReport, ResolutionError>
+ {
 -    seed_table_with_objects_and_dylib_imports(inputs, table, opts);
 -    if opts.all_load    { force_load_everything(inputs, table); }
 -    for forced in &opts.force_load { force_load_one(inputs, table, forced); }
 -    fixed_point_pull_from_archives(inputs, table);
 -    classify_unresolved(table, opts);
 +    seed_and_resolve_in_link_order(inputs, table, opts);
 +    classify_unresolved(table, opts.undefined_treatment);
+ }
  ```
  ### 4. `-force_load` and `-all_load`
  - `-force_load <archive>`: pull every member of that archive before fixed-point.
  - `-all_load`: pull every member of every archive.
 -- Both happen before the fixed-point loop so their transitively-pulled symbols feed into the same fixed point.
 +- In the implemented surface these happen when the named archive is encountered in link order, which preserves left-to-right linker semantics while still feeding the same resolution/classification pipeline.
  ### 5. `-undefined <treatment>`
  After the fixed point, any still-Undefined entry is classified by the `-undefined` setting:
  ```
  afs-ld: error: undefined symbol: _afs_print
 -      referenced by program.o(text section + 0x34)
 -      referenced by runtime.o(text section + 0x120)
 +      referenced by program.o(__TEXT,__text + 0x34)
 +      referenced by runtime.o(__TEXT,__text + 0x120)
        (also via 2 relocations in libarmfortas_rt.a(io.o))
  Hint: did you mean _afs_print_real? (Levenshtein distance 5)
  ```
  ### 8. Diagnostics for duplicate strong
  ```
  afs-ld: error: duplicate symbol _foo
 -  defined in: a.o (text + 0x0)
 -  also in:    b.o (text + 0x0)
 +  defined in: a.o (__TEXT,__text + 0x0)
 +  also in:    b.o (__TEXT,__text + 0x0)
  ```
  No suggestion — two strong defs is a real ambiguity.

.docs/sprints/sprint28.mdmodified

  Sprint 27 — correctness gate in place; can freely refactor for speed.
  ## Goals
 -Make afs-ld fast enough to feel like a production tool. Target: within 2× of Apple `ld`'s wall time on the fortsh link. Mold demonstrates linkers can be very fast; we don't need mold's speed, but we need to not be painful.
 +Make afs-ld fast enough to feel like a production tool. Sprint 28 establishes
 +the profiling surface, parallelizes the obvious hot paths, and enforces
 +hello/runtime-link budgets in CI. The fortsh 2× Apple `ld` gate remains the
 +production target, but Sprint 29 owns the fortsh fixture and final comparison.
 +Mold demonstrates linkers can be very fast; we don't need mold's speed, but we
 +need to not be painful.
  ## Deliverables
  ### 1. Baseline profile
 -Profile the fortsh link (Sprint 29 produces the fixture). Categorize wall time:
 +Profile representative hello-world and runtime-archive links in Sprint 28.
 +Sprint 29 extends the same profile surface to the fortsh link once the fixture
 +exists. Categorize wall time:
  - Input parsing (Mach-O headers, sections, symbols, relocations).
  - Symbol resolution (hash-map probes, archive lookups).
  ### 5. Bump allocator for ephemeral data
 -Parser produces many small allocations (strings, reloc lists, atom descriptors). A per-input arena avoids fragmentation and makes bulk drop free. Implement as `src/arena.rs` — a std-only `Vec<Box<[u8]>>` chunker.
 +Deferred. The current Sprint 28 profile work did not prove allocation churn is
 +the next limiting bucket after the parallel parsing/relocation/signature and
 +string-table clone fixes. If Sprint 29's fortsh profile shows parser allocation
 +pressure, implement `src/arena.rs` as a std-only `Vec<Box<[u8]>>` chunker.
  ### 6. mmap for large inputs
 -`std::fs::File` + `memmap2`? No — memmap2 is an external crate. Use `libc::mmap` via an unsafe `src/mmap.rs` wrapper. Input files are always read-only; mmap saves a read syscall and lets us share parse state across threads cheaply. Fall back to `fs::read` for GNU-thin archive members whose external path doesn't mmap cleanly (rare).
 +Deferred. Object/archive loading still uses `fs::read`; this keeps the Sprint 28
 +closeout safe and std-only. If fortsh-sized inputs show file-read overhead as a
 +real bucket in Sprint 29, add an unsafe `src/mmap.rs` wrapper and keep a
 +`fs::read` fallback for archive members whose external path cannot be mapped.
  ### 7. Symbol-table hash map
  ### 8. String interner
 -Single global `StringInterner` shared across inputs. Interning cost: one hash lookup per name. Optimize by batching per-input: each input parses its strings into a local table, then merges into the global interner in one pass.
 +Deferred. Sprint 28 made the global string table thread-shareable and removed
 +the cloned string-table offset map during output writing. Per-input local
 +interners remain a candidate if Sprint 29 identifies symbol seeding as a
 +fortsh-scale bottleneck.
  ### 9. No-alloc hot paths
  ### 10. Benchmarks
 -`afs-ld/bench/` (or a `#[bench]` behind `cargo +nightly bench`) with:
 -- `bench_hello_world`: small, measures startup overhead.
 -- `bench_runtime_link`: mid, measures symbol-table & reloc-apply.
 -- `bench_fortsh_link`: large, measures end-to-end throughput.
 +Sprint 28 uses CI-enforced integration benchmarks in `tests/perf_baseline.rs`:
++
 +- `bench_hello_world_profile_reports_baseline_timings`: small, measures startup overhead.
 +- `bench_runtime_link_profile_reports_baseline_timings`: mid, measures symbol-table, archive parsing, and reloc-apply.
 +- `bench_fortsh_link`: deferred to Sprint 29 with the real fortsh fixture.
  Budget targets:
  - hello-world: ≤ 20 ms.
  - runtime link: ≤ 150 ms.
 -- fortsh link: ≤ 2× Apple `ld`'s wall time on the same machine.
 +- fortsh link: ≤ 2× Apple `ld`'s wall time on the same machine, enforced in Sprint 29.
  ### 11. Determinism preserved
  ## Testing Strategy
 -- Benchmarks land as regression gates: nightly CI records throughput; > 10% regression fails.
 +- Benchmark gate: CI runs `tests/perf_baseline.rs` with hello/runtime budgets on every push and PR.
 +- Nightly throughput recording and a relative >10% regression gate are deferred until the fortsh fixture lands in Sprint 29.
  - Determinism: 100 parallel runs of the same input, assert byte-identical output every time.
  - Sprint 27 parity must remain green — no correctness regression.
  - Single-threaded fallback (`-j 1`) for debugging.
  ## Definition of Done
 -- fortsh link wall time within 2× of `ld`'s.
 +- hello/runtime performance budgets are enforced in CI.
 +- fortsh 2× comparison is explicitly handed to Sprint 29 with its fixture.
  - All Sprint 27 scenarios still byte-identical.
  - Determinism bulletproof across parallelism.
  - No external dependencies added.

.github/workflows/parity-matrix.ymlmodified

        - name: Run parity harness proof tests
          run: cargo test --test diff_harness_tolerates_known_linkedit --test parity_harness --test parity_canary -- --nocapture
 +      - name: Run determinism gate
 +        run: cargo test --test determinism -- --nocapture
++
 +      - name: Run performance budget gate
 +        env:
 +          AFS_LD_HELLO_BUDGET_MS: "20"
 +          AFS_LD_RUNTIME_BUDGET_MS: "150"
 +        run: cargo test --test perf_baseline -- --nocapture
++
        - name: Run parity matrix
          env:
            PARITY_MATRIX_ARTIFACT_DIR: ${{ github.workspace }}/parity-matrix-artifacts

AGENTS.mdadded

 +# AFS-LD
++
 +Local working guide for agents in `afs-ld`. Keep this file untracked.
 +`CLAUDE.md` is the tracked, authoritative policy file; this document adds a
 +reality-checked snapshot of the current implementation so we do not confuse the
 +roadmap with shipped code.
++
 +## Repository Context
++
 +`afs-ld` is the standalone ARM64 Mach-O linker for the ARMFORTAS toolchain. It
 +sits beside `afs-as` as a submodule in the `armfortas` workspace and is meant
 +to replace Apple's `ld` for binaries produced by armfortas.
++
 +The project boundary is intentionally clean:
++
 +- `afs-as` emits `MH_OBJECT`.
 +- `afs-ld` reads `.o`, `.a`, `.dylib`, and `.tbd`.
 +- armfortas should eventually hand final linking to `afs-ld` rather than to
 +  the system linker.
++
 +The project is Mach-O only, macOS only, arm64 only, stdlib only.
++
 +## Definition Of Done
++
 +The real finish line is not "parses some objects" or "links hello world once."
 +It is parity with Apple's `ld` for the binaries armfortas and fortsh need:
++
 +- arm64 Mach-O executables and dylibs
 +- static archive linking
 +- dylib and TBD ingestion
 +- dyld metadata that works on real macOS systems
 +- ad-hoc signing so output executes on Apple Silicon
 +- deterministic output
 +- enough correctness to link fortsh without ARM-specific workarounds
++
 +## Current Reality
++
 +This repo is ahead of Sprint 0 scaffolding, but it is not yet a full linker.
 +The roadmap in `.docs/overview.md` and `.docs/sprints/` is broader than the
 +code that exists today.
++
 +What is implemented now:
++
 +- hand-rolled CLI parsing for a small flag subset plus dump modes
 +- Mach-O header/load-command/section/symbol/string-table reading
 +- relocation parsing, fusion, validation, and round-trip support
 +- archive parsing and lazy member fetch support
 +- binary dylib parsing and export-trie walking
 +- TAPI TBD v4 parsing, including the custom YAML subset parser
 +- linker-side symbol interning, symbol table modeling, and resolution passes
 +- subsections-via-symbols atomization
 +- `--dump`, `--dump-archive`, `--dump-dylib`, and `--dump-tbd`
++
 +What is not implemented yet:
++
 +- real `Linker::run` output production
 +- output layout and Mach-O writing
 +- dyld metadata synthesis
 +- code signing
 +- dead-strip / ICF / thunks
 +- real differential linking against Apple `ld`
 +- driver integration with armfortas
 +- the full `ld`-compatible CLI surface described in Sprint 19
++
 +Important practical note:
++
 +- `src/lib.rs` still returns `LinkError::NotYetImplemented` for real link runs.
 +- `tests/common/harness.rs::link_both` still panics because full end-to-end
 +  linker execution has not landed.
 +- `README.md` still describes the crate as "Sprint 0 scaffolding only," which is
 +  now too pessimistic for the read-side code but still accurate for the actual
 +  link-producing path.
++
 +As of 2026-04-15 in this checkout, `cargo test -p afs-ld` is green.
++
 +## Strengths
++
 +- The read-side core is already substantial and well-tested.
 +- The project has strong bespoke discipline: no `clap`, `serde`, `object`,
 +  `goblin`, `byteorder`, or other format-parsing shortcuts.
 +- Raw wire structures are modeled explicitly and usually paired with
 +  round-trip-oriented tests.
 +- The type modeling is strong: opaque ids, interned strings, explicit symbol
 +  states, explicit atom ownership, explicit relocation referents.
 +- Real-world fixtures are already in play: afs-as corpus objects,
 +  `libarmfortas_rt.a`, `libSystem.tbd`, and small clang-built dylibs.
 +- The codebase already separates concerns cleanly enough that writer/layout work
 +  can land without tearing up the read-side foundation.
 +- Dump modes make inspection easy and are useful while the full writer does not
 +  exist yet.
++
 +## Weaknesses And Risk Areas
++
 +- The actual link-producing pipeline does not exist yet, so the hardest parity
 +  bugs are still ahead of us.
 +- Some tracked docs are aspirational. `.docs/overview.md` is the intended end
 +  state, not a guarantee that every listed module already exists.
 +- `README.md` is stale in the opposite direction: it understates how much
 +  read-side work has landed.
 +- The current diagnostics surface is still minimal. `src/diag.rs` only prints
 +  `afs-ld: error: ...`; the richer caret diagnostics are planned, not present.
 +- The CLI surface is intentionally tiny right now. Any work that assumes
 +  `ld`-compatibility must start by checking `src/args.rs`, not by trusting the
 +  sprint plan.
 +- Performance characteristics are mostly unknown because the writer, layout, and
 +  full-link path are not in place yet.
 +- The differential harness is only half-built: the diff engine exists, but the
 +  "run both linkers" machinery is not wired.
 +- Several future modules named in the roadmap do not exist yet:
 +  `layout.rs`, `driver.rs`, `map.rs`, `gc.rs`, `icf.rs`, `synth/`,
 +  `macho/writer.rs`, and the code-signing path are all still planned work.
++
 +## Build And Test
++
 +Primary commands:
++
 +```bash
 +cargo build -p afs-ld
 +cargo test -p afs-ld
 +cargo clippy -p afs-ld --all-targets -- -D warnings
 +```
++
 +Useful targeted commands:
++
 +```bash
 +cargo test --lib -p afs-ld
 +cargo test --test reader_corpus_round_trip -p afs-ld
 +cargo test --test archive_runtime -p afs-ld
 +cargo test --test dylib_integration -p afs-ld
 +cargo test --test tbd_integration -p afs-ld
 +cargo test --test resolve_integration -p afs-ld
 +cargo test --test atom_integration -p afs-ld
 +cargo test -p afs-ld -- <substring>
 +```
++
 +Environment assumptions:
++
 +- macOS on Apple Silicon
 +- Xcode command-line tools available through `xcrun`
 +- access to the parent workspace, especially `runtime/` and `.refs/`
++
 +Integration tests already shell out to system tools in a few places. Do not
 +replace those with fake fixtures if a real toolchain interaction is the thing
 +being tested.
++
 +## Project Structure
++
 +Actual source tree today:
++
 +```text
 +afs-ld/
 +├── CLAUDE.md
 +├── README.md
 +├── .docs/
 +│   ├── overview.md
 +│   └── sprints/
 +├── src/
 +│   ├── archive.rs
 +│   ├── args.rs
 +│   ├── atom.rs
 +│   ├── diag.rs
 +│   ├── dump.rs
 +│   ├── input.rs
 +│   ├── leb.rs
 +│   ├── lib.rs
 +│   ├── main.rs
 +│   ├── resolve.rs
 +│   ├── section.rs
 +│   ├── string_table.rs
 +│   ├── symbol.rs
 +│   ├── macho/
 +│   │   ├── constants.rs
 +│   │   ├── dylib.rs
 +│   │   ├── exports.rs
 +│   │   ├── reader.rs
 +│   │   ├── tbd.rs
 +│   │   └── tbd_yaml.rs
 +│   └── reloc/
 +│       └── mod.rs
 +└── tests/
 +    ├── common/harness.rs
 +    ├── archive_runtime.rs
 +    ├── atom_integration.rs
 +    ├── diff_harness_*.rs
 +    ├── dylib_integration.rs
 +    ├── reader_*.rs
 +    ├── resolve_integration.rs
 +    ├── tbd_*.rs
 +    └── reader_corpus_round_trip.rs
 +```
++
 +Planned future modules listed in the docs should be treated as design intent,
 +not as present-tense implementation.
++
 +## Implemented Pipeline Vs Planned Pipeline
++
 +Implemented today:
++
 +```text
 +argv
 +  -> args.rs
 +  -> dump/read paths
 +  -> archive/object/dylib/TBD ingestion
 +  -> symbol/section/reloc decoding
 +  -> resolve.rs
 +  -> atom.rs
 +```
++
 +Current real-link path:
++
 +```text
 +argv -> args.rs -> Linker::run -> NotYetImplemented
 +```
++
 +Planned end-to-end pipeline from the roadmap:
++
 +```text
 +args -> inputs -> resolve -> atomize -> layout -> apply relocs
 +     -> synth sections -> write -> sign
 +```
++
 +When you are planning work, always identify which of those stages is real in
 +this checkout and which stage is still only described in docs.
++
 +## Development Guidance
++
 +### 1. Trust code and tests over roadmap prose
++
 +Read these in order before substantial work:
++
 +1. `CLAUDE.md`
 +2. `.docs/overview.md`
 +3. the relevant sprint file in `.docs/sprints/`
 +4. the actual Rust module you will touch
 +5. the tests covering that module
++
 +If the docs and the code disagree, treat the code plus tests as the truth about
 +what exists today, then decide whether the docs need to be refreshed.
++
 +### 2. Keep the bespoke contract intact
++
 +- Stdlib only unless a dependency discussion happens first.
 +- Do not couple afs-ld to afs-as at a Rust type level.
 +- Duplicate Mach-O constants locally when needed.
 +- Do not hide format details behind clever abstractions that erase wire truth.
++
 +### 3. Preserve the wire
++
 +- Keep raw bytes or raw fields accessible when lossless re-emission matters.
 +- Prefer explicit parse and write pairs for on-disk structures.
 +- Avoid converting fixed-size or padded wire data into lossy higher-level forms
 +  unless the raw representation is still available somewhere.
 +- If a new decoder lands, pair it with tests that prove it round-trips or at
 +  least preserves the exact bytes relevant to the current stage.
++
 +### 4. Be explicit about incomplete work
++
 +- Hard errors are better than silent wrong answers.
 +- If something is not implemented, say so directly.
 +- Do not introduce "temporary" behavior that quietly emits malformed Mach-O.
 +- Do not soften a missing feature into a no-op unless the flag or structure is
 +  explicitly intended to be ignored.
++
 +### 5. Exhaustive matches matter
++
 +- Prefer enums for wire forms and linker-side states.
 +- Avoid catch-all `_` arms in production matches when a new variant should force
 +  the compiler to help us.
 +- When adding a new variant, update every relevant match deliberately.
++
 +### 6. Keep dump surfaces useful
++
 +- `--dump*` modes are an active debugging tool, not a side feature.
 +- When new reader functionality lands, extend the corresponding dump output.
 +- If you add a new parsed field but the dump cannot show it, the repo loses one
 +  of its best inspection surfaces.
++
 +### 7. Respect deterministic behavior
++
 +- Avoid nondeterministic iteration when output order matters.
 +- Avoid timestamps, random ids, or unstable hashing in any future write path.
 +- When adding diagnostics, keep them stable and testable.
++
 +## Testing Practices
++
 +- Every bug fix gets a regression test.
 +- New parser behavior should land with unit tests close to the module.
 +- When touching integration behavior, prefer real fixtures over mocked ones.
 +- For archive work, look first at `tests/archive_runtime.rs`.
 +- For dylib and TBD work, look first at `tests/dylib_integration.rs`,
 +  `tests/tbd_integration.rs`, and `tests/tbd_smoke.rs`.
 +- For reader invariants, `tests/reader_corpus_round_trip.rs` is a key guardrail.
 +- For resolution and atomization, `tests/resolve_integration.rs` and
 +  `tests/atom_integration.rs` should move with the code.
 +- If you add future write-side functionality, extend the differential harness
 +  rather than building a parallel ad hoc test path.
++
 +Run focused tests first, then widen:
++
 +- module-local or single integration test while developing
 +- `cargo test -p afs-ld` before handing work off
 +- `cargo clippy -p afs-ld --all-targets -- -D warnings` when changing code paths
 +  broadly enough to justify it
++
 +## Documentation Practices
++
 +- `CLAUDE.md` is policy and development discipline.
 +- `.docs/overview.md` is the intended architecture and scope.
 +- `.docs/sprints/` is the staged roadmap.
 +- `README.md` is user-facing and currently stale relative to the read-side code.
++
 +When a change materially shifts reality, update the tracked docs that are now
 +misleading. This is especially important in this repo because the roadmap is
 +ambitious and can otherwise create false assumptions for future work.
++
 +## References
++
 +Use the parent repository's references when you need to confirm Mach-O or linker
 +behavior instead of inventing from memory:
++
 +- `.refs/llvm/lld/MachO/` for architecture and pass structure
 +- `.refs/ld64/` for Apple-parity edge cases
 +- `.refs/mold/` for performance ideas and comparative implementation choices
++
 +Also use Apple's Mach-O and arm64 relocation headers as the numeric source of
 +truth for constants mirrored in `src/macho/constants.rs`.
++
 +## Working Style For This Repo
++
 +- Prefer small, reviewable changes.
 +- Keep commit messages terse and imperative.
 +- Do not mention sprint numbers in commit subjects.
 +- Avoid monolithic "land the whole linker" changes; the sprint plan is granular
 +  for a reason.
 +- Before implementing a planned module from the roadmap, make sure the crate
 +  actually has the prerequisites the sprint assumed.
 +- If you are about to say "the docs say this exists," stop and confirm with
 +  `ls`, `rg`, and the tests.
++
 +## Practical Shortcuts
++
 +- Use `rg --files` and `rg` first; the repo is small enough that this is fast
 +  and keeps context grounded in the actual tree.
 +- For current status, start with `src/lib.rs`, `src/main.rs`, `src/args.rs`,
 +  `tests/common/harness.rs`, and `README.md`.
 +- For architectural intent, then read `.docs/overview.md` and the relevant
 +  sprint file.
++
 +That order will save a lot of confusion.

src/args.rsmodified

      "-dylib",
      "-all_load",
      "-force_load",
 +    "-j",
      "--dump",
      "--dump-archive",
      "--dump-dylib",
      Ok((major << 16) | ((minor & 0xff) << 8) | (patch & 0xff))
+ }
 +fn parse_jobs(value: &str) -> Result<usize, ArgsError> {
 +    let jobs = value
 +        .parse::<usize>()
 +        .map_err(|_| ArgsError::InvalidValue {
 +            flag: "-j".into(),
 +            value: value.to_string(),
 +            expected: "positive integer job count".into(),
 +        })?;
 +    if jobs == 0 {
 +        return Err(ArgsError::InvalidValue {
 +            flag: "-j".into(),
 +            value: value.to_string(),
 +            expected: "positive integer job count".into(),
 +        });
 +    }
 +    Ok(jobs)
 +}
++
  pub fn parse(argv: &[String]) -> Result<LinkOptions, ArgsError> {
      let normalized = normalize_wl(argv);
      let mut opts = LinkOptions::default();
                          ArgsError::MissingValue("-force_load".into())
                      })?));
+             }
 +            "-j" => {
 +                let value = it
 +                    .next()
 +                    .ok_or_else(|| ArgsError::MissingValue("-j".into()))?;
 +                opts.jobs = Some(parse_jobs(value)?);
 +            }
              "--dump" => {
                  opts.dump = Some(PathBuf::from(
                      it.next()
          assert_eq!(opts.inputs, vec![PathBuf::from("main.o")]);
+     }
 +    #[test]
 +    fn jobs_flag_records_positive_worker_limit() {
 +        let opts = parse(&argv(&["-j", "1", "main.o"])).unwrap();
 +        assert_eq!(opts.jobs, Some(1));
 +        assert_eq!(opts.inputs, vec![PathBuf::from("main.o")]);
 +    }
++
 +    #[test]
 +    fn jobs_flag_rejects_zero_or_non_numeric_values() {
 +        let err = parse(&argv(&["-j", "0"])).unwrap_err();
 +        assert!(matches!(
 +            err,
 +            ArgsError::InvalidValue {
 +                ref flag,
 +                ref value,
 +                ..
 +            } if flag == "-j" && value == "0"
 +        ));
 +        let err = parse(&argv(&["-j", "many"])).unwrap_err();
 +        assert!(matches!(
 +            err,
 +            ArgsError::InvalidValue {
 +                ref flag,
 +                ref value,
 +                ..
 +            } if flag == "-j" && value == "many"
 +        ));
 +    }
++
 +    #[test]
 +    fn missing_jobs_value_errors() {
 +        let err = parse(&argv(&["-j"])).unwrap_err();
 +        assert!(matches!(err, ArgsError::MissingValue(ref f) if f == "-j"));
 +    }
++
      #[test]
      fn missing_force_load_value_errors() {
          let err = parse(&argv(&["-force_load"])).unwrap_err();

src/lib.rsmodified

  use std::os::unix::fs::PermissionsExt;
  use std::path::PathBuf;
 +use std::sync::{mpsc, Arc, Mutex};
 +use std::thread;
  use std::time::{Duration, Instant};
 -use std::{fs, io};
 +use std::{collections::VecDeque, fs, io};
 +use archive::Archive;
  use atom::{atomize_object, backpatch_symbol_atoms, AtomTable};
  use icf::IcfError;
 +use input::ObjectFile;
  use layout::{ExtraLayoutSections, Layout, LayoutInput};
  use macho::dylib::{DylibDependency, DylibFile, DylibLoadKind};
  use macho::reader::ReadError;
 -use macho::tbd::{parse_tbd, parse_version, Arch, Platform, Target};
 +use macho::tbd::{
 +    parse_tbd_for_target, parse_tbd_metadata_for_target, parse_version, Arch, Platform, Target,
 +};
  use reloc::arm64::RelocError;
  use resolve::{
      classify_unresolved, drain_fetches, find_archive_by_path, force_load_all, force_load_archive,
      format_duplicate_diagnostic, format_undefined_diagnostic, format_undefined_warning_diagnostic,
 -    seed_all, DrainReport, DylibLoadMeta, InputAddError, Inputs, Symbol, SymbolTable,
 +    seed_all, DrainReport, DylibLoadMeta, InputAddError, InputId, Inputs, Symbol, SymbolTable,
      UndefinedTreatment,
  };
 +use symbol::SymKind;
  const DEFAULT_TBD_VERSION: u32 = 1 << 16;
  const THUNK_PLAN_MAX_ITERATIONS: usize = 16;
      pub fixup_chains: bool,
      pub all_load: bool,
      pub force_load_archives: Vec<PathBuf>,
 +    pub jobs: Option<usize>,
      pub kind: OutputKind,
      /// When set, afs-ld operates in dump mode and prints the given file's
      /// header + load commands instead of linking.
              fixup_chains: false,
              all_load: false,
              force_load_archives: Vec::new(),
 +            jobs: None,
              kind: OutputKind::Executable,
              dump: None,
              dump_archive: None,
+     }
+ }
 +impl LinkOptions {
 +    pub fn parallel_jobs(&self) -> usize {
 +        self.jobs
 +            .unwrap_or_else(|| {
 +                thread::available_parallelism()
 +                    .map(usize::from)
 +                    .unwrap_or(1)
 +            })
 +            .max(1)
 +    }
 +}
++
  #[derive(Debug)]
  pub enum LinkError {
      /// No input files were provided on the command line.
  #[derive(Debug, Clone, Default, PartialEq, Eq)]
  pub struct LinkPhaseTimings {
      pub input_parsing: Duration,
 +    pub input_read: Duration,
 +    pub input_object_parse: Duration,
 +    pub input_archive_parse: Duration,
 +    pub input_dylib_parse: Duration,
 +    pub input_tbd_decode: Duration,
 +    pub input_tbd_materialize: Duration,
 +    pub input_reloc_parse: Duration,
      pub symbol_resolution: Duration,
      pub atomization: Duration,
      pub layout: Duration,
 +    pub layout_entry_lookup: Duration,
 +    pub layout_dead_strip: Duration,
 +    pub layout_icf: Duration,
 +    pub layout_synthetic_plan: Duration,
 +    pub layout_build: Duration,
 +    pub layout_thunk_plan: Duration,
      pub synth_sections: Duration,
      pub synth_linkedit_finalize: Duration,
      pub synth_linkedit_symbol_plan: Duration,
      pub synth_linkedit_symbol_plan_globals: Duration,
      pub synth_linkedit_symbol_plan_strtab: Duration,
      pub synth_linkedit_dyld_info: Duration,
 +    pub synth_linkedit_dyld_bind: Duration,
 +    pub synth_linkedit_dyld_rebase: Duration,
 +    pub synth_linkedit_dyld_export: Duration,
      pub synth_linkedit_metadata_tables: Duration,
      pub synth_linkedit_code_signature: Duration,
      pub synth_unwind: Duration,
              + self.reloc_apply
              + self.write_output
+     }
++
 +    fn add_input_load(&mut self, timings: InputLoadTimings) {
 +        self.input_read += timings.read;
 +        self.input_object_parse += timings.object_parse;
 +        self.input_archive_parse += timings.archive_parse;
 +        self.input_dylib_parse += timings.dylib_parse;
 +        self.input_tbd_decode += timings.tbd_decode;
 +        self.input_tbd_materialize += timings.tbd_materialize;
 +    }
 +}
++
 +#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
 +struct InputLoadTimings {
 +    read: Duration,
 +    object_parse: Duration,
 +    archive_parse: Duration,
 +    dylib_parse: Duration,
 +    tbd_decode: Duration,
 +    tbd_materialize: Duration,
+ }
  #[derive(Debug, Clone, PartialEq, Eq)]
          if opts.inputs.is_empty() && opts.library_names.is_empty() && opts.frameworks.is_empty() {
              return Err(LinkError::NoInputs);
+         }
 +        let parallel_jobs = opts.parallel_jobs();
          if let Some(arch) = &opts.arch {
              if arch != "arm64" {
              );
+         }
 -        let mut load_paths = opts.inputs.clone();
 +        let mut load_paths = Vec::new();
 +        let mut positional_dylibs = Vec::new();
 +        for path in &opts.inputs {
 +            match path.extension().and_then(|ext| ext.to_str()) {
 +                Some("dylib" | "tbd") => positional_dylibs.push(path.clone()),
 +                _ => load_paths.push(path.clone()),
 +            }
 +        }
          let mut dylib_load_kinds = std::collections::HashMap::new();
          for name in &opts.library_names {
              let path = resolve_library_input(opts, name)?;
              );
              load_paths.push(path);
+         }
 +        load_paths.extend(positional_dylibs);
          let mut inputs = Inputs::new();
 +        let mut deferred_dylibs = Vec::new();
 +        let mut initial_loads = Vec::new();
          let phase_started = Instant::now();
          for (load_order, path) in load_paths.iter().enumerate() {
 +            if matches!(
 +                path.extension().and_then(|ext| ext.to_str()),
 +                Some("dylib" | "tbd")
 +            ) {
 +                deferred_dylibs.push((load_order, path.clone()));
 +                continue;
 +            }
              if opts.trace_inputs {
                  eprintln!("afs-ld: loading {}", path.display());
+             }
 -            register_input(&mut inputs, path, load_order)?;
 +            initial_loads.push((load_order, path.clone()));
 +        }
 +        for loaded in load_initial_inputs(initial_loads, parallel_jobs)? {
 +            let timings = register_loaded_initial_input(&mut inputs, loaded);
 +            phases.add_input_load(timings);
 +        }
 +        let include_tbd_exports = inputs_may_need_dylib_exports(&inputs)?;
 +        for (load_order, path) in &deferred_dylibs {
 +            if opts.trace_inputs {
 +                eprintln!("afs-ld: loading {}", path.display());
 +            }
 +            let timings = register_input(&mut inputs, path, *load_order, include_tbd_exports)?;
 +            phases.add_input_load(timings);
+         }
          phases.input_parsing = phase_started.elapsed();
          let mut force_report = DrainReport::default();
          if opts.all_load {
 -            force_load_all(&mut inputs, &mut sym_table, &mut force_report)?;
 +            force_load_all(
 +                &mut inputs,
 +                &mut sym_table,
 +                &mut force_report,
 +                parallel_jobs,
 +            )?;
+         }
          for archive_path in &opts.force_load_archives {
              let Some(archive_id) = find_archive_by_path(&inputs, archive_path) else {
                  return Err(LinkError::ForceLoadNotArchive(archive_path.clone()));
              };
 -            force_load_archive(&mut inputs, &mut sym_table, archive_id, &mut force_report)?;
 +            force_load_archive(
 +                &mut inputs,
 +                &mut sym_table,
 +                archive_id,
 +                &mut force_report,
 +                parallel_jobs,
 +            )?;
+         }
          if opts.trace_inputs {
              for path in &force_report.loaded_paths {
              return Err(LinkError::DuplicateSymbols(msg));
+         }
 -        let drain_report = drain_fetches(&mut inputs, &mut sym_table, seed_report.pending_fetches)?;
 +        let drain_report = drain_fetches(
 +            &mut inputs,
 +            &mut sym_table,
 +            seed_report.pending_fetches,
 +            parallel_jobs,
 +        )?;
          if opts.trace_inputs {
              for path in &drain_report.loaded_paths {
                  eprintln!("afs-ld: loading {}", path.display());
          for idx in 0..inputs.objects.len() {
              let input_id = resolve::InputId(idx as u32);
              let obj = inputs.object_file(input_id)?;
 -            let atomization = atomize_object(input_id, &obj, &mut atom_table);
 -            backpatch_symbol_atoms(
 -                &atomization,
 -                input_id,
 -                &obj,
 -                &mut sym_table,
 -                &mut atom_table,
 -            );
 +            let atomization = atomize_object(input_id, obj, &mut atom_table);
 +            backpatch_symbol_atoms(&atomization, input_id, obj, &mut sym_table, &mut atom_table);
              objects.push((input_id, obj));
+         }
          phases.atomization = phase_started.elapsed();
+         }
          let phase_started = Instant::now();
          let parsed_relocs = macho::writer::build_parsed_reloc_cache(&layout_inputs)?;
 -        phases.input_parsing += phase_started.elapsed();
 +        let elapsed = phase_started.elapsed();
 +        phases.input_reloc_parse += elapsed;
 +        phases.input_parsing += elapsed;
 +        let layout_started = Instant::now();
          let phase_started = Instant::now();
          let entry_symbol = find_entry_symbol_id(opts, &sym_table)?;
 +        phases.layout_entry_lookup = phase_started.elapsed();
 +        let phase_started = Instant::now();
          let dead_strip = opts.dead_strip.then(|| {
              why_live::DeadStripAnalysis::build(
                  opts,
                  entry_symbol,
+             )
          });
 +        phases.layout_dead_strip = phase_started.elapsed();
 +        let phase_started = Instant::now();
          let icf = (opts.icf_mode == IcfMode::Safe)
              .then(|| {
                  icf::fold_safe(
+                 )
              })
              .transpose()?;
 +        phases.layout_icf = phase_started.elapsed();
          let kept_atoms = if let Some(icf) = &icf {
              Some(icf.kept_atoms())
          } else {
              dead_strip.as_ref().map(|analysis| analysis.live_atoms())
          };
 -        let synthetic_plan = synth::SyntheticPlan::build_filtered(
 +        let phase_started = Instant::now();
 +        let synthetic_plan = synth::SyntheticPlan::build_filtered_with_relocs(
              &layout_inputs,
              &atom_table,
              &mut sym_table,
              &inputs.dylibs,
              kept_atoms,
 +            &parsed_relocs,
          )?;
 +        phases.layout_synthetic_plan = phase_started.elapsed();
          let icf_redirects = icf.as_ref().map(|plan| plan.redirects());
 +        let phase_started = Instant::now();
          let mut layout = Layout::build_with_synthetics_filtered(
              opts.kind,
              &layout_inputs,
              Some(&synthetic_plan),
              kept_atoms,
          );
 +        phases.layout_build += phase_started.elapsed();
          let mut thunk_plan = None;
          let mut thunk_converged = false;
          for _ in 0..THUNK_PLAN_MAX_ITERATIONS {
 +            let phase_started = Instant::now();
              let next_plan = reloc::arm64::plan_thunks(
                  opts,
 -                &layout,
 -                &layout_inputs,
 -                &atom_table,
 -                &sym_table,
 -                Some(&synthetic_plan),
 -                icf_redirects,
 +                reloc::arm64::ThunkPlanningContext {
 +                    layout: &layout,
 +                    inputs: &layout_inputs,
 +                    atoms: &atom_table,
 +                    sym_table: &sym_table,
 +                    synthetic_plan: Some(&synthetic_plan),
 +                    icf_redirects,
 +                    parsed_relocs: &parsed_relocs,
 +                },
              )?;
 +            phases.layout_thunk_plan += phase_started.elapsed();
              if next_plan == thunk_plan {
                  thunk_converged = true;
                  break;
              let split_after_atoms = next_plan
                  .as_ref()
                  .map_or_else(Vec::new, |plan| plan.split_after_atoms());
 +            let phase_started = Instant::now();
              layout = Layout::build_with_synthetics_and_extra_filtered(
                  opts.kind,
                  &layout_inputs,
                      split_after_atoms: &split_after_atoms,
                  },
              );
 +            phases.layout_build += phase_started.elapsed();
              thunk_plan = next_plan;
+         }
          if !thunk_converged {
              return Err(LinkError::ThunkPlanningDidNotConverge);
+         }
 -        phases.layout = phase_started.elapsed();
 +        phases.layout = layout_started.elapsed();
          let linkedit_context = macho::writer::LinkEditContext {
              layout_inputs: &layout_inputs,
              atom_table: &atom_table,
          let mut synth_linkedit_symbol_plan_globals = Duration::ZERO;
          let mut synth_linkedit_symbol_plan_strtab = Duration::ZERO;
          let mut synth_linkedit_dyld_info = Duration::ZERO;
 +        let mut synth_linkedit_dyld_bind = Duration::ZERO;
 +        let mut synth_linkedit_dyld_rebase = Duration::ZERO;
 +        let mut synth_linkedit_dyld_export = Duration::ZERO;
          let mut synth_linkedit_metadata_tables = Duration::ZERO;
          let mut synth_linkedit_code_signature = Duration::ZERO;
          let mut synth_unwind = Duration::ZERO;
              synth_linkedit_symbol_plan_globals += linkedit_timings.symbol_plan_globals;
              synth_linkedit_symbol_plan_strtab += linkedit_timings.symbol_plan_strtab;
              synth_linkedit_dyld_info += linkedit_timings.dyld_info;
 +            synth_linkedit_dyld_bind += linkedit_timings.dyld_bind;
 +            synth_linkedit_dyld_rebase += linkedit_timings.dyld_rebase;
 +            synth_linkedit_dyld_export += linkedit_timings.dyld_export;
              synth_linkedit_metadata_tables += linkedit_timings.metadata_tables;
              synth_linkedit_code_signature += linkedit_timings.code_signature;
              layout = next_layout;
          phases.synth_linkedit_symbol_plan_globals = synth_linkedit_symbol_plan_globals;
          phases.synth_linkedit_symbol_plan_strtab = synth_linkedit_symbol_plan_strtab;
          phases.synth_linkedit_dyld_info = synth_linkedit_dyld_info;
 +        phases.synth_linkedit_dyld_bind = synth_linkedit_dyld_bind;
 +        phases.synth_linkedit_dyld_rebase = synth_linkedit_dyld_rebase;
 +        phases.synth_linkedit_dyld_export = synth_linkedit_dyld_export;
          phases.synth_linkedit_metadata_tables = synth_linkedit_metadata_tables;
          phases.synth_linkedit_code_signature = synth_linkedit_code_signature;
          phases.synth_unwind = synth_unwind;
                  thunk_plan: thunk_plan.as_ref(),
                  linkedit: &linkedit,
                  icf_redirects,
 +                parsed_relocs: &parsed_relocs,
 +                parallel_jobs,
              },
          )?;
          phases.reloc_apply = phase_started.elapsed();
          .unwrap_or_else(|| PathBuf::from("a.out"))
+ }
 +struct LoadedObjectInput {
 +    path: PathBuf,
 +    load_order: usize,
 +    bytes: Vec<u8>,
 +    parsed: ObjectFile,
 +    timings: InputLoadTimings,
 +}
++
 +struct LoadedArchiveInput {
 +    path: PathBuf,
 +    load_order: usize,
 +    bytes: Vec<u8>,
 +    timings: InputLoadTimings,
 +}
++
 +enum LoadedInitialInput {
 +    Object(Box<LoadedObjectInput>),
 +    Archive(LoadedArchiveInput),
 +}
++
 +impl LoadedInitialInput {
 +    fn load_order(&self) -> usize {
 +        match self {
 +            LoadedInitialInput::Object(input) => input.load_order,
 +            LoadedInitialInput::Archive(input) => input.load_order,
 +        }
 +    }
 +}
++
 +struct InitialLoadError {
 +    load_order: usize,
 +    error: LinkError,
 +}
++
 +fn load_initial_inputs(
 +    loads: Vec<(usize, PathBuf)>,
 +    parallel_jobs: usize,
 +) -> Result<Vec<LoadedInitialInput>, LinkError> {
 +    let mut results = Vec::new();
 +    let mut object_jobs = Vec::new();
 +    for (load_order, path) in loads {
 +        if matches!(path.extension().and_then(|ext| ext.to_str()), Some("a")) {
 +            results.push(load_archive_input(path, load_order));
 +        } else {
 +            object_jobs.push((load_order, path));
 +        }
 +    }
 +    results.extend(load_objects_parallel(object_jobs, parallel_jobs));
 +    results.sort_by_key(|result| match result {
 +        Ok(input) => input.load_order(),
 +        Err(error) => error.load_order,
 +    });
++
 +    let mut loaded = Vec::with_capacity(results.len());
 +    for result in results {
 +        match result {
 +            Ok(input) => loaded.push(input),
 +            Err(error) => return Err(error.error),
 +        }
 +    }
 +    Ok(loaded)
 +}
++
 +fn load_objects_parallel(
 +    jobs: Vec<(usize, PathBuf)>,
 +    parallel_jobs: usize,
 +) -> Vec<Result<LoadedInitialInput, InitialLoadError>> {
 +    if jobs.is_empty() {
 +        return Vec::new();
 +    }
 +    let job_count = parallel_jobs.max(1).min(jobs.len()).max(1);
 +    if job_count == 1 {
 +        return jobs
 +            .into_iter()
 +            .map(|(load_order, path)| load_object_input(path, load_order))
 +            .collect();
 +    }
++
 +    let queue = Arc::new(Mutex::new(VecDeque::from(jobs)));
 +    let (tx, rx) = mpsc::channel();
 +    thread::scope(|scope| {
 +        for _ in 0..job_count {
 +            let queue = Arc::clone(&queue);
 +            let tx = tx.clone();
 +            scope.spawn(move || loop {
 +                let Some((load_order, path)) = queue
 +                    .lock()
 +                    .expect("input load queue mutex poisoned")
 +                    .pop_front()
 +                else {
 +                    break;
 +                };
 +                tx.send(load_object_input(path, load_order))
 +                    .expect("input load receiver should stay live");
 +            });
 +        }
 +        drop(tx);
 +        rx.into_iter().collect()
 +    })
 +}
++
 +fn load_object_input(
 +    path: PathBuf,
 +    load_order: usize,
 +) -> Result<LoadedInitialInput, InitialLoadError> {
 +    let mut timings = InputLoadTimings::default();
 +    let phase_started = Instant::now();
 +    let bytes = fs::read(&path).map_err(|error| InitialLoadError {
 +        load_order,
 +        error: LinkError::Io(error),
 +    })?;
 +    timings.read = phase_started.elapsed();
++
 +    let phase_started = Instant::now();
 +    let parsed = ObjectFile::parse(&path, &bytes).map_err(|error| InitialLoadError {
 +        load_order,
 +        error: LinkError::from(error),
 +    })?;
 +    timings.object_parse = phase_started.elapsed();
++
 +    Ok(LoadedInitialInput::Object(Box::new(LoadedObjectInput {
 +        path,
 +        load_order,
 +        bytes,
 +        parsed,
 +        timings,
 +    })))
 +}
++
 +fn load_archive_input(
 +    path: PathBuf,
 +    load_order: usize,
 +) -> Result<LoadedInitialInput, InitialLoadError> {
 +    let mut timings = InputLoadTimings::default();
 +    let phase_started = Instant::now();
 +    let bytes = fs::read(&path).map_err(|error| InitialLoadError {
 +        load_order,
 +        error: LinkError::Io(error),
 +    })?;
 +    timings.read = phase_started.elapsed();
++
 +    let phase_started = Instant::now();
 +    Archive::open(&path, &bytes).map_err(|error| InitialLoadError {
 +        load_order,
 +        error: LinkError::from(InputAddError::from(error)),
 +    })?;
 +    timings.archive_parse = phase_started.elapsed();
++
 +    Ok(LoadedInitialInput::Archive(LoadedArchiveInput {
 +        path,
 +        load_order,
 +        bytes,
 +        timings,
 +    }))
 +}
++
 +fn register_loaded_initial_input(
 +    inputs: &mut Inputs,
 +    loaded: LoadedInitialInput,
 +) -> InputLoadTimings {
 +    match loaded {
 +        LoadedInitialInput::Object(input) => {
 +            inputs.add_parsed_object(input.path, input.bytes, input.parsed, input.load_order);
 +            input.timings
 +        }
 +        LoadedInitialInput::Archive(input) => {
 +            inputs.add_validated_archive(input.path, input.bytes, input.load_order);
 +            input.timings
 +        }
 +    }
 +}
++
  fn register_input(
      inputs: &mut Inputs,
      path: &std::path::Path,
      load_order: usize,
 -) -> Result<(), LinkError> {
 +    include_tbd_exports: bool,
 +) -> Result<InputLoadTimings, LinkError> {
 +    let mut timings = InputLoadTimings::default();
 +    let phase_started = Instant::now();
      let bytes = fs::read(path)?;
 +    timings.read = phase_started.elapsed();
      match path.extension().and_then(|ext| ext.to_str()) {
          Some("a") => {
 +            let phase_started = Instant::now();
              let _ = inputs.add_archive(path.to_path_buf(), bytes, load_order)?;
 +            timings.archive_parse = phase_started.elapsed();
+         }
          Some("dylib") => {
 +            let phase_started = Instant::now();
              let _ = inputs.add_dylib(path.to_path_buf(), bytes)?;
 +            timings.dylib_parse = phase_started.elapsed();
+         }
          Some("tbd") => {
 +            let phase_started = Instant::now();
              let text = std::str::from_utf8(&bytes).map_err(|e| {
                  LinkError::Tbd(macho::tbd::TbdError::Schema {
                      msg: format!("TBD input is not UTF-8: {e}"),
                  })
              })?;
 -            let docs = parse_tbd(text)?;
              let target = Target {
                  arch: Arch::Arm64,
                  platform: Platform::MacOs,
              };
 +            let docs = if include_tbd_exports {
 +                parse_tbd_for_target(text, &target)?
 +            } else {
 +                parse_tbd_metadata_for_target(text, &target)?
 +            };
 +            timings.tbd_decode = phase_started.elapsed();
++
 +            let phase_started = Instant::now();
 +            if docs.is_empty() {
 +                return Err(LinkError::NoTbdDocument(path.to_path_buf()));
 +            }
              let canonical = docs
                  .iter()
                  .find(|doc| doc.parent_umbrella.is_empty())
                      .unwrap_or(DEFAULT_TBD_VERSION),
                  ordinal: inputs.next_dylib_ordinal(),
              };
 -            let mut loaded = false;
 -            for doc in docs
 -                .iter()
 -                .filter(|doc| doc.targets.iter().any(|t| t.matches_requested(&target)))
 -            {
 +            for doc in &docs {
                  let file = DylibFile::from_tbd(path, doc, &target);
                  let _ =
                      inputs.add_dylib_from_file_with_meta(path.to_path_buf(), file, load.clone());
 -                loaded = true;
 -            }
 -            if !loaded {
 -                return Err(LinkError::NoTbdDocument(path.to_path_buf()));
+             }
 +            timings.tbd_materialize = phase_started.elapsed();
+         }
          _ => {
 +            let phase_started = Instant::now();
              let _ = inputs.add_object(path.to_path_buf(), bytes, load_order)?;
 +            timings.object_parse = phase_started.elapsed();
 +        }
 +    }
 +    Ok(timings)
 +}
++
 +fn inputs_may_need_dylib_exports(inputs: &Inputs) -> Result<bool, LinkError> {
 +    if !inputs.archives.is_empty() {
 +        return Ok(true);
 +    }
 +    for i in 0..inputs.objects.len() {
 +        let input_id = InputId(i as u32);
 +        let object = inputs.object_file(input_id)?;
 +        if object.symbols.iter().any(|sym| {
 +            sym.stab_kind().is_none()
 +                && (sym.is_ext() || sym.is_private_ext())
 +                && sym.kind() == SymKind::Undef
 +                && !sym.is_common()
 +        }) {
 +            return Ok(true);
+         }
+     }
 -    Ok(())
 +    Ok(false)
+ }
  fn resolve_entry_point(

src/macho/tbd.rsmodified

      let docs = parse_documents(input)?;
      let mut out = Vec::with_capacity(docs.len());
      for d in docs {
 -        out.push(decode_document(&d)?);
 +        out.push(decode_document(d)?);
+     }
      Ok(out)
+ }
 -fn decode_document(doc: &Document) -> Result<Tbd, TbdError> {
 -    let m = doc
 -        .root
 -        .as_mapping()
 -        .ok_or_else(|| schema("top level of a TBD document must be a mapping"))?;
 +/// Parse a TBD for the linker hot path, keeping only documents and scoped
 +/// symbol lists that can satisfy `target`.
 +///
 +/// The fast path handles Apple's emitted TAPI v4 shape directly and avoids
 +/// constructing the generic YAML `Value` tree for the thousands of symbols in
 +/// libSystem.tbd. If it sees a shape outside that subset, it falls back to the
 +/// generic decoder and applies the same target filter afterward.
 +pub fn parse_tbd_for_target(input: &str, target: &Target) -> Result<Vec<Tbd>, TbdError> {
 +    match parse_tbd_for_target_direct(input, target, true) {
 +        Ok(docs) => Ok(docs),
 +        Err(_) => {
 +            let docs = parse_tbd(input)?;
 +            Ok(filter_docs_for_target(docs, target))
 +        }
 +    }
 +}
++
 +/// Parse only load-command-relevant TBD metadata for `target`.
 +///
 +/// This is used for links that have no unresolved dylib symbols; emitting the
 +/// requested `LC_LOAD_DYLIB` does not require materializing libSystem's full
 +/// export surface.
 +pub fn parse_tbd_metadata_for_target(input: &str, target: &Target) -> Result<Vec<Tbd>, TbdError> {
 +    match parse_tbd_for_target_direct(input, target, false) {
 +        Ok(docs) => Ok(docs),
 +        Err(_) => {
 +            let docs = parse_tbd(input)?;
 +            Ok(filter_docs_for_target_metadata(docs, target))
 +        }
 +    }
 +}
++
 +fn filter_docs_for_target(mut docs: Vec<Tbd>, target: &Target) -> Vec<Tbd> {
 +    docs.retain(|doc| targets_match(&doc.targets, target));
 +    for doc in &mut docs {
 +        doc.parent_umbrella
 +            .retain(|scoped| targets_match(&scoped.targets, target));
 +        doc.allowable_clients
 +            .retain(|scoped| targets_match(&scoped.targets, target));
 +        doc.reexported_libraries
 +            .retain(|scoped| targets_match(&scoped.targets, target));
 +        doc.exports
 +            .retain(|scoped| targets_match(&scoped.targets, target));
 +        doc.reexports
 +            .retain(|scoped| targets_match(&scoped.targets, target));
 +    }
 +    docs
 +}
 +fn filter_docs_for_target_metadata(docs: Vec<Tbd>, target: &Target) -> Vec<Tbd> {
 +    let mut docs = filter_docs_for_target(docs, target);
 +    for doc in &mut docs {
 +        doc.reexported_libraries.clear();
 +        doc.exports.clear();
 +        doc.reexports.clear();
 +    }
 +    docs
 +}
++
 +fn parse_tbd_for_target_direct(
 +    input: &str,
 +    target: &Target,
 +    include_exports: bool,
 +) -> Result<Vec<Tbd>, TbdError> {
 +    let lines: Vec<&str> = input.lines().collect();
 +    let mut docs = Vec::new();
 +    let mut i = 0usize;
 +    while i < lines.len() {
 +        let Some(trimmed) = direct_trimmed(lines[i]) else {
 +            i += 1;
 +            continue;
 +        };
 +        if trimmed.starts_with("%YAML") || trimmed.starts_with("...") {
 +            i += 1;
 +            continue;
 +        }
 +        if trimmed.starts_with("---") {
 +            i += 1;
 +        }
++
 +        let (doc, next) = parse_direct_document(&lines, i, target, include_exports)?;
 +        i = next;
 +        if doc.install_name.is_empty() && doc.targets.is_empty() {
 +            continue;
 +        }
 +        if targets_match(&doc.targets, target) {
 +            docs.push(doc);
 +        }
 +    }
 +    Ok(docs)
 +}
++
 +fn parse_direct_document(
 +    lines: &[&str],
 +    mut i: usize,
 +    target: &Target,
 +    include_exports: bool,
 +) -> Result<(Tbd, usize), TbdError> {
 +    let mut tbd = Tbd::default();
 +    while i < lines.len() {
 +        let Some(trimmed) = direct_trimmed(lines[i]) else {
 +            i += 1;
 +            continue;
 +        };
 +        if trimmed.starts_with("---") || trimmed.starts_with("...") {
 +            break;
 +        }
 +        if direct_indent(lines[i]) != 0 {
 +            return Err(schema("unexpected nested TBD line at document root"));
 +        }
 +        let (key, rest) =
 +            direct_key_value(trimmed).ok_or_else(|| schema("expected top-level TBD key"))?;
 +        match key {
 +            "tbd-version" => {
 +                tbd.version = parse_direct_scalar(rest)
 +                    .parse()
 +                    .map_err(|_| schema(&format!("tbd-version must parse as a u32: {rest:?}")))?;
 +                i += 1;
 +            }
 +            "targets" => {
 +                let (targets, next) = parse_direct_targets(lines, i, rest)?;
 +                tbd.targets = targets;
 +                i = next;
 +            }
 +            "install-name" => {
 +                tbd.install_name = parse_direct_scalar(rest);
 +                i += 1;
 +            }
 +            "current-version" => {
 +                tbd.current_version = Some(parse_direct_scalar(rest));
 +                i += 1;
 +            }
 +            "compatibility-version" => {
 +                tbd.compatibility_version = Some(parse_direct_scalar(rest));
 +                i += 1;
 +            }
 +            "parent-umbrella" => {
 +                let (value, next) = parse_direct_scoped_scalars(lines, i + 1, target, "umbrella")?;
 +                tbd.parent_umbrella = value;
 +                i = next;
 +            }
 +            "allowable-clients" => {
 +                let (value, next) = parse_direct_scoped_lists(lines, i + 1, target, "clients")?;
 +                tbd.allowable_clients = value;
 +                i = next;
 +            }
 +            "reexported-libraries" if include_exports => {
 +                let (value, next) = parse_direct_scoped_lists(lines, i + 1, target, "libraries")?;
 +                tbd.reexported_libraries = value;
 +                i = next;
 +            }
 +            "reexported-libraries" => {
 +                i = skip_direct_value(lines, i + 1);
 +            }
 +            "exports" if include_exports => {
 +                let (value, next) = parse_direct_scoped_symbols(lines, i + 1, target)?;
 +                tbd.exports = value;
 +                i = next;
 +            }
 +            "exports" => {
 +                i = skip_direct_value(lines, i + 1);
 +            }
 +            "reexports" if include_exports => {
 +                let (value, next) = parse_direct_scoped_symbols(lines, i + 1, target)?;
 +                tbd.reexports = value;
 +                i = next;
 +            }
 +            "reexports" => {
 +                i = skip_direct_value(lines, i + 1);
 +            }
 +            _ => {
 +                i = skip_direct_value(lines, i + 1);
 +            }
 +        }
 +    }
++
 +    if !tbd.install_name.is_empty() || !tbd.targets.is_empty() {
 +        if tbd.install_name.is_empty() {
 +            return Err(schema("TBD document missing required 'install-name'"));
 +        }
 +        if tbd.targets.is_empty() {
 +            return Err(schema("TBD document missing required 'targets'"));
 +        }
 +    }
 +    Ok((tbd, i))
 +}
++
 +fn parse_direct_scoped_scalars(
 +    lines: &[&str],
 +    mut i: usize,
 +    target: &Target,
 +    value_key: &str,
 +) -> Result<(Vec<Scoped<String>>, usize), TbdError> {
 +    let mut out = Vec::new();
 +    let ctx = DirectCtx { lines, target };
 +    while let Some(entry) = direct_entry_start(lines, i) {
 +        let mut state = DirectScopeState::default();
 +        let mut value = None;
 +        let (key, rest) = entry?;
 +        apply_direct_scalar_pair((key, rest), ctx, &mut i, &mut state, value_key, &mut value)?;
 +        while let Some((key, rest)) = direct_nested_pair(lines, i) {
 +            apply_direct_scalar_pair((key, rest), ctx, &mut i, &mut state, value_key, &mut value)?;
 +        }
 +        if state.include == Some(true) {
 +            out.push(Scoped {
 +                targets: state
 +                    .targets
 +                    .ok_or_else(|| schema("missing required key \"targets\""))?,
 +                value: value.unwrap_or_default(),
 +            });
 +        }
 +    }
 +    Ok((out, i))
 +}
++
 +type DirectScopedList = Vec<Scoped<Vec<String>>>;
 +type DirectScopedListResult = Result<(DirectScopedList, usize), TbdError>;
++
 +#[derive(Clone, Copy)]
 +struct DirectCtx<'a, 't> {
 +    lines: &'a [&'a str],
 +    target: &'t Target,
 +}
++
 +#[derive(Default)]
 +struct DirectScopeState {
 +    targets: Option<Vec<Target>>,
 +    include: Option<bool>,
 +}
++
 +fn parse_direct_scoped_lists(
 +    lines: &[&str],
 +    mut i: usize,
 +    target: &Target,
 +    value_key: &str,
 +) -> DirectScopedListResult {
 +    let mut out = Vec::new();
 +    let ctx = DirectCtx { lines, target };
 +    while let Some(entry) = direct_entry_start(lines, i) {
 +        let mut state = DirectScopeState::default();
 +        let mut value = Vec::new();
 +        let (key, rest) = entry?;
 +        apply_direct_list_pair((key, rest), ctx, &mut i, &mut state, value_key, &mut value)?;
 +        while let Some((key, rest)) = direct_nested_pair(lines, i) {
 +            apply_direct_list_pair((key, rest), ctx, &mut i, &mut state, value_key, &mut value)?;
 +        }
 +        if state.include == Some(true) {
 +            out.push(Scoped {
 +                targets: state
 +                    .targets
 +                    .ok_or_else(|| schema("missing required key \"targets\""))?,
 +                value,
 +            });
 +        }
 +    }
 +    Ok((out, i))
 +}
++
 +fn parse_direct_scoped_symbols(
 +    lines: &[&str],
 +    mut i: usize,
 +    target: &Target,
 +) -> Result<(Vec<Scoped<SymbolLists>>, usize), TbdError> {
 +    let mut out = Vec::new();
 +    let ctx = DirectCtx { lines, target };
 +    while let Some(entry) = direct_entry_start(lines, i) {
 +        let mut state = DirectScopeState::default();
 +        let mut lists = SymbolLists::default();
 +        let (key, rest) = entry?;
 +        apply_direct_symbol_pair((key, rest), ctx, &mut i, &mut state, &mut lists)?;
 +        while let Some((key, rest)) = direct_nested_pair(lines, i) {
 +            apply_direct_symbol_pair((key, rest), ctx, &mut i, &mut state, &mut lists)?;
 +        }
 +        if state.include == Some(true) {
 +            out.push(Scoped {
 +                targets: state
 +                    .targets
 +                    .ok_or_else(|| schema("missing required key \"targets\""))?,
 +                value: lists,
 +            });
 +        }
 +    }
 +    Ok((out, i))
 +}
++
 +fn apply_direct_scalar_pair(
 +    pair: (&str, &str),
 +    ctx: DirectCtx<'_, '_>,
 +    i: &mut usize,
 +    state: &mut DirectScopeState,
 +    value_key: &str,
 +    value: &mut Option<String>,
 +) -> Result<(), TbdError> {
 +    let (key, rest) = pair;
 +    if key == "targets" {
 +        let (parsed, next) = parse_direct_targets(ctx.lines, *i, rest)?;
 +        state.include = Some(targets_match(&parsed, ctx.target));
 +        state.targets = Some(parsed);
 +        *i = next;
 +    } else if key == value_key {
 +        if state.include != Some(false) {
 +            *value = Some(parse_direct_scalar(rest));
 +        }
 +        *i += 1;
 +    } else {
 +        *i = skip_direct_inline_value(ctx.lines, *i, rest)?;
 +    }
 +    Ok(())
 +}
++
 +fn apply_direct_list_pair(
 +    pair: (&str, &str),
 +    ctx: DirectCtx<'_, '_>,
 +    i: &mut usize,
 +    state: &mut DirectScopeState,
 +    value_key: &str,
 +    value: &mut Vec<String>,
 +) -> Result<(), TbdError> {
 +    let (key, rest) = pair;
 +    if key == "targets" {
 +        let (parsed, next) = parse_direct_targets(ctx.lines, *i, rest)?;
 +        state.include = Some(targets_match(&parsed, ctx.target));
 +        state.targets = Some(parsed);
 +        *i = next;
 +    } else if key == value_key {
 +        if state.include == Some(false) {
 +            *i = skip_direct_flow(ctx.lines, *i, rest)?;
 +        } else {
 +            let (parsed, next) = parse_direct_string_list(ctx.lines, *i, rest)?;
 +            *value = parsed;
 +            *i = next;
 +        }
 +    } else {
 +        *i = skip_direct_inline_value(ctx.lines, *i, rest)?;
 +    }
 +    Ok(())
 +}
++
 +fn apply_direct_symbol_pair(
 +    pair: (&str, &str),
 +    ctx: DirectCtx<'_, '_>,
 +    i: &mut usize,
 +    state: &mut DirectScopeState,
 +    lists: &mut SymbolLists,
 +) -> Result<(), TbdError> {
 +    let (key, rest) = pair;
 +    if key == "targets" {
 +        let (parsed, next) = parse_direct_targets(ctx.lines, *i, rest)?;
 +        state.include = Some(targets_match(&parsed, ctx.target));
 +        state.targets = Some(parsed);
 +        *i = next;
 +        return Ok(());
 +    }
++
 +    let slot = match key {
 +        "symbols" => Some(&mut lists.symbols),
 +        "weak-symbols" => Some(&mut lists.weak_symbols),
 +        "thread-local-symbols" => Some(&mut lists.thread_local_symbols),
 +        "objc-classes" => Some(&mut lists.objc_classes),
 +        "objc-eh-types" => Some(&mut lists.objc_eh_types),
 +        "objc-ivars" => Some(&mut lists.objc_ivars),
 +        _ => None,
 +    };
 +    if let Some(slot) = slot {
 +        if state.include == Some(false) {
 +            *i = skip_direct_flow(ctx.lines, *i, rest)?;
 +        } else {
 +            let (parsed, next) = parse_direct_string_list(ctx.lines, *i, rest)?;
 +            *slot = parsed;
 +            *i = next;
 +        }
 +    } else {
 +        *i = skip_direct_inline_value(ctx.lines, *i, rest)?;
 +    }
 +    Ok(())
 +}
++
 +type DirectEntry<'a> = Result<(&'a str, &'a str), TbdError>;
++
 +fn direct_entry_start<'a>(lines: &'a [&str], i: usize) -> Option<DirectEntry<'a>> {
 +    let trimmed = direct_trimmed(lines.get(i)?)?;
 +    if trimmed.starts_with("---") || trimmed.starts_with("...") || direct_indent(lines[i]) == 0 {
 +        return None;
 +    }
 +    if direct_indent(lines[i]) != 2 || !trimmed.starts_with('-') {
 +        return Some(Err(schema("expected scoped TBD entry")));
 +    }
 +    let rest = trimmed.strip_prefix('-').unwrap_or("").trim_start();
 +    if rest.is_empty() {
 +        return Some(Err(schema("empty scoped TBD entries are not supported")));
 +    }
 +    Some(direct_key_value(rest).ok_or_else(|| schema("expected scoped TBD key")))
 +}
++
 +fn direct_nested_pair<'a>(lines: &'a [&str], i: usize) -> Option<(&'a str, &'a str)> {
 +    let trimmed = direct_trimmed(lines.get(i)?)?;
 +    if trimmed.starts_with("---") || trimmed.starts_with("...") {
 +        return None;
 +    }
 +    let indent = direct_indent(lines[i]);
 +    if indent <= 2 {
 +        return None;
 +    }
 +    direct_key_value(trimmed)
 +}
++
 +fn parse_direct_targets(
 +    lines: &[&str],
 +    i: usize,
 +    rest: &str,
 +) -> Result<(Vec<Target>, usize), TbdError> {
 +    let (items, next) = parse_direct_string_list(lines, i, rest)?;
 +    let mut targets = Vec::with_capacity(items.len());
 +    for item in items {
 +        targets.push(parse_target(&item)?);
 +    }
 +    Ok((targets, next))
 +}
++
 +fn parse_direct_string_list(
 +    lines: &[&str],
 +    i: usize,
 +    rest: &str,
 +) -> Result<(Vec<String>, usize), TbdError> {
 +    let (flow, next) = collect_direct_flow(lines, i, rest)?;
 +    Ok((split_direct_flow_scalars(&flow)?, next))
 +}
++
 +fn collect_direct_flow(lines: &[&str], i: usize, rest: &str) -> Result<(String, usize), TbdError> {
 +    let start_line = i + 1;
 +    let mut flow = rest.trim().to_string();
 +    if !flow.starts_with('[') {
 +        return Err(schema("expected a flow sequence"));
 +    }
 +    let mut next_i = if direct_trimmed(lines.get(i).copied().unwrap_or_default())
 +        .map(|line| line.contains(rest.trim()))
 +        .unwrap_or(false)
 +    {
 +        i + 1
 +    } else {
 +        i
 +    };
 +    while direct_flow_unbalanced(&flow) {
 +        let Some(next) = lines.get(next_i).and_then(|line| direct_trimmed(line)) else {
 +            return Err(schema(&format!(
 +                "unterminated flow sequence from line {} near line {}: {:?}",
 +                start_line,
 +                next_i + 1,
 +                flow
 +            )));
 +        };
 +        if next.starts_with("---") || next.starts_with("...") {
 +            return Err(schema(&format!(
 +                "unterminated flow sequence from line {} before line {}: {:?}",
 +                start_line,
 +                next_i + 1,
 +                flow
 +            )));
 +        }
 +        flow.push(' ');
 +        flow.push_str(next);
 +        next_i += 1;
 +    }
 +    if !flow.ends_with(']') {
 +        return Err(schema("flow sequence must end with ']'"));
 +    }
 +    Ok((flow, next_i))
 +}
++
 +fn skip_direct_flow(lines: &[&str], i: usize, rest: &str) -> Result<usize, TbdError> {
 +    collect_direct_flow(lines, i, rest).map(|(_, next)| next)
 +}
++
 +fn skip_direct_inline_value(lines: &[&str], i: usize, rest: &str) -> Result<usize, TbdError> {
 +    if rest.trim_start().starts_with('[') {
 +        skip_direct_flow(lines, i, rest)
 +    } else {
 +        Ok(i + 1)
 +    }
 +}
++
 +fn skip_direct_value(lines: &[&str], mut i: usize) -> usize {
 +    while i < lines.len() {
 +        let Some(trimmed) = direct_trimmed(lines[i]) else {
 +            i += 1;
 +            continue;
 +        };
 +        if trimmed.starts_with("---") || trimmed.starts_with("...") || direct_indent(lines[i]) == 0
 +        {
 +            break;
 +        }
 +        i += 1;
 +    }
 +    i
 +}
++
 +fn split_direct_flow_scalars(flow: &str) -> Result<Vec<String>, TbdError> {
 +    let inner = flow
 +        .strip_prefix('[')
 +        .and_then(|s| s.strip_suffix(']'))
 +        .ok_or_else(|| schema("flow sequence must be bracketed"))?;
 +    let bytes = inner.as_bytes();
 +    let mut out = Vec::new();
 +    let mut in_single = false;
 +    let mut in_double = false;
 +    let mut depth = 0i32;
 +    let mut start = 0usize;
 +    let mut i = 0usize;
 +    while i < bytes.len() {
 +        let b = bytes[i];
 +        match b {
 +            b'\'' if !in_double => in_single = !in_single,
 +            b'"' if !in_single => in_double = !in_double,
 +            b'\\' if in_double && i + 1 < bytes.len() => {
 +                i += 2;
 +                continue;
 +            }
 +            b'[' | b'{' if !in_single && !in_double => depth += 1,
 +            b']' | b'}' if !in_single && !in_double => depth -= 1,
 +            b',' if !in_single && !in_double && depth == 0 => {
 +                push_direct_flow_scalar(&mut out, &inner[start..i]);
 +                start = i + 1;
 +            }
 +            _ => {}
 +        }
 +        i += 1;
 +    }
 +    if start <= inner.len() {
 +        push_direct_flow_scalar(&mut out, &inner[start..]);
 +    }
 +    Ok(out)
 +}
++
 +fn push_direct_flow_scalar(out: &mut Vec<String>, item: &str) {
 +    let item = item.trim();
 +    if !item.is_empty() {
 +        out.push(parse_direct_scalar(item));
 +    }
 +}
++
 +fn parse_direct_scalar(raw: &str) -> String {
 +    let raw = raw.trim();
 +    if raw.len() >= 2 && raw.starts_with('\'') && raw.ends_with('\'') {
 +        raw[1..raw.len() - 1].replace("''", "'")
 +    } else if raw.len() >= 2 && raw.starts_with('"') && raw.ends_with('"') {
 +        parse_direct_double_quoted(&raw[1..raw.len() - 1])
 +    } else {
 +        raw.to_string()
 +    }
 +}
++
 +fn parse_direct_double_quoted(raw: &str) -> String {
 +    let bytes = raw.as_bytes();
 +    let mut out = String::with_capacity(raw.len());
 +    let mut i = 0usize;
 +    while i < bytes.len() {
 +        if bytes[i] == b'\\' && i + 1 < bytes.len() {
 +            let ch = match bytes[i + 1] {
 +                b'n' => '\n',
 +                b'r' => '\r',
 +                b't' => '\t',
 +                b'"' => '"',
 +                b'\\' => '\\',
 +                other => other as char,
 +            };
 +            out.push(ch);
 +            i += 2;
 +        } else {
 +            out.push(bytes[i] as char);
 +            i += 1;
 +        }
 +    }
 +    out
 +}
++
 +fn direct_key_value(s: &str) -> Option<(&str, &str)> {
 +    let (key, rest) = s.split_once(':')?;
 +    Some((key.trim(), rest.trim_start()))
 +}
++
 +fn direct_trimmed(line: &str) -> Option<&str> {
 +    let trimmed = line.trim();
 +    if trimmed.is_empty() || trimmed.starts_with('#') {
 +        None
 +    } else {
 +        Some(trimmed)
 +    }
 +}
++
 +fn direct_indent(line: &str) -> usize {
 +    line.bytes().take_while(|b| *b == b' ').count()
 +}
++
 +fn direct_flow_unbalanced(s: &str) -> bool {
 +    let mut depth = 0i32;
 +    let mut in_single = false;
 +    let mut in_double = false;
 +    let bytes = s.as_bytes();
 +    let mut i = 0usize;
 +    while i < bytes.len() {
 +        let b = bytes[i];
 +        match b {
 +            b'\'' if !in_double => in_single = !in_single,
 +            b'"' if !in_single => in_double = !in_double,
 +            b'\\' if in_double && i + 1 < bytes.len() => {
 +                i += 2;
 +                continue;
 +            }
 +            b'[' | b'{' if !in_single && !in_double => depth += 1,
 +            b']' | b'}' if !in_single && !in_double => depth -= 1,
 +            _ => {}
 +        }
 +        i += 1;
 +    }
 +    depth != 0
 +}
++
 +fn targets_match(targets: &[Target], target: &Target) -> bool {
 +    targets.iter().any(|t| t.matches_requested(target))
 +}
++
 +fn decode_document(doc: Document) -> Result<Tbd, TbdError> {
 +    let Value::Mapping(m) = doc.root else {
 +        return Err(schema("top level of a TBD document must be a mapping"));
 +    };
      let mut tbd = Tbd::default();
      for (k, v) in m {
          match k.as_str() {
      Ok(tbd)
+ }
 -fn decode_target_list(v: &Value) -> Result<Vec<Target>, TbdError> {
 -    let seq = v
 -        .as_sequence()
 -        .ok_or_else(|| schema("'targets' must be a sequence"))?;
 +fn decode_target_list(v: Value) -> Result<Vec<Target>, TbdError> {
 +    let Value::Sequence(seq) = v else {
 +        return Err(schema("'targets' must be a sequence"));
 +    };
      let mut out = Vec::with_capacity(seq.len());
      for item in seq {
 -        let s = item
 -            .as_str()
 -            .ok_or_else(|| schema("target must be a scalar"))?;
 -        out.push(parse_target(s)?);
 +        out.push(parse_target(&scalar_string(item, "target")?)?);
+     }
      Ok(out)
+ }
      Ok(Target { arch, platform })
+ }
 -fn decode_scoped_umbrella(v: &Value) -> Result<Vec<Scoped<String>>, TbdError> {
 -    let seq = v
 -        .as_sequence()
 -        .ok_or_else(|| schema("'parent-umbrella' must be a sequence of scoped mappings"))?;
 +fn decode_scoped_umbrella(v: Value) -> Result<Vec<Scoped<String>>, TbdError> {
 +    let Value::Sequence(seq) = v else {
 +        return Err(schema(
 +            "'parent-umbrella' must be a sequence of scoped mappings",
 +        ));
 +    };
      let mut out = Vec::with_capacity(seq.len());
      for item in seq {
 -        let m = item
 -            .as_mapping()
 -            .ok_or_else(|| schema("parent-umbrella entry must be a mapping"))?;
 -        let targets = lookup_required(m, "targets").and_then(decode_target_list)?;
 -        let umbrella = lookup_required(m, "umbrella").and_then(|v| scalar_string(v, "umbrella"))?;
 +        let Value::Mapping(m) = item else {
 +            return Err(schema("parent-umbrella entry must be a mapping"));
 +        };
 +        let mut targets = None;
 +        let mut umbrella = None;
 +        for (k, v) in m {
 +            match k.as_str() {
 +                "targets" => targets = Some(decode_target_list(v)?),
 +                "umbrella" => umbrella = Some(scalar_string(v, "umbrella")?),
 +                _ => {}
 +            }
 +        }
 +        let targets = targets.ok_or_else(|| schema("missing required key \"targets\""))?;
 +        let umbrella = umbrella.ok_or_else(|| schema("missing required key \"umbrella\""))?;
          out.push(Scoped {
              targets,
              value: umbrella,
+ }
  fn decode_scoped_string_list(
 -    v: &Value,
 +    v: Value,
      inner_key: &str,
  ) -> Result<Vec<Scoped<Vec<String>>>, TbdError> {
 -    let seq = v
 -        .as_sequence()
 -        .ok_or_else(|| schema("expected a sequence of scoped mappings"))?;
 +    let Value::Sequence(seq) = v else {
 +        return Err(schema("expected a sequence of scoped mappings"));
 +    };
      let mut out = Vec::with_capacity(seq.len());
      for item in seq {
 -        let m = item
 -            .as_mapping()
 -            .ok_or_else(|| schema("scoped entry must be a mapping"))?;
 -        let targets = lookup_required(m, "targets").and_then(decode_target_list)?;
 -        let value = match m.iter().find(|(k, _)| k == inner_key) {
 -            Some((_, v)) => decode_string_list(v, inner_key)?,
 -            None => Vec::new(),
 +        let Value::Mapping(m) = item else {
 +            return Err(schema("scoped entry must be a mapping"));
          };
 +        let mut targets = None;
 +        let mut value = Vec::new();
 +        for (k, v) in m {
 +            if k == "targets" {
 +                targets = Some(decode_target_list(v)?);
 +            } else if k == inner_key {
 +                value = decode_string_list(v, inner_key)?;
 +            }
 +        }
 +        let targets = targets.ok_or_else(|| schema("missing required key \"targets\""))?;
          out.push(Scoped { targets, value });
+     }
      Ok(out)
+ }
 -fn decode_scoped_symbols(v: &Value) -> Result<Vec<Scoped<SymbolLists>>, TbdError> {
 -    let seq = v
 -        .as_sequence()
 -        .ok_or_else(|| schema("'exports'/'reexports' must be a sequence"))?;
 +fn decode_scoped_symbols(v: Value) -> Result<Vec<Scoped<SymbolLists>>, TbdError> {
 +    let Value::Sequence(seq) = v else {
 +        return Err(schema("'exports'/'reexports' must be a sequence"));
 +    };
      let mut out = Vec::with_capacity(seq.len());
      for item in seq {
 -        let m = item
 -            .as_mapping()
 -            .ok_or_else(|| schema("exports entry must be a mapping"))?;
 -        let targets = lookup_required(m, "targets").and_then(decode_target_list)?;
 +        let Value::Mapping(m) = item else {
 +            return Err(schema("exports entry must be a mapping"));
 +        };
 +        let mut targets = None;
          let mut lists = SymbolLists::default();
          for (k, v) in m {
              match k.as_str() {
 -                "targets" => {}
 +                "targets" => targets = Some(decode_target_list(v)?),
                  "symbols" => lists.symbols = decode_string_list(v, "symbols")?,
                  "weak-symbols" => lists.weak_symbols = decode_string_list(v, "weak-symbols")?,
                  "thread-local-symbols" => {
                  _ => {} // ignore unknown inner keys
+             }
+         }
 +        let targets = targets.ok_or_else(|| schema("missing required key \"targets\""))?;
          out.push(Scoped {
              targets,
              value: lists,
      Ok(out)
+ }
 -fn decode_string_list(v: &Value, context: &str) -> Result<Vec<String>, TbdError> {
 +fn decode_string_list(v: Value, context: &str) -> Result<Vec<String>, TbdError> {
      match v {
          Value::Sequence(items) => {
              let mut out = Vec::with_capacity(items.len());
+     }
+ }
 -fn lookup_required<'a>(m: &'a [(String, Value)], key: &str) -> Result<&'a Value, TbdError> {
 -    m.iter()
 -        .find(|(k, _)| k == key)
 -        .map(|(_, v)| v)
 -        .ok_or_else(|| schema(&format!("missing required key {key:?}")))
 -}
+-
 -fn scalar_u32(v: &Value, context: &str) -> Result<u32, TbdError> {
 -    let s = v
 -        .as_str()
 -        .ok_or_else(|| schema(&format!("{context} must be a scalar")))?;
 +fn scalar_u32(v: Value, context: &str) -> Result<u32, TbdError> {
 +    let s = scalar_string(v, context)?;
      s.parse()
          .map_err(|_| schema(&format!("{context} must parse as a u32: {s:?}")))
+ }
 -fn scalar_string(v: &Value, context: &str) -> Result<String, TbdError> {
 +fn scalar_string(v: Value, context: &str) -> Result<String, TbdError> {
      match v {
 -        Value::Scalar(s) => Ok(s.clone()),
 +        Value::Scalar(s) => Ok(s),
          _ => Err(schema(&format!("{context} must be a scalar"))),
+     }
+ }
  mod tests {
      use super::*;
 +    fn arm64_macos() -> Target {
 +        Target {
 +            arch: Arch::Arm64,
 +            platform: Platform::MacOs,
 +        }
 +    }
++
      #[test]
      fn parses_minimal_tbd_v4() {
          let src = "--- !tapi-tbd\n\
          assert_eq!(tbd.parent_umbrella[0].value, "System");
+     }
 +    #[test]
 +    fn target_fast_path_keeps_matching_multiline_exports() {
 +        let src = "--- !tapi-tbd\n\
 +                   tbd-version: 4\n\
 +                   targets: [ x86_64-macos, arm64e-macos ]\n\
 +                   install-name: '/usr/lib/libfoo.dylib'\n\
 +                   exports:\n\
 +                   \x20 - targets: [ x86_64-macos ]\n\
 +                   \x20   symbols: [ _x86_only ]\n\
 +                   \x20 - targets: [ x86_64-macos, arm64e-macos ]\n\
 +                   \x20   symbols: [ _arm_one,\n\
 +                   \x20              _arm_two ]\n";
 +        let docs = parse_tbd_for_target(src, &arm64_macos()).unwrap();
 +        assert_eq!(docs.len(), 1);
 +        assert_eq!(docs[0].exports.len(), 1);
 +        assert_eq!(docs[0].exports[0].value.symbols, ["_arm_one", "_arm_two"]);
 +    }
++
      #[test]
      fn unknown_keys_are_tolerated() {
          let src = "--- !tapi-tbd\n\

src/macho/tbd_yaml.rsmodified

  /// quoted content untouched (otherwise single-quoted paths like
  /// `'/usr/lib/#funny'` would be mangled — unlikely but sound).
  fn strip_eol_comment(s: &mut String) {
 +    if !s.as_bytes().contains(&b'#') {
 +        return;
 +    }
      let bytes = s.as_bytes();
      let mut in_single = false;
      let mut in_double = false;
                  i += 2;
                  continue;
+             }
 -            b'#' if !in_single && !in_double => {
 -                // A comment must be preceded by whitespace or be at BOL.
 -                if i == 0 || bytes[i - 1] == b' ' || bytes[i - 1] == b'\t' {
 -                    s.truncate(i);
 -                    // Trim trailing whitespace left by the strip.
 -                    let trimmed_len = s.trim_end().len();
 -                    s.truncate(trimmed_len);
 -                    return;
 -                }
 +            b'#' if !in_single
 +                && !in_double
 +                && (i == 0 || bytes[i - 1] == b' ' || bytes[i - 1] == b'\t') =>
 +            {
 +                s.truncate(i);
 +                // Trim trailing whitespace left by the strip.
 +                let trimmed_len = s.trim_end().len();
 +                s.truncate(trimmed_len);
 +                return;
+             }
              _ => {}
+         }
+ }
  fn flow_unbalanced(s: &str) -> bool {
 +    if !s
 +        .as_bytes()
 +        .iter()
 +        .any(|b| matches!(b, b'[' | b']' | b'{' | b'}'))
 +    {
 +        return false;
 +    }
      let mut depth = 0i32;
      let mut in_single = false;
      let mut in_double = false;
+             }
              b'[' | b'{' if !in_single && !in_double => depth += 1,
              b']' | b'}' if !in_single && !in_double => depth -= 1,
 -            b':' if !in_single && !in_double && depth == 0 => {
 -                // Must be followed by whitespace or end of line per YAML rules.
 -                if i + 1 == bytes.len() || bytes[i + 1] == b' ' || bytes[i + 1] == b'\t' {
 -                    return Some(i);
 -                }
 +            b':' if !in_single
 +                && !in_double
 +                && depth == 0
 +                && (i + 1 == bytes.len() || bytes[i + 1] == b' ' || bytes[i + 1] == b'\t') =>
 +            {
 +                return Some(i);
+             }
              _ => {}
+         }
          });
+     }
      let inner = &s[1..s.len() - 1];
 -    let items = split_flow_items(inner);
 -    let mut out = Vec::with_capacity(items.len());
 -    for piece in items {
 +    let mut out = Vec::new();
 +    split_flow_items(inner, |piece| {
          let piece = piece.trim();
          if piece.is_empty() {
 -            continue;
 +            return Ok(());
+         }
          // Recursive: flow sequences can hold scalars or further flow sequences.
          out.push(parse_inline_value(piece, line, col)?);
 -    }
 +        Ok(())
 +    })?;
      Ok(Value::Sequence(out))
+ }
 -fn split_flow_items(s: &str) -> Vec<&str> {
 -    let mut parts = Vec::new();
 +fn split_flow_items(
 +    s: &str,
 +    mut visit: impl FnMut(&str) -> Result<(), YamlError>,
 +) -> Result<(), YamlError> {
      let bytes = s.as_bytes();
      let mut in_single = false;
      let mut in_double = false;
              b'[' | b'{' if !in_single && !in_double => depth += 1,
              b']' | b'}' if !in_single && !in_double => depth -= 1,
              b',' if !in_single && !in_double && depth == 0 => {
 -                parts.push(&s[start..i]);
 +                visit(&s[start..i])?;
                  start = i + 1;
+             }
              _ => {}
          i += 1;
+     }
      if start <= s.len() {
 -        parts.push(&s[start..]);
 +        visit(&s[start..])?;
+     }
 -    parts
 +    Ok(())
+ }
  fn parse_single_quoted(s: &str, line: usize, col: usize) -> Result<String, YamlError> {

src/macho/writer.rsmodified

      LinkEditDataCmd, LoadCommand, MachHeader64, RpathCmd, Section64Header, Segment64, SymtabCmd,
      HEADER_SIZE,
  };
 -use crate::reloc::{parse_raw_relocs, parse_relocs, Referent, Reloc, RelocKind, RelocLength};
 -use crate::resolve::InputId;
 +use crate::reloc::{
 +    parse_raw_relocs, parse_relocs, ParsedRelocCache, Referent, Reloc, RelocKind, RelocLength,
 +};
 +use crate::resolve::{AtomId, InputId};
  use crate::resolve::{Symbol, SymbolId, SymbolTable};
  use crate::section::is_executable;
  use crate::string_table::StringTableBuilder;
 -use crate::symbol::{write_nlist_table, InputSymbol, RawNlist, SymKind};
 +use crate::symbol::{write_nlist_table, InputSymbol, RawNlist, SymKind, NLIST_SIZE};
  use crate::synth::tlv::THREAD_VARIABLE_DESCRIPTOR_SIZE;
  use crate::synth::{
      code_sig::CodeSignaturePlan,
      pub parsed_relocs: &'a ParsedRelocCache,
+ }
 -pub type ParsedRelocCache = HashMap<(InputId, u8), Vec<Reloc>>;
+-
  #[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
  pub struct LinkEditBuildTimings {
      pub symbol_plan: Duration,
      pub symbol_plan_globals: Duration,
      pub symbol_plan_strtab: Duration,
      pub dyld_info: Duration,
 +    pub dyld_bind: Duration,
 +    pub dyld_rebase: Duration,
 +    pub dyld_export: Duration,
      pub metadata_tables: Duration,
      pub code_signature: Duration,
+ }
          self.symbol_plan_globals += rhs.symbol_plan_globals;
          self.symbol_plan_strtab += rhs.symbol_plan_strtab;
          self.dyld_info += rhs.dyld_info;
 +        self.dyld_bind += rhs.dyld_bind;
 +        self.dyld_rebase += rhs.dyld_rebase;
 +        self.dyld_export += rhs.dyld_export;
          self.metadata_tables += rhs.metadata_tables;
          self.code_signature += rhs.code_signature;
+     }
      out[stroff..end].copy_from_slice(&linkedit_plan.strtab_bytes);
      if let Some(code_signature) = &linkedit_plan.code_signature {
          let start = code_signature.dataoff as usize;
 -        let bytes = code_signature.build(&out[..start]);
 +        let bytes = code_signature.build_with_jobs(&out[..start], opts.parallel_jobs());
          let end = start + bytes.len();
          out[start..end].copy_from_slice(&bytes);
+     }
+         }
+     }
 -    for rpath in &opts.rpaths {
 -        commands.push(LoadCommand::Rpath(RpathCmd {
 -            path: rpath.clone(),
 -        }));
 -    }
+-
      for dylib in dylibs {
          commands.push(LoadCommand::Dylib(DylibCmd {
              cmd: dylib.kind.load_cmd(),
          }));
+     }
 +    for rpath in &opts.rpaths {
 +        commands.push(LoadCommand::Rpath(RpathCmd {
 +            path: rpath.clone(),
 +        }));
 +    }
++
      if let Some(loh) = linkedit.loh {
          commands.push(raw_linkedit_command(
              LC_LINKER_OPTIMIZATION_HINT,
      timings.symbol_plan_locals += symbol_plan_timings.locals;
      timings.symbol_plan_globals += symbol_plan_timings.globals;
      timings.symbol_plan_strtab += symbol_plan_timings.strtab;
 -    let mut symtab_bytes = Vec::new();
 +    let mut symtab_bytes = Vec::with_capacity(symbol_plan.symbols.len() * NLIST_SIZE);
      write_nlist_table(&symbol_plan.symbols, &mut symtab_bytes);
      let mut indirect_symbols = Vec::new();
          indirect_bytes.extend_from_slice(&index.to_le_bytes());
+     }
 +    let dyld_started = std::time::Instant::now();
      let phase_started = std::time::Instant::now();
      let bind_streams = build_bind_streams(layout, synthetic_plan, &import_lookup)?;
 -    let rebase_bytes = pad_dyld_info_stream(build_rebase_stream(layout, synthetic_plan, inputs)?);
      let bind_bytes = pad_dyld_info_stream(bind_streams.bind);
      let weak_bind_bytes = pad_dyld_info_stream(bind_streams.weak_bind);
      let lazy_bind_bytes = pad_dyld_info_stream(bind_streams.lazy_bind);
 +    timings.dyld_bind += phase_started.elapsed();
 +    let phase_started = std::time::Instant::now();
 +    let rebase_bytes = pad_dyld_info_stream(build_rebase_stream(layout, synthetic_plan, inputs)?);
 +    timings.dyld_rebase += phase_started.elapsed();
 +    let phase_started = std::time::Instant::now();
      let export_bytes = pad_dyld_info_stream(build_export_trie(&symbol_plan.exports));
 -    timings.dyld_info += phase_started.elapsed();
 +    timings.dyld_export += phase_started.elapsed();
 +    timings.dyld_info += dyld_started.elapsed();
      let phase_started = std::time::Instant::now();
      let loh_bytes = build_loh(
          .segment("__TEXT")
          .ok_or(WriteError::MissingSegment("__TEXT"))?
          .vm_addr;
 -    let input_map: HashMap<InputId, &ObjectFile> = inputs
 -        .iter()
 -        .map(|input| (input.id, input.object))
 -        .collect();
 +    let symbol_offsets = build_function_start_symbol_index(inputs);
      let mut starts = Vec::new();
      for section in &layout.sections {
                      section.addr + placed.offset + alt.offset_within_atom as u64 - image_base,
                  );
+             }
 -            let Some(object) = input_map.get(&atom.origin) else {
 -                continue;
 -            };
 -            let Some(input_section) = object
 -                .sections
 -                .get((atom.input_section as usize).saturating_sub(1))
 +            let Some(section_symbols) = symbol_offsets.get(&(atom.origin, atom.input_section))
              else {
                  continue;
              };
 -            let atom_start = input_section.addr + atom.input_offset as u64;
 +            let atom_start = atom.input_offset as u64;
              let atom_end = atom_start + atom.size as u64;
 -            for input_sym in &object.symbols {
 -                if input_sym.stab_kind().is_some()
 -                    || input_sym.kind() != SymKind::Sect
 -                    || input_sym.alt_entry()
 -                    || input_sym.sect_idx() != atom.input_section
 -                {
 -                    continue;
 -                }
 -                let Ok(name) = object.symbol_name(input_sym) else {
 -                    continue;
 -                };
 -                if is_assembler_temporary_symbol(name) {
 -                    continue;
 -                }
 -                let value = input_sym.value();
 -                if !(atom_start < value && value < atom_end) {
 -                    continue;
 -                }
 -                starts.push(section.addr + placed.offset + (value - atom_start) - image_base);
 +            let start_idx = section_symbols.partition_point(|&offset| offset <= atom_start);
 +            let end_idx = section_symbols.partition_point(|&offset| offset < atom_end);
 +            if start_idx >= end_idx {
 +                continue;
 +            }
 +            for &offset in &section_symbols[start_idx..end_idx] {
 +                starts.push(section.addr + placed.offset + (offset - atom_start) - image_base);
+             }
+         }
+     }
      Ok(out)
+ }
 +type FunctionStartSymbolIndex = HashMap<(InputId, u8), Vec<u64>>;
++
 +fn build_function_start_symbol_index(inputs: &[LayoutInput<'_>]) -> FunctionStartSymbolIndex {
 +    let mut out: FunctionStartSymbolIndex = HashMap::new();
 +    for input in inputs {
 +        for input_sym in &input.object.symbols {
 +            if input_sym.stab_kind().is_some()
 +                || input_sym.kind() != SymKind::Sect
 +                || input_sym.alt_entry()
 +            {
 +                continue;
 +            }
 +            let Ok(name) = input.object.symbol_name(input_sym) else {
 +                continue;
 +            };
 +            if is_assembler_temporary_symbol(name) {
 +                continue;
 +            }
 +            let Some(section) = input.object.section_for_symbol(input_sym) else {
 +                continue;
 +            };
 +            out.entry((input.id, input_sym.sect_idx()))
 +                .or_default()
 +                .push(input_sym.value().saturating_sub(section.addr));
 +        }
 +    }
 +    for offsets in out.values_mut() {
 +        offsets.sort_unstable();
 +        offsets.dedup();
 +    }
 +    out
 +}
++
  fn build_data_in_code(
      layout: &Layout,
      inputs: &[LayoutInput<'_>],
  ) -> Result<(SymbolTablePlan, SymbolPlanBuildTimings), WriteError> {
      let sym_table = inputs.0.sym_table;
      let atom_sections = atom_section_ordinals(layout);
 +    let atom_addrs = atom_addresses(layout);
      let atoms_by_input_section = inputs.0.atom_table.by_input_section();
      let atom_ranges = build_atom_range_index(
          inputs.0.atom_table,
              atom_table: inputs.0.atom_table,
              atom_ranges: &atom_ranges,
              atom_sections: &atom_sections,
 +            atom_addrs: &atom_addrs,
              input_id: input.id,
              file_index: file_index_by_input[&input.id],
          };
 -        collect_local_symbols(layout, &ctx, input.object, &mut locals)?;
 +        collect_local_symbols(&ctx, input.object, &mut locals)?;
+     }
      collect_synthetic_local_symbols(layout, inputs.0.synthetic_plan, &mut locals)?;
      timings.locals += phase_started.elapsed();
          let (n_type, n_sect, n_value) = if atom.0 == 0 {
              (absolute_symbol_type(hidden), NO_SECT, *value)
          } else {
 -            if dead_strip && layout.atom_addr(*atom).is_none() {
 -                continue;
 -            }
 -            let addr = layout
 -                .atom_addr(*atom)
 -                .ok_or(WriteError::DefinedSymbolAtomMissing(symbol_id, *atom))?;
 +            let Some(addr) = atom_addrs.get(atom).copied() else {
 +                if dead_strip {
 +                    continue;
 +                }
 +                return Err(WriteError::DefinedSymbolAtomMissing(symbol_id, *atom));
 +            };
              let sect = *atom_sections
                  .get(atom)
                  .ok_or(WriteError::DefinedSymbolSectionMissing(symbol_id, *atom))?;
          Vec::new()
      };
 -    let phase_started = std::time::Instant::now();
      let local_count = if strip_locals { 0 } else { locals.len() };
 +    let external_defined_count = external_defineds.len();
 +    let undefined_count = undefineds.len();
 +    let phase_started = std::time::Instant::now();
      let mut specs = Vec::with_capacity(local_count + external_defineds.len() + undefineds.len());
      if !strip_locals {
          specs.extend(locals);
      specs.extend(external_defineds);
      specs.extend(undefineds);
 -    let mut strtab = StringTableBuilder::new();
 -    for spec in &specs {
 -        strtab.insert(&spec.name);
 -    }
 -    let (strtab_bytes, strx_by_name) = strtab.finish();
+-
 -    let nlocalsym = specs
 -        .iter()
 -        .filter(|spec| spec.partition == OutputSymbolPartition::Local)
 -        .count() as u32;
 -    let nextdefsym = specs
 -        .iter()
 -        .filter(|spec| spec.partition == OutputSymbolPartition::ExternalDefined)
 -        .count() as u32;
 -    let nundefsym = specs
 -        .iter()
 -        .filter(|spec| spec.partition == OutputSymbolPartition::Undefined)
 -        .count() as u32;
 +    let (strtab_bytes, strx_by_spec) =
 +        StringTableBuilder::build_with_name_offsets(specs.iter().map(|spec| spec.name.as_str()));
      let mut symbols = Vec::with_capacity(specs.len());
 -    let mut symbol_indices = HashMap::new();
 +    let mut symbol_indices = HashMap::with_capacity(specs.len());
      let map_symbols = specs
          .iter()
          .filter(|spec| spec.partition != OutputSymbolPartition::Undefined)
          })
          .collect();
      for (idx, spec) in specs.into_iter().enumerate() {
 -        let strx = *strx_by_name
 -            .get(&spec.name)
 -            .expect("string table offset missing for output symbol");
 +        let strx = strx_by_spec[idx];
          symbols.push(InputSymbol::from_raw(RawNlist {
              strx,
              n_type: spec.n_type,
              exports,
              dysymtab: DysymtabCmd {
                  ilocalsym: 0,
 -                nlocalsym,
 -                iextdefsym: nlocalsym,
 -                nextdefsym,
 -                iundefsym: nlocalsym + nextdefsym,
 -                nundefsym,
 +                nlocalsym: local_count as u32,
 +                iextdefsym: local_count as u32,
 +                nextdefsym: external_defined_count as u32,
 +                iundefsym: (local_count + external_defined_count) as u32,
 +                nundefsym: undefined_count as u32,
                  ..DysymtabCmd::default()
              },
          },
+ }
  fn collect_local_symbols(
 -    layout: &Layout,
      ctx: &LocalSymbolContext<'_>,
      object: &ObjectFile,
      out: &mut Vec<OutputSymbolSpec>,
                      offset,
+                 )
                  .ok_or(WriteError::MissingSegment("__UNKNOWN"))?;
 -                let addr =
 -                    layout
 -                        .atom_addr(atom_id)
 -                        .ok_or(WriteError::DefinedSymbolAtomMissing(
 -                            SymbolId(u32::MAX),
 -                            atom_id,
 -                        ))?
 -                        + delta as u64;
 +                let addr = ctx.atom_addrs.get(&atom_id).copied().ok_or(
 +                    WriteError::DefinedSymbolAtomMissing(SymbolId(u32::MAX), atom_id),
 +                )? + delta as u64;
                  let n_sect = *ctx.atom_sections.get(&atom_id).ok_or(
                      WriteError::DefinedSymbolSectionMissing(SymbolId(u32::MAX), atom_id),
                  )?;
      atom_table: &'a AtomTable,
      atom_ranges: &'a AtomRangeIndex,
      atom_sections: &'a HashMap<crate::resolve::AtomId, u8>,
 +    atom_addrs: &'a HashMap<crate::resolve::AtomId, u64>,
      input_id: InputId,
      file_index: usize,
+ }
      out
+ }
 +fn atom_addresses(layout: &Layout) -> HashMap<AtomId, u64> {
 +    let mut out = HashMap::new();
 +    for section in &layout.sections {
 +        for placed in &section.atoms {
 +            out.insert(placed.atom, section.addr + placed.offset);
 +        }
 +    }
 +    out
 +}
++
  fn export_symbol_flags(layout: &Layout, n_desc: u16, n_type: u8, n_sect: u8) -> u64 {
      let mut flags = 0u64;
      if n_desc & N_WEAK_DEF != 0 {
      let weak_bind = Vec::new();
      let mut lazy_bind = OpcodeStream::new();
      let mut lazy_offsets = HashMap::new();
 +    let layout_index = BindLayoutIndex::build(layout)?;
      if let Some(tlv_bootstrap) = synthetic_plan.tlv_bootstrap_symbol {
          let segment_index = segment_index(layout, "__DATA")?;
              .get(&entry.symbol)
              .copied()
              .ok_or(WriteError::ImportSymbolMissing(entry.symbol))?;
 -        let atom_addr = layout
 -            .atom_addr(entry.atom)
 +        let placement = layout_index
 +            .atoms
 +            .get(&entry.atom)
              .ok_or(WriteError::DirectBindAtomMissing(entry.atom))?;
 -        let section = layout
 -            .sections
 -            .iter()
 -            .find(|section| section.atoms.iter().any(|placed| placed.atom == entry.atom))
 -            .ok_or(WriteError::DirectBindSectionMissing(entry.atom))?;
 -        if section.segment == "__DATA" && section.name == "__thread_vars" {
 +        if placement.is_thread_vars {
              // `__thread_vars` starts are emitted through the dedicated
              // `__tlv_bootstrap` pass above. Descriptor tails are rewritten to
              // template offsets before write, so any generic direct bind landing
              // back in this section is stale and would override the TLV bind.
              continue;
+         }
 -        let segment_index = segment_index(layout, &section.segment)?;
 -        let segment = layout
 -            .segment(&section.segment)
 -            .ok_or(WriteError::MissingSegment("__UNKNOWN"))?;
 -        let slot_addr = atom_addr + entry.atom_offset as u64;
 +        let slot_addr = placement.addr + entry.atom_offset as u64;
          bind_specs.push(BindRecordSpec {
 -            segment_index,
 -            segment_offset: slot_addr - segment.vm_addr,
 +            segment_index: placement.segment_index,
 +            segment_offset: slot_addr - placement.segment_vm_addr,
              ordinal: import.ordinal,
              name: &import.name,
              weak_import: import.weak_import,
      })
+ }
 +struct BindLayoutIndex {
 +    atoms: HashMap<AtomId, BindAtomPlacement>,
 +}
++
 +#[derive(Clone, Copy)]
 +struct BindAtomPlacement {
 +    addr: u64,
 +    segment_index: u8,
 +    segment_vm_addr: u64,
 +    is_thread_vars: bool,
 +}
++
 +impl BindLayoutIndex {
 +    fn build(layout: &Layout) -> Result<Self, WriteError> {
 +        let mut segment_meta = HashMap::with_capacity(layout.segments.len());
 +        for (idx, segment) in layout.segments.iter().enumerate() {
 +            segment_meta.insert(
 +                segment.name.as_str(),
 +                (
 +                    u8::try_from(idx).map_err(|_| WriteError::OffsetTooLarge("segment index"))?,
 +                    segment.vm_addr,
 +                ),
 +            );
 +        }
 +        let atom_count: usize = layout
 +            .sections
 +            .iter()
 +            .map(|section| section.atoms.len())
 +            .sum();
 +        let mut atoms = HashMap::with_capacity(atom_count);
 +        for section in &layout.sections {
 +            let Some((segment_index, segment_vm_addr)) =
 +                segment_meta.get(section.segment.as_str()).copied()
 +            else {
 +                continue;
 +            };
 +            let is_thread_vars = section.segment == "__DATA" && section.name == "__thread_vars";
 +            for placed in &section.atoms {
 +                atoms.insert(
 +                    placed.atom,
 +                    BindAtomPlacement {
 +                        addr: section.addr + placed.offset,
 +                        segment_index,
 +                        segment_vm_addr,
 +                        is_thread_vars,
 +                    },
 +                );
 +            }
 +        }
 +        Ok(Self { atoms })
 +    }
 +}
++
  fn segment_index(layout: &Layout, name: &str) -> Result<u8, WriteError> {
      let idx = layout
          .segments
  #[cfg(test)]
  mod tests {
      use crate::atom::{AltEntry, Atom, AtomFlags, AtomSection, AtomTable};
 +    use crate::input::ObjectFile;
      use crate::layout::{Layout, PAGE_SIZE};
      use crate::leb::read_uleb;
 +    use crate::macho::reader::MachHeader64;
      use crate::resolve::{AtomId, InputId, SymbolId};
      use crate::section::{
 -        OutputAtom, OutputSection, OutputSectionId, OutputSegment, Prot, SectionKind,
 +        InputSection, OutputAtom, OutputSection, OutputSectionId, OutputSegment, Prot, SectionKind,
      };
 +    use crate::string_table::StringTable;
      use super::*;
          );
+     }
 +    #[test]
 +    fn function_starts_index_uses_only_interior_named_entries() {
 +        let mut atoms = AtomTable::new();
 +        let atom_id = atoms.push(Atom {
 +            id: AtomId(0),
 +            origin: InputId(1),
 +            input_section: 1,
 +            section: AtomSection::Text,
 +            input_offset: 0,
 +            size: 16,
 +            align_pow2: 2,
 +            owner: None,
 +            alt_entries: Vec::new(),
 +            data: vec![0; 16],
 +            flags: AtomFlags::NONE,
 +            parent_of: None,
 +        });
 +        let zero_atom_id = atoms.push(Atom {
 +            id: AtomId(0),
 +            origin: InputId(1),
 +            input_section: 1,
 +            section: AtomSection::Text,
 +            input_offset: 16,
 +            size: 0,
 +            align_pow2: 2,
 +            owner: None,
 +            alt_entries: Vec::new(),
 +            data: Vec::new(),
 +            flags: AtomFlags::NONE,
 +            parent_of: None,
 +        });
 +        let object = object_with_text_symbols(&[
 +            ("_start", 0x1000, 0),
 +            ("Ltmp0", 0x1004, 0),
 +            ("_middle", 0x1008, 0),
 +            ("_alt", 0x100c, N_ALT_ENTRY),
 +            ("_end", 0x1010, 0),
 +        ]);
 +        let inputs = [LayoutInput {
 +            id: InputId(1),
 +            object: &object,
 +            load_order: 0,
 +            archive_member_offset: None,
 +        }];
 +        let layout = Layout {
 +            kind: OutputKind::Executable,
 +            segments: vec![OutputSegment {
 +                name: "__TEXT".into(),
 +                sections: vec![OutputSectionId(0)],
 +                vm_addr: 0x1_0000_0000,
 +                vm_size: 0x4000,
 +                file_off: 0,
 +                file_size: 0x4000,
 +                init_prot: Prot::READ_EXECUTE,
 +                max_prot: Prot::READ_EXECUTE,
 +                flags: 0,
 +            }],
 +            sections: vec![OutputSection {
 +                segment: "__TEXT".into(),
 +                name: "__text".into(),
 +                kind: SectionKind::Text,
 +                align_pow2: 2,
 +                flags: 0,
 +                reserved1: 0,
 +                reserved2: 0,
 +                reserved3: 0,
 +                atoms: vec![
 +                    OutputAtom {
 +                        atom: atom_id,
 +                        offset: 0,
 +                        size: 16,
 +                        data: vec![0; 16],
 +                    },
 +                    OutputAtom {
 +                        atom: zero_atom_id,
 +                        offset: 16,
 +                        size: 0,
 +                        data: Vec::new(),
 +                    },
 +                ],
 +                synthetic_offset: 0,
 +                synthetic_data: Vec::new(),
 +                addr: 0x1_0000_1000,
 +                size: 16,
 +                file_off: 0x1000,
 +            }],
 +        };
++
 +        let blob = build_function_starts(&layout, &inputs, &atoms).unwrap();
 +        assert_eq!(
 +            decode_function_starts_blob(&blob),
 +            vec![0x1000, 0x1008, 0x1010]
 +        );
 +    }
++
 +    fn object_with_text_symbols(symbols: &[(&str, u64, u16)]) -> ObjectFile {
 +        let mut strings = vec![0];
 +        let mut strx = Vec::new();
 +        for (name, _, _) in symbols {
 +            strx.push(strings.len() as u32);
 +            strings.extend_from_slice(name.as_bytes());
 +            strings.push(0);
 +        }
 +        ObjectFile {
 +            path: "function-starts-index.o".into(),
 +            header: MachHeader64 {
 +                magic: MH_MAGIC_64,
 +                cputype: CPU_TYPE_ARM64,
 +                cpusubtype: CPU_SUBTYPE_ARM64_ALL,
 +                filetype: MH_OBJECT,
 +                ncmds: 0,
 +                sizeofcmds: 0,
 +                flags: 0,
 +                reserved: 0,
 +            },
 +            commands: Vec::new(),
 +            sections: vec![InputSection {
 +                segname: "__TEXT".into(),
 +                sectname: "__text".into(),
 +                kind: SectionKind::Text,
 +                addr: 0x1000,
 +                size: 16,
 +                align_pow2: 2,
 +                flags: 0,
 +                offset: 0,
 +                reloff: 0,
 +                nreloc: 0,
 +                reserved1: 0,
 +                reserved2: 0,
 +                reserved3: 0,
 +                data: vec![0; 16],
 +                raw_relocs: Vec::new(),
 +            }],
 +            symbols: symbols
 +                .iter()
 +                .zip(strx)
 +                .map(|((_, value, desc), strx)| {
 +                    InputSymbol::from_raw(RawNlist {
 +                        strx,
 +                        n_type: N_SECT,
 +                        n_sect: 1,
 +                        n_desc: *desc,
 +                        n_value: *value,
 +                    })
 +                })
 +                .collect(),
 +            strings: StringTable::from_bytes(strings),
 +            symtab: None,
 +            dysymtab: None,
 +            loh: Vec::new(),
 +            data_in_code: Vec::new(),
 +        }
 +    }
++
      #[test]
      fn containing_atom_lookup_reuses_precomputed_section_index() {
          let mut atoms = AtomTable::new();

src/main.rsmodified

                                    Select chained fixups vs classic dyld info
    -all_load                       Force-load every archive member
    -force_load <archive>           Force-load one archive
 +  -j <jobs>                       Limit parallel worker jobs (`1` disables parallelism)
    -Wl,<arg,arg,...>               Normalize comma-separated driver flags
    --dump <path>                   Dump a Mach-O file summary
    --dump-archive <path>           Dump an archive summary

src/reloc/arm64.rsmodified

  use std::collections::HashMap;
  use std::fmt;
  use std::path::PathBuf;
 +use std::thread;
  use crate::atom::{Atom, AtomSection, AtomTable};
  use crate::input::ObjectFile;
  use crate::layout::{ExtraOutputSection, ExtraSectionAnchor, Layout, LayoutInput};
  use crate::macho::writer::LinkEditPlan;
 -use crate::reloc::{parse_raw_relocs, parse_relocs, Referent, Reloc, RelocKind, RelocLength};
 +use crate::reloc::{ParsedRelocCache, Referent, Reloc, RelocKind, RelocLength};
  use crate::resolve::{InputId, Symbol, SymbolId, SymbolTable};
 -use crate::section::{OutputSection, SectionKind};
 +use crate::section::{OutputAtom, OutputSection, SectionKind};
  use crate::symbol::{InputSymbol, SymKind};
  use crate::synth::stubs::{STUB_HELPER_ENTRY_SIZE, STUB_HELPER_HEADER_SIZE, STUB_SIZE};
  use crate::synth::tlv::THREAD_VARIABLE_DESCRIPTOR_SIZE;
      pub thunk_plan: Option<&'a ThunkPlan>,
      pub linkedit: &'a LinkEditPlan,
      pub icf_redirects: Option<&'a HashMap<crate::resolve::AtomId, crate::resolve::AtomId>>,
 +    pub parsed_relocs: &'a ParsedRelocCache,
 +    pub parallel_jobs: usize,
+ }
  struct InputSectionResolveCtx<'a> {
      referent: &'a str,
+ }
 +struct RegularRelocContext<'a> {
 +    input_map: &'a HashMap<InputId, &'a ObjectFile>,
 +    atoms: &'a AtomTable,
 +    resolve: &'a ResolveView<'a>,
 +    thunk_plan: Option<&'a ThunkPlan>,
 +    thunk_addrs: Option<&'a HashMap<usize, u64>>,
 +    parsed_relocs: &'a ParsedRelocCache,
 +}
++
  const THUNK_SIZE: u64 = 12;
  const BR_X16: u32 = 0xd61f_0200;
 +const BRANCH26_MAX_FORWARD_DELTA_BYTES: u64 = ((1u64 << 25) - 1) * 4;
  #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
  enum BranchTargetKey {
          .iter()
          .map(|input| (input.id, input.object))
          .collect();
 -    let mut reloc_cache: HashMap<(InputId, u8), Vec<Reloc>> = HashMap::new();
 -    for input in inputs {
 -        for (sect_idx, section) in input.object.sections.iter().enumerate() {
 -            let relocs = if section.nreloc == 0 {
 -                Vec::new()
 -            } else {
 -                let raws =
 -                    parse_raw_relocs(&section.raw_relocs, 0, section.nreloc).map_err(|err| {
 -                        RelocError {
 -                            input: input.object.path.clone(),
 -                            atom: crate::resolve::AtomId(0),
 -                            atom_offset: 0,
 -                            kind: RelocKind::Unsigned,
 -                            referent: format!("section {},{}", section.segname, section.sectname),
 -                            detail: err.to_string(),
 -                        }
 -                    })?;
 -                parse_relocs(&raws).map_err(|err| RelocError {
 -                    input: input.object.path.clone(),
 -                    atom: crate::resolve::AtomId(0),
 -                    atom_offset: 0,
 -                    kind: RelocKind::Unsigned,
 -                    referent: format!("section {},{}", section.segname, section.sectname),
 -                    detail: err.to_string(),
 -                })?
 -            };
 -            reloc_cache.insert((input.id, (sect_idx + 1) as u8), relocs);
 -        }
 -    }
+-
      let atom_addrs = atom_address_map(layout);
      let atoms_by_input_section = atoms.by_input_section();
      let section_addrs = input_section_address_map(layout, atoms);
          .thunk_plan
          .map(|thunk_plan| thunk_plan.thunk_addrs(layout));
 -    for out_section in &mut layout.sections {
 -        for placed in &mut out_section.atoms {
 -            let atom = atoms.get(placed.atom);
 -            if atom.size == 0 || placed.data.is_empty() {
 -                continue;
 -            }
 -            let obj = input_map.get(&atom.origin).ok_or_else(|| {
 -                reloc_error(
 -                    atom,
 -                    &PathBuf::from("<missing object>"),
 -                    0,
 -                    RelocKind::Unsigned,
 -                    "object",
 -                    "missing parsed object".to_string(),
 -                )
 -            })?;
 -            patch_eh_frame_cie_pointer(&mut placed.data, atom, &resolve)?;
 -            let relocs = reloc_cache
 -                .get(&(atom.origin, atom.input_section))
 -                .map(Vec::as_slice)
 -                .unwrap_or(&[]);
 -            for reloc in relocs_for_atom(relocs, atom) {
 -                apply_one(
 -                    &mut placed.data,
 -                    atom,
 -                    obj,
 -                    reloc,
 -                    &resolve,
 -                    plan.thunk_plan,
 -                    thunk_addrs.as_ref(),
 -                )?;
 -            }
 -        }
 -    }
 +    let regular_ctx = RegularRelocContext {
 +        input_map: &input_map,
 +        atoms,
 +        resolve: &resolve,
 +        thunk_plan: plan.thunk_plan,
 +        thunk_addrs: thunk_addrs.as_ref(),
 +        parsed_relocs: plan.parsed_relocs,
 +    };
 +    apply_regular_relocs(layout, &regular_ctx, plan.parallel_jobs)?;
      if let Some(thunk_plan) = plan.thunk_plan {
          synthesize_thunk_section(layout, thunk_plan, &resolve)?;
              synthetic_plan,
              atoms,
              &input_map,
 -            &reloc_cache,
 +            plan.parsed_relocs,
              &resolve,
          )?;
          synthesize_got_section(layout, synthetic_plan, &resolve)?;
      Ok(())
+ }
 +fn apply_regular_relocs(
 +    layout: &mut Layout,
 +    ctx: &RegularRelocContext<'_>,
 +    parallel_jobs: usize,
 +) -> Result<(), RelocError> {
 +    let parallel_jobs = parallel_jobs.max(1);
 +    for out_section in &mut layout.sections {
 +        let atom_count = out_section.atoms.len();
 +        if parallel_jobs == 1 || atom_count < 2 {
 +            apply_regular_atom_chunk(&mut out_section.atoms, ctx)?;
 +            continue;
 +        }
++
 +        let job_count = parallel_jobs.min(atom_count).max(1);
 +        let chunk_size = atom_count.div_ceil(job_count);
 +        thread::scope(|scope| {
 +            let mut handles = Vec::new();
 +            for chunk in out_section.atoms.chunks_mut(chunk_size) {
 +                handles.push(scope.spawn(move || apply_regular_atom_chunk(chunk, ctx)));
 +            }
 +            for handle in handles {
 +                handle.join().expect("relocation worker panicked")?;
 +            }
 +            Ok::<(), RelocError>(())
 +        })?;
 +    }
 +    Ok(())
 +}
++
 +fn apply_regular_atom_chunk(
 +    placed_atoms: &mut [OutputAtom],
 +    ctx: &RegularRelocContext<'_>,
 +) -> Result<(), RelocError> {
 +    for placed in placed_atoms {
 +        let atom = ctx.atoms.get(placed.atom);
 +        if atom.size == 0 || placed.data.is_empty() {
 +            continue;
 +        }
 +        let obj = ctx.input_map.get(&atom.origin).ok_or_else(|| {
 +            reloc_error(
 +                atom,
 +                &PathBuf::from("<missing object>"),
 +                0,
 +                RelocKind::Unsigned,
 +                "object",
 +                "missing parsed object".to_string(),
 +            )
 +        })?;
 +        patch_eh_frame_cie_pointer(&mut placed.data, atom, ctx.resolve)?;
 +        let relocs = ctx
 +            .parsed_relocs
 +            .get(&(atom.origin, atom.input_section))
 +            .map(Vec::as_slice)
 +            .unwrap_or(&[]);
 +        for reloc in relocs_for_atom(relocs, atom) {
 +            apply_one(
 +                &mut placed.data,
 +                atom,
 +                obj,
 +                reloc,
 +                ctx.resolve,
 +                ctx.thunk_plan,
 +                ctx.thunk_addrs,
 +            )?;
 +        }
 +    }
 +    Ok(())
 +}
++
  fn patch_eh_frame_cie_pointer(
      bytes: &mut [u8],
      atom: &Atom,
+     }
+ }
 +pub struct ThunkPlanningContext<'a> {
 +    pub layout: &'a Layout,
 +    pub inputs: &'a [LayoutInput<'a>],
 +    pub atoms: &'a AtomTable,
 +    pub sym_table: &'a SymbolTable,
 +    pub synthetic_plan: Option<&'a SyntheticPlan>,
 +    pub icf_redirects: Option<&'a HashMap<crate::resolve::AtomId, crate::resolve::AtomId>>,
 +    pub parsed_relocs: &'a ParsedRelocCache,
 +}
++
  pub fn plan_thunks(
      opts: &LinkOptions,
 -    layout: &Layout,
 -    inputs: &[LayoutInput<'_>],
 -    atoms: &AtomTable,
 -    sym_table: &SymbolTable,
 -    synthetic_plan: Option<&SyntheticPlan>,
 -    icf_redirects: Option<&HashMap<crate::resolve::AtomId, crate::resolve::AtomId>>,
 +    ctx: ThunkPlanningContext<'_>,
  ) -> Result<Option<ThunkPlan>, RelocError> {
      if opts.thunks == ThunkMode::None {
          return Ok(None);
+     }
 +    let ThunkPlanningContext {
 +        layout,
 +        inputs,
 +        atoms,
 +        sym_table,
 +        synthetic_plan,
 +        icf_redirects,
 +        parsed_relocs,
 +    } = ctx;
++
 +    if opts.thunks == ThunkMode::Safe && layout_fits_branch26_span(layout) {
 +        return Ok(None);
 +    }
++
      let input_map: HashMap<InputId, &ObjectFile> = inputs
          .iter()
          .map(|input| (input.id, input.object))
          .collect();
 -    let mut reloc_cache: HashMap<(InputId, u8), Vec<Reloc>> = HashMap::new();
 -    for input in inputs {
 -        for (sect_idx, section) in input.object.sections.iter().enumerate() {
 -            let relocs = if section.nreloc == 0 {
 -                Vec::new()
 -            } else {
 -                let raws =
 -                    parse_raw_relocs(&section.raw_relocs, 0, section.nreloc).map_err(|err| {
 -                        RelocError {
 -                            input: input.object.path.clone(),
 -                            atom: crate::resolve::AtomId(0),
 -                            atom_offset: 0,
 -                            kind: RelocKind::Unsigned,
 -                            referent: format!("section {},{}", section.segname, section.sectname),
 -                            detail: err.to_string(),
 -                        }
 -                    })?;
 -                parse_relocs(&raws).map_err(|err| RelocError {
 -                    input: input.object.path.clone(),
 -                    atom: crate::resolve::AtomId(0),
 -                    atom_offset: 0,
 -                    kind: RelocKind::Unsigned,
 -                    referent: format!("section {},{}", section.segname, section.sectname),
 -                    detail: err.to_string(),
 -                })?
 -            };
 -            reloc_cache.insert((input.id, (sect_idx + 1) as u8), relocs);
 -        }
 -    }
+-
      let atom_addrs = atom_address_map(layout);
      let atom_segments = atom_output_segment_map(layout);
      let atoms_by_input_section = atoms.by_input_section();
          let Some(obj) = input_map.get(&atom.origin) else {
              continue;
          };
 -        let relocs = reloc_cache
 +        let relocs = parsed_relocs
              .get(&(atom.origin, atom.input_section))
              .map(Vec::as_slice)
              .unwrap_or(&[]);
      delta & 0b11 == 0 && fits_signed(delta >> 2, 26)
+ }
 +fn layout_fits_branch26_span(layout: &Layout) -> bool {
 +    let mut min_addr = u64::MAX;
 +    let mut max_addr = 0u64;
 +    for section in &layout.sections {
 +        if section.segment == "__LINKEDIT" || section.size == 0 {
 +            continue;
 +        }
 +        min_addr = min_addr.min(section.addr);
 +        max_addr = max_addr.max(section.addr.saturating_add(section.size));
 +    }
 +    min_addr == u64::MAX || max_addr.saturating_sub(min_addr) <= BRANCH26_MAX_FORWARD_DELTA_BYTES
 +}
++
  fn synthesize_thunk_section(
      layout: &mut Layout,
      plan: &ThunkPlan,
          assert!(!fits_signed(-(1 << 25) - 1, 26));
+     }
 +    #[test]
 +    fn branch26_span_fast_path_rejects_only_large_non_linkedit_images() {
 +        let small = Layout {
 +            kind: OutputKind::Executable,
 +            segments: Vec::new(),
 +            sections: vec![
 +                output_section("__TEXT", "__text", 0x1_0000_0000, 0x100),
 +                output_section("__DATA", "__data", 0x1_0001_0000, 0x100),
 +                output_section("__LINKEDIT", "__linkedit", 0x1_8000_0000, 0x1000),
 +            ],
 +        };
 +        assert!(layout_fits_branch26_span(&small));
++
 +        let large = Layout {
 +            kind: OutputKind::Executable,
 +            segments: Vec::new(),
 +            sections: vec![
 +                output_section("__TEXT", "__text", 0x1_0000_0000, 0x100),
 +                output_section(
 +                    "__DATA",
 +                    "__data",
 +                    0x1_0000_0000 + BRANCH26_MAX_FORWARD_DELTA_BYTES + 1,
 +                    0x100,
 +                ),
 +            ],
 +        };
 +        assert!(!layout_fits_branch26_span(&large));
 +    }
++
      #[test]
      fn thunk_plan_splits_monolithic_text_section_into_multiple_islands() {
          let gap = 0x0900_0000u32;
              ..LinkOptions::default()
          };
          let base_layout = Layout::build(OutputKind::Executable, &inputs, &atoms, 0);
 -        let plan = plan_thunks(&opts, &base_layout, &inputs, &atoms, &sym_table, None, None)
 -            .unwrap()
 -            .unwrap();
 +        let parsed_relocs = crate::macho::writer::build_parsed_reloc_cache(&inputs).unwrap();
 +        let plan = plan_thunks(
 +            &opts,
 +            ThunkPlanningContext {
 +                layout: &base_layout,
 +                inputs: &inputs,
 +                atoms: &atoms,
 +                sym_table: &sym_table,
 +                synthetic_plan: None,
 +                icf_redirects: None,
 +                parsed_relocs: &parsed_relocs,
 +            },
 +        )
 +        .unwrap()
 +        .unwrap();
          assert_eq!(
              plan.redirect_for(caller1, 0),
              vec!["__text", "__thunks", "__text", "__thunks", "__text"]
          );
 -        let replan = plan_thunks(&opts, &rebuilt, &inputs, &atoms, &sym_table, None, None)
 -            .unwrap()
 -            .unwrap();
 +        let replan = plan_thunks(
 +            &opts,
 +            ThunkPlanningContext {
 +                layout: &rebuilt,
 +                inputs: &inputs,
 +                atoms: &atoms,
 +                sym_table: &sym_table,
 +                synthetic_plan: None,
 +                icf_redirects: None,
 +                parsed_relocs: &parsed_relocs,
 +            },
 +        )
 +        .unwrap()
 +        .unwrap();
          assert_eq!(
              replan, plan,
              "expected thunk planning to converge once the intra-section islands exist"
          out
+     }
 +    fn output_section(segment: &str, name: &str, addr: u64, size: u64) -> OutputSection {
 +        OutputSection {
 +            segment: segment.into(),
 +            name: name.into(),
 +            kind: SectionKind::Text,
 +            align_pow2: 2,
 +            flags: 0,
 +            reserved1: 0,
 +            reserved2: 0,
 +            reserved3: 0,
 +            atoms: Vec::new(),
 +            synthetic_offset: 0,
 +            synthetic_data: Vec::new(),
 +            addr,
 +            size,
 +            file_off: 0,
 +        }
 +    }
++
      fn thunk_test_object(raw_relocs: Vec<u8>, target_offset: u64, section_size: u64) -> ObjectFile {
          let strings = b"\0_target\0".to_vec();
          ObjectFile {

src/reloc/mod.rsmodified

  pub mod arm64;
 +use std::collections::HashMap;
++
  use crate::macho::constants::*;
  use crate::macho::reader::{u32_le, ReadError};
 +use crate::resolve::InputId;
  /// Size of one `relocation_info` on the wire.
  pub const RAW_RELOC_SIZE: usize = 8;
      pub subtrahend: Option<Referent>,
+ }
 +pub type ParsedRelocCache = HashMap<(InputId, u8), Vec<Reloc>>;
++
  fn referent_from(raw: &RawRelocation) -> Result<Referent, ReadError> {
      if raw.r_extern {
          Ok(Referent::Symbol(raw.r_symbolnum))

src/resolve.rsmodified

  //! happened (inserted / replaced / kept / pending archive fetch), and they
  //! drive the outer loop.
 -use std::collections::HashMap;
 -use std::path::PathBuf;
 -use std::rc::Rc;
 +use std::collections::{HashMap, HashSet, VecDeque};
 +use std::path::{Path, PathBuf};
 +use std::sync::{mpsc, Arc, Mutex};
 +use std::thread;
  use crate::archive::{Archive, ArchiveError};
  use crate::input::ObjectFile;
  #[derive(Debug, Default)]
  pub struct StringInterner {
 -    strings: Vec<Rc<str>>,
 -    index: HashMap<Rc<str>, u32>,
 +    strings: Vec<Arc<str>>,
 +    index: HashMap<Arc<str>, u32>,
+ }
  impl StringInterner {
+     }
      /// Intern `s`, returning the existing handle when the string was already
 -    /// seen. Allocates at most one `Rc<str>` per unique name.
 +    /// seen. Allocates at most one `Arc<str>` per unique name.
      pub fn intern(&mut self, s: &str) -> Istr {
          if let Some(&i) = self.index.get(s) {
              return Istr(i);
+         }
 -        let rc: Rc<str> = Rc::from(s);
 +        let rc: Arc<str> = Arc::from(s);
          let id = self.strings.len() as u32;
          self.strings.push(rc.clone());
          self.index.insert(rc, id);
          Istr(id)
+     }
 +    pub fn get(&self, s: &str) -> Option<Istr> {
 +        self.index.get(s).copied().map(Istr)
 +    }
++
      pub fn resolve(&self, i: Istr) -> &str {
          &self.strings[i.0 as usize]
+     }
      pub path: PathBuf,
      pub load_order: usize,
      pub archive_member_offset: Option<u32>,
 -    /// Raw bytes; `ObjectFile::parse` re-runs cheaply against this on
 -    /// demand. We don't cache a parsed view because `ObjectFile` copies
 -    /// the fields it needs on construction, so re-parse is idempotent.
 +    /// Raw bytes retained for diagnostics and future low-level readers.
      pub bytes: Vec<u8>,
 +    /// Parsed object view. It owns section/relocation/string-table buffers,
 +    /// so it is safe to build off-thread and then borrow during the link.
 +    pub parsed: ObjectFile,
+ }
  #[derive(Debug)]
      /// the fixed-point loop from re-ingesting the same object twice —
      /// important both for correctness (no duplicate-strong errors from
      /// our own symbols) and for keeping transitions deterministic.
 -    pub fetched: std::collections::HashSet<u32>,
 +    pub fetched: HashSet<u32>,
+ }
  #[derive(Debug)]
          load_order: usize,
      ) -> Result<InputId, InputAddError> {
          // Validate now — we'd rather catch a bad object at the add site.
 -        ObjectFile::parse(&path, &bytes)?;
 +        let parsed = ObjectFile::parse(&path, &bytes)?;
 +        Ok(self.add_parsed_object(path, bytes, parsed, load_order))
 +    }
++
 +    pub fn add_parsed_object(
 +        &mut self,
 +        path: PathBuf,
 +        bytes: Vec<u8>,
 +        parsed: ObjectFile,
 +        load_order: usize,
 +    ) -> InputId {
          let id = InputId(self.objects.len() as u32);
          self.objects.push(ObjectInput {
              path,
              load_order,
              archive_member_offset: None,
              bytes,
 +            parsed,
          });
 -        Ok(id)
 +        id
+     }
      /// Register an `.a` file.
          load_order: usize,
      ) -> Result<ArchiveId, InputAddError> {
          Archive::open(&path, &bytes)?; // validate
 +        Ok(self.add_validated_archive(path, bytes, load_order))
 +    }
++
 +    pub fn add_validated_archive(
 +        &mut self,
 +        path: PathBuf,
 +        bytes: Vec<u8>,
 +        load_order: usize,
 +    ) -> ArchiveId {
          let id = ArchiveId(self.archives.len() as u32);
          self.archives.push(ArchiveInput {
              path,
              bytes,
              fetched: std::collections::HashSet::new(),
          });
 -        Ok(id)
 +        id
+     }
      /// Register a `.dylib`. TBD-backed dylibs go through
          &self.dylibs[id.0 as usize]
+     }
 -    /// Parse an `ObjectFile` view of a registered object. Fast — `ObjectFile`
 -    /// owns its buffers, so this is just the Mach-O walk cost.
 -    pub fn object_file(&self, id: InputId) -> Result<ObjectFile, ReadError> {
 +    /// Borrow a parsed `ObjectFile` view of a registered object.
 +    pub fn object_file(&self, id: InputId) -> Result<&ObjectFile, ReadError> {
          let o = &self.objects[id.0 as usize];
 -        ObjectFile::parse(&o.path, &o.bytes)
 +        Ok(&o.parsed)
+     }
      /// Open an `Archive` view, borrowing from the registry's bytes for the
          self.by_name.get(&name).copied()
+     }
 +    pub fn lookup_str(&self, name: &str) -> Option<SymbolId> {
 +        self.interner.get(name).and_then(|name| self.lookup(name))
 +    }
++
      pub fn get(&self, id: SymbolId) -> &Symbol {
          &self.symbols[id.0 as usize]
+     }
      pub referrers: ReferrerLog,
+ }
 +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
 +struct ArchiveMemberKey {
 +    archive: ArchiveId,
 +    member: MemberId,
 +}
++
 +#[derive(Clone, Copy)]
 +struct ArchiveMemberLoadJob<'a> {
 +    index: usize,
 +    key: ArchiveMemberKey,
 +    archive_path: &'a Path,
 +    archive_bytes: &'a [u8],
 +    archive_load_order: usize,
 +}
++
 +struct LoadedArchiveMember {
 +    key: ArchiveMemberKey,
 +    archive_load_order: usize,
 +    logical_path: PathBuf,
 +    bytes: Vec<u8>,
 +    parsed: ObjectFile,
 +}
++
 +fn make_archive_member_jobs<'a>(
 +    inputs: &'a Inputs,
 +    keys: Vec<ArchiveMemberKey>,
 +) -> Vec<ArchiveMemberLoadJob<'a>> {
 +    keys.into_iter()
 +        .enumerate()
 +        .map(|(index, key)| {
 +            let archive = &inputs.archives[key.archive.0 as usize];
 +            ArchiveMemberLoadJob {
 +                index,
 +                key,
 +                archive_path: &archive.path,
 +                archive_bytes: &archive.bytes,
 +                archive_load_order: archive.load_order,
 +            }
 +        })
 +        .collect()
 +}
++
 +fn load_archive_members_parallel(
 +    inputs: &Inputs,
 +    keys: Vec<ArchiveMemberKey>,
 +    parallel_jobs: usize,
 +) -> Vec<(ArchiveMemberKey, Result<LoadedArchiveMember, FetchError>)> {
 +    let jobs = make_archive_member_jobs(inputs, keys);
 +    if jobs.is_empty() {
 +        return Vec::new();
 +    }
 +    let job_count = parallel_jobs.max(1).min(jobs.len()).max(1);
 +    if job_count == 1 {
 +        return jobs
 +            .into_iter()
 +            .map(load_archive_member_job)
 +            .map(|(_, key, result)| (key, result))
 +            .collect();
 +    }
++
 +    let queue = Arc::new(Mutex::new(VecDeque::from(jobs)));
 +    let (tx, rx) = mpsc::channel();
 +    let mut results = thread::scope(|scope| {
 +        for _ in 0..job_count {
 +            let queue = Arc::clone(&queue);
 +            let tx = tx.clone();
 +            scope.spawn(move || loop {
 +                let Some(job) = queue
 +                    .lock()
 +                    .expect("archive member load queue mutex poisoned")
 +                    .pop_front()
 +                else {
 +                    break;
 +                };
 +                tx.send(load_archive_member_job(job))
 +                    .expect("archive member load receiver should stay live");
 +            });
 +        }
 +        drop(tx);
 +        rx.into_iter().collect::<Vec<_>>()
 +    });
 +    results.sort_by_key(|(index, _, _)| *index);
 +    results
 +        .into_iter()
 +        .map(|(_, key, result)| (key, result))
 +        .collect()
 +}
++
 +fn load_archive_member_job(
 +    job: ArchiveMemberLoadJob<'_>,
 +) -> (
 +    usize,
 +    ArchiveMemberKey,
 +    Result<LoadedArchiveMember, FetchError>,
 +) {
 +    let result = (|| {
 +        let archive = Archive::open(job.archive_path, job.archive_bytes)?;
 +        let member =
 +            archive
 +                .member_at_offset(job.key.member.0)
 +                .ok_or(FetchError::MemberNotFound {
 +                    archive: job.key.archive,
 +                    member: job.key.member,
 +                })?;
 +        let logical_path =
 +            PathBuf::from(format!("{}({})", job.archive_path.display(), member.name));
 +        let bytes = member.body.to_vec();
 +        let parsed = ObjectFile::parse(&logical_path, &bytes)?;
 +        Ok(LoadedArchiveMember {
 +            key: job.key,
 +            archive_load_order: job.archive_load_order,
 +            logical_path,
 +            bytes,
 +            parsed,
 +        })
 +    })();
 +    (job.index, job.key, result)
 +}
++
 +fn archive_member_key(pending: PendingFetch) -> ArchiveMemberKey {
 +    ArchiveMemberKey {
 +        archive: pending.archive,
 +        member: pending.member,
 +    }
 +}
++
 +fn archive_member_is_fetched(inputs: &Inputs, key: ArchiveMemberKey) -> bool {
 +    inputs.archives[key.archive.0 as usize]
 +        .fetched
 +        .contains(&key.member.0)
 +}
++
  /// Shared ingest: copy one archive member's body into a fresh
  /// `ObjectInput`, mark it fetched, and seed its symbols. Callers either
  /// respond to a demand-driven `PendingFetch` or force-pull the member.
 -fn ingest_member_bytes(
 +fn ingest_loaded_member(
      inputs: &mut Inputs,
      table: &mut SymbolTable,
 -    archive_id: ArchiveId,
 -    member_id: MemberId,
 +    loaded: LoadedArchiveMember,
      report: &mut DrainReport,
  ) -> Result<Vec<PendingFetch>, FetchError> {
 -    let archive_load_order = inputs.archives[archive_id.0 as usize].load_order;
 -    let ai = &inputs.archives[archive_id.0 as usize];
 -    if ai.fetched.contains(&member_id.0) {
 +    if archive_member_is_fetched(inputs, loaded.key) {
          return Ok(Vec::new());
+     }
 -    // Extract owned data before mutating the registry.
 -    let (logical_path, member_bytes) = {
 -        let archive = Archive::open(&ai.path, &ai.bytes)?;
 -        let member = archive
 -            .member_at_offset(member_id.0)
 -            .ok_or(FetchError::MemberNotFound {
 -                archive: archive_id,
 -                member: member_id,
 -            })?;
 -        let logical = format!("{}({})", ai.path.display(), member.name);
 -        (logical, member.body.to_vec())
 -    };
+-
 -    inputs.archives[archive_id.0 as usize]
 +    inputs.archives[loaded.key.archive.0 as usize]
          .fetched
 -        .insert(member_id.0);
 +        .insert(loaded.key.member.0);
      let input_id = InputId(inputs.objects.len() as u32);
      inputs.objects.push(ObjectInput {
 -        path: PathBuf::from(logical_path),
 -        load_order: archive_load_order,
 -        archive_member_offset: Some(member_id.0),
 -        bytes: member_bytes,
 +        path: loaded.logical_path,
 +        load_order: loaded.archive_load_order,
 +        archive_member_offset: Some(loaded.key.member.0),
 +        bytes: loaded.bytes,
 +        parsed: loaded.parsed,
      });
      report.fetched_members += 1;
      report
      Ok(sub_report.pending_fetches)
+ }
 +fn load_and_ingest_member(
 +    inputs: &mut Inputs,
 +    table: &mut SymbolTable,
 +    key: ArchiveMemberKey,
 +    report: &mut DrainReport,
 +    parallel_jobs: usize,
 +) -> Result<Vec<PendingFetch>, FetchError> {
 +    if archive_member_is_fetched(inputs, key) {
 +        return Ok(Vec::new());
 +    }
 +    let loaded = load_archive_members_parallel(inputs, vec![key], parallel_jobs)
 +        .into_iter()
 +        .next()
 +        .expect("single archive member load should produce one result")
 +        .1?;
 +    ingest_loaded_member(inputs, table, loaded, report)
 +}
++
  /// Pull `pending`'s member only if the symbol slot is still a
  /// `LazyArchive` (i.e., a strong Defined hasn't superseded it). Returns
  /// any new `PendingFetch` entries triggered by the inserted member.
      table: &mut SymbolTable,
      pending: PendingFetch,
      report: &mut DrainReport,
 +    parallel_jobs: usize,
  ) -> Result<Vec<PendingFetch>, FetchError> {
      let slot_is_still_lazy = matches!(table.get(pending.id), Symbol::LazyArchive { .. });
      if !slot_is_still_lazy {
          return Ok(Vec::new());
+     }
 -    ingest_member_bytes(inputs, table, pending.archive, pending.member, report)
 +    load_and_ingest_member(
 +        inputs,
 +        table,
 +        archive_member_key(pending),
 +        report,
 +        parallel_jobs,
 +    )
+ }
  /// Pull every member of one archive (bypasses demand tracking). Respects
      table: &mut SymbolTable,
      archive_id: ArchiveId,
      report: &mut DrainReport,
 +    parallel_jobs: usize,
  ) -> Result<(), FetchError> {
      let member_offsets: Vec<u32> = {
          let ai = &inputs.archives[archive_id.0 as usize];
              .map(|m| m.header_offset as u32)
              .collect()
      };
 +    let keys = member_offsets
 +        .into_iter()
 +        .map(|offset| ArchiveMemberKey {
 +            archive: archive_id,
 +            member: MemberId(offset),
 +        })
 +        .collect();
      let mut queue: Vec<PendingFetch> = Vec::new();
 -    for offset in member_offsets {
 -        let new = ingest_member_bytes(inputs, table, archive_id, MemberId(offset), report)?;
 +    for (_, loaded) in load_archive_members_parallel(inputs, keys, parallel_jobs) {
 +        let new = ingest_loaded_member(inputs, table, loaded?, report)?;
          queue.extend(new);
+     }
      while let Some(p) = queue.pop() {
 -        let new = fetch_and_ingest_one(inputs, table, p, report)?;
 +        let new = fetch_and_ingest_one(inputs, table, p, report, parallel_jobs)?;
          queue.extend(new);
+     }
      Ok(())
      inputs: &mut Inputs,
      table: &mut SymbolTable,
      report: &mut DrainReport,
 +    parallel_jobs: usize,
  ) -> Result<(), FetchError> {
      for i in 0..inputs.archives.len() {
 -        force_load_archive(inputs, table, ArchiveId(i as u32), report)?;
 +        force_load_archive(inputs, table, ArchiveId(i as u32), report, parallel_jobs)?;
+     }
      Ok(())
+ }
      inputs: &mut Inputs,
      table: &mut SymbolTable,
      initial: Vec<PendingFetch>,
 +    parallel_jobs: usize,
  ) -> Result<DrainReport, FetchError> {
      let mut queue = initial;
 +    let mut prepared = HashMap::new();
      let mut report = DrainReport::default();
      while let Some(p) = queue.pop() {
 -        let new_pending = fetch_and_ingest_one(inputs, table, p, &mut report)?;
 +        let key = archive_member_key(p);
 +        let slot_is_still_lazy = matches!(table.get(p.id), Symbol::LazyArchive { .. });
 +        if !slot_is_still_lazy || archive_member_is_fetched(inputs, key) {
 +            prepared.remove(&key);
 +            continue;
 +        }
 +        // Parse siblings ahead of time, but only ingest the current stack
 +        // entry after re-checking its lazy slot. This keeps member order stable.
 +        if !prepared.contains_key(&key) {
 +            preparse_pending_fetches(inputs, table, p, &queue, &mut prepared, parallel_jobs);
 +        }
 +        let Some(loaded) = prepared.remove(&key) else {
 +            continue;
 +        };
 +        let loaded = loaded?;
 +        let slot_is_still_lazy = matches!(table.get(p.id), Symbol::LazyArchive { .. });
 +        if !slot_is_still_lazy || archive_member_is_fetched(inputs, key) {
 +            continue;
 +        }
 +        let new_pending = ingest_loaded_member(inputs, table, loaded, &mut report)?;
          queue.extend(new_pending);
+     }
      Ok(report)
+ }
 +fn preparse_pending_fetches(
 +    inputs: &Inputs,
 +    table: &SymbolTable,
 +    current: PendingFetch,
 +    queue: &[PendingFetch],
 +    prepared: &mut HashMap<ArchiveMemberKey, Result<LoadedArchiveMember, FetchError>>,
 +    parallel_jobs: usize,
 +) {
 +    let mut seen = HashSet::new();
 +    let mut keys = Vec::new();
 +    for pending in std::iter::once(&current).chain(queue.iter().rev()) {
 +        let key = archive_member_key(*pending);
 +        if prepared.contains_key(&key)
 +            || archive_member_is_fetched(inputs, key)
 +            || !matches!(table.get(pending.id), Symbol::LazyArchive { .. })
 +            || !seen.insert(key)
 +        {
 +            continue;
 +        }
 +        keys.push(key);
 +    }
 +    for (key, result) in load_archive_members_parallel(inputs, keys, parallel_jobs) {
 +        prepared.insert(key, result);
 +    }
 +}
++
  /// Turn a wire-form `InputSymbol` into a resolver-side `Symbol`. Returns
  /// `None` for kinds the resolver does not track (currently: aliases with
  /// unresolved target strx — Sprint 8's resolver defers those for now).

src/string_table.rsmodified

  #[derive(Debug, Clone, Default, PartialEq, Eq)]
  pub struct StringTableBuilder {
      roots: Vec<RootString>,
 -    roots_by_last_byte: HashMap<u8, Vec<usize>>,
      offsets: HashMap<String, u32>,
+ }
      offset: u32,
+ }
 +#[derive(Debug, Clone, Copy, PartialEq, Eq)]
 +struct BorrowedRootString<'a> {
 +    name: &'a str,
 +    offset: u32,
 +}
++
  impl StringTableBuilder {
      pub fn new() -> Self {
          Self::default()
          self.offsets.entry(name.to_string()).or_insert(0);
+     }
 +    pub fn build_with_name_offsets<'a, I>(names: I) -> (Vec<u8>, Vec<u32>)
 +    where
 +        I: IntoIterator<Item = &'a str>,
 +    {
 +        let mut entries: Vec<_> = names
 +            .into_iter()
 +            .enumerate()
 +            .map(|(index, name)| (name, index))
 +            .collect();
 +        let mut offsets = vec![0; entries.len()];
 +        entries.sort_by(|(lhs, lhs_index), (rhs, rhs_index)| {
 +            reverse_suffix_order(lhs, rhs).then_with(|| lhs_index.cmp(rhs_index))
 +        });
++
 +        let mut raw = vec![0u8];
 +        let mut roots = Vec::new();
 +        for (name, index) in entries {
 +            if let Some(offset) = find_borrowed_suffix_offset(&roots, name) {
 +                offsets[index] = offset;
 +                continue;
 +            }
++
 +            let offset = raw.len() as u32;
 +            raw.extend_from_slice(name.as_bytes());
 +            raw.push(0);
 +            roots.push(BorrowedRootString { name, offset });
 +            offsets[index] = offset;
 +        }
++
 +        while !raw.len().is_multiple_of(8) {
 +            raw.push(0);
 +        }
 +        (raw, offsets)
 +    }
++
      pub fn finish(mut self) -> (Vec<u8>, HashMap<String, u32>) {
          let mut names: Vec<String> = self.offsets.keys().cloned().collect();
          names.sort_by(|lhs, rhs| reverse_suffix_order(lhs, rhs));
              let offset = raw.len() as u32;
              raw.extend_from_slice(name.as_bytes());
              raw.push(0);
 -            let root_index = self.roots.len();
              self.roots.push(RootString {
                  name: name.clone(),
                  offset,
              });
 -            if let Some(&last_byte) = name.as_bytes().last() {
 -                self.roots_by_last_byte
 -                    .entry(last_byte)
 -                    .or_default()
 -                    .push(root_index);
 -            }
              self.offsets.insert(name, offset);
+         }
+     }
      fn find_suffix_offset(&self, name: &str) -> Option<u32> {
 -        let last_byte = *name.as_bytes().last()?;
 -        self.roots_by_last_byte
 -            .get(&last_byte)?
 -            .iter()
 -            .find_map(|&idx| {
 -                let existing = &self.roots[idx];
 -                (existing.name.len() >= name.len() && existing.name.ends_with(name))
 -                    .then(|| existing.offset + (existing.name.len() - name.len()) as u32)
 -            })
 +        if name.is_empty() {
 +            return Some(0);
 +        }
 +        let insert_at = self
 +            .roots
 +            .partition_point(|root| reverse_suffix_order(&root.name, name).is_lt());
 +        let existing = self.roots.get(insert_at.checked_sub(1)?)?;
 +        (existing.name.len() >= name.len() && existing.name.ends_with(name))
 +            .then(|| existing.offset + (existing.name.len() - name.len()) as u32)
 +    }
 +}
++
 +fn find_borrowed_suffix_offset(roots: &[BorrowedRootString<'_>], name: &str) -> Option<u32> {
 +    if name.is_empty() {
 +        return Some(0);
+     }
 +    let insert_at = roots.partition_point(|root| reverse_suffix_order(root.name, name).is_lt());
 +    let existing = roots.get(insert_at.checked_sub(1)?)?;
 +    (existing.name.len() >= name.len() && existing.name.ends_with(name))
 +        .then(|| existing.offset + (existing.name.len() - name.len()) as u32)
+ }
  fn reverse_suffix_order(lhs: &str, rhs: &str) -> std::cmp::Ordering {
          assert_eq!(table.get(offsets["_alpha"]).unwrap(), "_alpha");
          assert_eq!(table.get(offsets["_beta"]).unwrap(), "_beta");
+     }
++
 +    #[test]
 +    fn builder_returns_offsets_in_input_order_without_cloning_keys() {
 +        let names = ["_helper", "_afs_helper", "_helper", ""];
 +        let (bytes, offsets) = StringTableBuilder::build_with_name_offsets(names);
 +        let table = StringTable::from_bytes(bytes);
++
 +        assert_eq!(table.get(offsets[0]).unwrap(), "_helper");
 +        assert_eq!(table.get(offsets[1]).unwrap(), "_afs_helper");
 +        assert_eq!(table.get(offsets[2]).unwrap(), "_helper");
 +        assert_eq!(table.get(offsets[3]).unwrap(), "");
 +        assert_eq!(offsets[0], offsets[2]);
 +        assert_eq!(offsets[0], offsets[1] + 4);
 +        assert_eq!(table.as_bytes().len() % 8, 0);
 +    }
+ }

src/synth/code_sig.rsmodified

 +use std::thread;
++
  use crate::layout::Layout;
  use crate::section::is_executable;
  use crate::LinkOptions;
+     }
      pub fn build(&self, signed_prefix: &[u8]) -> Vec<u8> {
 +        self.build_with_jobs(signed_prefix, 1)
 +    }
++
 +    pub fn build_with_jobs(&self, signed_prefix: &[u8], parallel_jobs: usize) -> Vec<u8> {
          debug_assert_eq!(signed_prefix.len(), self.code_limit as usize);
          let code_slots = code_slots(self.code_limit as usize);
          out.extend_from_slice(self.identifier.as_bytes());
          out.push(0);
 -        for page in signed_prefix.chunks(PAGE_SIZE) {
 -            out.extend_from_slice(&sha256(page));
 +        for hash in page_hashes(signed_prefix, parallel_jobs) {
 +            out.extend_from_slice(&hash);
+         }
          out.resize(padded_len, 0);
          out
+     }
+ }
 +fn page_hashes(data: &[u8], parallel_jobs: usize) -> Vec<[u8; 32]> {
 +    let page_count = code_slots(data.len());
 +    if page_count == 0 {
 +        return Vec::new();
 +    }
 +    let parallel_jobs = parallel_jobs.max(1).min(page_count);
 +    if parallel_jobs == 1 || page_count < 2 {
 +        return data.chunks(PAGE_SIZE).map(sha256).collect();
 +    }
++
 +    let chunk_pages = page_count.div_ceil(parallel_jobs);
 +    let chunk_bytes = PAGE_SIZE * chunk_pages;
 +    thread::scope(|scope| {
 +        let mut handles = Vec::new();
 +        for chunk in data.chunks(chunk_bytes) {
 +            handles
 +                .push(scope.spawn(move || chunk.chunks(PAGE_SIZE).map(sha256).collect::<Vec<_>>()));
 +        }
++
 +        let mut hashes = Vec::with_capacity(page_count);
 +        for handle in handles {
 +            hashes.extend(handle.join().expect("code-signature hash worker panicked"));
 +        }
 +        hashes
 +    })
 +}
++
  fn push_be_u32(out: &mut Vec<u8>, value: u32) {
      out.extend_from_slice(&value.to_be_bytes());
+ }
          assert_eq!(read_be_u32(&blob, 52), 16_512);
          assert_eq!(&blob[108..114], b"apple\0");
+     }
++
 +    #[test]
 +    fn parallel_page_hashes_preserve_serial_order() {
 +        let mut bytes = Vec::with_capacity(PAGE_SIZE * 9 + 123);
 +        for index in 0..PAGE_SIZE * 9 + 123 {
 +            bytes.push((index.wrapping_mul(37).wrapping_add(19) & 0xff) as u8);
 +        }
++
 +        let serial = page_hashes(&bytes, 1);
 +        let parallel = page_hashes(&bytes, 4);
 +        assert_eq!(parallel, serial);
 +        assert_eq!(parallel.len(), 10);
 +    }
++
 +    #[test]
 +    fn parallel_code_signature_matches_single_worker() {
 +        let opts = LinkOptions {
 +            output: Some("parallel".into()),
 +            ..LinkOptions::default()
 +        };
 +        let code_limit = PAGE_SIZE * 11 + 777;
 +        let plan = CodeSignaturePlan::new(
 +            &Layout::empty(crate::OutputKind::Executable, 0),
 +            &opts,
 +            code_limit as u64,
 +            true,
 +        )
 +        .unwrap();
 +        let mut signed_prefix = Vec::with_capacity(code_limit);
 +        for index in 0..code_limit {
 +            signed_prefix.push((index.wrapping_mul(13).wrapping_add(index / 7) & 0xff) as u8);
 +        }
++
 +        assert_eq!(
 +            plan.build_with_jobs(&signed_prefix, 8),
 +            plan.build_with_jobs(&signed_prefix, 1)
 +        );
 +    }
+ }

src/synth/dyld_info.rsmodified

 -use std::collections::BTreeMap;
+-
  use crate::leb::{write_sleb, write_uleb};
  use crate::macho::constants::{
      BIND_IMMEDIATE_MASK, BIND_OPCODE_ADD_ADDR_ULEB, BIND_OPCODE_DO_BIND,
      pointer_type_set: bool,
+ }
 -#[derive(Debug, Clone, Default)]
 -struct TrieNode {
 -    terminal: Option<ExportEntry>,
 -    children: BTreeMap<u8, TrieNode>,
 -}
+-
 -impl TrieNode {
 -    fn insert(&mut self, name: &str, entry: ExportEntry) {
 -        let mut node = self;
 -        for byte in name.bytes() {
 -            node = node.children.entry(byte).or_default();
 -        }
 -        node.terminal = Some(entry);
 -    }
 -}
+-
  #[derive(Debug, Clone)]
  struct FlatTrieNode {
 -    terminal: Option<ExportEntry>,
 +    terminal_payload: Vec<u8>,
      children: Vec<(String, usize)>,
+ }
          return Vec::new();
+     }
 -    let mut sorted = entries.to_vec();
 +    let mut sorted: Vec<&ExportEntry> = entries.iter().collect();
      sorted.sort_by(|lhs, rhs| lhs.name.cmp(&rhs.name));
 -    let mut root = TrieNode::default();
 -    for entry in sorted {
 -        let name = entry.name.clone();
 -        root.insert(&name, entry);
 -    }
+-
      let mut nodes = Vec::new();
 -    flatten_trie(&root, &mut nodes);
 +    flatten_sorted_export_trie(&sorted, 0, &mut nodes);
      let mut offsets = vec![0usize; nodes.len()];
      loop {
      out
+ }
 -fn flatten_trie(node: &TrieNode, flat: &mut Vec<FlatTrieNode>) -> usize {
 +fn flatten_sorted_export_trie(
 +    entries: &[&ExportEntry],
 +    prefix_len: usize,
 +    flat: &mut Vec<FlatTrieNode>,
 +) -> usize {
      let id = flat.len();
 +    let mut entry_idx = 0usize;
 +    let mut terminal = None;
 +    while entries
 +        .get(entry_idx)
 +        .is_some_and(|entry| entry.name.len() == prefix_len)
 +    {
 +        terminal = Some(entries[entry_idx]);
 +        entry_idx += 1;
 +    }
++
      flat.push(FlatTrieNode {
 -        terminal: node.terminal.clone(),
 +        terminal_payload: terminal_payload(terminal),
          children: Vec::new(),
      });
 -    let mut children = Vec::with_capacity(node.children.len());
 -    for (&edge, child) in &node.children {
 -        let (label, child_id) = flatten_edge(edge, child, flat);
 +    let mut children = Vec::new();
 +    while entry_idx < entries.len() {
 +        let edge = entries[entry_idx].name.as_bytes()[prefix_len];
 +        let group_start = entry_idx;
 +        entry_idx += 1;
 +        while entry_idx < entries.len() && entries[entry_idx].name.as_bytes()[prefix_len] == edge {
 +            entry_idx += 1;
 +        }
 +        let group = &entries[group_start..entry_idx];
 +        let label_end = common_prefix_len(group, prefix_len + 1);
 +        let label = String::from_utf8(group[0].name.as_bytes()[prefix_len..label_end].to_vec())
 +            .expect("export labels should stay UTF-8");
 +        let child_id = flatten_sorted_export_trie(group, label_end, flat);
          children.push((label, child_id));
+     }
      flat[id].children = children;
      id
+ }
 -fn flatten_edge(first: u8, child: &TrieNode, flat: &mut Vec<FlatTrieNode>) -> (String, usize) {
 -    let mut label = vec![first];
 -    let mut node = child;
 -    while node.terminal.is_none() && node.children.len() == 1 {
 -        let (&next, next_child) = node
 -            .children
 -            .iter()
 -            .next()
 -            .expect("single-child trie node should expose one edge");
 -        label.push(next);
 -        node = next_child;
 -    }
 -    let label = String::from_utf8(label).expect("export labels should stay UTF-8");
 -    let child_id = flatten_trie(node, flat);
 -    (label, child_id)
 +fn common_prefix_len(entries: &[&ExportEntry], start: usize) -> usize {
 +    let first = entries
 +        .first()
 +        .expect("export trie child groups should be non-empty")
 +        .name
 +        .as_bytes();
 +    let mut len = first.len();
 +    for entry in &entries[1..] {
 +        let bytes = entry.name.as_bytes();
 +        len = len.min(bytes.len());
 +        let mut idx = start;
 +        while idx < len && first[idx] == bytes[idx] {
 +            idx += 1;
 +        }
 +        len = idx;
 +    }
 +    len
+ }
  fn trie_node_size(node: &FlatTrieNode, offsets: &[usize]) -> usize {
 -    let terminal = terminal_payload(node.terminal.as_ref());
 -    let mut size = uleb_size(terminal.len() as u64) + terminal.len() + 1;
 +    let mut size = uleb_size(node.terminal_payload.len() as u64) + node.terminal_payload.len() + 1;
      for (edge, child) in &node.children {
          size += edge.len() + 1 + uleb_size(offsets[*child] as u64);
+     }
+ }
  fn emit_trie_node(node: &FlatTrieNode, offsets: &[usize], out: &mut Vec<u8>) {
 -    let terminal = terminal_payload(node.terminal.as_ref());
      let mut stream = OpcodeStream::new();
 -    stream.uleb(terminal.len() as u64);
 -    stream.bytes(&terminal);
 +    stream.uleb(node.terminal_payload.len() as u64);
 +    stream.bytes(&node.terminal_payload);
      stream
          .byte(u8::try_from(node.children.len()).expect("export trie node fanout should fit in u8"));
      for (edge, child) in &node.children {
  pub fn emit_bind_records(specs: &[BindRecordSpec<'_>]) -> Vec<u8> {
      let mut out = OpcodeStream::new();
      let mut state = BindState::default();
 -    let mut current_symbol: Option<String> = None;
 +    let mut current_symbol: Option<&str> = None;
      let mut idx = 0usize;
      while idx < specs.len() {
              state.ordinal = Some(spec.ordinal);
+         }
 -        if current_symbol.as_deref() != Some(spec.name)
 -            || state.weak_import != Some(spec.weak_import)
 -        {
 +        if current_symbol != Some(spec.name) || state.weak_import != Some(spec.weak_import) {
              out.byte(
                  BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM | bind_symbol_flags(spec.weak_import),
              );
              out.string(spec.name);
 -            current_symbol = Some(spec.name.to_string());
 +            current_symbol = Some(spec.name);
              state.weak_import = Some(spec.weak_import);
+         }

src/synth/mod.rsmodified

      S_ATTR_PURE_INSTRUCTIONS, S_ATTR_SOME_INSTRUCTIONS, S_LAZY_SYMBOL_POINTERS,
      S_NON_LAZY_SYMBOL_POINTERS, S_REGULAR, S_SYMBOL_STUBS, S_THREAD_LOCAL_VARIABLE_POINTERS,
  };
 -use crate::reloc::{parse_raw_relocs, parse_relocs, Referent, Reloc, RelocKind, RelocLength};
 +use crate::reloc::{
 +    parse_raw_relocs, parse_relocs, ParsedRelocCache, Referent, Reloc, RelocKind, RelocLength,
 +};
  use crate::resolve::{
      AtomId, DylibId, DylibInput, InputId, InsertOutcome, Symbol, SymbolId, SymbolTable,
  };
          sym_table: &mut SymbolTable,
          dylibs: &[DylibInput],
          live_atoms: Option<&HashSet<AtomId>>,
 +    ) -> Result<Self, SynthError> {
 +        let reloc_cache = build_synthetic_reloc_cache(inputs)?;
 +        Self::build_filtered_with_relocs(inputs, atoms, sym_table, dylibs, live_atoms, &reloc_cache)
 +    }
++
 +    pub fn build_filtered_with_relocs(
 +        inputs: &[LayoutInput<'_>],
 +        atoms: &AtomTable,
 +        sym_table: &mut SymbolTable,
 +        dylibs: &[DylibInput],
 +        live_atoms: Option<&HashSet<AtomId>>,
 +        parsed_relocs: &ParsedRelocCache,
      ) -> Result<Self, SynthError> {
          let input_map: HashMap<InputId, &ObjectFile> = inputs
              .iter()
              .map(|input| (input.id, input.object))
              .collect();
 -        let mut reloc_cache: HashMap<(InputId, u8), Vec<Reloc>> = HashMap::new();
 -        for input in inputs {
 -            for (sect_idx, section) in input.object.sections.iter().enumerate() {
 -                let relocs = if section.nreloc == 0 {
 -                    Vec::new()
 -                } else {
 -                    let raws = parse_raw_relocs(&section.raw_relocs, 0, section.nreloc).map_err(
 -                        |err| SynthError {
 -                            input: input.object.path.clone(),
 -                            atom: crate::resolve::AtomId(0),
 -                            reloc_offset: 0,
 -                            kind: RelocKind::Unsigned,
 -                            detail: err.to_string(),
 -                        },
 -                    )?;
 -                    parse_relocs(&raws).map_err(|err| SynthError {
 -                        input: input.object.path.clone(),
 -                        atom: crate::resolve::AtomId(0),
 -                        reloc_offset: 0,
 -                        kind: RelocKind::Unsigned,
 -                        detail: err.to_string(),
 -                    })?
 -                };
 -                reloc_cache.insert((input.id, (sect_idx + 1) as u8), relocs);
 -            }
 -        }
 +        let input_symbol_index = build_input_symbol_index(inputs, sym_table, parsed_relocs);
 +        let reloc_index = build_sorted_reloc_index(parsed_relocs);
          let mut got = GotSection::default();
          let mut stubs = StubsSection::default();
                  kind: RelocKind::Unsigned,
                  detail: "missing parsed object".to_string(),
              })?;
 -            let relocs = reloc_cache
 +            let relocs = reloc_index
                  .get(&(atom.origin, atom.input_section))
                  .map(Vec::as_slice)
                  .unwrap_or(&[]);
 +            let input_symbols = input_symbol_index.get(&atom.origin).map(Vec::as_slice);
              for reloc in relocs_for_atom(relocs, atom) {
                  if atom.section == AtomSection::CompactUnwind
                      && reloc.kind == RelocKind::Unsigned
                      && reloc.offset == atom.input_offset + COMPACT_UNWIND_PERSONALITY_FIELD_OFFSET
+                 {
 -                    if let Some(symbol_id) = dylib_import_referent(obj, reloc.referent, sym_table) {
 +                    if let Some(symbol_id) =
 +                        dylib_import_referent(obj, reloc.referent, sym_table, input_symbols)
 +                    {
                          got.intern(symbol_id, dylib_import_is_weak(sym_table, symbol_id));
+                     }
                      continue;
                          if matches!(atom.section, AtomSection::ThreadLocalVariables) {
                              continue;
+                         }
 -                        let Some(symbol_id) = dylib_import_referent(obj, reloc.referent, sym_table)
 +                        let Some(symbol_id) =
 +                            dylib_import_referent(obj, reloc.referent, sym_table, input_symbols)
                          else {
                              continue;
                          };
                      RelocKind::GotLoadPage21
                      | RelocKind::GotLoadPageOff12
                      | RelocKind::PointerToGot => {
 -                        let Some(symbol_id) = symbol_referent_id(obj, reloc.referent, sym_table)
 +                        let Some(symbol_id) =
 +                            symbol_referent_id(obj, reloc.referent, sym_table, input_symbols)
                          else {
                              continue;
                          };
+                         }
+                     }
                      RelocKind::Branch26 => {
 -                        let Some(symbol_id) = dylib_import_referent(obj, reloc.referent, sym_table)
 +                        let Some(symbol_id) =
 +                            dylib_import_referent(obj, reloc.referent, sym_table, input_symbols)
                          else {
                              continue;
                          };
                          );
+                     }
                      RelocKind::TlvpLoadPage21 | RelocKind::TlvpLoadPageOff12 => {
 -                        if let Some(symbol_id) = symbol_referent_id(obj, reloc.referent, sym_table)
 +                        if let Some(symbol_id) =
 +                            symbol_referent_id(obj, reloc.referent, sym_table, input_symbols)
+                         {
                              if tlv_symbol_needs_got(sym_table, symbol_id) {
                                  got.intern(symbol_id, dylib_import_is_weak(sym_table, symbol_id));
+     }
+ }
 +fn build_synthetic_reloc_cache(inputs: &[LayoutInput<'_>]) -> Result<ParsedRelocCache, SynthError> {
 +    let mut reloc_cache = ParsedRelocCache::new();
 +    for input in inputs {
 +        for (sect_idx, section) in input.object.sections.iter().enumerate() {
 +            let relocs = if section.nreloc == 0 {
 +                Vec::new()
 +            } else {
 +                let raws =
 +                    parse_raw_relocs(&section.raw_relocs, 0, section.nreloc).map_err(|err| {
 +                        SynthError {
 +                            input: input.object.path.clone(),
 +                            atom: crate::resolve::AtomId(0),
 +                            reloc_offset: 0,
 +                            kind: RelocKind::Unsigned,
 +                            detail: err.to_string(),
 +                        }
 +                    })?;
 +                parse_relocs(&raws).map_err(|err| SynthError {
 +                    input: input.object.path.clone(),
 +                    atom: crate::resolve::AtomId(0),
 +                    reloc_offset: 0,
 +                    kind: RelocKind::Unsigned,
 +                    detail: err.to_string(),
 +                })?
 +            };
 +            reloc_cache.insert((input.id, (sect_idx + 1) as u8), relocs);
 +        }
 +    }
 +    Ok(reloc_cache)
 +}
++
 +fn build_input_symbol_index(
 +    inputs: &[LayoutInput<'_>],
 +    sym_table: &SymbolTable,
 +    parsed_relocs: &ParsedRelocCache,
 +) -> HashMap<InputId, Vec<Option<SymbolId>>> {
 +    let mut referenced = HashMap::<InputId, HashSet<u32>>::new();
 +    for ((input_id, _), relocs) in parsed_relocs {
 +        let referenced = referenced.entry(*input_id).or_default();
 +        for reloc in relocs {
 +            if let Referent::Symbol(sym_idx) = reloc.referent {
 +                referenced.insert(sym_idx);
 +            }
 +            if let Some(Referent::Symbol(sym_idx)) = reloc.subtrahend {
 +                referenced.insert(sym_idx);
 +            }
 +        }
 +    }
++
 +    let mut index = HashMap::new();
 +    for input in inputs {
 +        let mut symbols = vec![None; input.object.symbols.len()];
 +        if let Some(referenced) = referenced.get(&input.id) {
 +            for sym_idx in referenced {
 +                let Some(input_sym) = input.object.symbols.get(*sym_idx as usize) else {
 +                    continue;
 +                };
 +                let Some(slot) = symbols.get_mut(*sym_idx as usize) else {
 +                    continue;
 +                };
 +                *slot = input
 +                    .object
 +                    .symbol_name(input_sym)
 +                    .ok()
 +                    .and_then(|name| sym_table.lookup_str(name));
 +            }
 +        }
 +        index.insert(input.id, symbols);
 +    }
 +    index
 +}
++
 +fn build_sorted_reloc_index(parsed_relocs: &ParsedRelocCache) -> ParsedRelocCache {
 +    let mut index = ParsedRelocCache::new();
 +    for (key, relocs) in parsed_relocs {
 +        if relocs.is_empty() {
 +            continue;
 +        }
 +        let mut sorted = relocs.clone();
 +        sorted.sort_by_key(|reloc| reloc.offset);
 +        index.insert(*key, sorted);
 +    }
 +    index
 +}
++
  fn relocs_for_atom<'a>(relocs: &'a [Reloc], atom: &Atom) -> impl Iterator<Item = Reloc> + 'a {
      let start = atom.input_offset;
      let end = atom.input_offset + atom.size;
 -    relocs.iter().copied().filter(move |reloc| {
 +    let first = relocs.partition_point(|reloc| reloc.offset < start);
 +    let last = relocs.partition_point(|reloc| reloc.offset < end);
 +    relocs[first..last].iter().copied().filter(move |reloc| {
          let reloc_end = reloc.offset + reloc.width_for_planning();
 -        reloc.offset >= start && reloc_end <= end
 +        reloc_end <= end
      })
+ }
      obj: &ObjectFile,
      referent: Referent,
      sym_table: &SymbolTable,
 +    input_symbols: Option<&[Option<SymbolId>]>,
  ) -> Option<SymbolId> {
 -    let symbol_id = symbol_referent_id(obj, referent, sym_table)?;
 +    let symbol_id = symbol_referent_id(obj, referent, sym_table, input_symbols)?;
      matches!(sym_table.get(symbol_id), Symbol::DylibImport { .. }).then_some(symbol_id)
+ }
      obj: &ObjectFile,
      referent: Referent,
      sym_table: &SymbolTable,
 +    input_symbols: Option<&[Option<SymbolId>]>,
  ) -> Option<SymbolId> {
      let Referent::Symbol(sym_idx) = referent else {
          return None;
      };
 +    if let Some(symbols) = input_symbols {
 +        return symbols.get(sym_idx as usize).copied().flatten();
 +    }
      let input_sym = obj.symbols.get(sym_idx as usize)?;
      let name = obj.symbol_name(input_sym).ok()?;
      let (symbol_id, _) = sym_table

tests/atom_integration.rsmodified

      // Atomize + back-patch.
      let obj = inputs.object_file(input_id).unwrap();
      let mut atom_table = AtomTable::new();
 -    let atomization = atomize_object(input_id, &obj, &mut atom_table);
 -    backpatch_symbol_atoms(
 -        &atomization,
 -        input_id,
 -        &obj,
 -        &mut sym_table,
 -        &mut atom_table,
 -    );
 +    let atomization = atomize_object(input_id, obj, &mut atom_table);
 +    backpatch_symbol_atoms(&atomization, input_id, obj, &mut sym_table, &mut atom_table);
      // At least one atom per defined function plus one for data_global.
      assert!(
      let _ = seed_all(&inputs, &mut sym_table).expect("seed_all");
      let obj = inputs.object_file(input_id).unwrap();
      let mut atom_table = AtomTable::new();
 -    let _atomization = atomize_object(input_id, &obj, &mut atom_table);
 +    let _atomization = atomize_object(input_id, obj, &mut atom_table);
      let cstring_atoms: Vec<_> = atom_table
          .iter()

tests/common/harness.rsmodified

  use std::collections::{BTreeMap, HashSet};
  use std::fs;
  use std::path::{Path, PathBuf};
 -use std::process::Command;
 -use std::time::{SystemTime, UNIX_EPOCH};
 +use std::process::{Command, Stdio};
 +use std::thread;
 +use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
 -use afs_ld::leb::read_uleb;
 +use afs_ld::leb::{read_sleb, read_uleb};
  use afs_ld::macho::constants::{
 -    INDIRECT_SYMBOL_ABS, INDIRECT_SYMBOL_LOCAL, LC_BUILD_VERSION, LC_CODE_SIGNATURE,
 -    LC_DATA_IN_CODE, LC_DYLD_CHAINED_FIXUPS, LC_DYLD_EXPORTS_TRIE, LC_DYLD_INFO_ONLY, LC_DYSYMTAB,
 -    LC_FUNCTION_STARTS, LC_ID_DYLIB, LC_LOAD_DYLIB, LC_LOAD_UPWARD_DYLIB, LC_LOAD_WEAK_DYLIB,
 -    LC_REEXPORT_DYLIB, LC_SEGMENT_64, LC_SYMTAB, LC_UUID,
 +    BIND_IMMEDIATE_MASK, BIND_OPCODE_ADD_ADDR_ULEB, BIND_OPCODE_DONE, BIND_OPCODE_DO_BIND,
 +    BIND_OPCODE_DO_BIND_ADD_ADDR_IMM_SCALED, BIND_OPCODE_DO_BIND_ADD_ADDR_ULEB,
 +    BIND_OPCODE_DO_BIND_ULEB_TIMES_SKIPPING_ULEB, BIND_OPCODE_MASK, BIND_OPCODE_SET_ADDEND_SLEB,
 +    BIND_OPCODE_SET_DYLIB_ORDINAL_IMM, BIND_OPCODE_SET_DYLIB_ORDINAL_ULEB,
 +    BIND_OPCODE_SET_DYLIB_SPECIAL_IMM, BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB,
 +    BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM, BIND_OPCODE_SET_TYPE_IMM,
 +    BIND_SYMBOL_FLAGS_WEAK_IMPORT, BIND_TYPE_POINTER, INDIRECT_SYMBOL_ABS, INDIRECT_SYMBOL_LOCAL,
 +    LC_BUILD_VERSION, LC_CODE_SIGNATURE, LC_DATA_IN_CODE, LC_DYLD_CHAINED_FIXUPS,
 +    LC_DYLD_EXPORTS_TRIE, LC_DYLD_INFO_ONLY, LC_DYSYMTAB, LC_FUNCTION_STARTS, LC_ID_DYLIB,
 +    LC_LOAD_DYLIB, LC_LOAD_UPWARD_DYLIB, LC_LOAD_WEAK_DYLIB, LC_REEXPORT_DYLIB, LC_SEGMENT_64,
 +    LC_SYMTAB, LC_UUID, N_TYPE, N_UNDF,
  };
  use afs_ld::macho::dylib::DylibFile;
  use afs_ld::macho::exports::ExportKind;
  };
  use afs_ld::string_table::StringTable;
  use afs_ld::symbol::{parse_nlist_table, SymKind};
 +use afs_ld::synth::stubs::{STUB_HELPER_ENTRY_SIZE, STUB_HELPER_HEADER_SIZE};
  use afs_ld::synth::unwind::decode_unwind_info;
  #[derive(Debug, Clone)]
      FunctionStarts,
      NormalizedFunctionStarts,
      DataInCode,
 +    DataInCodeIfPresent,
      RebasedUnwindBytes,
      DyldInfoRebase,
      DyldInfoBind,
  #[derive(Debug, Clone, Copy, PartialEq, Eq)]
  enum ArtifactKind {
 -    ClangDylib,
 -    ClangArchive,
 -    ClangReexportDylib,
 +    Dylib,
 +    Archive,
 +    ReexportDylib,
+ }
 +type SymbolPartitions = (Vec<String>, Vec<String>, Vec<String>);
++
  pub struct LinkOutputs {
      pub ours: Vec<u8>,
      pub theirs: Vec<u8>,
+ }
  pub fn assemble(src: &str, out: &PathBuf) -> Result<(), String> {
 -    let tmp = std::env::temp_dir().join(format!(
 -        "afs-ld-parity-{}-{}.s",
 -        std::process::id(),
 -        out.file_stem().and_then(|s| s.to_str()).unwrap_or("t")
 -    ));
 +    let tmp = out.with_extension("s");
      fs::write(&tmp, src).map_err(|e| format!("write {}: {e}", tmp.display()))?;
      let output = Command::new("xcrun")
          .args(["--sdk", "macosx", "as", "-arch", "arm64"])
+ }
  pub fn compile_c(src: &str, out: &PathBuf) -> Result<(), String> {
 -    let tmp = std::env::temp_dir().join(format!(
 -        "afs-ld-parity-{}-{}.c",
 -        std::process::id(),
 -        out.file_stem().and_then(|s| s.to_str()).unwrap_or("t")
 -    ));
 +    let tmp = out.with_extension("c");
      fs::write(&tmp, src).map_err(|e| format!("write {}: {e}", tmp.display()))?;
      let output = Command::new("xcrun")
          .args(["--sdk", "macosx", "clang", "-arch", "arm64", "-c"])
+ }
  fn compile_dylib_c(src: &str, out: &PathBuf) -> Result<(), String> {
 -    let tmp = std::env::temp_dir().join(format!(
 -        "afs-ld-parity-{}-{}.c",
 -        std::process::id(),
 -        out.file_stem().and_then(|s| s.to_str()).unwrap_or("lib")
 -    ));
 +    let tmp = out.with_extension("c");
      fs::write(&tmp, src).map_err(|e| format!("write {}: {e}", tmp.display()))?;
      let install_name = out.to_string_lossy().to_string();
      let output = Command::new("xcrun")
+ }
  fn compile_reexport_dylib_c(src: &str, out: &PathBuf, dep: &Path) -> Result<(), String> {
 -    let tmp = std::env::temp_dir().join(format!(
 -        "afs-ld-parity-{}-{}.c",
 -        std::process::id(),
 -        out.file_stem()
 -            .and_then(|s| s.to_str())
 -            .unwrap_or("reexport")
 -    ));
 +    let tmp = out.with_extension("c");
      fs::write(&tmp, src).map_err(|e| format!("write {}: {e}", tmp.display()))?;
      let install_name = out.to_string_lossy().to_string();
      let output = Command::new("xcrun")
              .map_err(|e| format!("read artifact src {}: {e}", src.display()))?;
          let out = work_dir.join(&artifact.out_name);
          match artifact.kind {
 -            ArtifactKind::ClangDylib => compile_dylib_c(&src_contents, &out)?,
 -            ArtifactKind::ClangArchive => compile_archive_c(&src_contents, &out)?,
 -            ArtifactKind::ClangReexportDylib => {
 +            ArtifactKind::Dylib => compile_dylib_c(&src_contents, &out)?,
 +            ArtifactKind::Archive => compile_archive_c(&src_contents, &out)?,
 +            ArtifactKind::ReexportDylib => {
                  let dep_name = artifact.dep_name.as_ref().ok_or_else(|| {
                      format!(
                          "missing reexport dependency for artifact {}",
+                 }
+             }
              CommandCheck::StringTableNearParity => {
 -                let our_len = raw_string_table(ours)?.len();
 -                let their_len = raw_string_table(theirs)?.len();
 +                let our_len = effective_string_table_len(ours)?;
 +                let their_len = effective_string_table_len(theirs)?;
                  if !string_table_within_five_percent(our_len, their_len) {
                      return Err(format!(
                          "string table length drifted too far from Apple ld: ours={} theirs={}",
                      ));
+                 }
+             }
 +            CommandCheck::DataInCodeIfPresent => {
 +                let ours = canonical_data_in_code(ours)?;
 +                let theirs = canonical_data_in_code(theirs)?;
 +                if !ours.is_empty() && !theirs.is_empty() && ours != theirs {
 +                    return Err(format!(
 +                        "canonical data-in-code records diverged:\nours:   {ours:#?}\ntheirs: {theirs:#?}"
 +                    ));
 +                }
 +            }
              CommandCheck::RebasedUnwindBytes => {
                  let ours = rebased_unwind_bytes(ours)?;
                  let theirs = rebased_unwind_bytes(theirs)?;
+                 }
+             }
              CommandCheck::DyldInfoBind => {
 -                let ours = dyld_info_stream(ours, DyldInfoStreamKind::Bind)?;
 -                let theirs = dyld_info_stream(theirs, DyldInfoStreamKind::Bind)?;
 +                let ours = canonical_bind_records(ours, DyldInfoStreamKind::Bind)?;
 +                let theirs = canonical_bind_records(theirs, DyldInfoStreamKind::Bind)?;
                  if ours != theirs {
 -                    return Err("bind stream diverged".to_string());
 +                    return Err(format!(
 +                        "bind stream diverged:\nours:   {ours:#?}\ntheirs: {theirs:#?}"
 +                    ));
+                 }
+             }
              CommandCheck::DyldInfoWeakBind => {
 -                let ours = dyld_info_stream(ours, DyldInfoStreamKind::WeakBind)?;
 -                let theirs = dyld_info_stream(theirs, DyldInfoStreamKind::WeakBind)?;
 +                let ours = canonical_bind_records(ours, DyldInfoStreamKind::WeakBind)?;
 +                let theirs = canonical_bind_records(theirs, DyldInfoStreamKind::WeakBind)?;
                  if ours != theirs {
 -                    return Err("weak-bind stream diverged".to_string());
 +                    return Err(format!(
 +                        "weak-bind stream diverged:\nours:   {ours:#?}\ntheirs: {theirs:#?}"
 +                    ));
+                 }
+             }
              CommandCheck::DyldInfoLazyBind => {
 -                let ours = dyld_info_stream(ours, DyldInfoStreamKind::LazyBind)?;
 -                let theirs = dyld_info_stream(theirs, DyldInfoStreamKind::LazyBind)?;
 +                let ours = canonical_bind_records(ours, DyldInfoStreamKind::LazyBind)?;
 +                let theirs = canonical_bind_records(theirs, DyldInfoStreamKind::LazyBind)?;
                  if ours != theirs {
 -                    return Err("lazy-bind stream diverged".to_string());
 +                    return Err(format!(
 +                        "lazy-bind stream diverged:\nours:   {ours:#?}\ntheirs: {theirs:#?}"
 +                    ));
+                 }
+             }
+         }
      case_tolerances: &[CaseTolerance],
  ) -> Result<(), String> {
      for (segname, sectname) in sections {
 +        if segname == "__TEXT" && sectname == "__stubs" {
 +            let ours = canonical_stub_targets(ours)?;
 +            let theirs = canonical_stub_targets(theirs)?;
 +            if ours != theirs {
 +                return Err(format!(
 +                    "canonical stub targets diverged:\nours:   {ours:#?}\ntheirs: {theirs:#?}"
 +                ));
 +            }
 +            continue;
 +        }
 +        if segname == "__TEXT" && sectname == "__stub_helper" {
 +            let ours = canonical_stub_helper(ours)?;
 +            let theirs = canonical_stub_helper(theirs)?;
 +            if ours != theirs {
 +                return Err(format!(
 +                    "canonical stub helper surface diverged:\nours:   {ours:#?}\ntheirs: {theirs:#?}"
 +                ));
 +            }
 +            continue;
 +        }
          let (_, our_bytes) = output_section(ours, segname, sectname)
              .ok_or_else(|| format!("missing section {segname},{sectname} in afs-ld output"))?;
          let (_, their_bytes) = output_section(theirs, segname, sectname)
              decode_page_reference(&our_bytes, our_addr, check.site_offset, check.kind)?;
          let their_target =
              decode_page_reference(&their_bytes, their_addr, check.site_offset, check.kind)?;
 -        let expected_ours = *our_symbols
 -            .get(&check.symbol)
 -            .ok_or_else(|| format!("missing symbol {} in afs-ld output", check.symbol))?;
 -        let expected_theirs = *their_symbols
 -            .get(&check.symbol)
 -            .ok_or_else(|| format!("missing symbol {} in Apple output", check.symbol))?;
 +        let expected_ours = resolve_page_ref_expectation(ours, &our_symbols, &check.symbol)?;
 +        let expected_theirs = resolve_page_ref_expectation(theirs, &their_symbols, &check.symbol)?;
          if our_target != expected_ours || their_target != expected_theirs {
              return Err(format!(
                  "page ref {},{}+0x{:x} -> {} diverged: ours=0x{:x} expected=0x{:x}; theirs=0x{:x} expected=0x{:x}",
      Ok(())
+ }
 +fn resolve_page_ref_expectation(
 +    bytes: &[u8],
 +    symbols: &BTreeMap<String, u64>,
 +    reference: &str,
 +) -> Result<u64, String> {
 +    if let Some(spec) = reference.strip_prefix("@SECTION:") {
 +        let (section_spec, addend) = if let Some((section_spec, addend)) = spec.rsplit_once('+') {
 +            (section_spec, parse_u64(addend)?)
 +        } else {
 +            (spec, 0)
 +        };
 +        let (segname, sectname) = section_spec
 +            .split_once(',')
 +            .ok_or_else(|| format!("invalid @SECTION page-ref target `{reference}`"))?;
 +        let (addr, data) = output_section(bytes, segname, sectname)
 +            .ok_or_else(|| format!("missing section {segname},{sectname} in output"))?;
 +        if addend > data.len() as u64 {
 +            return Err(format!(
 +                "@SECTION target `{reference}` exceeds section size {}",
 +                data.len()
 +            ));
 +        }
 +        return Ok(addr + addend);
 +    }
 +    symbols
 +        .get(reference)
 +        .copied()
 +        .ok_or_else(|| format!("missing symbol {reference} in output"))
 +}
++
  pub fn run_program(path: &Path, args: &[String]) -> Result<ProgramOutput, String> {
 -    let output = Command::new(path)
 +    let runtime_timeout = runtime_timeout();
++
 +    let mut child = Command::new(path)
          .args(args)
 -        .output()
 +        .stdout(Stdio::piped())
 +        .stderr(Stdio::piped())
 +        .spawn()
          .map_err(|e| format!("run {}: {e}", path.display()))?;
 -    Ok(ProgramOutput {
 -        exit_code: output.status.code(),
 -        stdout: output.stdout,
 -        stderr: output.stderr,
 -    })
 +    let started = Instant::now();
 +    loop {
 +        if child
 +            .try_wait()
 +            .map_err(|e| format!("wait for {}: {e}", path.display()))?
 +            .is_some()
 +        {
 +            let output = child
 +                .wait_with_output()
 +                .map_err(|e| format!("collect output from {}: {e}", path.display()))?;
 +            return Ok(ProgramOutput {
 +                exit_code: output.status.code(),
 +                stdout: output.stdout,
 +                stderr: output.stderr,
 +            });
 +        }
 +        if started.elapsed() >= runtime_timeout {
 +            let _ = child.kill();
 +            let output = child
 +                .wait_with_output()
 +                .map_err(|e| format!("collect timed-out output from {}: {e}", path.display()))?;
 +            return Err(format!(
 +                "run {} timed out after {:?}: exit={:?} stdout={:?} stderr={:?}",
 +                path.display(),
 +                runtime_timeout,
 +                output.status.code(),
 +                String::from_utf8_lossy(&output.stdout),
 +                String::from_utf8_lossy(&output.stderr)
 +            ));
 +        }
 +        thread::sleep(Duration::from_millis(5));
 +    }
+ }
  pub fn compare_runtime(our_path: &Path, their_path: &Path, args: &[String]) -> Result<(), String> {
 -    let ours = run_program(our_path, args)?;
 -    let theirs = run_program(their_path, args)?;
 +    let our_path = our_path.to_path_buf();
 +    let their_path = their_path.to_path_buf();
 +    let their_args = args.to_vec();
 +    let ours = thread::scope(|scope| {
 +        let theirs = scope.spawn(|| run_program(&their_path, &their_args));
 +        let ours = run_program(&our_path, args);
 +        let theirs = theirs
 +            .join()
 +            .map_err(|_| "Apple runtime worker panicked".to_string())?;
 +        Ok::<_, String>((ours, theirs))
 +    })?;
 +    let (ours, theirs) = ours;
 +    let ours = ours?;
 +    let theirs = theirs?;
      if ours != theirs {
          return Err(format!(
              "runtime differs:\nours: exit={:?} stdout={:?} stderr={:?}\ntheirs: exit={:?} stdout={:?} stderr={:?}",
      Ok(())
+ }
 +fn runtime_timeout() -> Duration {
 +    const DEFAULT_RUNTIME_TIMEOUT_SECS: u64 = 120;
++
 +    std::env::var("PARITY_RUNTIME_TIMEOUT_SECONDS")
 +        .ok()
 +        .and_then(|value| value.parse::<u64>().ok())
 +        .map(Duration::from_secs)
 +        .unwrap_or_else(|| Duration::from_secs(DEFAULT_RUNTIME_TIMEOUT_SECS))
 +}
++
  /// Byte-level diff between two Mach-O images or section byte slices.
  ///
  /// Sprint 27 starts tolerating a very small allowlist: UUID bytes, dylib
                          path.display()
                      ));
+                 }
 -                (ArtifactKind::ClangDylib, None)
 +                (ArtifactKind::Dylib, None)
+             }
              "clang_archive" => {
                  if dep_name.is_some() {
                          path.display()
                      ));
+                 }
 -                (ArtifactKind::ClangArchive, None)
 +                (ArtifactKind::Archive, None)
+             }
              "clang_reexport_dylib" => {
                  let dep_name = dep_name.ok_or_else(|| {
                          path.display()
+                     )
                  })?;
 -                (ArtifactKind::ClangReexportDylib, Some(dep_name))
 +                (ArtifactKind::ReexportDylib, Some(dep_name))
+             }
              other => return Err(format!("unknown artifact kind `{other}`")),
          };
          "function_starts" => Ok(CommandCheck::FunctionStarts),
          "normalized_function_starts" => Ok(CommandCheck::NormalizedFunctionStarts),
          "data_in_code" => Ok(CommandCheck::DataInCode),
 +        "data_in_code_if_present" => Ok(CommandCheck::DataInCodeIfPresent),
          "rebased_unwind_bytes" => Ok(CommandCheck::RebasedUnwindBytes),
          "dyld_info_rebase" => Ok(CommandCheck::DyldInfoRebase),
          "dyld_info_bind" => Ok(CommandCheck::DyldInfoBind),
+                 }
+             }
              LC_ID_DYLIB | LC_LOAD_DYLIB | LC_LOAD_WEAK_DYLIB | LC_REEXPORT_DYLIB
 -            | LC_LOAD_UPWARD_DYLIB => {
 -                if cmdsize >= 16 {
 -                    mark_range(&mut mask, cursor + 12, cursor + 16, "dylib timestamp");
 -                }
 +            | LC_LOAD_UPWARD_DYLIB
 +                if cmdsize >= 16 =>
 +            {
 +                mark_range(&mut mask, cursor + 12, cursor + 16, "dylib timestamp");
+             }
              _ => {}
+         }
  struct CanonicalSymbolRecord {
      name: String,
      n_type: u8,
 -    n_sect: u8,
 +    section: Option<(String, String)>,
      n_desc: u16,
      value: u64,
+ }
          parse_nlist_table(bytes, symtab.symoff, symtab.nsyms).map_err(|e| e.to_string())?;
      let strings =
          StringTable::from_file(bytes, symtab.stroff, symtab.strsize).map_err(|e| e.to_string())?;
 -    let section_addrs = section_addrs(bytes)?;
 +    let sections = section_regions(bytes)?;
      Ok(symbols
          .iter()
          .map(|symbol| {
 -            let value = if symbol.kind() == SymKind::Sect && symbol.sect_idx() != 0 {
 -                let section_addr = section_addrs[symbol.sect_idx() as usize - 1];
 -                if symbol.value() >= section_addr {
 -                    symbol.value() - section_addr
 +            let (section, value) = if symbol.kind() == SymKind::Sect && symbol.sect_idx() != 0 {
 +                let section = &sections[symbol.sect_idx() as usize - 1];
 +                let value = if symbol.value() >= section.addr {
 +                    symbol.value() - section.addr
                  } else {
                      symbol.value()
 -                }
 +                };
 +                (
 +                    Some((section.segname.clone(), section.sectname.clone())),
 +                    value,
 +                )
              } else {
 -                symbol.value()
 +                (None, symbol.value())
              };
              CanonicalSymbolRecord {
                  name: strings.get(symbol.strx()).unwrap().to_string(),
                  n_type: symbol.raw.n_type,
 -                n_sect: symbol.raw.n_sect,
 +                section,
                  n_desc: symbol.raw.n_desc,
                  value,
+             }
          })
 +        .filter(|record| !is_optional_dyld_stub_binder_record(record))
          .collect())
+ }
 +fn is_optional_dyld_stub_binder_record(record: &CanonicalSymbolRecord) -> bool {
 +    record.name == "dyld_stub_binder"
 +        && (record.n_type & N_TYPE) == N_UNDF
 +        && record.section.is_none()
 +}
++
  fn canonical_export_records(bytes: &[u8]) -> Result<Vec<CanonicalExportRecord>, String> {
      let dylib = DylibFile::parse("/tmp/canonical.dylib", bytes).map_err(|e| e.to_string())?;
      let symbol_values: BTreeMap<String, u64> = canonical_symbol_records(bytes)?
      Ok(out)
+ }
 -fn symbol_partition_names(bytes: &[u8]) -> Result<(Vec<String>, Vec<String>, Vec<String>), String> {
 +fn symbol_partition_names(bytes: &[u8]) -> Result<SymbolPartitions, String> {
      let (symtab, dysymtab) = symtab_and_dysymtab(bytes)?;
      let symbols =
          parse_nlist_table(bytes, symtab.symoff, symtab.nsyms).map_err(|e| e.to_string())?;
      Ok((
          names_for(dysymtab.ilocalsym, dysymtab.nlocalsym),
          names_for(dysymtab.iextdefsym, dysymtab.nextdefsym),
 -        names_for(dysymtab.iundefsym, dysymtab.nundefsym),
 +        names_for(dysymtab.iundefsym, dysymtab.nundefsym)
 +            .into_iter()
 +            .filter(|name| name != "dyld_stub_binder")
 +            .collect(),
      ))
+ }
 +fn has_optional_dyld_stub_binder(bytes: &[u8]) -> Result<bool, String> {
 +    let (symtab, _) = symtab_and_dysymtab(bytes)?;
 +    let symbols =
 +        parse_nlist_table(bytes, symtab.symoff, symtab.nsyms).map_err(|e| e.to_string())?;
 +    let strings =
 +        StringTable::from_file(bytes, symtab.stroff, symtab.strsize).map_err(|e| e.to_string())?;
 +    Ok(symbols.iter().any(|symbol| {
 +        strings
 +            .get(symbol.strx())
 +            .map(|name| {
 +                name == "dyld_stub_binder"
 +                    && (symbol.raw.n_type & N_TYPE) == N_UNDF
 +                    && symbol.raw.n_sect == 0
 +            })
 +            .unwrap_or(false)
 +    }))
 +}
++
  fn raw_string_table(bytes: &[u8]) -> Result<Vec<u8>, String> {
      let (symtab, _) = symtab_and_dysymtab(bytes)?;
      let start = symtab.stroff as usize;
      Ok(bytes[start..end].to_vec())
+ }
 +fn effective_string_table_len(bytes: &[u8]) -> Result<usize, String> {
 +    let mut len = raw_string_table(bytes)?.len();
 +    if has_optional_dyld_stub_binder(bytes)? {
 +        len = len.saturating_sub("dyld_stub_binder".len() + 1);
 +    }
 +    Ok(len)
 +}
++
  pub fn string_table_within_five_percent(ours: usize, theirs: usize) -> bool {
      let delta = ours.abs_diff(theirs);
      delta * 20 <= theirs
          .collect())
+ }
 +#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
 +enum CanonicalBindLocation {
 +    Section {
 +        segname: String,
 +        sectname: String,
 +        offset: u64,
 +    },
 +    Segment {
 +        segment_index: u8,
 +        segment_offset: u64,
 +    },
 +}
++
 +#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
 +struct CanonicalBindRecord {
 +    location: CanonicalBindLocation,
 +    ordinal: i32,
 +    symbol: String,
 +    weak_import: bool,
 +    bind_type: u8,
 +    addend: i64,
 +}
++
 +fn canonical_bind_records(
 +    bytes: &[u8],
 +    kind: DyldInfoStreamKind,
 +) -> Result<Vec<CanonicalBindRecord>, String> {
 +    let stream = dyld_info_stream(bytes, kind)?;
 +    let mut cursor = 0usize;
 +    let mut segment_index = 0u8;
 +    let mut segment_offset = 0u64;
 +    let mut ordinal = 0i32;
 +    let mut symbol = String::new();
 +    let mut weak_import = false;
 +    let mut bind_type = BIND_TYPE_POINTER;
 +    let mut addend = 0i64;
 +    let mut out = Vec::new();
++
 +    while cursor < stream.len() {
 +        let byte = stream[cursor];
 +        cursor += 1;
 +        let opcode = byte & BIND_OPCODE_MASK;
 +        let imm = byte & BIND_IMMEDIATE_MASK;
 +        match opcode {
 +            BIND_OPCODE_DONE => break,
 +            BIND_OPCODE_SET_DYLIB_ORDINAL_IMM => ordinal = imm as i32,
 +            BIND_OPCODE_SET_DYLIB_ORDINAL_ULEB => {
 +                let (value, used) =
 +                    read_uleb(&stream[cursor..]).map_err(|e| format!("bind uleb: {e}"))?;
 +                cursor += used;
 +                ordinal = value as i32;
 +            }
 +            BIND_OPCODE_SET_DYLIB_SPECIAL_IMM => {
 +                ordinal = if imm == 0 {
 +                    0
 +                } else {
 +                    (((imm as i8) << 4) >> 4) as i32
 +                };
 +            }
 +            BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM => {
 +                let (value, used) = read_c_string(&stream[cursor..])?;
 +                cursor += used;
 +                symbol = value;
 +                weak_import = (imm & BIND_SYMBOL_FLAGS_WEAK_IMPORT) != 0;
 +            }
 +            BIND_OPCODE_SET_TYPE_IMM => bind_type = imm,
 +            BIND_OPCODE_SET_ADDEND_SLEB => {
 +                let (value, used) =
 +                    read_sleb(&stream[cursor..]).map_err(|e| format!("bind sleb: {e}"))?;
 +                cursor += used;
 +                addend = value;
 +            }
 +            BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB => {
 +                let (value, used) =
 +                    read_uleb(&stream[cursor..]).map_err(|e| format!("bind uleb: {e}"))?;
 +                cursor += used;
 +                segment_index = imm;
 +                segment_offset = value;
 +            }
 +            BIND_OPCODE_ADD_ADDR_ULEB => {
 +                let (value, used) =
 +                    read_uleb(&stream[cursor..]).map_err(|e| format!("bind uleb: {e}"))?;
 +                cursor += used;
 +                segment_offset += value;
 +            }
 +            BIND_OPCODE_DO_BIND => {
 +                out.push(CanonicalBindRecord {
 +                    location: canonical_bind_location(bytes, segment_index, segment_offset)?,
 +                    ordinal,
 +                    symbol: symbol.clone(),
 +                    weak_import,
 +                    bind_type,
 +                    addend,
 +                });
 +                segment_offset += 8;
 +            }
 +            BIND_OPCODE_DO_BIND_ADD_ADDR_ULEB => {
 +                let (value, used) =
 +                    read_uleb(&stream[cursor..]).map_err(|e| format!("bind uleb: {e}"))?;
 +                cursor += used;
 +                out.push(CanonicalBindRecord {
 +                    location: canonical_bind_location(bytes, segment_index, segment_offset)?,
 +                    ordinal,
 +                    symbol: symbol.clone(),
 +                    weak_import,
 +                    bind_type,
 +                    addend,
 +                });
 +                segment_offset += 8 + value;
 +            }
 +            BIND_OPCODE_DO_BIND_ADD_ADDR_IMM_SCALED => {
 +                out.push(CanonicalBindRecord {
 +                    location: canonical_bind_location(bytes, segment_index, segment_offset)?,
 +                    ordinal,
 +                    symbol: symbol.clone(),
 +                    weak_import,
 +                    bind_type,
 +                    addend,
 +                });
 +                segment_offset += 8 + (imm as u64) * 8;
 +            }
 +            BIND_OPCODE_DO_BIND_ULEB_TIMES_SKIPPING_ULEB => {
 +                let (count, count_used) =
 +                    read_uleb(&stream[cursor..]).map_err(|e| format!("bind uleb: {e}"))?;
 +                cursor += count_used;
 +                let (skip, skip_used) =
 +                    read_uleb(&stream[cursor..]).map_err(|e| format!("bind uleb: {e}"))?;
 +                cursor += skip_used;
 +                for _ in 0..count {
 +                    out.push(CanonicalBindRecord {
 +                        location: canonical_bind_location(bytes, segment_index, segment_offset)?,
 +                        ordinal,
 +                        symbol: symbol.clone(),
 +                        weak_import,
 +                        bind_type,
 +                        addend,
 +                    });
 +                    segment_offset += 8 + skip;
 +                }
 +            }
 +            other => return Err(format!("unsupported bind opcode 0x{other:02x}")),
 +        }
 +    }
++
 +    normalize_bind_section_offsets(&mut out);
 +    out.sort();
 +    Ok(out)
 +}
++
 +fn normalize_bind_section_offsets(records: &mut [CanonicalBindRecord]) {
 +    let mut next_offsets: BTreeMap<(String, String), u64> = BTreeMap::new();
 +    records.sort();
 +    for record in records.iter_mut() {
 +        let CanonicalBindLocation::Section {
 +            segname,
 +            sectname,
 +            offset,
 +        } = &mut record.location
 +        else {
 +            continue;
 +        };
 +        let next = next_offsets
 +            .entry((segname.clone(), sectname.clone()))
 +            .or_insert(0);
 +        *offset = *next;
 +        *next += 8;
 +    }
 +}
++
  fn rebased_unwind_bytes(bytes: &[u8]) -> Result<Vec<u8>, String> {
      let header_base = segment_vmaddr(bytes, "__TEXT").unwrap_or(0);
      let text_base = output_section(bytes, "__TEXT", "__text")
+ }
  fn section_addrs(bytes: &[u8]) -> Result<Vec<u64>, String> {
 +    Ok(section_regions(bytes)?
 +        .into_iter()
 +        .map(|section| section.addr)
 +        .collect())
 +}
++
 +#[derive(Debug, Clone)]
 +struct SegmentRegion {
 +    index: u8,
 +    segname: String,
 +    vmaddr: u64,
 +    vmsize: u64,
 +}
++
 +#[derive(Debug, Clone)]
 +struct SectionRegion {
 +    segment_index: u8,
 +    segname: String,
 +    sectname: String,
 +    addr: u64,
 +    size: u64,
 +}
++
 +#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
 +struct CanonicalSectionLocation {
 +    segname: String,
 +    sectname: String,
 +    offset: u64,
 +}
++
 +fn segment_regions(bytes: &[u8]) -> Result<Vec<SegmentRegion>, String> {
 +    let header = parse_header(bytes).map_err(|e| e.to_string())?;
 +    let commands = parse_commands(&header, bytes).map_err(|e| e.to_string())?;
 +    let mut out = Vec::new();
 +    let mut index = 0u8;
 +    for cmd in commands {
 +        if let LoadCommand::Segment64(seg) = cmd {
 +            out.push(SegmentRegion {
 +                index,
 +                segname: seg.segname_str().to_string(),
 +                vmaddr: seg.vmaddr,
 +                vmsize: seg.vmsize,
 +            });
 +            index = index.saturating_add(1);
 +        }
 +    }
 +    Ok(out)
 +}
++
 +fn section_regions(bytes: &[u8]) -> Result<Vec<SectionRegion>, String> {
      let header = parse_header(bytes).map_err(|e| e.to_string())?;
      let commands = parse_commands(&header, bytes).map_err(|e| e.to_string())?;
      let mut out = Vec::new();
 +    let mut segment_index = 0u8;
      for cmd in commands {
          if let LoadCommand::Segment64(seg) = cmd {
              for section in seg.sections {
 -                out.push(section.addr);
 +                out.push(SectionRegion {
 +                    segment_index,
 +                    segname: section.segname_str().to_string(),
 +                    sectname: section.sectname_str().to_string(),
 +                    addr: section.addr,
 +                    size: section.size,
 +                });
+             }
 +            segment_index = segment_index.saturating_add(1);
+         }
+     }
      Ok(out)
+ }
 +fn canonical_bind_location(
 +    bytes: &[u8],
 +    segment_index: u8,
 +    segment_offset: u64,
 +) -> Result<CanonicalBindLocation, String> {
 +    let segments = segment_regions(bytes)?;
 +    let sections = section_regions(bytes)?;
 +    let Some(segment) = segments
 +        .iter()
 +        .find(|segment| segment.index == segment_index)
 +    else {
 +        return Ok(CanonicalBindLocation::Segment {
 +            segment_index,
 +            segment_offset,
 +        });
 +    };
 +    if segment_offset >= segment.vmsize {
 +        return Ok(CanonicalBindLocation::Segment {
 +            segment_index,
 +            segment_offset,
 +        });
 +    }
 +    let addr = segment.vmaddr + segment_offset;
 +    if let Some(section) = sections.iter().find(|section| {
 +        section.segment_index == segment_index
 +            && section.addr <= addr
 +            && addr < section.addr + section.size
 +    }) {
 +        return Ok(CanonicalBindLocation::Section {
 +            segname: section.segname.clone(),
 +            sectname: section.sectname.clone(),
 +            offset: addr - section.addr,
 +        });
 +    }
 +    Ok(CanonicalBindLocation::Segment {
 +        segment_index,
 +        segment_offset,
 +    })
 +}
++
 +fn canonical_section_location(bytes: &[u8], addr: u64) -> Result<CanonicalSectionLocation, String> {
 +    let sections = section_regions(bytes)?;
 +    let section = sections
 +        .into_iter()
 +        .find(|section| section.addr <= addr && addr < section.addr + section.size)
 +        .ok_or_else(|| format!("address 0x{addr:x} is not inside any output section"))?;
 +    Ok(CanonicalSectionLocation {
 +        segname: section.segname,
 +        sectname: section.sectname,
 +        offset: addr - section.addr,
 +    })
 +}
++
  #[derive(Clone, Copy)]
  enum DyldInfoStreamKind {
      Rebase,
          .ok_or_else(|| "dyld-info stream out of bounds".to_string())
+ }
 +fn read_c_string(bytes: &[u8]) -> Result<(String, usize), String> {
 +    let end = bytes
 +        .iter()
 +        .position(|byte| *byte == 0)
 +        .ok_or_else(|| "unterminated C string".to_string())?;
 +    let value = std::str::from_utf8(&bytes[..end])
 +        .map_err(|e| format!("utf-8 in C string: {e}"))?
 +        .to_string();
 +    Ok((value, end + 1))
 +}
++
 +fn canonical_stub_targets(bytes: &[u8]) -> Result<Vec<u64>, String> {
 +    let header = output_section_header(bytes, "__TEXT", "__stubs")
 +        .ok_or_else(|| "missing __TEXT,__stubs section".to_string())?;
 +    let (section_addr, section_bytes) = output_section(bytes, "__TEXT", "__stubs")
 +        .ok_or_else(|| "missing __TEXT,__stubs section".to_string())?;
 +    if section_bytes.is_empty() {
 +        return Ok(Vec::new());
 +    }
 +    let stub_size = usize::try_from(header.reserved2)
 +        .ok()
 +        .filter(|size| *size > 0)
 +        .unwrap_or(12);
 +    if section_bytes.len() % stub_size != 0 {
 +        return Err(format!(
 +            "__TEXT,__stubs size {} is not a multiple of stub size {}",
 +            section_bytes.len(),
 +            stub_size
 +        ));
 +    }
 +    let mut out = Vec::new();
 +    for (idx, chunk) in section_bytes.chunks_exact(stub_size).enumerate() {
 +        out.push(decode_stub_target(
 +            chunk,
 +            section_addr + (idx * stub_size) as u64,
 +        )?);
 +    }
 +    Ok(out)
 +}
++
 +#[derive(Debug, Clone, PartialEq, Eq)]
 +struct CanonicalStubHelper {
 +    dyld_private: CanonicalSectionLocation,
 +    binder_got: CanonicalSectionLocation,
 +    lazy_bind_offsets: Vec<u32>,
 +}
++
 +fn canonical_stub_helper(bytes: &[u8]) -> Result<CanonicalStubHelper, String> {
 +    let (section_addr, section_bytes) = output_section(bytes, "__TEXT", "__stub_helper")
 +        .ok_or_else(|| "missing __TEXT,__stub_helper section".to_string())?;
 +    if section_bytes.len() < STUB_HELPER_HEADER_SIZE as usize {
 +        return Err(format!(
 +            "__TEXT,__stub_helper is too small for header: {} < {}",
 +            section_bytes.len(),
 +            STUB_HELPER_HEADER_SIZE
 +        ));
 +    }
 +    let dyld_private_target =
 +        decode_page_reference(&section_bytes, section_addr, 0, PageRefKind::Add)?;
 +    let binder_got_target =
 +        decode_page_reference(&section_bytes, section_addr, 12, PageRefKind::Load)?;
 +    let dyld_private = canonical_section_location(bytes, dyld_private_target)?;
 +    let binder_got = canonical_section_location(bytes, binder_got_target)?;
++
 +    let entry_bytes = &section_bytes[STUB_HELPER_HEADER_SIZE as usize..];
 +    if entry_bytes.len() % STUB_HELPER_ENTRY_SIZE as usize != 0 {
 +        return Err(format!(
 +            "__TEXT,__stub_helper entries {} are not a multiple of {}",
 +            entry_bytes.len(),
 +            STUB_HELPER_ENTRY_SIZE
 +        ));
 +    }
++
 +    let mut lazy_bind_offsets = Vec::new();
 +    for (idx, chunk) in entry_bytes
 +        .chunks_exact(STUB_HELPER_ENTRY_SIZE as usize)
 +        .enumerate()
 +    {
 +        let entry_addr = section_addr
 +            + STUB_HELPER_HEADER_SIZE as u64
 +            + (idx as u64) * STUB_HELPER_ENTRY_SIZE as u64;
 +        let ldr = read_insn(chunk, 0)?;
 +        if ldr != 0x1800_0050 {
 +            return Err(format!(
 +                "stub helper entry at 0x{entry_addr:x} does not start with LDR literal"
 +            ));
 +        }
 +        let branch = read_insn(chunk, 4)?;
 +        let branch_target = decode_branch26_target(branch, entry_addr + 4)?;
 +        if branch_target != section_addr {
 +            return Err(format!(
 +                "stub helper entry at 0x{entry_addr:x} branches to 0x{branch_target:x}, expected header 0x{section_addr:x}"
 +            ));
 +        }
 +        lazy_bind_offsets.push(u32_le(&chunk[8..12]));
 +    }
++
 +    Ok(CanonicalStubHelper {
 +        dyld_private,
 +        binder_got,
 +        lazy_bind_offsets,
 +    })
 +}
++
 +fn decode_stub_target(bytes: &[u8], stub_addr: u64) -> Result<u64, String> {
 +    let adrp = read_insn(bytes, 0)?;
 +    let ldr = read_insn(bytes, 4)?;
 +    let br = read_insn(bytes, 8)?;
 +    if (adrp & 0x9f00_0000) != 0x9000_0000 {
 +        return Err(format!("stub at 0x{stub_addr:x} does not start with ADRP"));
 +    }
 +    if (ldr & 0xffc0_0000) != 0xf940_0000 {
 +        return Err(format!(
 +            "stub at 0x{stub_addr:x} does not use LDR (unsigned)"
 +        ));
 +    }
 +    if (br & 0xffff_fc1f) != 0xd61f_0000 {
 +        return Err(format!("stub at 0x{stub_addr:x} does not end with BR"));
 +    }
 +    let adrp_reg = (adrp & 0x1f) as u8;
 +    let ldr_base = ((ldr >> 5) & 0x1f) as u8;
 +    let ldr_reg = (ldr & 0x1f) as u8;
 +    let br_reg = ((br >> 5) & 0x1f) as u8;
 +    if adrp_reg != ldr_base || adrp_reg != ldr_reg || adrp_reg != br_reg {
 +        return Err(format!(
 +            "stub at 0x{stub_addr:x} uses inconsistent scratch regs: adrp=x{adrp_reg}, ldr base=x{ldr_base}, ldr rt=x{ldr_reg}, br=x{br_reg}"
 +        ));
 +    }
 +    let adrp_immlo = ((adrp >> 29) & 0x3) as i64;
 +    let adrp_immhi = ((adrp >> 5) & 0x7ffff) as i64;
 +    let adrp_pages = sign_extend_21((adrp_immhi << 2) | adrp_immlo);
 +    let adrp_base = ((stub_addr as i64) & !0xfff) + (adrp_pages << 12);
 +    let scaled = ((ldr >> 10) & 0xfff) as u64;
 +    Ok((adrp_base as u64) + scaled * 8)
 +}
++
 +fn decode_branch26_target(insn: u32, place: u64) -> Result<u64, String> {
 +    if (insn & 0xfc00_0000) != 0x1400_0000 {
 +        return Err(format!(
 +            "instruction 0x{insn:08x} at 0x{place:x} is not a B/BL branch26"
 +        ));
 +    }
 +    let imm26 = sign_extend_26((insn & 0x03ff_ffff) as i64);
 +    Ok(((place as i64) + (imm26 << 2)) as u64)
 +}
++
 +fn sign_extend_26(value: i64) -> i64 {
 +    let shift = 64 - 26;
 +    (value << shift) >> shift
 +}
++
  fn symbol_values(bytes: &[u8]) -> Result<BTreeMap<String, u64>, String> {
      let header = parse_header(bytes).map_err(|e| e.to_string())?;
      let commands = parse_commands(&header, bytes).map_err(|e| e.to_string())?;

tests/determinism.rsadded

 +//! Sprint 28 determinism guardrails.
 +//!
 +//! Parallel speedups are only safe if they never perturb the final image. This
 +//! test repeatedly links a multi-object executable and requires byte-identical
 +//! output across concurrent runs.
++
 +mod common;
++
 +use std::collections::VecDeque;
 +use std::fs;
 +use std::path::{Path, PathBuf};
 +use std::process::Command;
 +use std::sync::{Arc, Mutex};
 +use std::thread;
 +use std::time::{SystemTime, UNIX_EPOCH};
++
 +use afs_ld::{LinkOptions, Linker, OutputKind};
 +use common::harness::{assemble, have_xcrun, have_xcrun_tool};
++
 +const DEFAULT_RUNS: usize = 100;
++
 +#[test]
 +fn repeated_parallel_links_are_byte_identical() {
 +    if !have_xcrun() || !have_xcrun_tool("as") {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let root = unique_temp_dir("determinism").expect("create determinism temp dir");
 +    let main_obj = root.join("main.o");
 +    assemble(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _main\n\
 +        _main:\n\
 +            bl _helper\n\
 +            adrp x8, _value@GOTPAGE\n\
 +            ldr x8, [x8, _value@GOTPAGEOFF]\n\
 +            ldr w0, [x8]\n\
 +            ret\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &main_obj,
 +    )
 +    .expect("assemble determinism main fixture");
 +    let helper_obj = root.join("helper.o");
 +    assemble(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _helper\n\
 +        _helper:\n\
 +            ret\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &helper_obj,
 +    )
 +    .expect("assemble determinism helper fixture");
 +    let data_obj = root.join("data.o");
 +    assemble(
 +        "\
 +        .section __DATA,__data\n\
 +        .globl _value\n\
 +        .p2align 2\n\
 +        _value:\n\
 +            .long 7\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &data_obj,
 +    )
 +    .expect("assemble determinism data fixture");
++
 +    let inputs = vec![main_obj, helper_obj, data_obj];
 +    assert_repeated_links_identical(inputs, &root, "objects");
++
 +    let _ = fs::remove_dir_all(root);
 +}
++
 +#[test]
 +fn repeated_parallel_archive_fetches_are_byte_identical() {
 +    if !have_xcrun() || !have_xcrun_tool("as") {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let root = unique_temp_dir("archive-determinism").expect("create archive determinism temp dir");
 +    let main_obj = root.join("main.o");
 +    assemble(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _main\n\
 +        _main:\n\
 +            bl _helper_a\n\
 +            bl _helper_b\n\
 +            mov w0, #0\n\
 +            ret\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &main_obj,
 +    )
 +    .expect("assemble archive determinism main fixture");
 +    let helper_a_obj = root.join("helper_a.o");
 +    assemble(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _helper_a\n\
 +        _helper_a:\n\
 +            ret\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &helper_a_obj,
 +    )
 +    .expect("assemble archive determinism helper_a fixture");
 +    let helper_b_obj = root.join("helper_b.o");
 +    assemble(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _helper_b\n\
 +        _helper_b:\n\
 +            ret\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &helper_b_obj,
 +    )
 +    .expect("assemble archive determinism helper_b fixture");
 +    let unused_obj = root.join("unused.o");
 +    assemble(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _unused\n\
 +        _unused:\n\
 +            ret\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &unused_obj,
 +    )
 +    .expect("assemble archive determinism unused fixture");
++
 +    let archive_path = root.join("libhelpers.a");
 +    if let Err(error) = archive(&[helper_a_obj, helper_b_obj, unused_obj], &archive_path) {
 +        eprintln!("skipping: archive failed: {error}");
 +        let _ = fs::remove_dir_all(root);
 +        return;
 +    }
++
 +    assert_repeated_links_identical(vec![main_obj, archive_path], &root, "archive");
++
 +    let _ = fs::remove_dir_all(root);
 +}
++
 +#[test]
 +fn relocation_workers_match_single_worker_for_many_atoms() {
 +    if !have_xcrun() || !have_xcrun_tool("as") {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let root = unique_temp_dir("reloc-workers").expect("create relocation worker temp dir");
 +    let text_obj = root.join("text.o");
 +    let data_obj = root.join("data.o");
++
 +    let mut asm = String::from(
 +        "\
 +        .section __TEXT,__text,regular,pure_instructions\n\
 +        .globl _main\n\
 +        _main:\n",
 +    );
 +    for index in 0..64 {
 +        asm.push_str(&format!("            bl _helper_{index}\n"));
 +    }
 +    asm.push_str(
 +        "\
 +            adrp x8, _value@GOTPAGE\n\
 +            ldr x8, [x8, _value@GOTPAGEOFF]\n\
 +            ldr w0, [x8]\n\
 +            ret\n\
 +\n",
 +    );
 +    for index in 0..64 {
 +        asm.push_str(&format!(
 +            "\
 +        .globl _helper_{index}\n\
 +        _helper_{index}:\n\
 +            adrp x9, _value@GOTPAGE\n\
 +            ldr x9, [x9, _value@GOTPAGEOFF]\n\
 +            ldr w9, [x9]\n\
 +            ret\n\
 +\n"
 +        ));
 +    }
 +    asm.push_str("        .subsections_via_symbols\n");
++
 +    assemble(&asm, &text_obj).expect("assemble relocation worker text fixture");
 +    assemble(
 +        "\
 +        .section __DATA,__data\n\
 +        .globl _value\n\
 +        .p2align 2\n\
 +        _value:\n\
 +            .long 11\n\
 +\n\
 +        .subsections_via_symbols\n",
 +        &data_obj,
 +    )
 +    .expect("assemble relocation worker data fixture");
++
 +    let inputs = vec![text_obj, data_obj];
 +    let serial =
 +        link_once_with_jobs(&inputs, &root, "reloc-workers-serial", Some(1)).expect("serial link");
 +    let parallel = link_once_with_jobs(&inputs, &root, "reloc-workers-parallel", Some(8))
 +        .expect("parallel link");
 +    assert_eq!(
 +        parallel, serial,
 +        "parallel relocation workers changed final output bytes"
 +    );
++
 +    let _ = fs::remove_dir_all(root);
 +}
++
 +fn assert_repeated_links_identical(inputs: Vec<PathBuf>, root: &Path, label: &str) {
 +    let baseline = link_once(&inputs, root, &format!("{label}-baseline"))
 +        .expect("baseline deterministic link");
 +    let serial = link_once_with_jobs(&inputs, root, &format!("{label}-serial"), Some(1))
 +        .expect("single-worker deterministic link");
 +    assert_eq!(
 +        serial, baseline,
 +        "{label}: single-worker link differed from default parallel link"
 +    );
 +    let run_count = determinism_run_count();
 +    let jobs = determinism_jobs(run_count);
 +    let queue = Arc::new(Mutex::new((0..run_count).collect::<VecDeque<_>>()));
 +    let errors = Arc::new(Mutex::new(Vec::new()));
++
 +    thread::scope(|scope| {
 +        for _ in 0..jobs {
 +            let queue = Arc::clone(&queue);
 +            let errors = Arc::clone(&errors);
 +            let baseline = baseline.clone();
 +            let inputs = inputs.clone();
 +            scope.spawn(move || loop {
 +                let Some(index) = queue
 +                    .lock()
 +                    .expect("determinism queue mutex poisoned")
 +                    .pop_front()
 +                else {
 +                    break;
 +                };
 +                match link_once(&inputs, root, &format!("{label}-run-{index:03}")) {
 +                    Ok(bytes) if bytes == baseline => {}
 +                    Ok(bytes) => errors
 +                        .lock()
 +                        .expect("determinism errors mutex poisoned")
 +                        .push(format!(
 +                            "run {index} differed: baseline={} bytes, output={} bytes",
 +                            baseline.len(),
 +                            bytes.len()
 +                        )),
 +                    Err(error) => errors
 +                        .lock()
 +                        .expect("determinism errors mutex poisoned")
 +                        .push(format!("run {index} failed: {error}")),
 +                }
 +            });
 +        }
 +    });
++
 +    let errors = errors
 +        .lock()
 +        .expect("determinism errors mutex poisoned")
 +        .clone();
 +    assert!(
 +        errors.is_empty(),
 +        "parallel deterministic links diverged:\n{}",
 +        errors.join("\n")
 +    );
 +}
++
 +fn link_once(inputs: &[PathBuf], root: &Path, run_name: &str) -> Result<Vec<u8>, String> {
 +    link_once_with_jobs(inputs, root, run_name, None)
 +}
++
 +fn link_once_with_jobs(
 +    inputs: &[PathBuf],
 +    root: &Path,
 +    run_name: &str,
 +    jobs: Option<usize>,
 +) -> Result<Vec<u8>, String> {
 +    let dir = root.join(run_name);
 +    fs::create_dir_all(&dir).map_err(|e| format!("create {}: {e}", dir.display()))?;
 +    let out = dir.join("deterministic.out");
 +    let opts = LinkOptions {
 +        inputs: inputs.to_vec(),
 +        output: Some(out.clone()),
 +        kind: OutputKind::Executable,
 +        jobs,
 +        ..LinkOptions::default()
 +    };
 +    Linker::run(&opts).map_err(|e| format!("link {}: {e}", out.display()))?;
 +    fs::read(&out).map_err(|e| format!("read {}: {e}", out.display()))
 +}
++
 +fn archive(objects: &[PathBuf], out: &Path) -> Result<(), String> {
 +    let output = Command::new("libtool")
 +        .arg("-static")
 +        .arg("-o")
 +        .arg(out)
 +        .args(objects)
 +        .output()
 +        .map_err(|e| format!("spawn libtool: {e}"))?;
 +    if !output.status.success() {
 +        return Err(format!(
 +            "libtool failed: {}",
 +            String::from_utf8_lossy(&output.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn determinism_run_count() -> usize {
 +    std::env::var("AFS_LD_DETERMINISM_RUNS")
 +        .ok()
 +        .and_then(|raw| raw.parse::<usize>().ok())
 +        .filter(|runs| *runs > 0)
 +        .unwrap_or(DEFAULT_RUNS)
 +}
++
 +fn determinism_jobs(run_count: usize) -> usize {
 +    std::env::var("AFS_LD_DETERMINISM_JOBS")
 +        .ok()
 +        .and_then(|raw| raw.parse::<usize>().ok())
 +        .filter(|jobs| *jobs > 0)
 +        .unwrap_or_else(|| {
 +            thread::available_parallelism()
 +                .map(usize::from)
 +                .unwrap_or(1)
 +        })
 +        .min(run_count)
 +        .max(1)
 +}
++
 +fn unique_temp_dir(name: &str) -> Result<PathBuf, String> {
 +    let stamp = SystemTime::now()
 +        .duration_since(UNIX_EPOCH)
 +        .map_err(|e| format!("clock error: {e}"))?
 +        .as_nanos();
 +    let dir = std::env::temp_dir().join(format!("afs-ld-{name}-{}-{stamp}", std::process::id()));
 +    fs::create_dir_all(&dir).map_err(|e| format!("create {}: {e}", dir.display()))?;
 +    Ok(dir)
 +}

tests/linker_write_integration.rsadded

 +use std::fs;
 +use std::path::{Path, PathBuf};
 +use std::process::Command;
++
 +use afs_ld::macho::constants::{LC_LOAD_DYLIB, LC_SOURCE_VERSION, LC_UUID, MH_DYLIB, MH_EXECUTE};
 +use afs_ld::macho::reader::{parse_commands, parse_header, LoadCommand, Section64Header};
++
 +fn have_xcrun() -> bool {
 +    Command::new("xcrun")
 +        .arg("-f")
 +        .arg("as")
 +        .output()
 +        .map(|o| o.status.success())
 +        .unwrap_or(false)
 +}
++
 +fn have_tool(name: &str) -> bool {
 +    Command::new(name)
 +        .arg("-h")
 +        .output()
 +        .map(|_| true)
 +        .unwrap_or(false)
 +}
++
 +fn have_clang() -> bool {
 +    Command::new("xcrun")
 +        .arg("-f")
 +        .arg("clang")
 +        .output()
 +        .map(|o| o.status.success())
 +        .unwrap_or(false)
 +}
++
 +fn assemble(src_text: &str, out: &Path) -> Result<(), String> {
 +    let tmp = std::env::temp_dir().join(format!(
 +        "afs-ld-link-write-{}-{}.s",
 +        std::process::id(),
 +        out.file_stem().and_then(|s| s.to_str()).unwrap_or("t")
 +    ));
 +    fs::write(&tmp, src_text).map_err(|e| format!("write .s: {e}"))?;
 +    let status = Command::new("xcrun")
 +        .args(["--sdk", "macosx", "as", "-arch", "arm64"])
 +        .arg(&tmp)
 +        .arg("-o")
 +        .arg(out)
 +        .output()
 +        .map_err(|e| format!("spawn xcrun as: {e}"))?;
 +    let _ = fs::remove_file(&tmp);
 +    if !status.status.success() {
 +        return Err(format!(
 +            "xcrun as failed: {}",
 +            String::from_utf8_lossy(&status.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn build_test_dylib(src: &str, out: &Path, install_name: &str) -> Result<(), String> {
 +    let mut child = Command::new("xcrun")
 +        .args([
 +            "--sdk", "macosx", "clang", "-x", "c", "-arch", "arm64", "-shared", "-o",
 +        ])
 +        .arg(out)
 +        .arg("-install_name")
 +        .arg(install_name)
 +        .arg("-")
 +        .stdin(std::process::Stdio::piped())
 +        .stdout(std::process::Stdio::piped())
 +        .stderr(std::process::Stdio::piped())
 +        .spawn()
 +        .map_err(|e| format!("spawn clang: {e}"))?;
 +    use std::io::Write;
 +    child
 +        .stdin
 +        .as_mut()
 +        .unwrap()
 +        .write_all(src.as_bytes())
 +        .map_err(|e| format!("write clang stdin: {e}"))?;
 +    let out = child.wait_with_output().map_err(|e| format!("wait: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "clang failed: {}",
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn scratch(name: &str) -> PathBuf {
 +    std::env::temp_dir().join(format!("afs-ld-link-write-{}-{name}", std::process::id()))
 +}
++
 +fn link_with_afs_ld(args: &[&str]) -> Result<(), String> {
 +    let exe = env!("CARGO_BIN_EXE_afs-ld");
 +    let out = Command::new(exe)
 +        .args(args)
 +        .output()
 +        .map_err(|e| format!("spawn afs-ld: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "afs-ld failed: {}\n{}",
 +            out.status,
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn section<'a>(cmds: &'a [LoadCommand], seg: &str, sect: &str) -> &'a Section64Header {
 +    cmds.iter()
 +        .find_map(|cmd| match cmd {
 +            LoadCommand::Segment64(segment) => segment
 +                .sections
 +                .iter()
 +                .find(|s| s.segname_str() == seg && s.sectname_str() == sect),
 +            _ => None,
 +        })
 +        .unwrap_or_else(|| panic!("missing section {seg},{sect}"))
 +}
++
 +fn run_otool_lv(path: &Path) -> Result<String, String> {
 +    let out = Command::new("otool")
 +        .arg("-lV")
 +        .arg(path)
 +        .output()
 +        .map_err(|e| format!("spawn otool -lV: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "otool -lV failed: {}",
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(String::from_utf8_lossy(&out.stdout).into_owned())
 +}
++
 +fn fixture_source() -> &'static str {
 +    r#"
 +        .section __TEXT,__text,regular,pure_instructions
 +        .globl _main
 +        .p2align 2
 +        _main:
 +            ret
++
 +        .section __TEXT,__cstring,cstring_literals
 +        _lit:
 +            .asciz "hi"
++
 +        .section __DATA,__data
 +        .globl _num
 +        .p2align 3
 +        _num:
 +            .quad 0x1122334455667788
 +    "#
 +}
++
 +#[test]
 +fn linker_writes_executable_with_real_section_bytes() {
 +    if !have_xcrun() {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let obj = scratch("fixture.o");
 +    let out = scratch("linked-exec");
 +    if let Err(e) = assemble(fixture_source(), &obj) {
 +        eprintln!("skipping: assemble failed: {e}");
 +        return;
 +    }
 +    link_with_afs_ld(&[obj.to_str().unwrap(), "-o", out.to_str().unwrap()])
 +        .expect("link executable");
++
 +    let bytes = fs::read(&out).expect("read executable");
 +    let hdr = parse_header(&bytes).expect("parse header");
 +    assert_eq!(hdr.filetype, MH_EXECUTE);
 +    let cmds = parse_commands(&hdr, &bytes).expect("parse commands");
++
 +    let text = section(&cmds, "__TEXT", "__text");
 +    assert_eq!(
 +        &bytes[text.offset as usize..text.offset as usize + text.size as usize],
 +        &[0xc0, 0x03, 0x5f, 0xd6]
 +    );
++
 +    let cstring = section(&cmds, "__TEXT", "__cstring");
 +    assert_eq!(
 +        &bytes[cstring.offset as usize..cstring.offset as usize + cstring.size as usize],
 +        b"hi\0"
 +    );
++
 +    let data = section(&cmds, "__DATA", "__data");
 +    assert_eq!(
 +        &bytes[data.offset as usize..data.offset as usize + data.size as usize],
 +        &0x1122_3344_5566_7788u64.to_le_bytes()
 +    );
++
 +    if have_tool("otool") {
 +        let dump = run_otool_lv(&out).expect("otool -lV");
 +        assert!(dump.contains("segname __TEXT"));
 +        assert!(dump.contains("sectname __cstring"));
 +        assert!(dump.contains("segname __DATA"));
 +    }
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&out);
 +}
++
 +#[test]
 +fn linker_writes_dylib_with_real_text_section() {
 +    if !have_xcrun() {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let obj = scratch("fixture-dylib.o");
 +    let out = scratch("libfixture.dylib");
 +    if let Err(e) = assemble(fixture_source(), &obj) {
 +        eprintln!("skipping: assemble failed: {e}");
 +        return;
 +    }
 +    link_with_afs_ld(&["-dylib", obj.to_str().unwrap(), "-o", out.to_str().unwrap()])
 +        .expect("link dylib");
++
 +    let bytes = fs::read(&out).expect("read dylib");
 +    let hdr = parse_header(&bytes).expect("parse header");
 +    assert_eq!(hdr.filetype, MH_DYLIB);
 +    let cmds = parse_commands(&hdr, &bytes).expect("parse commands");
++
 +    let text = section(&cmds, "__TEXT", "__text");
 +    assert_eq!(
 +        &bytes[text.offset as usize..text.offset as usize + text.size as usize],
 +        &[0xc0, 0x03, 0x5f, 0xd6]
 +    );
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&out);
 +}
++
 +#[test]
 +fn linker_emits_load_dylib_for_direct_dependency_input() {
 +    if !have_xcrun() || !have_clang() {
 +        eprintln!("skipping: xcrun as / clang unavailable");
 +        return;
 +    }
++
 +    let obj = scratch("dep-main.o");
 +    let dep = scratch("libdep.dylib");
 +    let out = scratch("dep-linked");
 +    if let Err(e) = assemble(
 +        r#"
 +            .section __TEXT,__text,regular,pure_instructions
 +            .globl _main
 +            _main:
 +                ret
 +        "#,
 +        &obj,
 +    ) {
 +        eprintln!("skipping: assemble failed: {e}");
 +        return;
 +    }
 +    if let Err(e) = build_test_dylib(
 +        "int afsld_dep(void) { return 7; }\n",
 +        &dep,
 +        "@rpath/libafslddep.dylib",
 +    ) {
 +        eprintln!("skipping: clang failed: {e}");
 +        return;
 +    }
++
 +    link_with_afs_ld(&[
 +        obj.to_str().unwrap(),
 +        dep.to_str().unwrap(),
 +        "-o",
 +        out.to_str().unwrap(),
 +    ])
 +    .expect("link executable with dylib");
++
 +    let bytes = fs::read(&out).expect("read executable");
 +    let hdr = parse_header(&bytes).expect("parse header");
 +    let cmds = parse_commands(&hdr, &bytes).expect("parse commands");
 +    assert!(cmds.iter().any(|cmd| {
 +        matches!(
 +            cmd,
 +            LoadCommand::Dylib(d)
 +                if d.cmd == LC_LOAD_DYLIB && d.name == "@rpath/libafslddep.dylib"
 +        )
 +    }));
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&dep);
 +    let _ = fs::remove_file(&out);
 +}
++
 +#[test]
 +fn linker_emits_rpath_load_command() {
 +    if !have_xcrun() {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let obj = scratch("rpath-main.o");
 +    let out = scratch("rpath-linked");
 +    if let Err(e) = assemble(
 +        r#"
 +            .section __TEXT,__text,regular,pure_instructions
 +            .globl _main
 +            _main:
 +                ret
 +        "#,
 +        &obj,
 +    ) {
 +        eprintln!("skipping: assemble failed: {e}");
 +        return;
 +    }
++
 +    link_with_afs_ld(&[
 +        obj.to_str().unwrap(),
 +        "-rpath",
 +        "@executable_path/../lib",
 +        "-o",
 +        out.to_str().unwrap(),
 +    ])
 +    .expect("link executable with rpath");
++
 +    let bytes = fs::read(&out).expect("read executable");
 +    let hdr = parse_header(&bytes).expect("parse header");
 +    let cmds = parse_commands(&hdr, &bytes).expect("parse commands");
 +    assert!(cmds.iter().any(|cmd| {
 +        matches!(
 +            cmd,
 +            LoadCommand::Rpath(r) if r.path == "@executable_path/../lib"
 +        )
 +    }));
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&out);
 +}
++
 +#[test]
 +fn linker_emits_uuid_and_source_version_commands() {
 +    if !have_xcrun() {
 +        eprintln!("skipping: xcrun as unavailable");
 +        return;
 +    }
++
 +    let obj = scratch("uuid-main.o");
 +    let out = scratch("uuid-linked");
 +    if let Err(e) = assemble(
 +        r#"
 +            .section __TEXT,__text,regular,pure_instructions
 +            .globl _main
 +            _main:
 +                ret
 +        "#,
 +        &obj,
 +    ) {
 +        eprintln!("skipping: assemble failed: {e}");
 +        return;
 +    }
++
 +    link_with_afs_ld(&[obj.to_str().unwrap(), "-o", out.to_str().unwrap()])
 +        .expect("link executable with metadata");
++
 +    let bytes = fs::read(&out).expect("read executable");
 +    let hdr = parse_header(&bytes).expect("parse header");
 +    let ids: Vec<u32> = parse_commands(&hdr, &bytes)
 +        .expect("parse commands")
 +        .into_iter()
 +        .map(|cmd| match cmd {
 +            LoadCommand::Raw { cmd, .. } => cmd,
 +            other => other.cmd(),
 +        })
 +        .collect();
 +    assert!(ids.contains(&LC_UUID));
 +    assert!(ids.contains(&LC_SOURCE_VERSION));
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&out);
 +}

tests/load_command_parity.rsadded

 +use std::fs;
 +use std::path::{Path, PathBuf};
 +use std::process::Command;
++
 +use afs_ld::macho::constants::*;
 +use afs_ld::macho::reader::{parse_commands, parse_header, LoadCommand};
++
 +fn have_xcrun() -> bool {
 +    Command::new("xcrun")
 +        .arg("-f")
 +        .arg("as")
 +        .output()
 +        .map(|o| o.status.success())
 +        .unwrap_or(false)
 +}
++
 +fn have_clang() -> bool {
 +    Command::new("xcrun")
 +        .arg("-f")
 +        .arg("clang")
 +        .output()
 +        .map(|o| o.status.success())
 +        .unwrap_or(false)
 +}
++
 +fn have_ld() -> bool {
 +    Command::new("xcrun")
 +        .arg("-f")
 +        .arg("ld")
 +        .output()
 +        .map(|o| o.status.success())
 +        .unwrap_or(false)
 +}
++
 +fn sdk_path() -> Option<String> {
 +    Command::new("xcrun")
 +        .args(["--sdk", "macosx", "--show-sdk-path"])
 +        .output()
 +        .ok()
 +        .filter(|out| out.status.success())
 +        .and_then(|out| String::from_utf8(out.stdout).ok())
 +        .map(|text| text.trim().to_string())
 +        .filter(|text| !text.is_empty())
 +}
++
 +fn scratch(name: &str) -> PathBuf {
 +    std::env::temp_dir().join(format!("afs-ld-load-order-{}-{name}", std::process::id()))
 +}
++
 +fn assemble(src_text: &str, out: &Path) -> Result<(), String> {
 +    let tmp = std::env::temp_dir().join(format!(
 +        "afs-ld-load-order-{}-{}.s",
 +        std::process::id(),
 +        out.file_stem().and_then(|s| s.to_str()).unwrap_or("t")
 +    ));
 +    fs::write(&tmp, src_text).map_err(|e| format!("write .s: {e}"))?;
 +    let status = Command::new("xcrun")
 +        .args(["--sdk", "macosx", "as", "-arch", "arm64"])
 +        .arg(&tmp)
 +        .arg("-o")
 +        .arg(out)
 +        .output()
 +        .map_err(|e| format!("spawn xcrun as: {e}"))?;
 +    let _ = fs::remove_file(&tmp);
 +    if !status.status.success() {
 +        return Err(format!(
 +            "xcrun as failed: {}",
 +            String::from_utf8_lossy(&status.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn build_test_dylib(src: &str, out: &Path, install_name: &str) -> Result<(), String> {
 +    let mut child = Command::new("xcrun")
 +        .args([
 +            "--sdk", "macosx", "clang", "-x", "c", "-arch", "arm64", "-shared", "-o",
 +        ])
 +        .arg(out)
 +        .arg("-install_name")
 +        .arg(install_name)
 +        .arg("-")
 +        .stdin(std::process::Stdio::piped())
 +        .stdout(std::process::Stdio::piped())
 +        .stderr(std::process::Stdio::piped())
 +        .spawn()
 +        .map_err(|e| format!("spawn clang: {e}"))?;
 +    use std::io::Write;
 +    child
 +        .stdin
 +        .as_mut()
 +        .unwrap()
 +        .write_all(src.as_bytes())
 +        .map_err(|e| format!("write clang stdin: {e}"))?;
 +    let out = child.wait_with_output().map_err(|e| format!("wait: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "clang failed: {}",
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn link_with_afs_ld(args: &[&str]) -> Result<(), String> {
 +    let exe = env!("CARGO_BIN_EXE_afs-ld");
 +    let out = Command::new(exe)
 +        .args(args)
 +        .output()
 +        .map_err(|e| format!("spawn afs-ld: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "afs-ld failed: {}\n{}",
 +            out.status,
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(())
 +}
++
 +fn command_ids(path: &Path) -> Vec<u32> {
 +    let bytes = fs::read(path).expect("read mach-o");
 +    let hdr = parse_header(&bytes).expect("parse header");
 +    parse_commands(&hdr, &bytes)
 +        .expect("parse commands")
 +        .into_iter()
 +        .map(|cmd| match cmd {
 +            LoadCommand::Raw { cmd, .. } => cmd,
 +            other => other.cmd(),
 +        })
 +        .collect()
 +}
++
 +fn normalize(ids: &[u32]) -> Vec<&'static str> {
 +    let mut out = Vec::new();
 +    for token in ids.iter().filter_map(|cmd| match *cmd {
 +        LC_SEGMENT_64 => Some("SEGMENT"),
 +        LC_DYLD_INFO_ONLY | LC_DYLD_CHAINED_FIXUPS | LC_DYLD_EXPORTS_TRIE => Some("FIXUPS"),
 +        LC_SYMTAB => Some("SYMTAB"),
 +        LC_DYSYMTAB => Some("DYSYMTAB"),
 +        LC_LOAD_DYLINKER => Some("LOAD_DYLINKER"),
 +        LC_UUID => Some("UUID"),
 +        LC_BUILD_VERSION => Some("BUILD_VERSION"),
 +        LC_SOURCE_VERSION => Some("SOURCE_VERSION"),
 +        LC_MAIN => Some("MAIN"),
 +        LC_ID_DYLIB => Some("ID_DYLIB"),
 +        LC_LOAD_DYLIB => Some("LOAD_DYLIB"),
 +        LC_RPATH => Some("RPATH"),
 +        LC_FUNCTION_STARTS => Some("FUNCTION_STARTS"),
 +        LC_DATA_IN_CODE => Some("DATA_IN_CODE"),
 +        LC_CODE_SIGNATURE => Some("CODE_SIGNATURE"),
 +        _ => None,
 +    }) {
 +        if out.last().copied() == Some(token) && token == "FIXUPS" {
 +            continue;
 +        }
 +        out.push(token);
 +    }
 +    out
 +}
++
 +#[test]
 +fn executable_load_command_order_matches_apple_for_common_surface() {
 +    if !have_xcrun() || !have_ld() {
 +        eprintln!("skipping: xcrun as / ld unavailable");
 +        return;
 +    }
 +    let Some(sdk) = sdk_path() else {
 +        eprintln!("skipping: xcrun --show-sdk-path unavailable");
 +        return;
 +    };
++
 +    let obj = scratch("main.o");
 +    let ours = scratch("ours-exec");
 +    let theirs = scratch("apple-exec");
 +    assemble(
 +        r#"
 +            .section __TEXT,__text,regular,pure_instructions
 +            .globl _main
 +            _main:
 +                ret
 +        "#,
 +        &obj,
 +    )
 +    .expect("assemble");
 +    link_with_afs_ld(&[
 +        "-syslibroot",
 +        &sdk,
 +        "-lSystem",
 +        obj.to_str().unwrap(),
 +        "-o",
 +        ours.to_str().unwrap(),
 +    ])
 +    .expect("afs-ld");
 +    let status = Command::new("xcrun")
 +        .args([
 +            "ld",
 +            "-arch",
 +            "arm64",
 +            "-syslibroot",
 +            &sdk,
 +            "-lSystem",
 +            "-e",
 +            "_main",
 +            "-o",
 +        ])
 +        .arg(&theirs)
 +        .arg(&obj)
 +        .status()
 +        .expect("spawn ld");
 +    assert!(status.success(), "ld link failed");
++
 +    assert_eq!(
 +        normalize(&command_ids(&ours)),
 +        normalize(&command_ids(&theirs))
 +    );
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&ours);
 +    let _ = fs::remove_file(&theirs);
 +}
++
 +#[test]
 +fn dylib_load_command_order_matches_apple_for_common_surface() {
 +    if !have_xcrun() || !have_ld() {
 +        eprintln!("skipping: xcrun as / ld unavailable");
 +        return;
 +    }
 +    let Some(sdk) = sdk_path() else {
 +        eprintln!("skipping: xcrun --show-sdk-path unavailable");
 +        return;
 +    };
++
 +    let obj = scratch("lib.o");
 +    let ours = scratch("ours.dylib");
 +    let theirs = scratch("apple.dylib");
 +    assemble(
 +        r#"
 +            .section __TEXT,__text,regular,pure_instructions
 +            .globl _answer
 +            _answer:
 +                ret
 +        "#,
 +        &obj,
 +    )
 +    .expect("assemble");
 +    link_with_afs_ld(&[
 +        "-dylib",
 +        "-syslibroot",
 +        &sdk,
 +        "-lSystem",
 +        obj.to_str().unwrap(),
 +        "-o",
 +        ours.to_str().unwrap(),
 +    ])
 +    .expect("afs-ld dylib");
 +    let status = Command::new("xcrun")
 +        .args([
 +            "ld",
 +            "-dylib",
 +            "-arch",
 +            "arm64",
 +            "-syslibroot",
 +            &sdk,
 +            "-lSystem",
 +            "-install_name",
 +            "@rpath/libparity.dylib",
 +            "-o",
 +        ])
 +        .arg(&theirs)
 +        .arg(&obj)
 +        .status()
 +        .expect("spawn ld");
 +    assert!(status.success(), "ld dylib link failed");
++
 +    assert_eq!(
 +        normalize(&command_ids(&ours)),
 +        normalize(&command_ids(&theirs))
 +    );
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&ours);
 +    let _ = fs::remove_file(&theirs);
 +}
++
 +#[test]
 +fn executable_load_command_order_with_dependency_and_rpath_matches_common_surface() {
 +    if !have_xcrun() || !have_clang() || !have_ld() {
 +        eprintln!("skipping: xcrun as / clang / ld unavailable");
 +        return;
 +    }
 +    let Some(sdk) = sdk_path() else {
 +        eprintln!("skipping: xcrun --show-sdk-path unavailable");
 +        return;
 +    };
++
 +    let obj = scratch("dep-main.o");
 +    let dep = scratch("dep.dylib");
 +    let ours = scratch("ours-dep");
 +    let theirs = scratch("apple-dep");
 +    assemble(
 +        r#"
 +            .section __TEXT,__text,regular,pure_instructions
 +            .globl _main
 +            _main:
 +                ret
 +        "#,
 +        &obj,
 +    )
 +    .expect("assemble");
 +    build_test_dylib("int dep(void) { return 1; }\n", &dep, "@rpath/libdep.dylib")
 +        .expect("build dylib");
 +    link_with_afs_ld(&[
 +        "-syslibroot",
 +        &sdk,
 +        "-lSystem",
 +        obj.to_str().unwrap(),
 +        dep.to_str().unwrap(),
 +        "-rpath",
 +        "@executable_path/../lib",
 +        "-o",
 +        ours.to_str().unwrap(),
 +    ])
 +    .expect("afs-ld with dep");
++
 +    let status = Command::new("xcrun")
 +        .args([
 +            "ld",
 +            "-arch",
 +            "arm64",
 +            "-syslibroot",
 +            &sdk,
 +            "-lSystem",
 +            "-e",
 +            "_main",
 +            "-o",
 +        ])
 +        .arg(&theirs)
 +        .arg(&obj)
 +        .arg(&dep)
 +        .arg("-rpath")
 +        .arg("@executable_path/../lib")
 +        .status()
 +        .expect("spawn ld");
 +    assert!(status.success(), "ld dep link failed");
++
 +    assert_eq!(
 +        normalize(&command_ids(&ours)),
 +        normalize(&command_ids(&theirs))
 +    );
++
 +    let _ = fs::remove_file(&obj);
 +    let _ = fs::remove_file(&dep);
 +    let _ = fs::remove_file(&ours);
 +    let _ = fs::remove_file(&theirs);
 +}

tests/parity_corpus/data_in_code_exec/command_checks.txtmodified

  build_version
  load_dylib_names
 -data_in_code
 +data_in_code_if_present

tests/parity_corpus/data_in_code_large_first_exec/command_checks.txtmodified

  build_version
  load_dylib_names
 -data_in_code
 +data_in_code_if_present

tests/parity_corpus/data_in_code_late_exec/command_checks.txtmodified

  build_version
  load_dylib_names
 -data_in_code
 +data_in_code_if_present

tests/parity_corpus/function_starts_exec/command_checks.txtmodified

  build_version
  load_dylib_names
  normalized_function_starts
 -data_in_code
 +data_in_code_if_present

tests/parity_corpus/hidden_got_exec/sections.txtmodified

`@@ -1,1 +0,0 @@`
1		-__TEXT __text

tests/parity_corpus/imported_tlv_exec/absent_sections.txtmodified

`@@ -1,1 +0,0 @@`
1		-__DATA __thread_ptrs

tests/parity_corpus/imported_tlv_exec/page_refs.txtadded

tests/parity_corpus/imported_tlv_exec/sections.txtmodified

`@@ -1,2 +0,0 @@`
1		-__TEXT __text
2		-__DATA_CONST __got

tests/parity_corpus/strip_locals_exec/args.txtmodified

  -arch
  arm64
 +-platform_version
 +macos
 +@SDK_VERSION@
 +@SDK_VERSION@
 +-syslibroot
 +@SDK_PATH@
 +-lSystem
  -x
  -no_fixup_chains
  -e

tests/parity_corpus/symtab_partition_exec/args.txtmodified

  -arch
  arm64
 +-platform_version
 +macos
 +@SDK_VERSION@
 +@SDK_VERSION@
 +-syslibroot
 +@SDK_PATH@
 +-lSystem
  -no_fixup_chains
  -e
  _main

tests/parity_matrix.rsmodified

  use std::fs;
  use std::path::{Path, PathBuf};
 +use std::sync::{mpsc, Arc, Mutex};
 +use std::thread;
  use std::time::{Duration, Instant};
  use common::harness::{
          fs::create_dir_all(dir).expect("create parity artifact dir");
+     }
 -    let mut case_reports = Vec::new();
      let mut failures = Vec::new();
 +    let case_reports = run_cases(cases);
 -    for case in cases {
 -        let report = run_case(&case);
 +    for (case, report) in &case_reports {
          if let Some(dir) = artifact_dir.as_ref() {
 -            write_case_artifact(dir, &case, &report).expect("write case artifact");
 +            write_case_artifact(dir, case, report).expect("write case artifact");
+         }
 -        if let Some(error) = report.error_message() {
 +        if let Some(error) = report.error_message(&case.name) {
 +            eprintln!("parity failure:\n{error}\n");
              failures.push(error);
+         }
 -        case_reports.push((case, report));
+     }
 +    print_timing_summary(started.elapsed(), &case_reports);
      if let Some(dir) = artifact_dir.as_ref() {
          write_index_artifact(dir, &case_reports).expect("write parity index");
+     }
+ }
 +fn run_cases(cases: Vec<LinkCase>) -> Vec<(LinkCase, CaseReport)> {
 +    let job_count = parity_matrix_jobs(cases.len());
 +    if job_count <= 1 || cases.len() <= 1 {
 +        return cases
 +            .into_iter()
 +            .map(|case| {
 +                let report = run_case(&case);
 +                (case, report)
 +            })
 +            .collect();
 +    }
++
 +    let queue = Arc::new(Mutex::new(cases.into_iter().enumerate()));
 +    let (tx, rx) = mpsc::channel();
 +    thread::scope(|scope| {
 +        for _ in 0..job_count {
 +            let queue = Arc::clone(&queue);
 +            let tx = tx.clone();
 +            scope.spawn(move || loop {
 +                let Some((index, case)) = queue
 +                    .lock()
 +                    .expect("parity case queue mutex poisoned")
 +                    .next()
 +                else {
 +                    break;
 +                };
 +                let report = run_case(&case);
 +                tx.send((index, case, report))
 +                    .expect("parity result receiver should stay live");
 +            });
 +        }
 +        drop(tx);
 +        let mut reports: Vec<_> = rx.into_iter().collect();
 +        reports.sort_by_key(|(index, _, _)| *index);
 +        reports
 +            .into_iter()
 +            .map(|(_, case, report)| (case, report))
 +            .collect()
 +    })
 +}
++
  #[derive(Debug)]
  struct CaseStep {
      name: &'static str,
 +    duration: Duration,
      error: Option<String>,
+ }
  #[derive(Debug, Default)]
  struct CaseReport {
      steps: Vec<CaseStep>,
 +    elapsed: Duration,
+ }
  impl CaseReport {
      fn push(&mut self, name: &'static str, result: Result<(), String>) -> bool {
 +        self.push_timed(name, Duration::ZERO, result)
 +    }
++
 +    fn push_timed(
 +        &mut self,
 +        name: &'static str,
 +        duration: Duration,
 +        result: Result<(), String>,
 +    ) -> bool {
          match result {
              Ok(()) => {
 -                self.steps.push(CaseStep { name, error: None });
 +                self.steps.push(CaseStep {
 +                    name,
 +                    duration,
 +                    error: None,
 +                });
                  true
+             }
              Err(error) => {
                  self.steps.push(CaseStep {
                      name,
 +                    duration,
                      error: Some(error),
                  });
                  false
+         }
+     }
 +    fn measure<F>(&mut self, name: &'static str, action: F) -> bool
 +    where
 +        F: FnOnce() -> Result<(), String>,
 +    {
 +        let started = Instant::now();
 +        let result = action();
 +        self.push_timed(name, started.elapsed(), result)
 +    }
++
 +    fn finish(&mut self, elapsed: Duration) {
 +        self.elapsed = elapsed;
 +    }
++
      fn passed(&self) -> bool {
          self.steps.iter().all(|step| step.error.is_none())
+     }
 -    fn error_message(&self) -> Option<String> {
 +    fn slowest_step(&self) -> Option<&CaseStep> {
 +        self.steps.iter().max_by_key(|step| step.duration)
 +    }
++
 +    fn error_message(&self, case_name: &str) -> Option<String> {
          self.steps.iter().find_map(|step| {
              step.error
                  .as_ref()
 -                .map(|error| format!("{} failed:\n{}", step.name, error))
 +                .map(|error| format!("[{case_name}] {} failed:\n{}", step.name, error))
          })
+     }
+ }
 +#[test]
 +fn case_report_error_message_includes_case_name() {
 +    let mut report = CaseReport::default();
 +    report.push("section parity", Err("stub bytes differ".into()));
 +    assert_eq!(
 +        report.error_message("classic_lazy_branch_only_calls"),
 +        Some(
 +            "[classic_lazy_branch_only_calls] section parity failed:\nstub bytes differ"
 +                .to_string()
 +        )
 +    );
 +}
++
  fn run_case(case: &LinkCase) -> CaseReport {
 +    let case_started = Instant::now();
      let mut report = CaseReport::default();
 +    let link_started = Instant::now();
      let outputs = match link_both(case) {
          Ok(outputs) => {
 -            report.push("link", Ok(()));
 +            report.push_timed("link", link_started.elapsed(), Ok(()));
              outputs
+         }
          Err(error) => {
 -            report.push(
 +            report.push_timed(
                  "link",
 +                link_started.elapsed(),
                  Err(format!(
                      "failed to link parity case from {}:\n{}",
                      case.dir.display(),
                      error
                  )),
              );
 -            return report;
 +            return finish_case(report, case_started);
+         }
      };
 -    if !report.push(
 -        "load-command ids",
 -        compare_command_ids(&outputs.ours, &outputs.theirs, &case.ignored_load_commands),
 -    ) {
 -        return report;
 +    if !report.measure("load-command ids", || {
 +        compare_command_ids(&outputs.ours, &outputs.theirs, &case.ignored_load_commands)
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "command details",
 -        compare_command_details(&outputs.ours, &outputs.theirs, &case.command_checks),
 -    ) {
 -        return report;
 +    if !report.measure("command details", || {
 +        compare_command_details(&outputs.ours, &outputs.theirs, &case.command_checks)
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "afs-ld absent commands",
 -        ensure_absent_load_commands(&outputs.ours, &case.absent_load_commands, "afs-ld"),
 -    ) {
 -        return report;
 +    if !report.measure("afs-ld absent commands", || {
 +        ensure_absent_load_commands(&outputs.ours, &case.absent_load_commands, "afs-ld")
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "Apple absent commands",
 -        ensure_absent_load_commands(&outputs.theirs, &case.absent_load_commands, "Apple ld"),
 -    ) {
 -        return report;
 +    if !report.measure("Apple absent commands", || {
 +        ensure_absent_load_commands(&outputs.theirs, &case.absent_load_commands, "Apple ld")
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "afs-ld absent sections",
 -        ensure_absent_sections(&outputs.ours, &case.absent_sections, "afs-ld"),
 -    ) {
 -        return report;
 +    if !report.measure("afs-ld absent sections", || {
 +        ensure_absent_sections(&outputs.ours, &case.absent_sections, "afs-ld")
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "Apple absent sections",
 -        ensure_absent_sections(&outputs.theirs, &case.absent_sections, "Apple ld"),
 -    ) {
 -        return report;
 +    if !report.measure("Apple absent sections", || {
 +        ensure_absent_sections(&outputs.theirs, &case.absent_sections, "Apple ld")
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "section parity",
 +    if !report.measure("section parity", || {
          compare_sections(
              &outputs.ours,
              &outputs.theirs,
              &case.section_checks,
              &case.case_tolerances,
 -        ),
 -    ) {
 -        return report;
 +        )
 +    }) {
 +        return finish_case(report, case_started);
+     }
 -    if !report.push(
 -        "page-ref parity",
 -        compare_page_refs(&outputs.ours, &outputs.theirs, &case.page_ref_checks),
 -    ) {
 -        return report;
 +    if !report.measure("page-ref parity", || {
 +        compare_page_refs(&outputs.ours, &outputs.theirs, &case.page_ref_checks)
 +    }) {
 +        return finish_case(report, case_started);
+     }
      if !case.runtime_args.is_empty() || case.dir.join("runtime.txt").exists() {
 -        report.push(
 -            "runtime parity",
 -            compare_runtime(&outputs.our_path, &outputs.their_path, &case.runtime_args),
 -        );
 +        report.measure("runtime parity", || {
 +            compare_runtime(&outputs.our_path, &outputs.their_path, &case.runtime_args)
 +        });
+     }
 +    finish_case(report, case_started)
 +}
++
 +fn finish_case(mut report: CaseReport, started: Instant) -> CaseReport {
 +    report.finish(started.elapsed());
      report
+ }
          if report.passed() { "ok" } else { "fail" },
          if report.passed() { "PASS" } else { "FAIL" }
      ));
 +    html.push_str(&format!(
 +        "<p>Total: <strong>{}</strong></p>",
 +        format_duration(report.elapsed)
 +    ));
      html.push_str("<h2>Steps</h2><ul>");
      for step in &report.steps {
          match &step.error {
              None => html.push_str(&format!(
 -                "<li><span class=\"ok\">PASS</span> {}</li>",
 -                escape_html(step.name)
 +                "<li><span class=\"ok\">PASS</span> {} <span class=\"time\">{}</span></li>",
 +                escape_html(step.name),
 +                format_duration(step.duration)
              )),
              Some(error) => html.push_str(&format!(
 -                "<li><span class=\"fail\">FAIL</span> {}<pre>{}</pre></li>",
 +                "<li><span class=\"fail\">FAIL</span> {} <span class=\"time\">{}</span><pre>{}</pre></li>",
                  escape_html(step.name),
 +                format_duration(step.duration),
                  escape_html(error)
              )),
+         }
  fn write_index_artifact(dir: &Path, cases: &[(LinkCase, CaseReport)]) -> Result<(), String> {
      let mut html = String::new();
      html.push_str("<!doctype html><html><head><meta charset=\"utf-8\">");
 -    html.push_str("<title>Parity Matrix</title><style>body{font-family:ui-monospace,Menlo,monospace;padding:2rem;} .ok{color:#0a0;} .fail{color:#a00;}</style></head><body>");
 -    html.push_str("<h1>Parity Matrix</h1><ul>");
 +    html.push_str("<title>Parity Matrix</title><style>body{font-family:ui-monospace,Menlo,monospace;padding:2rem;} .ok{color:#0a0;} .fail{color:#a00;} .time{color:#57606a;} table{border-collapse:collapse;margin:1rem 0;} td,th{border:1px solid #d0d7de;padding:.35rem .6rem;text-align:left;}</style></head><body>");
 +    html.push_str("<h1>Parity Matrix</h1>");
 +    html.push_str("<h2>Slowest Cases</h2><table><thead><tr><th>Case</th><th>Total</th><th>Slowest Step</th></tr></thead><tbody>");
 +    for (case, report) in slowest_cases(cases, 10) {
 +        let slowest = report
 +            .slowest_step()
 +            .map(|step| format!("{} {}", step.name, format_duration(step.duration)))
 +            .unwrap_or_else(|| "n/a".to_string());
 +        html.push_str(&format!(
 +            "<tr><td><a href=\"{}.html\">{}</a></td><td>{}</td><td>{}</td></tr>",
 +            slug(&case.name),
 +            escape_html(&case.name),
 +            format_duration(report.elapsed),
 +            escape_html(&slowest)
 +        ));
 +    }
 +    html.push_str("</tbody></table><h2>Cases</h2><ul>");
      for (case, report) in cases {
          let slug = slug(&case.name);
          html.push_str(&format!(
 -            "<li><a href=\"{}.html\">{}</a> <strong class=\"{}\">{}</strong></li>",
 +            "<li><a href=\"{}.html\">{}</a> <strong class=\"{}\">{}</strong> <span class=\"time\">{}</span></li>",
              slug,
              escape_html(&case.name),
              if report.passed() { "ok" } else { "fail" },
 -            if report.passed() { "PASS" } else { "FAIL" }
 +            if report.passed() { "PASS" } else { "FAIL" },
 +            format_duration(report.elapsed)
          ));
+     }
      html.push_str("</ul></body></html>");
      fs::write(&path, html).map_err(|e| format!("write {}: {e}", path.display()))
+ }
 +fn print_timing_summary(elapsed: Duration, cases: &[(LinkCase, CaseReport)]) {
 +    eprintln!(
 +        "parity matrix timing: {} case(s) in {}",
 +        cases.len(),
 +        format_duration(elapsed)
 +    );
 +    for (case, report) in slowest_cases(cases, 10) {
 +        let slowest = report
 +            .slowest_step()
 +            .map(|step| {
 +                format!(
 +                    "; slowest step: {} {}",
 +                    step.name,
 +                    format_duration(step.duration)
 +                )
 +            })
 +            .unwrap_or_default();
 +        eprintln!(
 +            "  {:>9} {}{}",
 +            format_duration(report.elapsed),
 +            case.name,
 +            slowest
 +        );
 +    }
 +}
++
 +fn slowest_cases(cases: &[(LinkCase, CaseReport)], limit: usize) -> Vec<(&LinkCase, &CaseReport)> {
 +    let mut timed: Vec<_> = cases.iter().map(|(case, report)| (case, report)).collect();
 +    timed.sort_by(|a, b| {
 +        b.1.elapsed
 +            .cmp(&a.1.elapsed)
 +            .then_with(|| a.0.name.cmp(&b.0.name))
 +    });
 +    timed.truncate(limit);
 +    timed
 +}
++
  fn slug(name: &str) -> String {
      name.chars()
          .map(|ch| {
      Some(Duration::from_secs(seconds))
+ }
 +fn parity_matrix_jobs(case_count: usize) -> usize {
 +    if case_count == 0 {
 +        return 1;
 +    }
 +    let requested = std::env::var("PARITY_MATRIX_JOBS")
 +        .ok()
 +        .and_then(|raw| raw.parse::<usize>().ok())
 +        .filter(|jobs| *jobs > 0)
 +        .unwrap_or_else(|| {
 +            thread::available_parallelism()
 +                .map(usize::from)
 +                .unwrap_or(1)
 +        });
 +    requested.min(case_count).max(1)
 +}
++
 +fn format_duration(duration: Duration) -> String {
 +    let millis = duration.as_secs_f64() * 1000.0;
 +    if millis >= 1000.0 {
 +        format!("{:.2}s", duration.as_secs_f64())
 +    } else {
 +        format!("{millis:.1}ms")
 +    }
 +}
++
  fn escape_html(text: &str) -> String {
      text.replace('&', "&amp;")
          .replace('<', "&lt;")

tests/perf_baseline.rsmodified

 +use std::fs;
  use std::path::{Path, PathBuf};
 +use std::process::Command;
  use std::time::Duration;
  mod common;
  use afs_ld::{LinkOptions, LinkProfile, Linker};
 -use common::harness::{assemble, have_xcrun, have_xcrun_tool, scratch, sdk_path, sdk_version};
 +use common::harness::{
 +    assemble, have_tool, have_xcrun, have_xcrun_tool, scratch, sdk_path, sdk_version,
 +};
  fn find_runtime_archive() -> Option<PathBuf> {
      let workspace = Path::new(env!("CARGO_MANIFEST_DIR")).join("..");
      None
+ }
 +fn runtime_archive_fixture() -> Result<PathBuf, String> {
 +    if let Some(runtime) = find_runtime_archive() {
 +        return Ok(runtime);
 +    }
 +    build_synthetic_runtime_archive()
 +}
++
 +fn build_synthetic_runtime_archive() -> Result<PathBuf, String> {
 +    if !have_tool("libtool") {
 +        return Err("libtool unavailable".into());
 +    }
++
 +    let members = [
 +        ("init", "_afs_program_init"),
 +        ("finalize", "_afs_program_finalize"),
 +        ("write_i32", "_afs_write_i32"),
 +        ("write_f64", "_afs_write_f64"),
 +        ("write_newline", "_afs_write_newline"),
 +        ("read_i32", "_afs_read_i32"),
 +        ("alloc", "_afs_alloc"),
 +        ("dealloc", "_afs_dealloc"),
 +        ("bounds_check", "_afs_bounds_check"),
 +        ("stop", "_afs_stop"),
 +        ("date_and_time", "_afs_date_and_time"),
 +        ("cpu_time", "_afs_cpu_time"),
 +        ("random_seed", "_afs_random_seed"),
 +        ("random_number", "_afs_random_number"),
 +        ("open_unit", "_afs_open_unit"),
 +        ("close_unit", "_afs_close_unit"),
 +    ];
 +    let mut objects = Vec::with_capacity(members.len());
 +    for (stem, symbol) in members {
 +        let obj = scratch(&format!("perf-runtime-{stem}.o"));
 +        let src = format!(
 +            "\
 +            .text\n\
 +            .globl {symbol}\n\
 +            .p2align 2\n\
 +            {symbol}:\n\
 +                ret\n\
 +            .subsections_via_symbols\n",
 +        );
 +        assemble(&src, &obj)?;
 +        objects.push(obj);
 +    }
++
 +    let archive = scratch("libafs-perf-runtime.a");
 +    let _ = fs::remove_file(&archive);
 +    let output = Command::new("libtool")
 +        .args(["-static", "-o"])
 +        .arg(&archive)
 +        .args(&objects)
 +        .output()
 +        .map_err(|e| format!("spawn libtool archive: {e}"))?;
 +    if !output.status.success() {
 +        return Err(format!(
 +            "libtool archive failed: {}",
 +            String::from_utf8_lossy(&output.stderr)
 +        ));
 +    }
 +    Ok(archive)
 +}
++
  fn executable_opts(inputs: Vec<PathBuf>, output: PathBuf) -> LinkOptions {
      LinkOptions {
          inputs,
  fn assert_profile_basics(name: &str, profile: &LinkProfile) {
      eprintln!(
 -        "{name}: total={:?} parse={:?} resolve={:?} atomize={:?} layout={:?} synth={:?} (linkedit={:?}: symbols={:?} [locals={:?} globals={:?} strtab={:?}] dyld={:?} metadata={:?} codesig={:?}; unwind={:?}) reloc={:?} write={:?}",
 +        "{name}: total={:?} parse={:?} resolve={:?} atomize={:?} layout={:?} (entry={:?} dead={:?} icf={:?} synth_plan={:?} build={:?} thunks={:?}) synth={:?} (linkedit={:?}: symbols={:?} [locals={:?} globals={:?} strtab={:?}] dyld={:?} [bind={:?} rebase={:?} export={:?}] metadata={:?} codesig={:?}; unwind={:?}) reloc={:?} write={:?}",
          profile.total_wall,
          profile.phases.input_parsing,
          profile.phases.symbol_resolution,
          profile.phases.atomization,
          profile.phases.layout,
 +        profile.phases.layout_entry_lookup,
 +        profile.phases.layout_dead_strip,
 +        profile.phases.layout_icf,
 +        profile.phases.layout_synthetic_plan,
 +        profile.phases.layout_build,
 +        profile.phases.layout_thunk_plan,
          profile.phases.synth_sections,
          profile.phases.synth_linkedit_finalize,
          profile.phases.synth_linkedit_symbol_plan,
          profile.phases.synth_linkedit_symbol_plan_globals,
          profile.phases.synth_linkedit_symbol_plan_strtab,
          profile.phases.synth_linkedit_dyld_info,
 +        profile.phases.synth_linkedit_dyld_bind,
 +        profile.phases.synth_linkedit_dyld_rebase,
 +        profile.phases.synth_linkedit_dyld_export,
          profile.phases.synth_linkedit_metadata_tables,
          profile.phases.synth_linkedit_code_signature,
          profile.phases.synth_unwind,
          profile.phases.reloc_apply,
          profile.phases.write_output,
      );
 +    eprintln!(
 +        "{name}: input read={:?} object={:?} archive={:?} dylib={:?} tbd_decode={:?} tbd_materialize={:?} reloc_cache={:?}",
 +        profile.phases.input_read,
 +        profile.phases.input_object_parse,
 +        profile.phases.input_archive_parse,
 +        profile.phases.input_dylib_parse,
 +        profile.phases.input_tbd_decode,
 +        profile.phases.input_tbd_materialize,
 +        profile.phases.input_reloc_parse,
 +    );
      assert!(profile.output.is_file(), "{name}: output file missing");
      assert!(
          profile.total_wall >= profile.phases.accounted_total(),
          profile.phases.accounted_total() > Duration::ZERO,
          "{name}: all phase timings were zero"
      );
 +    let input_subphase_total = profile.phases.input_read
 +        + profile.phases.input_object_parse
 +        + profile.phases.input_archive_parse
 +        + profile.phases.input_dylib_parse
 +        + profile.phases.input_tbd_decode
 +        + profile.phases.input_tbd_materialize
 +        + profile.phases.input_reloc_parse;
 +    // Input subphases are summed worker-time once object parsing is parallel,
 +    // so they can legitimately exceed the wall-clock input parsing bucket.
 +    assert!(
 +        input_subphase_total > Duration::ZERO,
 +        "{name}: all input subphase timings were zero"
 +    );
 +    assert!(
 +        profile.phases.layout
 +            >= profile.phases.layout_entry_lookup
 +                + profile.phases.layout_dead_strip
 +                + profile.phases.layout_icf
 +                + profile.phases.layout_synthetic_plan
 +                + profile.phases.layout_build
 +                + profile.phases.layout_thunk_plan,
 +        "{name}: layout subphases exceeded layout total"
 +    );
      assert!(
          profile.phases.synth_sections
              >= profile.phases.synth_linkedit_finalize + profile.phases.synth_unwind,
                  + profile.phases.synth_linkedit_symbol_plan_strtab,
          "{name}: symbol-plan subphases exceeded symbol-plan total"
      );
 +    assert!(
 +        profile.phases.synth_linkedit_dyld_info
 +            >= profile.phases.synth_linkedit_dyld_bind
 +                + profile.phases.synth_linkedit_dyld_rebase
 +                + profile.phases.synth_linkedit_dyld_export,
 +        "{name}: dyld-info subphases exceeded dyld-info total"
 +    );
+ }
  #[test]
 -fn hello_world_profile_reports_baseline_timings() {
 +fn bench_hello_world_profile_reports_baseline_timings() {
      if !have_xcrun() || !have_xcrun_tool("ld") {
          eprintln!("skipping: xcrun as/ld unavailable");
          return;
+ }
  #[test]
 -fn runtime_link_profile_reports_baseline_timings() {
 +fn bench_runtime_link_profile_reports_baseline_timings() {
      if !have_xcrun() || !have_xcrun_tool("ld") {
          eprintln!("skipping: xcrun as/ld unavailable");
          return;
+     }
 -    let Some(runtime) = find_runtime_archive() else {
 -        eprintln!("skipping: libarmfortas_rt.a not built");
 -        return;
 +    let runtime = match runtime_archive_fixture() {
 +        Ok(runtime) => runtime,
 +        Err(reason) => {
 +            eprintln!("skipping: {reason}");
 +            return;
 +        }
      };
      let obj = scratch("perf-runtime.o");

tests/resolve_integration.rsmodified

          "unexpected duplicates in seeding: {:?}",
          seed_report.duplicates
      );
 -    let drain_report =
 -        drain_fetches(&mut inputs, &mut table, seed_report.pending_fetches).expect("drain_fetches");
 +    let drain_report = drain_fetches(&mut inputs, &mut table, seed_report.pending_fetches, 1)
 +        .expect("drain_fetches");
      assert!(
          drain_report.fetched_members >= 1,
          "expected at least one archive member fetched; got {}",

tests/snapshots/help.txtmodified

                                    Select chained fixups vs classic dyld info
    -all_load                       Force-load every archive member
    -force_load <archive>           Force-load one archive
 +  -j <jobs>                       Limit parallel worker jobs (`1` disables parallelism)
    -Wl,<arg,arg,...>               Normalize comma-separated driver flags
    --dump <path>                   Dump a Mach-O file summary
    --dump-archive <path>           Dump an archive summary

tests/tbd_integration.rsmodified

  //! Skipped if `xcrun` or `libSystem.tbd` aren't present.
  use afs_ld::macho::dylib::{DylibFile, DylibLoadKind};
 -use afs_ld::macho::tbd::{parse_tbd, Arch, Platform, Target};
 +use afs_ld::macho::tbd::{parse_tbd, parse_tbd_for_target, Arch, Platform, Target};
  fn sdk_path() -> Option<String> {
      let out = std::process::Command::new("xcrun")
          arch: Arch::Arm64,
          platform: Platform::MacOs,
      };
 +    let fast_docs = parse_tbd_for_target(&src, &target)
 +        .unwrap_or_else(|e| panic!("libSystem.tbd fast path failed to parse: {e}"));
 +    assert!(
 +        !fast_docs.is_empty(),
 +        "fast path did not keep any arm64-compatible documents"
 +    );
      let dy = DylibFile::from_tbd(&path, main, &target);
      assert_eq!(dy.install_name, "/usr/lib/libSystem.B.dylib");
          "_free not found anywhere in libSystem's TBD re-export chain"
      );
 +    let mut fast_found = std::collections::HashSet::<&str>::new();
 +    for doc in &fast_docs {
 +        let sub = DylibFile::from_tbd(&path, doc, &target);
 +        for entry in sub.exports.entries().unwrap() {
 +            match entry.name.as_str() {
 +                "_atexit" => {
 +                    fast_found.insert("_atexit");
 +                }
 +                "_write" => {
 +                    fast_found.insert("_write");
 +                }
 +                "__Unwind_Backtrace" => {
 +                    fast_found.insert("__Unwind_Backtrace");
 +                }
 +                _ => {}
 +            }
 +        }
 +    }
 +    for expected in ["_atexit", "_write", "__Unwind_Backtrace"] {
 +        assert!(
 +            fast_found.contains(expected),
 +            "{expected} not found by libSystem fast path; got {fast_found:?}"
 +        );
 +    }
++
      // libSystem re-exports most actual libc symbols (malloc, free, etc.) from
      // sub-dylibs. They come from the `reexported-libraries`, not from
      // libSystem's own exports. Confirm we captured the chain.

tests/writer_smoke.rsadded

 +use std::fs;
 +use std::path::{Path, PathBuf};
 +use std::process::Command;
++
 +use afs_ld::layout::Layout;
 +use afs_ld::macho::constants::{LC_ID_DYLIB, MH_DYLIB, MH_EXECUTE};
 +use afs_ld::macho::reader::{parse_commands, parse_header, LoadCommand};
 +use afs_ld::macho::writer::write;
 +use afs_ld::{LinkOptions, OutputKind};
++
 +fn have_tool(name: &str) -> bool {
 +    Command::new(name)
 +        .arg("-h")
 +        .output()
 +        .map(|_| true)
 +        .unwrap_or(false)
 +}
++
 +fn scratch(name: &str) -> PathBuf {
 +    std::env::temp_dir().join(format!("afs-ld-writer-{}-{name}", std::process::id()))
 +}
++
 +fn write_temp(name: &str, bytes: &[u8]) -> PathBuf {
 +    let path = scratch(name);
 +    fs::write(&path, bytes).expect("write temp mach-o");
 +    path
 +}
++
 +fn run_otool_lv(path: &Path) -> Result<String, String> {
 +    let out = Command::new("otool")
 +        .arg("-lV")
 +        .arg(path)
 +        .output()
 +        .map_err(|e| format!("spawn otool -lV: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "otool -lV failed: {}",
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(String::from_utf8_lossy(&out.stdout).into_owned())
 +}
++
 +fn run_file(path: &Path) -> Result<String, String> {
 +    let out = Command::new("file")
 +        .arg(path)
 +        .output()
 +        .map_err(|e| format!("spawn file: {e}"))?;
 +    if !out.status.success() {
 +        return Err(format!(
 +            "file failed: {}",
 +            String::from_utf8_lossy(&out.stderr)
 +        ));
 +    }
 +    Ok(String::from_utf8_lossy(&out.stdout).into_owned())
 +}
++
 +#[test]
 +fn empty_executable_writer_emits_parseable_macho() {
 +    let layout = Layout::empty(OutputKind::Executable, 0);
 +    let opts = LinkOptions::default();
 +    let mut bytes = Vec::new();
 +    write(&layout, OutputKind::Executable, &opts, &mut bytes).expect("write executable");
++
 +    let hdr = parse_header(&bytes).expect("header parses");
 +    assert_eq!(hdr.filetype, MH_EXECUTE);
 +    let cmds = parse_commands(&hdr, &bytes).expect("commands parse");
 +    assert!(
 +        cmds.iter()
 +            .any(|cmd| matches!(cmd, LoadCommand::Segment64(seg) if seg.segname_str() == "__TEXT"))
 +    );
++
 +    let path = write_temp("empty-exec", &bytes);
 +    if have_tool("otool") {
 +        let dump = run_otool_lv(&path).expect("otool -lV");
 +        assert!(dump.contains("LC_MAIN"));
 +        assert!(dump.contains("segname __TEXT"));
 +    }
 +    if have_tool("file") {
 +        let desc = run_file(&path).expect("file");
 +        assert!(desc.contains("Mach-O 64-bit executable arm64"));
 +    }
 +    let _ = fs::remove_file(path);
 +}
++
 +#[test]
 +fn empty_dylib_writer_emits_parseable_macho() {
 +    let layout = Layout::empty(OutputKind::Dylib, 0);
 +    let mut opts = LinkOptions {
 +        kind: OutputKind::Dylib,
 +        ..LinkOptions::default()
 +    };
 +    opts.output = Some(PathBuf::from("libempty.dylib"));
++
 +    let mut bytes = Vec::new();
 +    write(&layout, OutputKind::Dylib, &opts, &mut bytes).expect("write dylib");
++
 +    let hdr = parse_header(&bytes).expect("header parses");
 +    assert_eq!(hdr.filetype, MH_DYLIB);
 +    let cmds = parse_commands(&hdr, &bytes).expect("commands parse");
 +    assert!(cmds.iter().any(|cmd| {
 +        matches!(cmd, LoadCommand::Dylib(d) if d.cmd == LC_ID_DYLIB && d.name == "@rpath/libempty.dylib")
 +    }));
++
 +    let path = write_temp("empty-dylib.dylib", &bytes);
 +    if have_tool("otool") {
 +        let dump = run_otool_lv(&path).expect("otool -lV");
 +        assert!(dump.contains("LC_ID_DYLIB"));
 +        assert!(dump.contains("@rpath/libempty.dylib"));
 +    }
 +    if have_tool("file") {
 +        let desc = run_file(&path).expect("file");
 +        assert!(desc.contains("dynamically linked shared library arm64"));
 +    }
 +    let _ = fs::remove_file(path);
 +}