markdown · 4276 bytes Raw Blame History

Sprint 8: Name Resolution Pass

Prerequisites

Sprint 7 — SymbolTable with insertion semantics.

Goals

Drive the symbol table to a fixed point: every undefined reference either resolves to a Defined (from an object), Common (promoted in BSS), DylibImport (from a dylib/TBD), or raises a clear, actionable diagnostic. -force_load / -all_load / -undefined <treatment> all handled.

Closeout note: the implemented entrypoint is resolve(inputs, table, opts) -> ResolutionReport. The current library surface applies archive force-loading as archives are encountered in command-line order so left-to-right archive behavior stays explicit.

Deliverables

1. Resolution algorithm

afs-ld/src/resolve.rs:

pub fn resolve(inputs: &mut Inputs, table: &mut SymbolTable, opts: &LinkOptions)
    -> Result<ResolutionReport, ResolutionError>
{
    seed_and_resolve_in_link_order(inputs, table, opts);
    classify_unresolved(table, opts.undefined_treatment);
}

2. Seed phase

Walk every explicit .o first, then every .dylib / .tbd: add Defined / Common from objects, DylibImport from dylibs. Archives are added as LazyArchive entries only — their members are not parsed until pulled.

3. Fixed-point pull

while let Some(name) = table.undefined_pending.pop() {
    for archive in &inputs.archives_in_command_line_order {
        if let Some(member) = archive.fetch(name) {
            ingest_member(member, table);
            break;
        }
    }
}

Order matters: armfortas's driver currently passes <objs> <runtime.a> -lSystem, and resolution must match ld's left-to-right behavior. Ingesting a member can create new undefined pending names; loop terminates when no member was fetched this round.

4. -force_load and -all_load

  • -force_load <archive>: pull every member of that archive before fixed-point.
  • -all_load: pull every member of every archive.
  • In the implemented surface these happen when the named archive is encountered in link order, which preserves left-to-right linker semantics while still feeding the same resolution/classification pipeline.

5. -undefined <treatment>

After the fixed point, any still-Undefined entry is classified by the -undefined setting:

  • error (default): hard error, cite every input that references the name (collected via the transition log).
  • warning: warn but emit, writing the symbol as address 0 (bind to nothing).
  • suppress: silent, address 0.
  • dynamic_lookup: flat-namespace DylibImport with ordinal BIND_SPECIAL_DYLIB_FLAT_LOOKUP=-2.

6. Weak references

weak_ref to a missing symbol is always valid regardless of -undefined; it resolves to address 0 at bind time and the runtime tests for null.

7. Diagnostics

Undefined errors must cite every referrer input, not just one. Output format:

afs-ld: error: undefined symbol: _afs_print
      referenced by program.o(__TEXT,__text + 0x34)
      referenced by runtime.o(__TEXT,__text + 0x120)
      (also via 2 relocations in libarmfortas_rt.a(io.o))
Hint: did you mean _afs_print_real? (Levenshtein distance 5)

Did-you-mean uses a basic Levenshtein-3 search over defined symbols.

8. Diagnostics for duplicate strong

afs-ld: error: duplicate symbol _foo
  defined in: a.o (__TEXT,__text + 0x0)
  also in:    b.o (__TEXT,__text + 0x0)

No suggestion — two strong defs is a real ambiguity.

Testing Strategy

  • Resolution matrix revisited from Sprint 7, but with real archives and dylibs.
  • Order sensitivity: a.o b.a vs b.a a.o — first resolves when a.o references a symbol in b.a; second does not (matches ld's classic behavior).
  • -force_load pulls in a member whose symbols would otherwise go unreferenced.
  • -all_load across a multi-member archive.
  • Weak-import from a dylib that at runtime will be missing.
  • Did-you-mean fires on a close misspell, stays silent when the closest match is > 3 edits away.

Definition of Done

  • Fixed-point loop terminates on all corpus inputs.
  • Diagnostics match the format above, include every referrer, include did-you-mean suggestions.
  • Differential test against ld for order-dependent resolution on 10+ scenarios.
  • -force_load / -all_load / -undefined=* all pass dedicated tests.
View source
1 # Sprint 8: Name Resolution Pass
2
3 ## Prerequisites
4 Sprint 7 — `SymbolTable` with insertion semantics.
5
6 ## Goals
7 Drive the symbol table to a fixed point: every undefined reference either resolves to a Defined (from an object), Common (promoted in BSS), DylibImport (from a dylib/TBD), or raises a clear, actionable diagnostic. `-force_load` / `-all_load` / `-undefined <treatment>` all handled.
8
9 Closeout note: the implemented entrypoint is
10 `resolve(inputs, table, opts) -> ResolutionReport`. The current library
11 surface applies archive force-loading as archives are encountered in
12 command-line order so left-to-right archive behavior stays explicit.
13
14 ## Deliverables
15
16 ### 1. Resolution algorithm
17 `afs-ld/src/resolve.rs`:
18
19 ```rust
20 pub fn resolve(inputs: &mut Inputs, table: &mut SymbolTable, opts: &LinkOptions)
21 -> Result<ResolutionReport, ResolutionError>
22 {
23 seed_and_resolve_in_link_order(inputs, table, opts);
24 classify_unresolved(table, opts.undefined_treatment);
25 }
26 ```
27
28 ### 2. Seed phase
29 Walk every explicit `.o` first, then every `.dylib` / `.tbd`: add Defined / Common from objects, DylibImport from dylibs. Archives are added as LazyArchive entries only — their members are not parsed until pulled.
30
31 ### 3. Fixed-point pull
32 ```
33 while let Some(name) = table.undefined_pending.pop() {
34 for archive in &inputs.archives_in_command_line_order {
35 if let Some(member) = archive.fetch(name) {
36 ingest_member(member, table);
37 break;
38 }
39 }
40 }
41 ```
42
43 Order matters: armfortas's driver currently passes `<objs> <runtime.a> -lSystem`, and resolution must match `ld`'s left-to-right behavior. Ingesting a member can create new undefined pending names; loop terminates when no member was fetched this round.
44
45 ### 4. `-force_load` and `-all_load`
46 - `-force_load <archive>`: pull every member of that archive before fixed-point.
47 - `-all_load`: pull every member of every archive.
48 - In the implemented surface these happen when the named archive is encountered in link order, which preserves left-to-right linker semantics while still feeding the same resolution/classification pipeline.
49
50 ### 5. `-undefined <treatment>`
51 After the fixed point, any still-Undefined entry is classified by the `-undefined` setting:
52 - `error` (default): hard error, cite every input that references the name (collected via the transition log).
53 - `warning`: warn but emit, writing the symbol as address 0 (bind to nothing).
54 - `suppress`: silent, address 0.
55 - `dynamic_lookup`: flat-namespace DylibImport with ordinal `BIND_SPECIAL_DYLIB_FLAT_LOOKUP=-2`.
56
57 ### 6. Weak references
58 `weak_ref` to a missing symbol is always valid regardless of `-undefined`; it resolves to address 0 at bind time and the runtime tests for null.
59
60 ### 7. Diagnostics
61 Undefined errors must cite every referrer input, not just one. Output format:
62
63 ```
64 afs-ld: error: undefined symbol: _afs_print
65 referenced by program.o(__TEXT,__text + 0x34)
66 referenced by runtime.o(__TEXT,__text + 0x120)
67 (also via 2 relocations in libarmfortas_rt.a(io.o))
68 Hint: did you mean _afs_print_real? (Levenshtein distance 5)
69 ```
70
71 Did-you-mean uses a basic Levenshtein-3 search over defined symbols.
72
73 ### 8. Diagnostics for duplicate strong
74 ```
75 afs-ld: error: duplicate symbol _foo
76 defined in: a.o (__TEXT,__text + 0x0)
77 also in: b.o (__TEXT,__text + 0x0)
78 ```
79
80 No suggestion — two strong defs is a real ambiguity.
81
82 ## Testing Strategy
83 - Resolution matrix revisited from Sprint 7, but with real archives and dylibs.
84 - Order sensitivity: `a.o b.a` vs `b.a a.o` — first resolves when `a.o` references a symbol in `b.a`; second does not (matches `ld`'s classic behavior).
85 - `-force_load` pulls in a member whose symbols would otherwise go unreferenced.
86 - `-all_load` across a multi-member archive.
87 - Weak-import from a dylib that at runtime will be missing.
88 - Did-you-mean fires on a close misspell, stays silent when the closest match is > 3 edits away.
89
90 ## Definition of Done
91 - Fixed-point loop terminates on all corpus inputs.
92 - Diagnostics match the format above, include every referrer, include did-you-mean suggestions.
93 - Differential test against `ld` for order-dependent resolution on 10+ scenarios.
94 - `-force_load` / `-all_load` / `-undefined=*` all pass dedicated tests.