Sprint 23: Dead Strip (-dead_strip)
Prerequisites
Sprint 9 — atomization; Sprint 19 — -dead_strip CLI flag.
Goals
Implement -dead_strip: remove atoms that are unreachable from the GC roots. Populates the side table that Sprint 19's -why_live diagnostic reads.
Deliverables
1. GC roots
Live set seeded from:
- The entry-point symbol's atom (executable only).
- Every exported symbol's atom (governed by
-exported_symbols_list/ defaults). - Every atom with
NoDeadStripflag (fromN_NO_DEAD_STRIPor.no_dead_strip). - Every atom referenced by a
LC_RPATH/LC_LOAD_DYLIBside-channel (usually none, but keep the hook). _dyld_stub_binder— always referenced by__stub_helper.- Compact-unwind and eh_frame atoms are not roots; they are transitively live via their
parent_oflink to a function atom (Sprint 9). - Personality functions referenced by any live unwind FDE.
2. Mark-live traversal
Worklist algorithm over the atom reference graph:
pub fn mark_live(layout: &mut Layout, roots: &[AtomId]) {
let mut worklist: Vec<AtomId> = roots.to_vec();
while let Some(atom_id) = worklist.pop() {
if layout.atoms[atom_id].live { continue; }
layout.atoms[atom_id].live = true;
record_why_live(atom_id, cause); // for -why_live diagnostic
for referent in atom_references(&layout.atoms[atom_id]) {
worklist.push(referent);
}
}
}
atom_references pulls from the reloc list of the atom, yielding the atom-id each reloc points to.
3. Transitive rules
- A live function makes its
__compact_unwindrecord live via theparent_oflink from Sprint 9. - A live function makes its
__eh_frameFDE live (FDE references function via SUBTRACTOR pair — the FDE's life is parasitic on the function's). - A live personality function is reached via the unwind records that reference it.
- LSDA blobs are live iff their owning function is.
4. Dead atoms purged
After mark-live, walk the atom list; atoms with live = false are removed from the output. Output sections shrink accordingly; Sprint 10's layout pass re-runs to compact addresses.
5. -why_live side table
Every time an atom is marked live, record its cause: "GC root (entry point)", "reachable from ", "reachable via unwind parent", etc. Stored as a HashMap<AtomId, LiveCause>. Sprint 19's -why_live walks back from a named symbol through this map to a root.
6. Interactions with ICF (Sprint 24)
Dead-strip runs before ICF. ICF folds only live atoms. Dead atoms never get folded.
7. Stripped-symbol enumeration
The -map output (Sprint 19) lists dead-stripped symbols. Populate this list from the purged atoms.
8. Default behavior
-dead_strip is opt-in. Without it, no GC runs, no -why_live data is produced (the diagnostic explains this).
Testing Strategy
- Fixture: two functions, one called by
_main, one unreferenced. With-dead_strip: output's__textshrinks,nm -nlists only the called function. -why_live _mainon the fixture: "root".-why_live _called_fn: "reachable from _main; _main is root".-why_live _unreferenced_fn: "not live (dead-stripped)".N_NO_DEAD_STRIPfixture: that symbol survives even without a reference.- Weak def reference: weak-coalesced winner survives, losers dead-stripped.
- Differential: output's
__textlength matchesld -dead_stripon 10+ fixtures.
Definition of Done
- Live set matches
ld -dead_stripon a corpus of 15+ fixtures. -why_liveproduces coherent chains.- Compact-unwind and eh_frame entries correctly follow their parent functions.
- fortsh builds under
-dead_strip(Sprint 29 retest).
View source
| 1 | # Sprint 23: Dead Strip (`-dead_strip`) |
| 2 | |
| 3 | ## Prerequisites |
| 4 | Sprint 9 — atomization; Sprint 19 — `-dead_strip` CLI flag. |
| 5 | |
| 6 | ## Goals |
| 7 | Implement `-dead_strip`: remove atoms that are unreachable from the GC roots. Populates the side table that Sprint 19's `-why_live` diagnostic reads. |
| 8 | |
| 9 | ## Deliverables |
| 10 | |
| 11 | ### 1. GC roots |
| 12 | Live set seeded from: |
| 13 | - The entry-point symbol's atom (executable only). |
| 14 | - Every exported symbol's atom (governed by `-exported_symbols_list` / defaults). |
| 15 | - Every atom with `NoDeadStrip` flag (from `N_NO_DEAD_STRIP` or `.no_dead_strip`). |
| 16 | - Every atom referenced by a `LC_RPATH` / `LC_LOAD_DYLIB` side-channel (usually none, but keep the hook). |
| 17 | - `_dyld_stub_binder` — always referenced by `__stub_helper`. |
| 18 | - Compact-unwind and eh_frame atoms are **not** roots; they are transitively live via their `parent_of` link to a function atom (Sprint 9). |
| 19 | - Personality functions referenced by any live unwind FDE. |
| 20 | |
| 21 | ### 2. Mark-live traversal |
| 22 | Worklist algorithm over the atom reference graph: |
| 23 | |
| 24 | ```rust |
| 25 | pub fn mark_live(layout: &mut Layout, roots: &[AtomId]) { |
| 26 | let mut worklist: Vec<AtomId> = roots.to_vec(); |
| 27 | while let Some(atom_id) = worklist.pop() { |
| 28 | if layout.atoms[atom_id].live { continue; } |
| 29 | layout.atoms[atom_id].live = true; |
| 30 | record_why_live(atom_id, cause); // for -why_live diagnostic |
| 31 | for referent in atom_references(&layout.atoms[atom_id]) { |
| 32 | worklist.push(referent); |
| 33 | } |
| 34 | } |
| 35 | } |
| 36 | ``` |
| 37 | |
| 38 | `atom_references` pulls from the reloc list of the atom, yielding the atom-id each reloc points to. |
| 39 | |
| 40 | ### 3. Transitive rules |
| 41 | - A live function makes its `__compact_unwind` record live via the `parent_of` link from Sprint 9. |
| 42 | - A live function makes its `__eh_frame` FDE live (FDE references function via SUBTRACTOR pair — the FDE's life is parasitic on the function's). |
| 43 | - A live personality function is reached via the unwind records that reference it. |
| 44 | - LSDA blobs are live iff their owning function is. |
| 45 | |
| 46 | ### 4. Dead atoms purged |
| 47 | After mark-live, walk the atom list; atoms with `live = false` are removed from the output. Output sections shrink accordingly; Sprint 10's layout pass re-runs to compact addresses. |
| 48 | |
| 49 | ### 5. `-why_live` side table |
| 50 | Every time an atom is marked live, record its cause: "GC root (entry point)", "reachable from <other atom>", "reachable via unwind parent", etc. Stored as a `HashMap<AtomId, LiveCause>`. Sprint 19's `-why_live` walks back from a named symbol through this map to a root. |
| 51 | |
| 52 | ### 6. Interactions with ICF (Sprint 24) |
| 53 | Dead-strip runs before ICF. ICF folds only live atoms. Dead atoms never get folded. |
| 54 | |
| 55 | ### 7. Stripped-symbol enumeration |
| 56 | The `-map` output (Sprint 19) lists dead-stripped symbols. Populate this list from the purged atoms. |
| 57 | |
| 58 | ### 8. Default behavior |
| 59 | `-dead_strip` is opt-in. Without it, no GC runs, no `-why_live` data is produced (the diagnostic explains this). |
| 60 | |
| 61 | ## Testing Strategy |
| 62 | - Fixture: two functions, one called by `_main`, one unreferenced. With `-dead_strip`: output's `__text` shrinks, `nm -n` lists only the called function. |
| 63 | - `-why_live _main` on the fixture: "root". |
| 64 | - `-why_live _called_fn`: "reachable from _main; _main is root". |
| 65 | - `-why_live _unreferenced_fn`: "not live (dead-stripped)". |
| 66 | - `N_NO_DEAD_STRIP` fixture: that symbol survives even without a reference. |
| 67 | - Weak def reference: weak-coalesced winner survives, losers dead-stripped. |
| 68 | - Differential: output's `__text` length matches `ld -dead_strip` on 10+ fixtures. |
| 69 | |
| 70 | ## Definition of Done |
| 71 | - Live set matches `ld -dead_strip` on a corpus of 15+ fixtures. |
| 72 | - `-why_live` produces coherent chains. |
| 73 | - Compact-unwind and eh_frame entries correctly follow their parent functions. |
| 74 | - fortsh builds under `-dead_strip` (Sprint 29 retest). |