Sprint 24: ICF (-icf=safe)
Prerequisites
Sprints 9, 23 — atoms, dead-strip done.
Goals
Identical Code Folding: merge atoms whose content, relocations, and attributes are identical, so one copy survives. Safe mode only — respects address-taken symbols so function-pointer equality is preserved.
Deliverables
1. Safety rules
Atom is foldable iff:
- It lives in
__TEXT,__text,__TEXT,__cstring,__TEXT,__literal16,__TEXT,__const, or__DATA_CONST,__const. - It has no
NoDeadStripflag. - It has no
AddressTakenannotation (see below). - It is not the primary symbol of its atom under
-exported_symbols_list.
2. Address-taken detection
During reloc application (Sprint 11 + this sprint's prep pass), mark any atom referenced via Unsigned or PointerToGot or any GOT_LOAD_* as AddressTaken. Function pointers, vtable slots, RTTI entries, anything comparable via == — all take the address, so they must not be folded.
Only Branch26 references are considered "fold-safe" on their own; direct calls don't create address equality.
3. Segregation algorithm
Two-phase refinement (lld/MachO style):
- Initial hash: for each foldable atom, compute a 64-bit hash over (size, flags, content bytes, reloc list normalized to (kind, addend, target class)).
- Bucket: group atoms by hash. Single-element buckets are unique and not foldable.
- Refine: within each bucket, check pairwise equality of content + relocs. Use equivalence classes — two atoms equivalent iff every referenced atom is equivalent (fixed-point refinement, à la Hopcroft's DFA minimization).
- Repeat step 3 until no class splits. Converges in log(N) iterations in practice.
- Fold: within each final equivalence class, pick a winner by input command-line order (stable, deterministic); rewrite every other atom's
owner_symbolredirect to the winner; erase the loser atoms.
4. Reloc patching
After folding, some relocations now point at folded-away atoms. Rewrite every such reloc to target the winner. Layout recomputed; sizes shrink.
5. -icf=none path
Default off. -icf=safe enables. -icf=all (unsafe, doesn't respect AddressTaken) is not implemented — emits a diagnostic.
6. Interaction with -map and -why_live
-mapreports folded symbols with their winner:_foo folded to _bar.-why_liveof a folded symbol reports the winner's live-chain.
7. Determinism
Winner selection by command-line order is stable across invocations. Hash function is seeded with a fixed seed — same inputs, same winners, same output bytes.
Testing Strategy
- Fixture: two functions with byte-identical code and relocs.
-icf=safefolds them into one;nmshows only one entry, both symbol aliases point at the same address. - Address-taken fixture: two identical functions, one has its address stored in a table. Fold does not happen for the address-taken one; both survive.
- String dedup:
__cstringatoms with identical content folded. - Large-scale: 50+ near-identical functions, fold correctness verified by running each alias.
- Differential: folded output's
__textsize matchesld -icf=safewithin 1%.
Definition of Done
-icf=safereduces text size on a constructed fixture.- Address-taken functions never folded.
- All aliases point to the folded winner at runtime.
- fortsh builds with
-icf=safeand behaves identically to its-icf=nonecounterpart.
View source
| 1 | # Sprint 24: ICF (`-icf=safe`) |
| 2 | |
| 3 | ## Prerequisites |
| 4 | Sprints 9, 23 — atoms, dead-strip done. |
| 5 | |
| 6 | ## Goals |
| 7 | Identical Code Folding: merge atoms whose content, relocations, and attributes are identical, so one copy survives. Safe mode only — respects address-taken symbols so function-pointer equality is preserved. |
| 8 | |
| 9 | ## Deliverables |
| 10 | |
| 11 | ### 1. Safety rules |
| 12 | |
| 13 | Atom is **foldable** iff: |
| 14 | - It lives in `__TEXT,__text`, `__TEXT,__cstring`, `__TEXT,__literal16`, `__TEXT,__const`, or `__DATA_CONST,__const`. |
| 15 | - It has no `NoDeadStrip` flag. |
| 16 | - It has no `AddressTaken` annotation (see below). |
| 17 | - It is not the primary symbol of its atom under `-exported_symbols_list`. |
| 18 | |
| 19 | ### 2. Address-taken detection |
| 20 | |
| 21 | During reloc application (Sprint 11 + this sprint's prep pass), mark any atom referenced via `Unsigned` or `PointerToGot` or any `GOT_LOAD_*` as `AddressTaken`. Function pointers, vtable slots, RTTI entries, anything comparable via `==` — all take the address, so they must not be folded. |
| 22 | |
| 23 | Only `Branch26` references are considered "fold-safe" on their own; direct calls don't create address equality. |
| 24 | |
| 25 | ### 3. Segregation algorithm |
| 26 | |
| 27 | Two-phase refinement (lld/MachO style): |
| 28 | |
| 29 | 1. **Initial hash**: for each foldable atom, compute a 64-bit hash over (size, flags, content bytes, reloc list normalized to (kind, addend, target class)). |
| 30 | 2. **Bucket**: group atoms by hash. Single-element buckets are unique and not foldable. |
| 31 | 3. **Refine**: within each bucket, check pairwise equality of content + relocs. Use equivalence classes — two atoms equivalent iff every referenced atom is equivalent (fixed-point refinement, à la Hopcroft's DFA minimization). |
| 32 | 4. **Repeat step 3 until no class splits**. Converges in log(N) iterations in practice. |
| 33 | 5. **Fold**: within each final equivalence class, pick a winner by input command-line order (stable, deterministic); rewrite every other atom's `owner_symbol` redirect to the winner; erase the loser atoms. |
| 34 | |
| 35 | ### 4. Reloc patching |
| 36 | |
| 37 | After folding, some relocations now point at folded-away atoms. Rewrite every such reloc to target the winner. Layout recomputed; sizes shrink. |
| 38 | |
| 39 | ### 5. `-icf=none` path |
| 40 | |
| 41 | Default off. `-icf=safe` enables. `-icf=all` (unsafe, doesn't respect AddressTaken) is not implemented — emits a diagnostic. |
| 42 | |
| 43 | ### 6. Interaction with `-map` and `-why_live` |
| 44 | |
| 45 | - `-map` reports folded symbols with their winner: `_foo folded to _bar`. |
| 46 | - `-why_live` of a folded symbol reports the winner's live-chain. |
| 47 | |
| 48 | ### 7. Determinism |
| 49 | |
| 50 | Winner selection by command-line order is stable across invocations. Hash function is seeded with a fixed seed — same inputs, same winners, same output bytes. |
| 51 | |
| 52 | ## Testing Strategy |
| 53 | |
| 54 | - Fixture: two functions with byte-identical code and relocs. `-icf=safe` folds them into one; `nm` shows only one entry, both symbol aliases point at the same address. |
| 55 | - Address-taken fixture: two identical functions, one has its address stored in a table. Fold does not happen for the address-taken one; both survive. |
| 56 | - String dedup: `__cstring` atoms with identical content folded. |
| 57 | - Large-scale: 50+ near-identical functions, fold correctness verified by running each alias. |
| 58 | - Differential: folded output's `__text` size matches `ld -icf=safe` within 1%. |
| 59 | |
| 60 | ## Definition of Done |
| 61 | |
| 62 | - `-icf=safe` reduces text size on a constructed fixture. |
| 63 | - Address-taken functions never folded. |
| 64 | - All aliases point to the folded winner at runtime. |
| 65 | - fortsh builds with `-icf=safe` and behaves identically to its `-icf=none` counterpart. |