tenseleyflow/shithub / 5dd17bb

Browse files

S36: docs — caching invariants + active/planned cache table

Authored by espadonne
SHA
5dd17bb2bc91ca4a4963ebb9ea686beb64d3c084
Parents
ad13327
Tree
9ac1a42

1 changed file

StatusFile+-
A docs/internal/caching.md 87 0
docs/internal/caching.mdadded
@@ -0,0 +1,87 @@
1
+# Caching
2
+
3
+The S36 perf-pass standardises on an in-process LRU
4
+(`internal/cache/lru`) with optional TTL and a single-flight
5
+wrapper for hot-key dogpile prevention. This doc tracks every
6
+cross-request cache and its invalidation contract.
7
+
8
+The invariant is: **every cached value has a documented
9
+invalidation trigger**. If you can't name the trigger, the cache
10
+is a bug factory.
11
+
12
+## Active caches
13
+
14
+| Cache | Key | Value | Invalidator | Bound |
15
+|---|---|---|---|---|
16
+| `repos/git.AheadBehindCached` | (repo_id, base_oid, head_oid) | (ahead, behind) | OID change ⇒ different key; LRU eviction | 4096 entries |
17
+
18
+Concrete uses:
19
+- `branchesList` (S20 deferral H4) — replaces N `git rev-list`
20
+  invocations per page load with one cached lookup per branch.
21
+  Single-flight collapses concurrent misses on hot branches.
22
+
23
+## Planned caches (next iterations)
24
+
25
+These are listed in the S36 spec; they land as the surfaces they
26
+back grow large enough to bench-justify the cache.
27
+
28
+| Cache | Key | Value | Invalidator |
29
+|---|---|---|---|
30
+| Tree at root | (repo_id, ref_oid) | rendered ls-tree result | push:process bumps default-OID |
31
+| Ref list | (repo_id) | branches + tags | push:process |
32
+| File list (finder) | (repo_id, ref_oid) | flat path slice | push:process |
33
+| Default-branch OID | (repo_id) | OID string | push:process + default-branch swap |
34
+| Markdown render | (markdown_pipeline_version, body_hash) | rendered HTML | bump pipeline version on goldmark/policy change |
35
+| Effective-team-set | (actor, org) | team-id slice | team-membership change |
36
+
37
+## Single-flight: when to wrap
38
+
39
+Wrap with `lru.Group` whenever:
40
+
41
+1. The upstream is non-trivial (subprocess, FS walk, multi-row DB read), AND
42
+2. The key is hot (one popular repo, one busy user), AND
43
+3. Concurrent misses are realistic (HTTP burst, worker fanout).
44
+
45
+Without single-flight, a cache miss under load triggers a stampede
46
+that defeats the cache's purpose. The `lru.Group` wrapper collapses
47
+N concurrent misses into one upstream call.
48
+
49
+## Errors are NOT cached
50
+
51
+`lru.Group.Do` deliberately returns errors without caching them. A
52
+transient upstream failure (DB blip, git fork EAGAIN) shouldn't
53
+poison subsequent reads. Negative caching (caching the absence of
54
+a key) is a separate concern; callers add their own sentinel value
55
+when they want it.
56
+
57
+## TTL: when to use one
58
+
59
+Default to no-TTL with explicit invalidation. Use TTL only when:
60
+
61
+- The data is fully public + anonymous-cacheable (rendered HTML for
62
+  a public repo's README).
63
+- Staleness is measured-low-impact (≤ 60s for hot reads).
64
+- An explicit invalidator is wired in addition (TTL is the safety
65
+  net, not the primary correctness mechanism).
66
+
67
+Avoid TTL on personalized content. The "you see your friend's old
68
+comment for 30s" UX surprise is not worth the cache hit-rate.
69
+
70
+## Reading hit-rates
71
+
72
+Every cache exposes `Stats() lru.Stats{Hits, Misses, Evictions}`.
73
+The `/metrics` surface (S37 deploy) will scrape these. CI baseline
74
+asserts hit-rate above a per-cache target on the bench run.
75
+
76
+## Invalidation patterns
77
+
78
+The push:process worker is the canonical invalidation source for
79
+git-shaped caches. After updating refs:
80
+
81
+```go
82
+git.InvalidateAheadBehind(git.AheadBehindKey{...})
83
+// + future: tree, refs, default-branch caches
84
+```
85
+
86
+The (repo_id, ...) key shape lets us scope invalidation to one
87
+repo's slice without scanning the whole cache.