@@ -0,0 +1,110 @@ |
| 1 | +# DB indexes — catalog |
| 2 | + |
| 3 | +Per-domain catalog of every non-PK index, the query it backs, and the |
| 4 | +rationale. Each new index needs an entry here AND a sentence in its |
| 5 | +migration file justifying the write-throughput cost. Reviewed on |
| 6 | +every perf-pass sprint (S36 baseline; revisited on S39 beta hardening). |
| 7 | + |
| 8 | +The columns are: **Index**, **Covers query**, **Selectivity**, **Cost |
| 9 | +notes**. Selectivity is rough — "high" means typically returns ≤ 100 |
| 10 | +rows for a 1M-row table; "medium" 1–10k; "low" most-of-the-table. |
| 11 | + |
| 12 | +## Identity (`users`, `user_emails`, `user_*`) |
| 13 | + |
| 14 | +| Index | Covers query | Selectivity | Cost notes | |
| 15 | +|---|---|---|---| |
| 16 | +| `users.username_uq` (unique) | login + profile lookup by username | high | implied by UNIQUE; PK-class | |
| 17 | +| `users_deleted_at_idx` (partial WHERE deleted_at IS NOT NULL) | restore listing + admin "deleted" filter | high | partial keeps it tiny | |
| 18 | +| `users_suspended_at_idx` (partial WHERE suspended_at IS NOT NULL) | admin "suspended" filter | high | partial | |
| 19 | +| `users_site_admin_idx` (partial WHERE is_site_admin = true) | admin elevation lookups | high | partial; few rows | |
| 20 | +| `user_emails (user_id)` | enumerate a user's emails | high | per-user | |
| 21 | +| `user_emails_address_uq` (unique) | reverse lookup on signup-collision check | high | UNIQUE | |
| 22 | + |
| 23 | +## Repos |
| 24 | + |
| 25 | +| Index | Covers query | Selectivity | Cost notes | |
| 26 | +|---|---|---|---| |
| 27 | +| `repos (owner_user_id, name)` PK-class | per-owner repo lookup | high | composite | |
| 28 | +| `repos_owner_org_name_uq` | org-owner repo lookup | high | UNIQUE | |
| 29 | +| `repo_collaborators_user_id_idx` | "all my repos" enumeration | medium | parallel index to PK | |
| 30 | +| `repo_collaborators_repo_id_idx` | settings access tab | high | per-repo | |
| 31 | +| `repo_topics_topic_idx` | "repos with topic X" | medium | post-MVP topic filter | |
| 32 | + |
| 33 | +## Issues / pulls |
| 34 | + |
| 35 | +| Index | Covers query | Selectivity | Cost notes | |
| 36 | +|---|---|---|---| |
| 37 | +| `issues (repo_id, number)` PK | issue-by-number | high | PK | |
| 38 | +| `issues_repo_state_updated_idx (repo_id, state, updated_at DESC)` | issues list with state filter (S21) | medium | the dominant list-page index | |
| 39 | +| `issues_repo_kind_state_idx (repo_id, kind, state)` | PR-vs-issue split | medium | composite | |
| 40 | +| `issues_author_idx (author_user_id)` | "issues authored by me" | medium | per-user-author | |
| 41 | +| `issues_search_*` | FTS lookup | medium | GIN tsv index | |
| 42 | + |
| 43 | +## Jobs queue |
| 44 | + |
| 45 | +| Index | Covers query | Selectivity | Cost notes | |
| 46 | +|---|---|---|---| |
| 47 | +| `jobs_dispatch_idx (kind, run_at) WHERE completed_at IS NULL AND failed_at IS NULL` | worker claim — hot path | high | partial; only dispatchable rows | |
| 48 | +| `jobs_locked_idx (locked_by, locked_at)` | stuck-claim sweeper | high | rare scan | |
| 49 | + |
| 50 | +## Notifications + domain events |
| 51 | + |
| 52 | +| Index | Covers query | Selectivity | Cost notes | |
| 53 | +|---|---|---|---| |
| 54 | +| `notifications_recipient_recent_idx (recipient_user_id, last_event_at DESC)` | inbox first page | high | per-user, recency-sorted | |
| 55 | +| `notifications_recipient_unread_idx` | unread badge | high | partial | |
| 56 | +| `notifications_thread_coalesce_idx` (UNIQUE) | dedup thread rows on fanout | high | UNIQUE | |
| 57 | +| `domain_events_created_at_idx (created_at)` | fanout cursor scan | low | covers polling | |
| 58 | +| `domain_events_repo_created_idx (repo_id, created_at DESC) WHERE repo_id IS NOT NULL` | per-repo activity feed | medium | partial | |
| 59 | +| `domain_events_actor_created_idx (actor_user_id, created_at DESC) WHERE actor_user_id IS NOT NULL` | per-user activity feed | medium | partial | |
| 60 | +| `domain_events_public_created_idx (created_at DESC) WHERE public = true` | public feed | medium | partial | |
| 61 | + |
| 62 | +## Check runs |
| 63 | + |
| 64 | +| Index | Covers query | Selectivity | Cost notes | |
| 65 | +|---|---|---|---| |
| 66 | +| `check_runs_repo_head_idx (repo_id, head_sha)` | required-check eval per PR | high | composite | |
| 67 | +| `check_runs_required_lookup_idx (repo_id, head_sha, name)` | named-check exact lookup | high | composite | |
| 68 | +| `check_runs_external_id_idx` (UNIQUE) | upsert-by-(name+external_id) | high | UNIQUE | |
| 69 | + |
| 70 | +## Auth audit log |
| 71 | + |
| 72 | +| Index | Covers query | Selectivity | Cost notes | |
| 73 | +|---|---|---|---| |
| 74 | +| `auth_audit_log_actor_id_idx` | "what did actor X do" | medium | per-actor | |
| 75 | +| `auth_audit_log_target_idx (target_type, target_id)` | "what happened to target Y" | medium | composite | |
| 76 | +| `auth_audit_log_action_idx` | filter by action prefix | medium | per-action (admin viewer) | |
| 77 | +| `auth_audit_log_created_at_idx (created_at DESC)` | recency scan | low | dominant index for the admin viewer | |
| 78 | + |
| 79 | +## Webhooks (S33) |
| 80 | + |
| 81 | +| Index | Covers query | Selectivity | Cost notes | |
| 82 | +|---|---|---|---| |
| 83 | +| `webhooks_owner_idx (owner_kind, owner_id)` | settings list | high | composite | |
| 84 | +| `webhooks_active_idx (owner_kind, owner_id) WHERE active = true AND disabled_at IS NULL` | fanout subscriber lookup | high | partial; the hot path | |
| 85 | +| `webhook_deliveries_pending_due_idx (next_retry_at) WHERE status IN ('pending','failed_retry')` | deliverer claim | high | partial | |
| 86 | +| `webhook_deliveries_webhook_started_idx (webhook_id, started_at DESC)` | per-webhook delivery view | medium | composite | |
| 87 | + |
| 88 | +## Rate-limit (S35) |
| 89 | + |
| 90 | +| Index | Covers query | Selectivity | Cost notes | |
| 91 | +|---|---|---|---| |
| 92 | +| `rate_limits` PK `(scope, key)` | per-scope-key bump | high | UPSERT hot path | |
| 93 | +| `rate_limits_window_started_idx (window_started_at)` | periodic prune | low | scan-friendly | |
| 94 | +| `signup_ip_throttle` PK `(cidr)` | per-/24 lookup | high | UPSERT | |
| 95 | +| `signup_ip_throttle_window_started_idx (window_started_at)` | periodic prune | low | scan-friendly | |
| 96 | + |
| 97 | +## Future considerations (deferred) |
| 98 | + |
| 99 | +- **`pg_stat_statements` extension.** S37's deploy doc owns the |
| 100 | + install. Once the suite runs against a populated dataset we'll |
| 101 | + capture top-N slow queries here and decide on additional |
| 102 | + indexes. |
| 103 | +- **`gin_trgm_ops` indexes** for partial-match search on |
| 104 | + usernames / repo names. The existing FTS covers most needs; |
| 105 | + trigram lookup pays off only if profiling shows an LIKE-prefix |
| 106 | + hot path. |
| 107 | +- **Covering indexes for the issue list.** `INCLUDE` columns on |
| 108 | + `issues_repo_state_updated_idx` could turn the page into an |
| 109 | + index-only scan. Defer until the scan proves expensive on the |
| 110 | + 100k-issue fixture. |