tenseleyflow/shithub / 4f709dd

Browse files

S14 followup: document deferred hook DB role split (lands in S37)

Authored by mfwolffe <wolffemf@dukes.jmu.edu>
SHA
4f709dd65eac00418e441086239a5fc326458e81
Parents
11f0fda
Tree
e49c9ff

2 changed files

StatusFile+-
A docs/internal/db-roles.md 109 0
M docs/internal/hooks.md 9 0
docs/internal/db-roles.mdadded
@@ -0,0 +1,109 @@
1
+# Postgres roles
2
+
3
+shithub uses a single Postgres role today (`shithub`) for everything —
4
+the web server, the SSH dispatcher, the worker, and the git hooks all
5
+connect with the same credentials. That's deliberate for development
6
+ergonomics; production hardening (S37) splits roles along blast-radius
7
+lines so an exploit in one surface can't read or mutate the rest of
8
+the database.
9
+
10
+This doc captures the **target end state** so S37 doesn't have to
11
+re-derive it. None of these roles exist yet in dev or CI.
12
+
13
+## Role plan
14
+
15
+| Role             | Used by                              | Grants                                                                                                  |
16
+| ---------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------- |
17
+| `shithub`        | web server, worker, admin tooling    | Full owner of all tables (current behavior)                                                             |
18
+| `shithub_hook`   | `shithubd hook pre-receive` / `post-receive` (S14) | `SELECT` on `users`, `repos`. `INSERT` on `push_events`, `jobs`, `webhook_events_pending`. Nothing else. |
19
+| `shithub_akc`    | `shithubd ssh-authkeys` (S07/S13)    | `SELECT` on `user_ssh_keys`, `users`. `UPDATE` on `user_ssh_keys.last_used_at` (and that column only).  |
20
+| `shithub_ro`     | future read-only replica or analytics | `SELECT` on every table, no write access at all                                                         |
21
+
22
+The hook split is the highest-value because hooks run inside the push
23
+process — fewer guardrails between a compromised git push and DB writes
24
+than the web layer (which goes through CSRF + middleware + handlers).
25
+The AKC split mirrors the contract from S07: that path only needs to
26
+authenticate keys; if it can't write to repos or push_events the blast
27
+radius is "log spam at worst."
28
+
29
+## SQL recipe (S37 will turn this into a migration or Ansible task)
30
+
31
+```sql
32
+-- ─── shithub_hook ───────────────────────────────────────────────
33
+CREATE ROLE shithub_hook LOGIN PASSWORD :'shithub_hook_password';
34
+
35
+-- Reads (re-checked from DB so env staleness can't authorize a push):
36
+GRANT SELECT (id, suspended_at, deleted_at)         ON users  TO shithub_hook;
37
+GRANT SELECT (id, is_archived, deleted_at, default_branch) ON repos TO shithub_hook;
38
+
39
+-- Writes (post-receive's exact INSERT surface):
40
+GRANT INSERT                                  ON push_events           TO shithub_hook;
41
+GRANT INSERT, SELECT, USAGE                   ON jobs                  TO shithub_hook;
42
+GRANT INSERT                                  ON webhook_events_pending TO shithub_hook;
43
+GRANT USAGE, SELECT                           ON SEQUENCE push_events_id_seq           TO shithub_hook;
44
+GRANT USAGE, SELECT                           ON SEQUENCE jobs_id_seq                  TO shithub_hook;
45
+GRANT USAGE, SELECT                           ON SEQUENCE webhook_events_pending_id_seq TO shithub_hook;
46
+
47
+-- Connect & schema:
48
+GRANT CONNECT ON DATABASE shithub TO shithub_hook;
49
+GRANT USAGE   ON SCHEMA   public  TO shithub_hook;
50
+
51
+-- Notify channel: pg_notify needs no extra grant (it's a function call).
52
+
53
+-- Explicitly REVOKE the public defaults if any were inherited:
54
+REVOKE ALL ON ALL TABLES    IN SCHEMA public FROM shithub_hook;
55
+REVOKE ALL ON ALL SEQUENCES IN SCHEMA public FROM shithub_hook;
56
+-- Then re-apply the precise grants above. (Order: REVOKE first when
57
+-- migrating an existing role; on a fresh CREATE ROLE the explicit
58
+-- grants are already authoritative.)
59
+
60
+
61
+-- ─── shithub_akc ────────────────────────────────────────────────
62
+CREATE ROLE shithub_akc LOGIN PASSWORD :'shithub_akc_password';
63
+
64
+GRANT SELECT (id, fingerprint, public_key, user_id, expires_at, revoked_at)
65
+                      ON user_ssh_keys TO shithub_akc;
66
+GRANT UPDATE (last_used_at, last_used_ip)
67
+                      ON user_ssh_keys TO shithub_akc;
68
+GRANT SELECT (id, username, suspended_at, deleted_at)
69
+                      ON users         TO shithub_akc;
70
+
71
+GRANT CONNECT ON DATABASE shithub TO shithub_akc;
72
+GRANT USAGE   ON SCHEMA   public  TO shithub_akc;
73
+```
74
+
75
+## Plumbing changes when S37 lands
76
+
77
+* **Config**: add `SHITHUB_HOOK_DATABASE_URL` and `SHITHUB_AKC_DATABASE_URL`
78
+  to `internal/infra/config/config.go`. Fall back to `SHITHUB_DATABASE_URL`
79
+  when unset (dev keeps single-role).
80
+* **Hook subcommands**: `cmd/shithubd/hook.go::loadHookCtx` checks
81
+  `SHITHUB_HOOK_DATABASE_URL` first, falls back to the main URL.
82
+* **AKC subcommand**: `cmd/shithubd/ssh.go::sshAuthkeysCmd` checks
83
+  `SHITHUB_AKC_DATABASE_URL` first.
84
+* **Ansible role**: `deploy/ansible/roles/postgres/tasks/main.yml` runs
85
+  the SQL recipe above against a fresh DB; passwords come from sops or
86
+  1Password.
87
+* **Migration policy**: don't add the GRANT statements to the goose
88
+  migrations directory — those run as the `shithub` super-role and
89
+  changing them would re-grant on every dev re-up. Roles + grants are
90
+  Ansible territory, not migration territory.
91
+
92
+## Why we deferred
93
+
94
+S14's deliverables include "Hook DB connection: small pool, distinct
95
+credentials with the minimum-needed grants." The dev path (single role)
96
+is what S14 actually shipped. Splitting it now adds friction to every
97
+new schema migration (each new write target needs a grant tweak) while
98
+the schema is still iterating fast — by S37 the schema has settled
99
+enough that the grant surface is stable.
100
+
101
+## Tracking
102
+
103
+* This doc is the design.
104
+* `.docs/sprints/S37-deployment-automation.md` references this doc
105
+  under "Postgres" → "Hook DB role split" so the deferred work is on
106
+  the sprint's deliverable list, not floating.
107
+* The S14 sprint spec (`.docs/sprints/S14-push-processing-pipeline.md`)
108
+  remains the authoritative description of what S14 *should* have done
109
+  in an ideal world; this doc explains what got cut and where it landed.
docs/internal/hooks.mdmodified
@@ -103,3 +103,12 @@ to the pusher's terminal. Latency budget: <100ms p99.
103103
   and abort the push. Operators recover with `hooks reinstall`.
104104
 * **Stale shim from previous shithubd version**: `Install` is
105105
   idempotent and overwrites; re-running on a deploy is the right move.
106
+
107
+## Deferred: hook DB role split
108
+
109
+S14's spec calls for the hook to connect with a least-privilege Postgres
110
+role distinct from the one the web server uses. S14 ships the dev path
111
+(single role) and defers the split to S37 deploy automation. The full
112
+GRANT recipe and the planned config plumbing live in
113
+[`db-roles.md`](./db-roles.md), and the bullet is on the S37
114
+deliverables list so it doesn't fall off.