@@ -0,0 +1,126 @@ |
| 1 | +# Database conventions |
| 2 | + |
| 3 | +This document is the authoritative reference for shithub's PostgreSQL schema |
| 4 | +conventions. Every domain sprint (S05 onwards) follows these. |
| 5 | + |
| 6 | +## Engine and tooling |
| 7 | + |
| 8 | +- **PostgreSQL 16**. Production runs self-hosted on a dedicated block volume |
| 9 | + (S37). Local dev runs in `docker-compose` (`make dev-db`). |
| 10 | +- **Driver:** `pgx/v5` (`github.com/jackc/pgx/v5`). Native `*pgxpool.Pool` for |
| 11 | + app code; the `stdlib` adapter is reserved for libraries that demand |
| 12 | + `*sql.DB`. |
| 13 | +- **Migrations:** `goose` (`github.com/pressly/goose/v3`), used as a library |
| 14 | + via `shithubd migrate`. Plain SQL up/down blocks; one file per migration. |
| 15 | +- **Code generation:** `sqlc`. Queries live under |
| 16 | + `internal/<domain>/queries/*.sql`; generated Go lives under |
| 17 | + `internal/<domain>/sqlc/`. |
| 18 | + |
| 19 | +## Naming |
| 20 | + |
| 21 | +| Concern | Convention | |
| 22 | +|---|---| |
| 23 | +| Tables | snake_case, plural (`users`, `repos`, `pull_requests`) | |
| 24 | +| Columns | snake_case (`created_at`, `owner_user_id`) | |
| 25 | +| Primary keys | `id` (column) | |
| 26 | +| Indexes | `<table>_<columns>_idx` | |
| 27 | +| Unique indexes / constraints | `<table>_<columns>_key` | |
| 28 | +| Foreign keys | `<table>_<col>_fkey` (Postgres default) | |
| 29 | +| Enums | `snake_case_enum` (e.g. `repo_visibility`) | |
| 30 | +| Trigger functions | `tg_<purpose>` (e.g. `tg_set_updated_at`) | |
| 31 | + |
| 32 | +## ID strategy |
| 33 | + |
| 34 | +- **`bigserial id PRIMARY KEY`** for internal IDs. Compact, ordered, cheap. |
| 35 | +- **UUID v7** for any ID exposed publicly via URL. Gives sortability without |
| 36 | + leaking tenant size. Stored as the canonical Postgres `uuid` type. |
| 37 | +- **Per-repo issue/PR numbers** are NEITHER `bigserial` NOR UUID — they are |
| 38 | + per-`(repo_id)` monotonic counters maintained by a small counter table |
| 39 | + (S21). |
| 40 | + |
| 41 | +## Timestamps |
| 42 | + |
| 43 | +- Always **`timestamptz`**. Never `timestamp` (without timezone) — that's a |
| 44 | + foot-gun. |
| 45 | +- `created_at timestamptz NOT NULL DEFAULT now()` on every table. |
| 46 | +- `updated_at timestamptz NOT NULL DEFAULT now()` on every table that |
| 47 | + receives updates, with a `BEFORE UPDATE` trigger that re-stamps it: |
| 48 | + |
| 49 | + ```sql |
| 50 | + CREATE TRIGGER set_updated_at BEFORE UPDATE ON <table> |
| 51 | + FOR EACH ROW EXECUTE FUNCTION tg_set_updated_at(); |
| 52 | + ``` |
| 53 | + |
| 54 | + The `tg_set_updated_at()` function is created in `0001_meta.sql` and |
| 55 | + reused across every table. |
| 56 | + |
| 57 | +## Foreign keys |
| 58 | + |
| 59 | +- **Always specify `ON DELETE` explicitly.** `RESTRICT` (the default) is the |
| 60 | + safe choice; only use `CASCADE` when the lifecycle of the child IS the |
| 61 | + lifecycle of the parent. |
| 62 | +- Index every FK column unless the child is naturally indexed by a covering |
| 63 | + composite index that starts with the FK. |
| 64 | + |
| 65 | +## Soft deletes |
| 66 | + |
| 67 | +- **No soft deletes by default.** Hard delete + audit-log row when audit |
| 68 | + matters. |
| 69 | +- Exceptions exist where grace windows are required (user delete, repo |
| 70 | + delete, org delete) — these are explicit and documented per sprint. |
| 71 | + |
| 72 | +## Migrations |
| 73 | + |
| 74 | +- **Forward-only after production deploy.** `Down` blocks exist for dev |
| 75 | + convenience but are immutable post-deploy. Corrections come as new |
| 76 | + migrations. |
| 77 | +- One change per migration. Filename: `NNNN_short_purpose.sql` where `NNNN` |
| 78 | + is monotonically increasing. |
| 79 | +- Goose markers: |
| 80 | + |
| 81 | + ``` |
| 82 | + -- +goose Up |
| 83 | + ... |
| 84 | + -- +goose Down |
| 85 | + ... |
| 86 | + ``` |
| 87 | + |
| 88 | + Multi-statement DDL that needs PL/pgSQL must be wrapped in |
| 89 | + `-- +goose StatementBegin` / `-- +goose StatementEnd`. |
| 90 | + |
| 91 | +## citext |
| 92 | + |
| 93 | +Use `citext` (case-insensitive text) for identifiers that should match |
| 94 | +case-insensitively but display case-preserved (`users.username`, |
| 95 | +`user_emails.email`, `orgs.slug`, etc.). The extension is enabled in |
| 96 | +`0001_meta.sql` (post-S05) — for now, S01 enables only the function we need. |
| 97 | + |
| 98 | +## JSONB |
| 99 | + |
| 100 | +- Acceptable for genuinely polymorphic data (event metadata, audit log |
| 101 | + details, notification summaries). |
| 102 | +- Not a substitute for proper schema. If the shape is known and stable, |
| 103 | + model it as columns or a child table. |
| 104 | + |
| 105 | +## Connection pooling |
| 106 | + |
| 107 | +- App: `pgxpool.Pool` per process. Default sizing 25 in prod, 10 in dev. |
| 108 | +- Hooks (S07/S14): tiny pools (max 4) since they're short-lived and on the |
| 109 | + critical path of git operations. |
| 110 | +- Tests: per-test pool with max 2 connections. |
| 111 | + |
| 112 | +## Test harness |
| 113 | + |
| 114 | +- `internal/testing/dbtest`: `dbtest.NewTestDB(t)` creates a fresh database |
| 115 | + cloned from a template (which has all migrations applied) per test, then |
| 116 | + drops it on `t.Cleanup`. Parallel-safe. |
| 117 | +- Tests requiring DB access set `SHITHUB_TEST_DATABASE_URL` pointing at a |
| 118 | + Postgres server with permission to `CREATE DATABASE`. |
| 119 | + |
| 120 | +## Operational |
| 121 | + |
| 122 | +- `pg_stat_statements` extension is loaded by default in dev compose and |
| 123 | + prod (S37). Used by S36's perf pass. |
| 124 | +- `archive_mode=on` + WAL shipping to Spaces (cross-region) in prod (S37). |
| 125 | +- Daily logical backups via `pg_dump --format=custom`, restored weekly to |
| 126 | + validate the backup chain. |