@@ -0,0 +1,188 @@ |
| | 1 | +# Architecture |
| | 2 | + |
| | 3 | +A bird's-eye view of how shithub fits together. For depth on any |
| | 4 | +one subsystem, see the linked per-area doc. |
| | 5 | + |
| | 6 | +## Process model |
| | 7 | + |
| | 8 | +One Go binary, three roles: |
| | 9 | + |
| | 10 | +- `shithubd web` — the HTTP server. Renders the UI, serves |
| | 11 | + `/api/v1/`, handles git over HTTPS. |
| | 12 | +- `shithubd worker` — pulls from the Postgres-backed job queue. |
| | 13 | + Owns webhook delivery, notification fan-out, async fan-out, |
| | 14 | + search reindex. |
| | 15 | +- `shithubd cron` — periodic chores (lifecycle sweeps, queue |
| | 16 | + janitor, webhook purge). Triggered by a systemd timer; one-shot. |
| | 17 | + |
| | 18 | +All three read the same config and the same DB. They differ only |
| | 19 | +in which `cmd/shithubd/<role>.go` they boot into. |
| | 20 | + |
| | 21 | +Plus a small admin/operator CLI surface (`shithubd admin …`, |
| | 22 | +`shithubd migrate …`, `shithubd config …`, `shithubd hook …`) |
| | 23 | +that runs as one-shot subprocesses. |
| | 24 | + |
| | 25 | +## Layered packages |
| | 26 | + |
| | 27 | +``` |
| | 28 | + +------------------------+ |
| | 29 | +HTTP request --> | internal/web/handlers | Per-area handlers (auth, repo, …) |
| | 30 | + +------------------------+ |
| | 31 | + | |
| | 32 | + +------------------------+ |
| | 33 | + | internal/web/render | html/template + helpers |
| | 34 | + | internal/web/middleware| auth, csrf, ratelimit, body cap |
| | 35 | + +------------------------+ |
| | 36 | + | |
| | 37 | + +------------------------+ |
| | 38 | + | internal/<domain> | repos, issues, pulls, orgs, … |
| | 39 | + +------------------------+ |
| | 40 | + | |
| | 41 | + +------------------------+ |
| | 42 | + | internal/auth/policy | the only place "can X do Y" lives |
| | 43 | + +------------------------+ |
| | 44 | + | |
| | 45 | + +------------------------+ |
| | 46 | + | internal/<domain>/sqlc | sqlc-generated DB code |
| | 47 | + +------------------------+ |
| | 48 | + | |
| | 49 | + Postgres |
| | 50 | +``` |
| | 51 | + |
| | 52 | +Cross-cutting: |
| | 53 | + |
| | 54 | +- `internal/markdown` — sole renderer for user-authored markdown. |
| | 55 | + Lint-enforced: nothing outside this package may import goldmark |
| | 56 | + or bluemonday. |
| | 57 | +- `internal/security/{ssrf,openredirect}` — outbound HTTP defense |
| | 58 | + + redirect validation. |
| | 59 | +- `internal/cache/lru` — generic LRU + singleflight. |
| | 60 | +- `internal/pagination/keyset` — cursor pagination with |
| | 61 | + HMAC-signed cursors. |
| | 62 | +- `internal/webhook` — outbound webhook delivery + signing. |
| | 63 | + |
| | 64 | +## Request lifecycle (web) |
| | 65 | + |
| | 66 | +1. Caddy terminates TLS, forwards to `127.0.0.1:8080`. |
| | 67 | +2. The chi router matches the route. Middleware stack on every |
| | 68 | + request: |
| | 69 | + - Request ID + access log |
| | 70 | + - `OptionalUser` (resolves session cookie or anon) |
| | 71 | + - CSRF (nosurf-derived) — enabled by default; `/api/v1/` and |
| | 72 | + git transports are explicit exemptions |
| | 73 | + - Rate limit (per-IP for anon, per-user for logged-in on hot |
| | 74 | + paths) |
| | 75 | +3. Auth-gated routes have `RequireUser` or `RequireSiteAdmin` |
| | 76 | + middleware in their group. |
| | 77 | +4. The handler resolves the domain object, calls `policy.Can(...)` |
| | 78 | + to authorize the action, executes, renders. |
| | 79 | +5. Templates render through `internal/web/render`. Markdown bodies |
| | 80 | + pass through `internal/markdown` for sanitization. |
| | 81 | + |
| | 82 | +## Request lifecycle (API) |
| | 83 | + |
| | 84 | +`/api/v1/*` is under a CSRF-exempt group with PAT auth: |
| | 85 | + |
| | 86 | +1. `MaxBodySize` cap (256 KiB). |
| | 87 | +2. `PATAuthMiddleware` — resolves token to user; 401 on missing, |
| | 88 | + 403 on insufficient scope. |
| | 89 | +3. Per-route `RequireScope` checks the specific scope. |
| | 90 | +4. Handler runs. |
| | 91 | + |
| | 92 | +## Data flow: a push |
| | 93 | + |
| | 94 | +``` |
| | 95 | + git client |
| | 96 | + | |
| | 97 | + v git-receive-pack POST |
| | 98 | + web service ──> shithubd hook pre-receive ──> Postgres (auth check) |
| | 99 | + | | |
| | 100 | + | <── decision (accept/reject) |
| | 101 | + v | |
| | 102 | + accept push, v |
| | 103 | + write to /data/repos INSERT push_events (shithub_hook role) |
| | 104 | + | |
| | 105 | + v |
| | 106 | + INSERT jobs (search reindex, notify, ...) |
| | 107 | + | |
| | 108 | + v |
| | 109 | + worker picks up job (FOR UPDATE SKIP LOCKED) |
| | 110 | + | |
| | 111 | + v |
| | 112 | + fan-out (webhooks, notifications, search) |
| | 113 | +``` |
| | 114 | + |
| | 115 | +The `shithub_hook` role has minimum-needed grants |
| | 116 | +(`docs/internal/db-roles.md`). The web role can do everything |
| | 117 | +the runtime needs but is intentionally separate from the migration |
| | 118 | +role used at deploy time. |
| | 119 | + |
| | 120 | +## Deployment topology |
| | 121 | + |
| | 122 | +``` |
| | 123 | + +----------------------------+ |
| | 124 | + public ---> | Caddy (TLS, rate limits) | :443 |
| | 125 | + +----------------------------+ |
| | 126 | + | 127.0.0.1:8080 |
| | 127 | + +----------------------------+ |
| | 128 | + | shithubd web (systemd) | |
| | 129 | + | shithubd worker (systemd) | |
| | 130 | + | shithubd cron (timer) | |
| | 131 | + +----------------------------+ |
| | 132 | + | |
| | 133 | + +-----------+ +-----------------+ |
| | 134 | + | Postgres | | Spaces (S3) | |
| | 135 | + +-----------+ | - WAL archive | |
| | 136 | + | - daily dumps | |
| | 137 | + | - LFS / blobs | |
| | 138 | + +-----------------+ |
| | 139 | + \ / |
| | 140 | + \ WireGuard mesh (10.50.0.0/24) |
| | 141 | + \________ ________/ |
| | 142 | + | |
| | 143 | + +----------------+ |
| | 144 | + | Monitoring | |
| | 145 | + | Prom/Loki/AM | |
| | 146 | + | Grafana | |
| | 147 | + +----------------+ |
| | 148 | +``` |
| | 149 | + |
| | 150 | +Monitoring is on the WireGuard mesh; metrics ports never face the |
| | 151 | +public internet. See [deploy.md](./deploy.md) for the full |
| | 152 | +operator guide. |
| | 153 | + |
| | 154 | +## Observability |
| | 155 | + |
| | 156 | +Three independent channels: |
| | 157 | + |
| | 158 | +- **Structured logs** (`internal/infra/log`) → stdout → journald |
| | 159 | + → promtail → Loki. |
| | 160 | +- **Metrics** (Prometheus) at `/metrics`, basic-auth gated in |
| | 161 | + prod. |
| | 162 | +- **Tracing** (OTel HTTP) optional, sample-rate controlled. |
| | 163 | +- **Error reporting** (Sentry-protocol DSN, GlitchTip-compatible). |
| | 164 | + |
| | 165 | +See [observability.md](./observability.md). |
| | 166 | + |
| | 167 | +## What's deliberately not here |
| | 168 | + |
| | 169 | +- **No microservices.** One binary per role; everything else is |
| | 170 | + premature. |
| | 171 | +- **No background ETL pipeline.** Work that doesn't fit a job-row |
| | 172 | + shape doesn't exist yet; if it does, add it as a new job kind |
| | 173 | + rather than a new daemon. |
| | 174 | +- **No message bus.** Postgres LISTEN/NOTIFY suffices for the |
| | 175 | + intra-process signaling we need. |
| | 176 | +- **No Redis.** The LRU + singleflight cache is in-process; if a |
| | 177 | + shared cache becomes necessary, the seam is `internal/cache` |
| | 178 | + and we'd swap in a Redis-backed implementation behind the same |
| | 179 | + interface. |
| | 180 | + |
| | 181 | +## Where to read next |
| | 182 | + |
| | 183 | +- New to the code: [code-tab.md](./code-tab.md) walks how a single |
| | 184 | + request flows end-to-end on the most-traveled page. |
| | 185 | +- New to ops: [deploy.md](./deploy.md), then |
| | 186 | + [runbooks/](./runbooks/). |
| | 187 | +- New to security: [threat-model.md](./threat-model.md) + |
| | 188 | + [security-checklist.md](./security-checklist.md). |