markdown · 7125 bytes Raw Blame History

Architecture

A bird's-eye view of how shithub fits together. For depth on any one subsystem, see the linked per-area doc.

Process model

One Go binary, three roles:

  • shithubd web — the HTTP server. Renders the UI, serves /api/v1/, handles git over HTTPS.
  • shithubd worker — pulls from the Postgres-backed job queue. Owns webhook delivery, notification fan-out, async fan-out, search reindex.
  • shithubd cron — periodic chores (lifecycle sweeps, queue janitor, webhook purge). Triggered by a systemd timer; one-shot.

All three read the same config and the same DB. They differ only in which cmd/shithubd/<role>.go they boot into.

Plus a small admin/operator CLI surface (shithubd admin …, shithubd migrate …, shithubd config …, shithubd hook …) that runs as one-shot subprocesses.

Layered packages

                 +------------------------+
HTTP request --> | internal/web/handlers  | Per-area handlers (auth, repo, …)
                 +------------------------+
                            |
                 +------------------------+
                 | internal/web/render    | html/template + helpers
                 | internal/web/middleware| auth, csrf, ratelimit, body cap
                 +------------------------+
                            |
                 +------------------------+
                 | internal/<domain>      | repos, issues, pulls, orgs, …
                 +------------------------+
                            |
                 +------------------------+
                 | internal/auth/policy   | the only place "can X do Y" lives
                 +------------------------+
                            |
                 +------------------------+
                 | internal/<domain>/sqlc | sqlc-generated DB code
                 +------------------------+
                            |
                          Postgres

Cross-cutting:

  • internal/markdown — sole renderer for user-authored markdown. Lint-enforced: nothing outside this package may import goldmark or bluemonday.
  • internal/security/{ssrf,openredirect} — outbound HTTP defense
    • redirect validation.
  • internal/cache/lru — generic LRU + singleflight.
  • internal/pagination/keyset — cursor pagination with HMAC-signed cursors.
  • internal/webhook — outbound webhook delivery + signing.

Request lifecycle (web)

  1. Caddy terminates TLS, forwards to 127.0.0.1:8080.
  2. The chi router matches the route. Middleware stack on every request:
    • Request ID + access log
    • OptionalUser (resolves session cookie or anon)
    • CSRF (nosurf-derived) — enabled by default; /api/v1/ and git transports are explicit exemptions
    • Rate limit (per-IP for anon, per-user for logged-in on hot paths)
  3. Auth-gated routes have RequireUser or RequireSiteAdmin middleware in their group.
  4. The handler resolves the domain object, calls policy.Can(...) to authorize the action, executes, renders.
  5. Templates render through internal/web/render. Markdown bodies pass through internal/markdown for sanitization.

Request lifecycle (API)

/api/v1/* is under a CSRF-exempt group with PAT auth:

  1. MaxBodySize cap (256 KiB).
  2. PATAuthMiddleware — resolves token to user; 401 on missing, 403 on insufficient scope.
  3. Per-route RequireScope checks the specific scope.
  4. Handler runs.

Data flow: a push

   git client
       |
       v   git-receive-pack POST
   web service ──> shithubd hook pre-receive ──> Postgres (auth check)
       |                          |
       |              <── decision (accept/reject)
       v                          |
   accept push,                   v
   write to /data/repos    INSERT push_events (shithub_hook role)
                                  |
                                  v
                          INSERT jobs (search reindex, notify, ...)
                                  |
                                  v
                          worker picks up job (FOR UPDATE SKIP LOCKED)
                                  |
                                  v
                          fan-out (webhooks, notifications, search)

The shithub_hook role has minimum-needed grants (docs/internal/db-roles.md). The web role can do everything the runtime needs but is intentionally separate from the migration role used at deploy time.

Deployment topology

                +----------------------------+
   public --->  |  Caddy (TLS, rate limits)  |  :443
                +----------------------------+
                       |  127.0.0.1:8080
                +----------------------------+
                |  shithubd web (systemd)    |
                |  shithubd worker (systemd) |
                |  shithubd cron  (timer)    |
                +----------------------------+
                       |
                +-----------+         +-----------------+
                | Postgres  |         | Spaces (S3)     |
                +-----------+         | - WAL archive   |
                                      | - daily dumps   |
                                      | - LFS / blobs   |
                                      +-----------------+
                       \                    /
                        \      WireGuard mesh (10.50.0.0/24)
                         \________ ________/
                                  |
                          +----------------+
                          |  Monitoring    |
                          |  Prom/Loki/AM  |
                          |  Grafana       |
                          +----------------+

Monitoring is on the WireGuard mesh; metrics ports never face the public internet. See deploy.md for the full operator guide.

Observability

Three independent channels:

  • Structured logs (internal/infra/log) → stdout → journald → promtail → Loki.
  • Metrics (Prometheus) at /metrics, basic-auth gated in prod.
  • Tracing (OTel HTTP) optional, sample-rate controlled.
  • Error reporting (Sentry-protocol DSN, GlitchTip-compatible).

See observability.md.

What's deliberately not here

  • No microservices. One binary per role; everything else is premature.
  • No background ETL pipeline. Work that doesn't fit a job-row shape doesn't exist yet; if it does, add it as a new job kind rather than a new daemon.
  • No message bus. Postgres LISTEN/NOTIFY suffices for the intra-process signaling we need.
  • No Redis. The LRU + singleflight cache is in-process; if a shared cache becomes necessary, the seam is internal/cache and we'd swap in a Redis-backed implementation behind the same interface.
View source
1 # Architecture
2
3 A bird's-eye view of how shithub fits together. For depth on any
4 one subsystem, see the linked per-area doc.
5
6 ## Process model
7
8 One Go binary, three roles:
9
10 - `shithubd web` — the HTTP server. Renders the UI, serves
11 `/api/v1/`, handles git over HTTPS.
12 - `shithubd worker` — pulls from the Postgres-backed job queue.
13 Owns webhook delivery, notification fan-out, async fan-out,
14 search reindex.
15 - `shithubd cron` — periodic chores (lifecycle sweeps, queue
16 janitor, webhook purge). Triggered by a systemd timer; one-shot.
17
18 All three read the same config and the same DB. They differ only
19 in which `cmd/shithubd/<role>.go` they boot into.
20
21 Plus a small admin/operator CLI surface (`shithubd admin …`,
22 `shithubd migrate …`, `shithubd config …`, `shithubd hook …`)
23 that runs as one-shot subprocesses.
24
25 ## Layered packages
26
27 ```
28 +------------------------+
29 HTTP request --> | internal/web/handlers | Per-area handlers (auth, repo, …)
30 +------------------------+
31 |
32 +------------------------+
33 | internal/web/render | html/template + helpers
34 | internal/web/middleware| auth, csrf, ratelimit, body cap
35 +------------------------+
36 |
37 +------------------------+
38 | internal/<domain> | repos, issues, pulls, orgs, …
39 +------------------------+
40 |
41 +------------------------+
42 | internal/auth/policy | the only place "can X do Y" lives
43 +------------------------+
44 |
45 +------------------------+
46 | internal/<domain>/sqlc | sqlc-generated DB code
47 +------------------------+
48 |
49 Postgres
50 ```
51
52 Cross-cutting:
53
54 - `internal/markdown` — sole renderer for user-authored markdown.
55 Lint-enforced: nothing outside this package may import goldmark
56 or bluemonday.
57 - `internal/security/{ssrf,openredirect}` — outbound HTTP defense
58 + redirect validation.
59 - `internal/cache/lru` — generic LRU + singleflight.
60 - `internal/pagination/keyset` — cursor pagination with
61 HMAC-signed cursors.
62 - `internal/webhook` — outbound webhook delivery + signing.
63
64 ## Request lifecycle (web)
65
66 1. Caddy terminates TLS, forwards to `127.0.0.1:8080`.
67 2. The chi router matches the route. Middleware stack on every
68 request:
69 - Request ID + access log
70 - `OptionalUser` (resolves session cookie or anon)
71 - CSRF (nosurf-derived) — enabled by default; `/api/v1/` and
72 git transports are explicit exemptions
73 - Rate limit (per-IP for anon, per-user for logged-in on hot
74 paths)
75 3. Auth-gated routes have `RequireUser` or `RequireSiteAdmin`
76 middleware in their group.
77 4. The handler resolves the domain object, calls `policy.Can(...)`
78 to authorize the action, executes, renders.
79 5. Templates render through `internal/web/render`. Markdown bodies
80 pass through `internal/markdown` for sanitization.
81
82 ## Request lifecycle (API)
83
84 `/api/v1/*` is under a CSRF-exempt group with PAT auth:
85
86 1. `MaxBodySize` cap (256 KiB).
87 2. `PATAuthMiddleware` — resolves token to user; 401 on missing,
88 403 on insufficient scope.
89 3. Per-route `RequireScope` checks the specific scope.
90 4. Handler runs.
91
92 ## Data flow: a push
93
94 ```
95 git client
96 |
97 v git-receive-pack POST
98 web service ──> shithubd hook pre-receive ──> Postgres (auth check)
99 | |
100 | <── decision (accept/reject)
101 v |
102 accept push, v
103 write to /data/repos INSERT push_events (shithub_hook role)
104 |
105 v
106 INSERT jobs (search reindex, notify, ...)
107 |
108 v
109 worker picks up job (FOR UPDATE SKIP LOCKED)
110 |
111 v
112 fan-out (webhooks, notifications, search)
113 ```
114
115 The `shithub_hook` role has minimum-needed grants
116 (`docs/internal/db-roles.md`). The web role can do everything
117 the runtime needs but is intentionally separate from the migration
118 role used at deploy time.
119
120 ## Deployment topology
121
122 ```
123 +----------------------------+
124 public ---> | Caddy (TLS, rate limits) | :443
125 +----------------------------+
126 | 127.0.0.1:8080
127 +----------------------------+
128 | shithubd web (systemd) |
129 | shithubd worker (systemd) |
130 | shithubd cron (timer) |
131 +----------------------------+
132 |
133 +-----------+ +-----------------+
134 | Postgres | | Spaces (S3) |
135 +-----------+ | - WAL archive |
136 | - daily dumps |
137 | - LFS / blobs |
138 +-----------------+
139 \ /
140 \ WireGuard mesh (10.50.0.0/24)
141 \________ ________/
142 |
143 +----------------+
144 | Monitoring |
145 | Prom/Loki/AM |
146 | Grafana |
147 +----------------+
148 ```
149
150 Monitoring is on the WireGuard mesh; metrics ports never face the
151 public internet. See [deploy.md](./deploy.md) for the full
152 operator guide.
153
154 ## Observability
155
156 Three independent channels:
157
158 - **Structured logs** (`internal/infra/log`) → stdout → journald
159 → promtail → Loki.
160 - **Metrics** (Prometheus) at `/metrics`, basic-auth gated in
161 prod.
162 - **Tracing** (OTel HTTP) optional, sample-rate controlled.
163 - **Error reporting** (Sentry-protocol DSN, GlitchTip-compatible).
164
165 See [observability.md](./observability.md).
166
167 ## What's deliberately not here
168
169 - **No microservices.** One binary per role; everything else is
170 premature.
171 - **No background ETL pipeline.** Work that doesn't fit a job-row
172 shape doesn't exist yet; if it does, add it as a new job kind
173 rather than a new daemon.
174 - **No message bus.** Postgres LISTEN/NOTIFY suffices for the
175 intra-process signaling we need.
176 - **No Redis.** The LRU + singleflight cache is in-process; if a
177 shared cache becomes necessary, the seam is `internal/cache`
178 and we'd swap in a Redis-backed implementation behind the same
179 interface.
180
181 ## Where to read next
182
183 - New to the code: [code-tab.md](./code-tab.md) walks how a single
184 request flows end-to-end on the most-traveled page.
185 - New to ops: [deploy.md](./deploy.md), then
186 [runbooks/](./runbooks/).
187 - New to security: [threat-model.md](./threat-model.md) +
188 [security-checklist.md](./security-checklist.md).