tenseleyflow/shithub / 3c50f66

Browse files

Add docs/internal/auth.md and extend config reference with auth.* keys

Authored by mfwolffe <wolffemf@dukes.jmu.edu>
SHA
3c50f66634e48682be9a2ec6cebd867583de9743
Parents
7f1eb93
Tree
76151e8

2 changed files

StatusFile+-
A docs/internal/auth.md 214 0
M docs/internal/config.md 12 0
docs/internal/auth.mdadded
@@ -0,0 +1,214 @@
1
+# Authentication
2
+
3
+S05 brings the first real domain surface to shithub: email/password signup, email verification, login, logout, password reset, and rate-limiting. This doc covers what's wired, where the code lives, and the security choices behind the design.
4
+
5
+## What this sprint ships
6
+
7
+- Real users in Postgres: `users`, `user_emails`, `password_resets`, `email_verifications`, `auth_throttle`, `username_redirects`.
8
+- argon2id password hashing with PHC string encoding.
9
+- Strict username whitelist + reserved-name list.
10
+- Common-password rejection from an embedded SecLists 10k corpus.
11
+- Email Sender abstraction with stdout (dev), SMTP (MailHog/local), and Postmark (prod) backends.
12
+- Counter-based rate-limiting backed by Postgres (no Redis dep).
13
+- Auth handlers: signup, login, logout, password reset (request + confirm), email verification, verification resend.
14
+- Constant-time login: missing usernames still trigger an argon2 hash against a pre-computed dummy.
15
+- Generic password-reset response: no user-existence enumeration via that flow.
16
+- 4 KiB request-body cap on auth POSTs (anti-DoS for the hashing path).
17
+- Honeypot field on the signup form.
18
+- `shithubd admin reset-password <username>` operator escape hatch.
19
+
20
+## Out of scope (future sprints)
21
+
22
+- 2FA / TOTP (S06)
23
+- SSH keys (S07)
24
+- Personal access tokens (S08)
25
+- OAuth providers (post-MVP)
26
+- Profile / settings UI (S09 / S10)
27
+
28
+## Tables (migrations 0002–0008)
29
+
30
+| Table | Purpose | Notes |
31
+|---|---|---|
32
+| `users` | identity + password material | `username citext UNIQUE`. PHC argon2id hash. Soft-deletable (`deleted_at`) and suspendable. |
33
+| `user_emails` | one or more emails per user | `email citext UNIQUE` across all users. Partial unique index enforces one primary per user. |
34
+| `password_resets` | reset tokens | `token_hash bytea UNIQUE`. 1-hour TTL. Single-use via `used_at`. |
35
+| `email_verifications` | verification tokens | `token_hash bytea UNIQUE`. 24-hour TTL. Single-use via `used_at`. |
36
+| `auth_throttle` | rate-limit counters | `(scope, identifier)` UNIQUE. Fixed-window. |
37
+| `username_redirects` | username-change history | Old name → new user. Acts as a 30-day reservation (S10 wires the cooldown). |
38
+
39
+`citext` is enabled in migration 0002 for case-insensitive uniqueness without functional indexes. Display casing for users and emails lives in the same column — citext preserves it on read.
40
+
41
+## Password hashing
42
+
43
+`internal/auth/password` implements argon2id using `golang.org/x/crypto/argon2`. Defaults:
44
+
45
+- Memory: 64 MiB (`m=65536`)
46
+- Time: 3 iterations (`t=3`)
47
+- Threads: 2 (`p=2`)
48
+- Salt: 16 bytes
49
+- Key: 32 bytes
50
+
51
+These are tuned for ~100–300 ms per Hash on dev hardware. Operators can override via `auth.argon2.*` config. The output is a canonical PHC string:
52
+
53
+```
54
+$argon2id$v=19$m=65536,t=3,p=2$<saltB64>$<hashB64>
55
+```
56
+
57
+Stored in `users.password_hash`; `users.password_algo` is `argon2id-v1` so we can roll forward later.
58
+
59
+A package-level `dummyEncoded` is computed on first use. Login handlers call `password.VerifyAgainstDummy` when the username doesn't exist so the response time matches a real failed-password verification (defense against username enumeration via timing).
60
+
61
+The minimum password length is 10 characters. There is **no** maximum — argon2id can hash arbitrary input — but the auth-POST middleware caps `r.Body` at 4 KiB so a misbehaving client can't ship a 10 MB password to weaponize the hashing path.
62
+
63
+## Tokens
64
+
65
+`internal/auth/token` mints 32-byte cryptographically-random tokens, base64url-encodes them for inclusion in URLs/emails, and stores their sha256 hash in the DB. The raw token never touches disk. Lookups parse the URL fragment, hash, and query by hash with a UNIQUE-index O(1) seek. Comparisons use `crypto/subtle.ConstantTimeCompare`.
66
+
67
+## Reserved-name list
68
+
69
+`internal/auth/reserved.go` maintains a set of usernames that may not be claimed: every static top-level route shithub registers, GitHub-known reservations we mirror for parity, and a buffer of likely-future routes. Signup checks this list (case-insensitive — `LOGIN` is rejected the same as `login`).
70
+
71
+When a future sprint adds a new top-level route, the leading path segment **must** be added here. The intended audit mechanism: `internal/web/handlers/handlers_test.go` should walk the registered chi routes and check every static segment against `auth.ReservedNames()`. (Stub-level today; full route-walking lands when the route surface stabilizes in S11+.)
72
+
73
+## Common-password list
74
+
75
+`internal/passwords` embeds the SecLists `10k-most-common.txt` corpus (~73 KB on disk). `IsCommon` is case-insensitive. Used at signup and password-reset.
76
+
77
+To refresh: replace `internal/passwords/common_passwords.txt` with a newer SecLists snapshot and re-run the test suite.
78
+
79
+## Email service
80
+
81
+`internal/auth/email` defines the `Sender` interface. Three implementations:
82
+
83
+- `StdoutSender` — writes a human-readable dump to a writer. Default in dev when no SMTP is configured. Convenient for tests.
84
+- `SMTPSender` — plain SMTP for MailHog locally. Authenticated and TLS-upgrade variants supported.
85
+- `PostmarkSender` — Postmark transactional API. Production default.
86
+
87
+`messages.go` hosts the `VerifyMessage` and `ResetMessage` builders. Both produce HTML + plaintext bodies — every transactional email shithub sends works in plain-text-only clients. Templates are inlined (short, rarely change); when they grow, promote to `templates/email/*.{html,txt}`.
88
+
89
+### Wiring
90
+
91
+`auth.email_backend` chooses the implementation: `stdout | smtp | postmark`. The `smtp` backend additionally requires `auth.smtp.addr`; `postmark` requires `auth.postmark.server_token`. Validation enforces these.
92
+
93
+```sh
94
+# Dev: capture mail in MailHog
95
+make dev-email
96
+SHITHUB_AUTH__EMAIL_BACKEND=smtp ./bin/shithubd web
97
+# Open http://127.0.0.1:8025 to read captured messages
98
+```
99
+
100
+## Rate limiting
101
+
102
+`internal/auth/throttle` implements fixed-window counters via the `auth_throttle` table. The `BumpAuthThrottle` query atomically increments the counter for `(scope, identifier)` or starts a new window if the existing one is older than `(now - Window)`.
103
+
104
+Limits enforced by the auth handlers:
105
+
106
+| Scope | Identifier | Max | Window | Where |
107
+|---|---|---|---|---|
108
+| `signup` | `ip:<client-ip>` | 5 | 1h | `signupSubmit` |
109
+| `login` | `ip:<client-ip>\|<username>` | 6 | 15m | `loginSubmit` (reset on success) |
110
+| `reset` | `email:<addr>` | 3 | 1h | `resetRequestSubmit` |
111
+
112
+429 responses include a `Retry-After` header.
113
+
114
+We deliberately use Postgres rather than introducing Redis. At launch scale this is well within Postgres's comfort zone, and avoiding a new dep is worth the marginal latency. Migrate if S36 proves it necessary.
115
+
116
+## Sessions
117
+
118
+S02 ships an AEAD-encrypted cookie store; S05 extends it to carry `user_id`. `Session.IsAnonymous()` reports whether a user is bound. The cookie store re-encrypts on every Save, producing a fresh ciphertext — that's our defense against session fixation.
119
+
120
+Login flow:
121
+
122
+1. Verify password.
123
+2. Mutate the loaded session to set `UserID` and `IssuedAt`.
124
+3. Call `SessionStore.Save` — re-encrypts the cookie under the new state.
125
+
126
+Logout flow:
127
+
128
+1. `SessionStore.Clear` — deletes the cookie.
129
+
130
+## Auth middleware
131
+
132
+`internal/web/middleware/auth.go` adds:
133
+
134
+- `OptionalUser(lookup)` — populates `CurrentUser{ID, Username}` into context when the session has a `user_id`. Anonymous requests still pass through.
135
+- `RequireUser` — redirects to `/login?next=<requested-path>` for anonymous requests.
136
+- `MaxBodySize(n)` — wraps `r.Body` in `http.MaxBytesReader(w, r.Body, n)`. Used on auth POSTs (4 KiB).
137
+
138
+`CurrentUserFromContext(ctx)` returns the bound `CurrentUser` (zero value when anonymous).
139
+
140
+## CSRF
141
+
142
+S02's `nosurf` wrapper guards every state-changing route. S05 fixes a wart: nosurf's default `isTLS` returns true unconditionally, which makes its same-origin Referer check require an `https` scheme even on plain-HTTP requests. We set `isTLS` to a function that consults `r.TLS` and `X-Forwarded-Proto: https` so dev (HTTP) and prod (TLS-terminated) both work correctly.
143
+
144
+## Signup → verify → login → logout
145
+
146
+End-to-end flow exercised in `internal/web/handlers/auth/auth_test.go::TestSignup_Verify_Login_Logout`:
147
+
148
+1. `GET /signup` — render form. CSRF cookie set by nosurf.
149
+2. `POST /signup` — validate username (whitelist + reserved-name list), email shape, password length, common-password check. Hash password. In one transaction: create user, create user_email (unverified), set `users.primary_email_id`, create email_verifications row. Send verification email (best-effort — SMTP failure does not break signup). Redirect to `/login?notice=signup-pending`.
150
+3. `GET /verify-email/{token}` — hash the token, look up the row, validate (not used, not expired). In one transaction: mark the email verified, flip `users.email_verified` if it's the primary, mark the verification row `used_at`. Redirect to `/login?notice=verified`.
151
+4. `POST /login` — load user (constant-time fallback if missing), Verify password, check suspended/verified flags, reset throttle counter, touch `last_login_at`, mutate session with `user_id`, Save (re-encrypts cookie), redirect to `/` (or `?next=` target if it's a relative path).
152
+5. `POST /logout` — Clear session. Redirect to `/login?notice=logged-out`.
153
+
154
+## Password reset
155
+
156
+`POST /password/reset` always responds with the same generic notice — "If an account is registered to that address, we've sent a password-reset link." — whether or not the email exists. No email is sent for unknown addresses. This is the canonical defense against enumeration via the reset flow.
157
+
158
+`POST /password/reset/{token}` validates the token (lookup by sha256, not used, not expired), enforces the password policy (length + common-password), hashes via argon2id, updates `users.password_hash`/`password_algo`/`password_updated_at` and marks the reset row consumed — atomically in one transaction. Redirects to `/login?notice=password-reset`.
159
+
160
+## Honeypot
161
+
162
+The signup form has a hidden `company` field positioned off-screen with `tabindex="-1"`. Bots tend to fill every field they see; humans don't. Non-empty submissions are silently treated as success (303 redirect to the same `/login?notice=signup-pending` page) so bots can't detect the trap.
163
+
164
+## Admin escape hatch
165
+
166
+```sh
167
+shithubd admin reset-password <username>
168
+```
169
+
170
+Generates a fresh password-reset token (1-hour TTL), persists it, and emails the link to the user's primary email via the configured backend. Useful when a locked-out user can't drive the public reset flow themselves.
171
+
172
+## Configuration
173
+
174
+All auth settings flow through `internal/infra/config` (see `docs/internal/config.md`):
175
+
176
+| Key | Type | Default | Notes |
177
+|---|---|---|---|
178
+| `auth.require_email_verification` | bool | `true` | When true, login is rejected until the primary email is verified. |
179
+| `auth.base_url` | string | `http://127.0.0.1:8080` | Used for absolute links in emails. |
180
+| `auth.site_name` | string | `shithub` | Branding token for email subjects/bodies. |
181
+| `auth.email_from` | string | `shithub <noreply@shithub.local>` | Envelope From for outgoing email. |
182
+| `auth.email_backend` | string | `stdout` | `stdout | smtp | postmark`. |
183
+| `auth.smtp.addr` | string | `127.0.0.1:1025` | Required when `email_backend=smtp`. |
184
+| `auth.smtp.username` | string | `""` | Optional. |
185
+| `auth.smtp.password` | string | `""` | Optional. Redacted by `config print`. |
186
+| `auth.postmark.server_token` | string | `""` | Required when `email_backend=postmark`. Redacted. |
187
+| `auth.argon2.memory_kib` | uint32 | `65536` | argon2id memory cost (KiB). |
188
+| `auth.argon2.time` | uint32 | `3` | argon2id iterations. |
189
+| `auth.argon2.threads` | uint8 | `2` | argon2id lanes. |
190
+
191
+## Testing
192
+
193
+- **Unit tests** (no DB): `internal/auth/password`, `internal/auth/token`, `internal/auth/email`, `internal/auth/reserved`, `internal/passwords`. Run with `go test ./...`.
194
+- **DB-backed tests** (skip when `SHITHUB_TEST_DATABASE_URL` is unset): `internal/auth/throttle`, `internal/web/handlers/auth`. Use the dbtest harness (clone-from-template) for parallel safety.
195
+
196
+The auth integration tests cover:
197
+- Signup → verify → login → logout
198
+- Password reset end-to-end
199
+- Generic notice for unknown email at reset time (no enumeration)
200
+- Login throttled after 6 failed attempts (with `Retry-After`)
201
+- Login response time roughly equal between existing and missing usernames
202
+- Reserved-name rejection
203
+- Common-password rejection
204
+- Honeypot silently accepted
205
+
206
+## Pitfalls / what to remember
207
+
208
+- **Don't leak existence via timing.** Always call `password.VerifyAgainstDummy` on missing-user login attempts.
209
+- **Don't leak existence via reset.** Always render the same notice regardless of whether the email maps to a real account.
210
+- **Username uniqueness** is enforced by the `citext UNIQUE` constraint, not by a pre-check (TOCTOU race otherwise).
211
+- **No password length cap** at the application layer — but cap the request body so the argon2 hasher can't be weaponized.
212
+- **`html/template` HTML-escapes attribute values** including `+` → `&#43;`. Browsers decode this transparently; non-browser clients (e.g. test clients) need to call `html.UnescapeString`.
213
+- **Reserved-name list is load-bearing** — when adding a new top-level route, update `internal/auth/reserved.go` in the same PR.
214
+- **Email send failure must not break signup.** The verify-resend flow handles delivery retries.
docs/internal/config.mdmodified
@@ -55,6 +55,18 @@ shithubd version # includes a one-line summary of which sinks are confi
5555
 | `storage.s3.bucket` | string | `""` | Single bucket per environment. |
5656
 | `storage.s3.use_ssl` | bool | `false` | True for Spaces, false for local MinIO. |
5757
 | `storage.s3.force_path_style` | bool | `true` | True for MinIO, false for Spaces. |
58
+| `auth.require_email_verification` | bool | `true` | When true, login is rejected until the primary email is verified. |
59
+| `auth.base_url` | string | `http://127.0.0.1:8080` | Used for absolute links in transactional emails. |
60
+| `auth.site_name` | string | `shithub` | Branding token for email subjects/bodies. |
61
+| `auth.email_from` | string | `shithub <noreply@shithub.local>` | Envelope From for outgoing email. |
62
+| `auth.email_backend` | string | `stdout` | One of `stdout | smtp | postmark`. |
63
+| `auth.smtp.addr` | string | `127.0.0.1:1025` | Required when `email_backend=smtp`. |
64
+| `auth.smtp.username` | string | `""` | Optional SMTP auth username. |
65
+| `auth.smtp.password` | string | `""` | Optional SMTP auth password. Redacted by `config print`. |
66
+| `auth.postmark.server_token` | string | `""` | Required when `email_backend=postmark`. Redacted. |
67
+| `auth.argon2.memory_kib` | uint32 | `65536` | argon2id memory cost (KiB). |
68
+| `auth.argon2.time` | uint32 | `3` | argon2id iterations. |
69
+| `auth.argon2.threads` | uint8 | `2` | argon2id parallelism. |
5870
 
5971
 ## Env-var examples
6072