@@ -0,0 +1,214 @@ |
| 1 | +# Authentication |
| 2 | + |
| 3 | +S05 brings the first real domain surface to shithub: email/password signup, email verification, login, logout, password reset, and rate-limiting. This doc covers what's wired, where the code lives, and the security choices behind the design. |
| 4 | + |
| 5 | +## What this sprint ships |
| 6 | + |
| 7 | +- Real users in Postgres: `users`, `user_emails`, `password_resets`, `email_verifications`, `auth_throttle`, `username_redirects`. |
| 8 | +- argon2id password hashing with PHC string encoding. |
| 9 | +- Strict username whitelist + reserved-name list. |
| 10 | +- Common-password rejection from an embedded SecLists 10k corpus. |
| 11 | +- Email Sender abstraction with stdout (dev), SMTP (MailHog/local), and Postmark (prod) backends. |
| 12 | +- Counter-based rate-limiting backed by Postgres (no Redis dep). |
| 13 | +- Auth handlers: signup, login, logout, password reset (request + confirm), email verification, verification resend. |
| 14 | +- Constant-time login: missing usernames still trigger an argon2 hash against a pre-computed dummy. |
| 15 | +- Generic password-reset response: no user-existence enumeration via that flow. |
| 16 | +- 4 KiB request-body cap on auth POSTs (anti-DoS for the hashing path). |
| 17 | +- Honeypot field on the signup form. |
| 18 | +- `shithubd admin reset-password <username>` operator escape hatch. |
| 19 | + |
| 20 | +## Out of scope (future sprints) |
| 21 | + |
| 22 | +- 2FA / TOTP (S06) |
| 23 | +- SSH keys (S07) |
| 24 | +- Personal access tokens (S08) |
| 25 | +- OAuth providers (post-MVP) |
| 26 | +- Profile / settings UI (S09 / S10) |
| 27 | + |
| 28 | +## Tables (migrations 0002–0008) |
| 29 | + |
| 30 | +| Table | Purpose | Notes | |
| 31 | +|---|---|---| |
| 32 | +| `users` | identity + password material | `username citext UNIQUE`. PHC argon2id hash. Soft-deletable (`deleted_at`) and suspendable. | |
| 33 | +| `user_emails` | one or more emails per user | `email citext UNIQUE` across all users. Partial unique index enforces one primary per user. | |
| 34 | +| `password_resets` | reset tokens | `token_hash bytea UNIQUE`. 1-hour TTL. Single-use via `used_at`. | |
| 35 | +| `email_verifications` | verification tokens | `token_hash bytea UNIQUE`. 24-hour TTL. Single-use via `used_at`. | |
| 36 | +| `auth_throttle` | rate-limit counters | `(scope, identifier)` UNIQUE. Fixed-window. | |
| 37 | +| `username_redirects` | username-change history | Old name → new user. Acts as a 30-day reservation (S10 wires the cooldown). | |
| 38 | + |
| 39 | +`citext` is enabled in migration 0002 for case-insensitive uniqueness without functional indexes. Display casing for users and emails lives in the same column — citext preserves it on read. |
| 40 | + |
| 41 | +## Password hashing |
| 42 | + |
| 43 | +`internal/auth/password` implements argon2id using `golang.org/x/crypto/argon2`. Defaults: |
| 44 | + |
| 45 | +- Memory: 64 MiB (`m=65536`) |
| 46 | +- Time: 3 iterations (`t=3`) |
| 47 | +- Threads: 2 (`p=2`) |
| 48 | +- Salt: 16 bytes |
| 49 | +- Key: 32 bytes |
| 50 | + |
| 51 | +These are tuned for ~100–300 ms per Hash on dev hardware. Operators can override via `auth.argon2.*` config. The output is a canonical PHC string: |
| 52 | + |
| 53 | +``` |
| 54 | +$argon2id$v=19$m=65536,t=3,p=2$<saltB64>$<hashB64> |
| 55 | +``` |
| 56 | + |
| 57 | +Stored in `users.password_hash`; `users.password_algo` is `argon2id-v1` so we can roll forward later. |
| 58 | + |
| 59 | +A package-level `dummyEncoded` is computed on first use. Login handlers call `password.VerifyAgainstDummy` when the username doesn't exist so the response time matches a real failed-password verification (defense against username enumeration via timing). |
| 60 | + |
| 61 | +The minimum password length is 10 characters. There is **no** maximum — argon2id can hash arbitrary input — but the auth-POST middleware caps `r.Body` at 4 KiB so a misbehaving client can't ship a 10 MB password to weaponize the hashing path. |
| 62 | + |
| 63 | +## Tokens |
| 64 | + |
| 65 | +`internal/auth/token` mints 32-byte cryptographically-random tokens, base64url-encodes them for inclusion in URLs/emails, and stores their sha256 hash in the DB. The raw token never touches disk. Lookups parse the URL fragment, hash, and query by hash with a UNIQUE-index O(1) seek. Comparisons use `crypto/subtle.ConstantTimeCompare`. |
| 66 | + |
| 67 | +## Reserved-name list |
| 68 | + |
| 69 | +`internal/auth/reserved.go` maintains a set of usernames that may not be claimed: every static top-level route shithub registers, GitHub-known reservations we mirror for parity, and a buffer of likely-future routes. Signup checks this list (case-insensitive — `LOGIN` is rejected the same as `login`). |
| 70 | + |
| 71 | +When a future sprint adds a new top-level route, the leading path segment **must** be added here. The intended audit mechanism: `internal/web/handlers/handlers_test.go` should walk the registered chi routes and check every static segment against `auth.ReservedNames()`. (Stub-level today; full route-walking lands when the route surface stabilizes in S11+.) |
| 72 | + |
| 73 | +## Common-password list |
| 74 | + |
| 75 | +`internal/passwords` embeds the SecLists `10k-most-common.txt` corpus (~73 KB on disk). `IsCommon` is case-insensitive. Used at signup and password-reset. |
| 76 | + |
| 77 | +To refresh: replace `internal/passwords/common_passwords.txt` with a newer SecLists snapshot and re-run the test suite. |
| 78 | + |
| 79 | +## Email service |
| 80 | + |
| 81 | +`internal/auth/email` defines the `Sender` interface. Three implementations: |
| 82 | + |
| 83 | +- `StdoutSender` — writes a human-readable dump to a writer. Default in dev when no SMTP is configured. Convenient for tests. |
| 84 | +- `SMTPSender` — plain SMTP for MailHog locally. Authenticated and TLS-upgrade variants supported. |
| 85 | +- `PostmarkSender` — Postmark transactional API. Production default. |
| 86 | + |
| 87 | +`messages.go` hosts the `VerifyMessage` and `ResetMessage` builders. Both produce HTML + plaintext bodies — every transactional email shithub sends works in plain-text-only clients. Templates are inlined (short, rarely change); when they grow, promote to `templates/email/*.{html,txt}`. |
| 88 | + |
| 89 | +### Wiring |
| 90 | + |
| 91 | +`auth.email_backend` chooses the implementation: `stdout | smtp | postmark`. The `smtp` backend additionally requires `auth.smtp.addr`; `postmark` requires `auth.postmark.server_token`. Validation enforces these. |
| 92 | + |
| 93 | +```sh |
| 94 | +# Dev: capture mail in MailHog |
| 95 | +make dev-email |
| 96 | +SHITHUB_AUTH__EMAIL_BACKEND=smtp ./bin/shithubd web |
| 97 | +# Open http://127.0.0.1:8025 to read captured messages |
| 98 | +``` |
| 99 | + |
| 100 | +## Rate limiting |
| 101 | + |
| 102 | +`internal/auth/throttle` implements fixed-window counters via the `auth_throttle` table. The `BumpAuthThrottle` query atomically increments the counter for `(scope, identifier)` or starts a new window if the existing one is older than `(now - Window)`. |
| 103 | + |
| 104 | +Limits enforced by the auth handlers: |
| 105 | + |
| 106 | +| Scope | Identifier | Max | Window | Where | |
| 107 | +|---|---|---|---|---| |
| 108 | +| `signup` | `ip:<client-ip>` | 5 | 1h | `signupSubmit` | |
| 109 | +| `login` | `ip:<client-ip>\|<username>` | 6 | 15m | `loginSubmit` (reset on success) | |
| 110 | +| `reset` | `email:<addr>` | 3 | 1h | `resetRequestSubmit` | |
| 111 | + |
| 112 | +429 responses include a `Retry-After` header. |
| 113 | + |
| 114 | +We deliberately use Postgres rather than introducing Redis. At launch scale this is well within Postgres's comfort zone, and avoiding a new dep is worth the marginal latency. Migrate if S36 proves it necessary. |
| 115 | + |
| 116 | +## Sessions |
| 117 | + |
| 118 | +S02 ships an AEAD-encrypted cookie store; S05 extends it to carry `user_id`. `Session.IsAnonymous()` reports whether a user is bound. The cookie store re-encrypts on every Save, producing a fresh ciphertext — that's our defense against session fixation. |
| 119 | + |
| 120 | +Login flow: |
| 121 | + |
| 122 | +1. Verify password. |
| 123 | +2. Mutate the loaded session to set `UserID` and `IssuedAt`. |
| 124 | +3. Call `SessionStore.Save` — re-encrypts the cookie under the new state. |
| 125 | + |
| 126 | +Logout flow: |
| 127 | + |
| 128 | +1. `SessionStore.Clear` — deletes the cookie. |
| 129 | + |
| 130 | +## Auth middleware |
| 131 | + |
| 132 | +`internal/web/middleware/auth.go` adds: |
| 133 | + |
| 134 | +- `OptionalUser(lookup)` — populates `CurrentUser{ID, Username}` into context when the session has a `user_id`. Anonymous requests still pass through. |
| 135 | +- `RequireUser` — redirects to `/login?next=<requested-path>` for anonymous requests. |
| 136 | +- `MaxBodySize(n)` — wraps `r.Body` in `http.MaxBytesReader(w, r.Body, n)`. Used on auth POSTs (4 KiB). |
| 137 | + |
| 138 | +`CurrentUserFromContext(ctx)` returns the bound `CurrentUser` (zero value when anonymous). |
| 139 | + |
| 140 | +## CSRF |
| 141 | + |
| 142 | +S02's `nosurf` wrapper guards every state-changing route. S05 fixes a wart: nosurf's default `isTLS` returns true unconditionally, which makes its same-origin Referer check require an `https` scheme even on plain-HTTP requests. We set `isTLS` to a function that consults `r.TLS` and `X-Forwarded-Proto: https` so dev (HTTP) and prod (TLS-terminated) both work correctly. |
| 143 | + |
| 144 | +## Signup → verify → login → logout |
| 145 | + |
| 146 | +End-to-end flow exercised in `internal/web/handlers/auth/auth_test.go::TestSignup_Verify_Login_Logout`: |
| 147 | + |
| 148 | +1. `GET /signup` — render form. CSRF cookie set by nosurf. |
| 149 | +2. `POST /signup` — validate username (whitelist + reserved-name list), email shape, password length, common-password check. Hash password. In one transaction: create user, create user_email (unverified), set `users.primary_email_id`, create email_verifications row. Send verification email (best-effort — SMTP failure does not break signup). Redirect to `/login?notice=signup-pending`. |
| 150 | +3. `GET /verify-email/{token}` — hash the token, look up the row, validate (not used, not expired). In one transaction: mark the email verified, flip `users.email_verified` if it's the primary, mark the verification row `used_at`. Redirect to `/login?notice=verified`. |
| 151 | +4. `POST /login` — load user (constant-time fallback if missing), Verify password, check suspended/verified flags, reset throttle counter, touch `last_login_at`, mutate session with `user_id`, Save (re-encrypts cookie), redirect to `/` (or `?next=` target if it's a relative path). |
| 152 | +5. `POST /logout` — Clear session. Redirect to `/login?notice=logged-out`. |
| 153 | + |
| 154 | +## Password reset |
| 155 | + |
| 156 | +`POST /password/reset` always responds with the same generic notice — "If an account is registered to that address, we've sent a password-reset link." — whether or not the email exists. No email is sent for unknown addresses. This is the canonical defense against enumeration via the reset flow. |
| 157 | + |
| 158 | +`POST /password/reset/{token}` validates the token (lookup by sha256, not used, not expired), enforces the password policy (length + common-password), hashes via argon2id, updates `users.password_hash`/`password_algo`/`password_updated_at` and marks the reset row consumed — atomically in one transaction. Redirects to `/login?notice=password-reset`. |
| 159 | + |
| 160 | +## Honeypot |
| 161 | + |
| 162 | +The signup form has a hidden `company` field positioned off-screen with `tabindex="-1"`. Bots tend to fill every field they see; humans don't. Non-empty submissions are silently treated as success (303 redirect to the same `/login?notice=signup-pending` page) so bots can't detect the trap. |
| 163 | + |
| 164 | +## Admin escape hatch |
| 165 | + |
| 166 | +```sh |
| 167 | +shithubd admin reset-password <username> |
| 168 | +``` |
| 169 | + |
| 170 | +Generates a fresh password-reset token (1-hour TTL), persists it, and emails the link to the user's primary email via the configured backend. Useful when a locked-out user can't drive the public reset flow themselves. |
| 171 | + |
| 172 | +## Configuration |
| 173 | + |
| 174 | +All auth settings flow through `internal/infra/config` (see `docs/internal/config.md`): |
| 175 | + |
| 176 | +| Key | Type | Default | Notes | |
| 177 | +|---|---|---|---| |
| 178 | +| `auth.require_email_verification` | bool | `true` | When true, login is rejected until the primary email is verified. | |
| 179 | +| `auth.base_url` | string | `http://127.0.0.1:8080` | Used for absolute links in emails. | |
| 180 | +| `auth.site_name` | string | `shithub` | Branding token for email subjects/bodies. | |
| 181 | +| `auth.email_from` | string | `shithub <noreply@shithub.local>` | Envelope From for outgoing email. | |
| 182 | +| `auth.email_backend` | string | `stdout` | `stdout | smtp | postmark`. | |
| 183 | +| `auth.smtp.addr` | string | `127.0.0.1:1025` | Required when `email_backend=smtp`. | |
| 184 | +| `auth.smtp.username` | string | `""` | Optional. | |
| 185 | +| `auth.smtp.password` | string | `""` | Optional. Redacted by `config print`. | |
| 186 | +| `auth.postmark.server_token` | string | `""` | Required when `email_backend=postmark`. Redacted. | |
| 187 | +| `auth.argon2.memory_kib` | uint32 | `65536` | argon2id memory cost (KiB). | |
| 188 | +| `auth.argon2.time` | uint32 | `3` | argon2id iterations. | |
| 189 | +| `auth.argon2.threads` | uint8 | `2` | argon2id lanes. | |
| 190 | + |
| 191 | +## Testing |
| 192 | + |
| 193 | +- **Unit tests** (no DB): `internal/auth/password`, `internal/auth/token`, `internal/auth/email`, `internal/auth/reserved`, `internal/passwords`. Run with `go test ./...`. |
| 194 | +- **DB-backed tests** (skip when `SHITHUB_TEST_DATABASE_URL` is unset): `internal/auth/throttle`, `internal/web/handlers/auth`. Use the dbtest harness (clone-from-template) for parallel safety. |
| 195 | + |
| 196 | +The auth integration tests cover: |
| 197 | +- Signup → verify → login → logout |
| 198 | +- Password reset end-to-end |
| 199 | +- Generic notice for unknown email at reset time (no enumeration) |
| 200 | +- Login throttled after 6 failed attempts (with `Retry-After`) |
| 201 | +- Login response time roughly equal between existing and missing usernames |
| 202 | +- Reserved-name rejection |
| 203 | +- Common-password rejection |
| 204 | +- Honeypot silently accepted |
| 205 | + |
| 206 | +## Pitfalls / what to remember |
| 207 | + |
| 208 | +- **Don't leak existence via timing.** Always call `password.VerifyAgainstDummy` on missing-user login attempts. |
| 209 | +- **Don't leak existence via reset.** Always render the same notice regardless of whether the email maps to a real account. |
| 210 | +- **Username uniqueness** is enforced by the `citext UNIQUE` constraint, not by a pre-check (TOCTOU race otherwise). |
| 211 | +- **No password length cap** at the application layer — but cap the request body so the argon2 hasher can't be weaponized. |
| 212 | +- **`html/template` HTML-escapes attribute values** including `+` → `+`. Browsers decode this transparently; non-browser clients (e.g. test clients) need to call `html.UnescapeString`. |
| 213 | +- **Reserved-name list is load-bearing** — when adding a new top-level route, update `internal/auth/reserved.go` in the same PR. |
| 214 | +- **Email send failure must not break signup.** The verify-resend flow handles delivery retries. |