markdown · 9284 bytes Raw Blame History

Personal access tokens

S08 ships personal access tokens (PATs) — the authentication primitive for shithub's HTTP API and (eventually) git-over-HTTPS pushes. Tokens are minted via /settings/tokens, displayed once, and revocable from the same page.

What's wired

  • user_tokens table with global token_hash UNIQUE + scope array.
  • internal/auth/pat package: minting, sha256 hashing, scope set, in-memory last-used debouncer.
  • internal/web/middleware PATAuth + RequireScope middlewares.
  • internal/web/handlers/api package with GET /api/v1/user (PAT-only).
  • /settings/tokens (list, create one-time-display, revoke).
  • Recent-auth gate (Session.Recent2FAAt within 10 min) when 2FA is enrolled.
  • token_created notification email.
  • Authorization-header redaction + URL-credential stripping in slog.

Token format

shithub_pat_<32-char-base62>, e.g. shithub_pat_F75zxAWXLBWR3mno8hCa2eBM8p4X5saw.

The fixed shithub_pat_ prefix is intentional:

  • Secret-scanning vendors (GitHub Advanced Security, GitGuardian, etc.) recognize this exact pattern and can flag it in source code or commit history.
  • Our slog redactor scrubs any string containing the prefix.
  • The prefix is stripped+redacted in URL-credential form (https://user:shithub_pat_…@host) so even an accidentally-pasted git remote stays safe in logs.

The 32-char payload carries ~190 bits of entropy (log2(62)*32) — well beyond brute-force budgets.

Storage

Only the sha256 hash of the raw token lands in the DB:

Column Type Notes
token_hash bytea sha256 of raw, UNIQUE — auth lookup uses an index seek
token_prefix text first ~16 chars of raw (incl. shithub_pat_), display-only
scopes text[] normalized via pat.NormalizeScopes before INSERT
expires_at timestamptz NULL NULL = no expiry; UI defaults to 90 days
last_used_at, last_used_ip best-effort, debounced
revoked_at timestamptz NULL Once set, the token is dead even if not yet expired

sha256 without salt is acceptable here because the input is itself uniform-random — there is no rainbow-table surface. Constant-time compare is provided by pat.EqualHash and used by callers that compare hashes outside a DB lookup.

Scopes

Coarse, GitHub-classic-PAT-style:

Scope Grants
repo:read Read access to private repos the user can access
repo:write Push, branch, PR creation. Implies repo:read.
user:read Read user profile, verified emails
user:write Modify user settings (NOT password / 2FA / tokens — session-only ops). Implies user:read.
admin:read Read admin endpoints (only honored for users with admin role)

Implication is enforced in pat.HasScope so callers don't have to spell it out.

A route declares its required scope via middleware.RequireScope(pat.ScopeUserRead). Pure-session callers (no Authorization header) bypass scope checks entirely — sessions have implicit full scope.

Authentication paths

The middleware accepts three forms, all routed to the same lookup:

  1. Authorization: token <pat> (preferred)
  2. Authorization: Bearer <pat>
  3. HTTP Basic where the password is the PAT (matches git's credential helpers)
# Preferred
curl -H 'Authorization: token shithub_pat_…' https://shithub/api/v1/user

# Bearer (RFC 6750)
curl -H 'Authorization: Bearer shithub_pat_…' https://shithub/api/v1/user

# git-over-HTTPS (works because git uses Basic with password=<pat>)
git push https://alice:shithub_pat_…@shithub/owner/repo.git main

On any auth failure the middleware writes a 401 with a WWW-Authenticate: Bearer realm="shithub", error="invalid_token", error_description="<reason>" header. Reasons we emit:

  • invalid token — missing, malformed, unknown, or any unrecoverable error
  • token revoked
  • token expired
  • account suspended

We don't distinguish "unknown" from "malformed" in the response — leaking that distinction would let an attacker probe whether a particular hash exists.

Recent-auth gate

Creating a PAT requires a recent successful TOTP challenge if the user has 2FA enrolled:

  • Session.Recent2FAAt is set on a successful /login/2fa and on the enrollment-confirm path.
  • recentAuthOK checks the timestamp is within 10 minutes (recentAuthWindow).
  • Users without 2FA are treated as recent (the gate only matters when 2FA is in play).

When the gate fails, the token-create form shows a flash directing the user to sign in again with their authenticator code. This raises the bar on stolen-cookie attacks issuing tokens.

One-time display

Raw tokens are shown to the user exactly once at create time, then discarded. The handler:

  1. Mints the raw + hash + display prefix.
  2. Inserts the row with hash + prefix + scopes.
  3. Renders the listing template with JustCreatedRaw set to the raw value.

The template displays JustCreatedRaw in a <pre><code> block with a clear "save this now — won't be shown again" message. The raw value never lands in the session, the URL, or any log line.

Last-used debouncing

Every authenticated request would otherwise burn an UPDATE on user_tokens. To avoid hot-row contention, the middleware uses an in-memory debouncer (pat.Debouncer) keyed by token_id with a 60-second window:

  • First request in a window → ShouldTouch returns true → goroutine writes last_used_at / last_used_ip with a 500ms timeout.
  • Subsequent requests within the window → ShouldTouch returns false → no write.

Trade-offs:

  • Lost on process restart. Acceptable — last_used_at is a UI-display field, not an audit primitive.
  • Per-process. Multi-replica deployments will have ~N writes per token per window where N = replicas. Still bounded.
  • Goroutine detached from r.Context(). Intentional — a debounced touch is best-effort; we'd rather complete it than drop on client disconnect (G118 is suppressed with that justification).

pat.Debouncer.Forget(tokenID) is exposed for revoke-time invalidation in future sprints.

Auto-revoke triggers

  • User suspension → all active tokens revoked via RevokeAllUserTokens.
  • User deletion (post-grace) → tokens hard-deleted via the ON DELETE CASCADE from user_id.
  • Password change → tokens are NOT revoked by default. Matches GitHub's behavior; user-configurable preference lands in S10.

Logging discipline

Tokens leak in three classic places: request logs, error reporters, and panic dumps. Defenses:

  1. Authorization is in the slog redactor's secret-attr list — value ***.
  2. shithub_pat_ is in the value-marker list — any string containing it collapses to ***.
  3. URL credentials of the form scheme://user:pass@host get the userinfo stripped via regex while preserving host + path so logs stay useful (e.g. postgres://***@127.0.0.1:5432/shithub).
  4. errrep.SlogHandler flows error-level records through the same redactor before forwarding to GlitchTip.

Negative tests cover #1, #2, and #3 in internal/infra/log/log_test.go.

Settings UI

/settings/tokens shows:

  • Active and revoked tokens (revoked sorted last) with name, prefix, scopes, last-used timestamp, expiry.
  • A create form with name, expiry picker (30/90/365/none), and scope checkboxes.
  • A flash banner when the recent-auth gate is failing.
  • The newly-minted raw token in a one-time-display block at the top of the page when JustCreatedRaw is set.

Operational notes

  • Per-user cap: 50 tokens (pat.MaxTokensPerUser). Configurable via constant; the listing handler enforces it on create.
  • Audit: pat_created and pat_revoked rows in auth_audit_log carry {token_id, prefix, scopes} (creation) and {token_id} (revocation). Never the raw token, never the hash.
  • Notification email on every successful create with name + prefix + IP.

API surface (S08 baseline)

  • GET /api/v1/user — returns the authenticated user's public profile JSON. Requires user:read (or user:write via implication).

Future sprints add:

  • GET /api/v1/repos/{owner}/{repo} (S11 unblocks)
  • POST /api/v1/repos (S11)
  • PR / issue endpoints (S22, S23)

Pitfalls / what to remember

  • Don't store the raw token anywhere. The DB has the hash; the handler shows the raw exactly once.
  • Don't compare hashes with bytes.Equal. Use pat.EqualHash (constant-time).
  • Don't put scopes on the request — put them on the token. Sessions have implicit full scope; PATs carry exactly what the user picked.
  • Don't bypass the recent-auth gate. A hijacked session must NOT be able to mint a PAT without a fresh TOTP step.
  • Don't log Authorization headers. The redactor catches it, but a custom logger that bypasses our handler doesn't get the redaction for free.
  • Don't change a token's scopes in place. "Edit scopes" = "revoke + create new" UX. We never silently widen a token's privileges.
  • docs/internal/auth.md — email/password auth (S05).
  • docs/internal/2fa.md — TOTP + recovery codes (S06), recent-auth pattern.
  • docs/internal/ssh-deploy.md — git-over-SSH (S07); PAT covers git-over-HTTPS in S12.
  • docs/internal/observability.md — slog redaction.
View source
1 # Personal access tokens
2
3 S08 ships personal access tokens (PATs) — the authentication primitive for shithub's HTTP API and (eventually) git-over-HTTPS pushes. Tokens are minted via `/settings/tokens`, displayed once, and revocable from the same page.
4
5 ## What's wired
6
7 - `user_tokens` table with global `token_hash UNIQUE` + scope array.
8 - `internal/auth/pat` package: minting, sha256 hashing, scope set, in-memory last-used debouncer.
9 - `internal/web/middleware` PATAuth + RequireScope middlewares.
10 - `internal/web/handlers/api` package with `GET /api/v1/user` (PAT-only).
11 - `/settings/tokens` (list, create one-time-display, revoke).
12 - Recent-auth gate (Session.Recent2FAAt within 10 min) when 2FA is enrolled.
13 - token_created notification email.
14 - Authorization-header redaction + URL-credential stripping in slog.
15
16 ## Token format
17
18 `shithub_pat_<32-char-base62>`, e.g. `shithub_pat_F75zxAWXLBWR3mno8hCa2eBM8p4X5saw`.
19
20 The fixed `shithub_pat_` prefix is intentional:
21
22 - Secret-scanning vendors (GitHub Advanced Security, GitGuardian, etc.) recognize this exact pattern and can flag it in source code or commit history.
23 - Our slog redactor scrubs any string containing the prefix.
24 - The prefix is stripped+redacted in URL-credential form (`https://user:shithub_pat_…@host`) so even an accidentally-pasted git remote stays safe in logs.
25
26 The 32-char payload carries ~190 bits of entropy (`log2(62)*32`) — well beyond brute-force budgets.
27
28 ## Storage
29
30 Only the sha256 hash of the raw token lands in the DB:
31
32 | Column | Type | Notes |
33 |---|---|---|
34 | `token_hash` | `bytea` | sha256 of raw, UNIQUE — auth lookup uses an index seek |
35 | `token_prefix` | `text` | first ~16 chars of raw (incl. `shithub_pat_`), display-only |
36 | `scopes` | `text[]` | normalized via `pat.NormalizeScopes` before INSERT |
37 | `expires_at` | `timestamptz` NULL | NULL = no expiry; UI defaults to 90 days |
38 | `last_used_at`, `last_used_ip` | best-effort, debounced | |
39 | `revoked_at` | `timestamptz` NULL | Once set, the token is dead even if not yet expired |
40
41 sha256 without salt is acceptable here because the input is itself uniform-random — there is no rainbow-table surface. Constant-time compare is provided by `pat.EqualHash` and used by callers that compare hashes outside a DB lookup.
42
43 ## Scopes
44
45 Coarse, GitHub-classic-PAT-style:
46
47 | Scope | Grants |
48 |---|---|
49 | `repo:read` | Read access to private repos the user can access |
50 | `repo:write` | Push, branch, PR creation. Implies `repo:read`. |
51 | `user:read` | Read user profile, verified emails |
52 | `user:write` | Modify user settings (NOT password / 2FA / tokens — session-only ops). Implies `user:read`. |
53 | `admin:read` | Read admin endpoints (only honored for users with admin role) |
54
55 Implication is enforced in `pat.HasScope` so callers don't have to spell it out.
56
57 A route declares its required scope via `middleware.RequireScope(pat.ScopeUserRead)`. **Pure-session callers (no Authorization header) bypass scope checks entirely** — sessions have implicit full scope.
58
59 ## Authentication paths
60
61 The middleware accepts three forms, all routed to the same lookup:
62
63 1. `Authorization: token <pat>` (preferred)
64 2. `Authorization: Bearer <pat>`
65 3. HTTP Basic where the password is the PAT (matches `git`'s credential helpers)
66
67 ```sh
68 # Preferred
69 curl -H 'Authorization: token shithub_pat_…' https://shithub/api/v1/user
70
71 # Bearer (RFC 6750)
72 curl -H 'Authorization: Bearer shithub_pat_…' https://shithub/api/v1/user
73
74 # git-over-HTTPS (works because git uses Basic with password=<pat>)
75 git push https://alice:shithub_pat_…@shithub/owner/repo.git main
76 ```
77
78 On any auth failure the middleware writes a 401 with a `WWW-Authenticate: Bearer realm="shithub", error="invalid_token", error_description="<reason>"` header. Reasons we emit:
79
80 - `invalid token` — missing, malformed, unknown, or any unrecoverable error
81 - `token revoked`
82 - `token expired`
83 - `account suspended`
84
85 We don't distinguish "unknown" from "malformed" in the response — leaking that distinction would let an attacker probe whether a particular hash exists.
86
87 ## Recent-auth gate
88
89 Creating a PAT requires a recent successful TOTP challenge if the user has 2FA enrolled:
90
91 - `Session.Recent2FAAt` is set on a successful `/login/2fa` and on the enrollment-confirm path.
92 - `recentAuthOK` checks the timestamp is within 10 minutes (`recentAuthWindow`).
93 - Users without 2FA are treated as recent (the gate only matters when 2FA is in play).
94
95 When the gate fails, the token-create form shows a flash directing the user to sign in again with their authenticator code. This raises the bar on stolen-cookie attacks issuing tokens.
96
97 ## One-time display
98
99 Raw tokens are shown to the user **exactly once** at create time, then discarded. The handler:
100
101 1. Mints the raw + hash + display prefix.
102 2. Inserts the row with hash + prefix + scopes.
103 3. Renders the listing template with `JustCreatedRaw` set to the raw value.
104
105 The template displays `JustCreatedRaw` in a `<pre><code>` block with a clear "save this now — won't be shown again" message. The raw value never lands in the session, the URL, or any log line.
106
107 ## Last-used debouncing
108
109 Every authenticated request would otherwise burn an UPDATE on `user_tokens`. To avoid hot-row contention, the middleware uses an in-memory debouncer (`pat.Debouncer`) keyed by `token_id` with a 60-second window:
110
111 - First request in a window → `ShouldTouch` returns true → goroutine writes `last_used_at` / `last_used_ip` with a 500ms timeout.
112 - Subsequent requests within the window → `ShouldTouch` returns false → no write.
113
114 Trade-offs:
115
116 - **Lost on process restart.** Acceptable — `last_used_at` is a UI-display field, not an audit primitive.
117 - **Per-process.** Multi-replica deployments will have ~N writes per token per window where N = replicas. Still bounded.
118 - **Goroutine detached from `r.Context()`.** Intentional — a debounced touch is best-effort; we'd rather complete it than drop on client disconnect (`G118` is suppressed with that justification).
119
120 `pat.Debouncer.Forget(tokenID)` is exposed for revoke-time invalidation in future sprints.
121
122 ## Auto-revoke triggers
123
124 - **User suspension** → all active tokens revoked via `RevokeAllUserTokens`.
125 - **User deletion** (post-grace) → tokens hard-deleted via the `ON DELETE CASCADE` from `user_id`.
126 - **Password change** → tokens are NOT revoked by default. Matches GitHub's behavior; user-configurable preference lands in S10.
127
128 ## Logging discipline
129
130 Tokens leak in three classic places: request logs, error reporters, and panic dumps. Defenses:
131
132 1. `Authorization` is in the slog redactor's secret-attr list — value `***`.
133 2. `shithub_pat_` is in the value-marker list — any string containing it collapses to `***`.
134 3. URL credentials of the form `scheme://user:pass@host` get the userinfo stripped via regex while preserving host + path so logs stay useful (e.g. `postgres://***@127.0.0.1:5432/shithub`).
135 4. `errrep.SlogHandler` flows error-level records through the same redactor before forwarding to GlitchTip.
136
137 Negative tests cover #1, #2, and #3 in `internal/infra/log/log_test.go`.
138
139 ## Settings UI
140
141 `/settings/tokens` shows:
142
143 - Active and revoked tokens (revoked sorted last) with name, prefix, scopes, last-used timestamp, expiry.
144 - A create form with name, expiry picker (30/90/365/none), and scope checkboxes.
145 - A flash banner when the recent-auth gate is failing.
146 - The newly-minted raw token in a one-time-display block at the top of the page when `JustCreatedRaw` is set.
147
148 ## Operational notes
149
150 - **Per-user cap:** 50 tokens (`pat.MaxTokensPerUser`). Configurable via constant; the listing handler enforces it on create.
151 - **Audit:** `pat_created` and `pat_revoked` rows in `auth_audit_log` carry `{token_id, prefix, scopes}` (creation) and `{token_id}` (revocation). Never the raw token, never the hash.
152 - **Notification email** on every successful create with name + prefix + IP.
153
154 ## API surface (S08 baseline)
155
156 - `GET /api/v1/user` — returns the authenticated user's public profile JSON. Requires `user:read` (or `user:write` via implication).
157
158 Future sprints add:
159
160 - `GET /api/v1/repos/{owner}/{repo}` (S11 unblocks)
161 - `POST /api/v1/repos` (S11)
162 - PR / issue endpoints (S22, S23)
163
164 ## Pitfalls / what to remember
165
166 - **Don't store the raw token anywhere.** The DB has the hash; the handler shows the raw exactly once.
167 - **Don't compare hashes with `bytes.Equal`.** Use `pat.EqualHash` (constant-time).
168 - **Don't put scopes on the request — put them on the token.** Sessions have implicit full scope; PATs carry exactly what the user picked.
169 - **Don't bypass the recent-auth gate.** A hijacked session must NOT be able to mint a PAT without a fresh TOTP step.
170 - **Don't log `Authorization` headers.** The redactor catches it, but a custom logger that bypasses our handler doesn't get the redaction for free.
171 - **Don't change a token's scopes in place.** "Edit scopes" = "revoke + create new" UX. We never silently widen a token's privileges.
172
173 ## Related docs
174
175 - `docs/internal/auth.md` — email/password auth (S05).
176 - `docs/internal/2fa.md` — TOTP + recovery codes (S06), recent-auth pattern.
177 - `docs/internal/ssh-deploy.md` — git-over-SSH (S07); PAT covers git-over-HTTPS in S12.
178 - `docs/internal/observability.md` — slog redaction.