# Threat model — v1

A short, concrete document describing who shithub defends against,
what we protect, and how the controls map to attackers. This is the
v1 baseline; it evolves with the codebase. Per the S35 spec, "two
or three pages, not thirty." Pair with `security-checklist.md` for
the per-control test references.

## Assets

In rough decreasing order of consequence if compromised:

1. **User credentials.** Password hashes, TOTP secrets, recovery
   codes, PATs, SSH public keys. Direct path to account takeover.
2. **Repository content.** Source code, issues, PR discussions,
   private-repo files. Privacy + IP value; for some users, source
   IS the product.
3. **Webhook secrets + delivery payloads.** A leaked webhook secret
   lets an attacker spoof events to subscriber systems. Payloads
   often contain user content that hadn't been published.
4. **Site-admin actions.** Suspending users, deleting repos,
   impersonating accounts. Trust violations here cascade.
5. **Server-side resources.** CPU (argon2 hashing, git ops), disk
   (repo storage, response-body cap), DB connections.

## Attackers

### A1 — Compromised user account

The attacker authenticates as a real user (phished password, leaked
PAT). They have whatever the user has.

**Mitigations:**
- Per-account session-epoch invalidation: the user can "log out
  everywhere" and burn every session at once.
- TOTP 2FA gate (S06) — when enabled, the password alone doesn't
  log in. Recovery codes one-shot.
- Audit log on security-relevant actions (auth, settings, admin)
  surfaces the attack to the user + admins.
- Suspended-account gate (`policy.Can`) lets an admin freeze a
  hijacked account without nuking the user's data.
- Common-password blocklist + argon2id make password-spraying
  impractical.

### A2 — Malicious public-repo viewer

An anonymous or freshly-signed-up user crafts payloads against the
public-repo surface: XSS via issue body, CSRF from a public-org page,
SSRF via a webhook to internal infrastructure.

**Mitigations:**
- All user content is markdown-rendered through `internal/markdown`
  with `bluemonday` UGC policy; ad-hoc `template.HTML(...)` is
  lint-blocked outside designated render helpers.
- CSRF protection at the router root (nosurf-derived); state-
  changing routes inherit it. PAT-only API and git transports are
  explicit exemptions documented in the lint script.
- SSRF defense (`internal/security/ssrf`) on every outbound HTTP:
  IP block-list + dial-the-IP transport defeats DNS rebinding.
- Webhook secret-decrypt failure auto-disables the hook so a
  compromised key doesn't keep delivering forever.
- CSP, Frame-Options DENY, COOP, CORP cut off the embed/clickjack
  surface.

### A3 — Abusive signup automation

Botnets create accounts in bulk to spam issues, host abuse content,
or burn through resources.

**Mitigations:**
- Per-IP signup throttle (S05): 5/hour.
- Per-/24 signup throttle (S35): 20/hour. Catches spray-from-many-
  IPs-on-the-same-network patterns.
- Honeypot field on the signup form (silently treats success).
- Email verification required (configurable) gates account
  activation behind a real inbox.
- Captcha integration is deferred (vendor decision pending); the
  per-/24 throttle is the live primary defense.

### A4 — Supply-chain via webhook subscriber

A compromised subscriber URL receives webhook deliveries and uses
the captured payloads to escalate (phishing, replay).

**Mitigations:**
- HMAC-SHA256 signing on every delivery; subscribers can verify
  authenticity. Per-webhook secret stored AEAD-encrypted at rest.
- Idempotency key on each delivery so replays are detectable.
- SSRF defense rejects subscriber URLs that resolve to private IPs
  (operator can opt in for self-hosted CI via `AllowedHosts`).
- Auto-disable on persistent failure (50 consecutive). Damage is
  bounded.

### A5 — Insider with admin access

A site admin abuses privileges (looking at user data, mass-deleting
repos, impersonating users to act on their behalf).

**Mitigations:**
- Impersonation defaults read-only (`policy.Can` +
  `DenyImpersonationReadOnly`); writes require a typed-name confirm.
- Visible red sticky banner on every page during impersonation.
- Audit row carries BOTH the real admin id and the impersonated id
  (`meta.impersonated_user_id`) for forensics.
- Bootstrap-admin CLI is the only out-of-band elevation path; all
  subsequent grants happen through `/admin/users/{id}` and audit.
- 404 (not 403) for non-admin `/admin` access prevents privilege
  enumeration.

### A6 — Resource exhaustion

A determined attacker tries to consume CPU, DB connections, or disk
faster than our limits.

**Mitigations:**
- Body-size caps on auth POSTs (so 10MB password bodies don't burn
  argon2 cost).
- Repo-create throttle (10/hour/user); content-creation throttles
  on issues/comments/stars/forks.
- Webhook payload cap (25 MiB); response body cap (32 KiB stored).
- Job queue uses `FOR UPDATE SKIP LOCKED` so contention bounded.
- pgx pool max-conns capped per process.

## Out of scope (v1)

Documented here so they don't get assumed:

- **State-actor adversaries.** No claim to defend against an attacker
  with control over CDN/DNS/CA infrastructure.
- **Side-channel attacks on the host.** Spectre/Meltdown class
  defenses are the OS/runtime's responsibility.
- **Physical access to servers.** Standard ops practice; not in app
  layer.
- **Coordinated disclosure pipeline.** Future work.
- **Penetration test by external party.** Future work.

## Review cadence

This document is reviewed at the start of every security-touching
sprint (S35, S39 beta hardening) and on any major architecture
change (S37 deploy, S44 GraphQL API). Significant updates require a
PR with an explicit reviewer note in the description.

## S39 hardening review (2026-05-09)

The S39 internal pen-test (3 days, scoped to the OWASP top set +
auth + git + webhook SSRF) noted the following considerations
for v1 — none introduce a new attacker class, but they sharpen
how A1–A6 are addressed:

- **A1 — compromised account.** The S38 introduction of the
  finalized "sign out everywhere" surface (per-account session
  epoch) is the operator's primary lever. The audit flagged that
  rotating the session signing key
  (`docs/internal/runbooks/rotate-secrets.md`) is also a global
  kill-switch — useful for "we suspect the cookie database
  leaked." Documented; no code change.
- **A2 — public viewer.** The render.go fix landing in S39
  (`internal/web/render/render.go`) closes a class of silent-
  blank-page bugs that, while not a vulnerability themselves,
  made it harder to notice missing authorization gates during
  development. Fail-loud at parse time is now the rule.
- **A4 — webhook subscriber.** The SSRF defense
  (`internal/security/ssrf/`) gets re-tested every release; S39
  added the `audit-a11y` and `load-test` CI scaffolding but did
  not change the SSRF surface.
- **A6 — resource exhaustion.** The k6 scenarios in
  `tests/load/k6/scenarios/` exercise the rate-limit floors. The
  S39 spec calls out "0% 5xx errors; rate-limit-driven 429s
  expected and counted" — confirmed in the load-test design.

## Out-of-band watchlist (track separately)

These don't fit the A1–A6 attacker model but operators should
keep an eye on them:

- **Dependency-supply-chain on the Go side.** `go.sum` pinning
  is enforced; we don't yet do reproducible-build verification.
- **The docs subdomain serving from Spaces.** A bucket
  policy mistake there could let an attacker stage a phishing
  page on `docs.shithub.sh`. Mitigated by Caddy's CSP
  and the explicit reverse-proxy origin
  (`deploy/docs-site/Caddyfile.snippet`).
- **PAT prefix recognition by external secret scanners.**
  `shp_` is documented in `docs/public/user/personal-access-
  tokens.md` and recognised by GitGuardian/GitHub's scanners;
  if we ever rotate the prefix, coordinate with them so leaked
  tokens still get caught upstream.