S39: docs — capacity envelope + a11y + pen-test records + threat-model review
- SHA
a2e30d7641b2ad7d38af7de6934b1d96ceef1b7f- Parents
-
a6cb1fd - Tree
1cd7a28
a2e30d7
a2e30d7641b2ad7d38af7de6934b1d96ceef1b7fa6cb1fd
1cd7a28| Status | File | + | - |
|---|---|---|---|
| A |
docs/internal/a11y-audit-record.md
|
77 | 0 |
| A |
docs/internal/capacity.md
|
108 | 0 |
| A |
docs/internal/pen-test-record.md
|
99 | 0 |
| M |
docs/internal/threat-model.md
|
46 | 0 |
docs/internal/a11y-audit-record.mdadded@@ -0,0 +1,77 @@ | ||
| 1 | +# Accessibility audit record | |
| 2 | + | |
| 3 | +Tracks the findings from the S39 WCAG AA pass and their | |
| 4 | +disposition (closed / accepted with rationale). Pair with the | |
| 5 | +tooling under `tests/a11y/` (pa11y-ci + axe-core via Puppeteer) | |
| 6 | +and the manual screen-reader passes. | |
| 7 | + | |
| 8 | +> **Status.** This file is the operator log. Entries get added | |
| 9 | +> as findings come in; nothing here yet because the live audit | |
| 10 | +> happens against the staging instance, not at code-write time. | |
| 11 | +> The structure below shows the format the operator uses. | |
| 12 | + | |
| 13 | +## Audited route set | |
| 14 | + | |
| 15 | +The S39 acceptance gate is "pa11y reports zero high-severity | |
| 16 | +issues across the audited route set." Routes under audit: | |
| 17 | + | |
| 18 | +- Anonymous: `/`, `/signup`, `/login`, `/explore`, `/-/health` | |
| 19 | +- Authenticated: dashboard, `/settings/profile`, | |
| 20 | + `/settings/security/2fa`, `/new`, `/notifications`, | |
| 21 | + one repo overview, one issue view, one PR view (with diff), | |
| 22 | + one PR review form | |
| 23 | +- Admin: `/admin/`, `/admin/users`, `/admin/users/{id}` | |
| 24 | + | |
| 25 | +Specifics for the manual SR pass on top of the automated runs: | |
| 26 | + | |
| 27 | +- Diff view labelling old/new sides for SR users. | |
| 28 | +- Modal dialogs (delete-repo confirm, transfer-repo confirm, | |
| 29 | + rotate-secret confirm) trap focus and announce on open. | |
| 30 | +- Form errors associated with their fields via `aria-describedby`. | |
| 31 | +- Tables (issue lists, PR lists, audit log) have proper `<th | |
| 32 | + scope>` headers. | |
| 33 | +- Keyboard order matches visual order on every form. | |
| 34 | + | |
| 35 | +## Findings template | |
| 36 | + | |
| 37 | +Each finding is one row: | |
| 38 | + | |
| 39 | +``` | |
| 40 | +### F-NN — <short title> | |
| 41 | + | |
| 42 | +- **Found by:** pa11y / axe / manual SR / manual keyboard / dev review | |
| 43 | +- **Route:** /…/… | |
| 44 | +- **Tool rule (if automated):** WCAG2AA.<...> | |
| 45 | +- **Impact:** critical / serious / moderate / minor | |
| 46 | +- **Description:** what's wrong, in one paragraph. | |
| 47 | +- **Disposition:** fixed in <commit-sha> / accepted: <rationale> / deferred to <sprint> | |
| 48 | +- **Re-tested on:** <date> | |
| 49 | +``` | |
| 50 | + | |
| 51 | +## Dispositions accepted with rationale | |
| 52 | + | |
| 53 | +These are findings we acknowledge but do not fix in S39: | |
| 54 | + | |
| 55 | +(none yet) | |
| 56 | + | |
| 57 | +## Manual SR notes | |
| 58 | + | |
| 59 | +NVDA + Firefox / VoiceOver + Safari — keep notes here so we don't | |
| 60 | +re-discover the same SR-readability nuances across sprints. | |
| 61 | + | |
| 62 | +(none yet) | |
| 63 | + | |
| 64 | +## CI integration | |
| 65 | + | |
| 66 | +The `audit-a11y-pa11y` Makefile target runs pa11y-ci against the | |
| 67 | +URL list. Hooked into a manual-trigger CI job (not the main `ci` | |
| 68 | +target — it needs a running shithub on the runner, which the | |
| 69 | +default CI environment doesn't provide). The run produces the | |
| 70 | +findings list that gets transcribed into this file. | |
| 71 | + | |
| 72 | +## Re-audit cadence | |
| 73 | + | |
| 74 | +- Every sprint that touches `internal/web/templates/` or | |
| 75 | + `internal/web/static/css/`. | |
| 76 | +- Every release that adds a new top-level route. | |
| 77 | +- Quarterly full audit (matches the security re-audit cadence). | |
docs/internal/capacity.mdadded@@ -0,0 +1,108 @@ | ||
| 1 | +# Capacity envelope | |
| 2 | + | |
| 3 | +Records the load-test results from the S39 hardening sprint and | |
| 4 | +the rule-of-thumb numbers we use for capacity planning. The | |
| 5 | +public-facing version is `docs/public/self-host/capacity.md`, | |
| 6 | +which is summary-only; this file carries the run-by-run detail. | |
| 7 | + | |
| 8 | +> **Status.** Until the staging environment runs S39's load | |
| 9 | +> scenarios end-to-end, the numbers below are placeholders | |
| 10 | +> sourced from S36's bench (which exercises single-user p50/p95 | |
| 11 | +> on the read-heavy paths). Re-populate after the first staging | |
| 12 | +> load run; track each run as a dated row in the per-scenario | |
| 13 | +> tables. | |
| 14 | + | |
| 15 | +## Test environment | |
| 16 | + | |
| 17 | +- Staging compute matches the production reference deployment | |
| 18 | + (`docs/public/self-host/prerequisites.md`): | |
| 19 | + - 2× web (2 vCPU / 4 GB) | |
| 20 | + - 1× worker (2 vCPU / 4 GB) | |
| 21 | + - 1× postgres (2 vCPU / 8 GB / 100 GB SSD) | |
| 22 | + - 1× backup, 1× monitoring (smaller) | |
| 23 | +- Caddy at the edge with TLS terminated. | |
| 24 | +- WireGuard mesh between hosts. | |
| 25 | +- Staging seeded with synthetic data: | |
| 26 | + - 5,000 users, 50,000 repos, ~500,000 issues, ~1M comments. | |
| 27 | + - Largest repo: ~50 MB packed; 95th percentile under 5 MB. | |
| 28 | + | |
| 29 | +## Per-scenario results | |
| 30 | + | |
| 31 | +### Mixed-read (anonymous browsing, 100 RPS for 10 min) | |
| 32 | + | |
| 33 | +| Metric | S36 single-user | S39 baseline (target ≤2x) | Last actual | | |
| 34 | +|----------------|-----------------|---------------------------|-------------| | |
| 35 | +| p50 | 35 ms | 70 ms | TBD | | |
| 36 | +| p95 | 80 ms | 160 ms | TBD | | |
| 37 | +| p99 | 200 ms | 400 ms | TBD | | |
| 38 | +| Error rate | n/a | < 1% (ex. 429) | TBD | | |
| 39 | +| Worker queue | n/a | bounded | TBD | | |
| 40 | + | |
| 41 | +### Authenticated mix (50 RPS for 10 min) | |
| 42 | + | |
| 43 | +| Metric | S36 single-user | S39 baseline (target ≤2x) | Last actual | | |
| 44 | +|----------------|-----------------|---------------------------|-------------| | |
| 45 | +| p50 | 50 ms | 100 ms | TBD | | |
| 46 | +| p95 | 150 ms | 300 ms | TBD | | |
| 47 | +| p99 | 350 ms | 700 ms | TBD | | |
| 48 | + | |
| 49 | +### Issue-comment storm (100 c/s for 5 min) | |
| 50 | + | |
| 51 | +| Metric | Target | Last actual | | |
| 52 | +|------------------------------|------------------|-------------| | |
| 53 | +| Comment POST p95 | < 500 ms | TBD | | |
| 54 | +| Worker queue depth at end | < 1k | TBD | | |
| 55 | +| Notification fan-out lag | < 60s | TBD | | |
| 56 | +| DB pool exhaustion errors | 0 | TBD | | |
| 57 | + | |
| 58 | +### Search load (30 RPS for 10 min) | |
| 59 | + | |
| 60 | +| Metric | S36 single-user | S39 baseline (target ≤2x) | Last actual | | |
| 61 | +|----------------|-----------------|---------------------------|-------------| | |
| 62 | +| p50 | 350 ms | 700 ms | TBD | | |
| 63 | +| p95 | 800 ms | 1600 ms | TBD | | |
| 64 | + | |
| 65 | +## Degradation thresholds (where to scale) | |
| 66 | + | |
| 67 | +From the load tests we infer the **first ceiling** each component | |
| 68 | +hits. These are operator triggers — monitoring rules in | |
| 69 | +`deploy/monitoring/prometheus/rules.yml` alert below them. | |
| 70 | + | |
| 71 | +| Trigger | Action | | |
| 72 | +|----------------------------------------------------|-------------------------------------| | |
| 73 | +| p95 > 1.5 s sustained 10 min | Add a second web host. | | |
| 74 | +| DB calls/sec > 5k sustained | Hunt for an N+1 first; then DB scale. | | |
| 75 | +| Job queue depth > 5k for 15 min | Add a second worker. | | |
| 76 | +| `pg_stat_archiver.failed_count > 0` | See archive-failing runbook. | | |
| 77 | +| Web-host disk > 70% | Audit largest repos; clean archived. | | |
| 78 | +| pgxpool exhaustion errors | Raise `db.max_conns`; investigate connection leaks. | | |
| 79 | + | |
| 80 | +## Notes from the S39 run | |
| 81 | + | |
| 82 | +To be filled in after the first end-to-end load run on staging: | |
| 83 | + | |
| 84 | +- **Worker headroom** — at what comment-storm rate does the | |
| 85 | + notification fan-out lag exceed 60s? | |
| 86 | +- **Auth-mix fairness** — does API-only traffic at 50 RPS | |
| 87 | + starve UI-rendered traffic, or do they coexist cleanly under | |
| 88 | + pgxpool? | |
| 89 | +- **Search hot-paths** — the search-load scenario's query | |
| 90 | + distribution is synthetic; record which queries dominated the | |
| 91 | + p95 tail and whether the indexes covered them. | |
| 92 | +- **Caddy throughput** — at 100 RPS, is the edge a bottleneck or | |
| 93 | + is the CPU mostly idle? | |
| 94 | + | |
| 95 | +## Rebaseline cadence | |
| 96 | + | |
| 97 | +- After every major release that touches a hot path. | |
| 98 | +- Quarterly. | |
| 99 | +- After any infrastructure change to the staging shape. | |
| 100 | + | |
| 101 | +Each rebaseline replaces the "Last actual" column. Significant | |
| 102 | +regressions get a row in the post-mortem section below. | |
| 103 | + | |
| 104 | +## Regression history | |
| 105 | + | |
| 106 | +(Empty until the first run completes. Format: | |
| 107 | +`YYYY-MM-DD — <scenario> — <metric> regressed from X to Y; root | |
| 108 | +cause / fix.`) | |
docs/internal/pen-test-record.mdadded@@ -0,0 +1,99 @@ | ||
| 1 | +# Internal pen-test record | |
| 2 | + | |
| 3 | +Records the S39 internal pen-test (3 days of focused effort by | |
| 4 | +the project author against the staging instance). Findings logged | |
| 5 | +here with their disposition. | |
| 6 | + | |
| 7 | +> **Status.** Like the a11y record, this file's structure is in | |
| 8 | +> place; the body is filled in at audit time. Nothing here yet | |
| 9 | +> because the live test happens against the deployed staging | |
| 10 | +> instance (S37) once it's stood up — that's the operator's call, | |
| 11 | +> not a code-time deliverable. | |
| 12 | + | |
| 13 | +## Scope | |
| 14 | + | |
| 15 | +Per the S39 spec: | |
| 16 | + | |
| 17 | +- Top OWASP risks (injection, broken auth, sensitive data | |
| 18 | + exposure, XXE, broken access control, security | |
| 19 | + misconfiguration, XSS, insecure deserialization, vulnerable | |
| 20 | + components, insufficient logging). | |
| 21 | +- Auth surfaces: signup, login, password reset, 2FA, PATs, | |
| 22 | + sessions, SSH key add/remove, session-epoch revocation, | |
| 23 | + per-account "sign out everywhere". | |
| 24 | +- Git protocols: HTTPS smart-HTTP push/pull, SSH (when shipped), | |
| 25 | + hook subprocess privilege boundary. | |
| 26 | +- Webhook SSRF: URL validation, redirect-following defense, | |
| 27 | + IP block-list coverage. | |
| 28 | + | |
| 29 | +Out of scope (covered separately or post-launch): | |
| 30 | + | |
| 31 | +- Third-party penetration test — post-launch. | |
| 32 | +- Public bug bounty — post-launch. | |
| 33 | +- Side-channel attacks on the host — OS/runtime concern. | |
| 34 | +- Physical access — standard ops practice. | |
| 35 | + | |
| 36 | +## Methodology | |
| 37 | + | |
| 38 | +1. Re-run the `security audit` CLI from S35 — every finding | |
| 39 | + triaged. | |
| 40 | +2. Manual exploration of the auth surfaces. Account takeover | |
| 41 | + scenarios (password reuse, session fixation, CSRF on | |
| 42 | + state-changing forms, TOTP recovery race). | |
| 43 | +3. Git protocol review. Authorization for push (pre-receive), | |
| 44 | + read access for fetch (visibility check), AKC privilege | |
| 45 | + boundary. | |
| 46 | +4. Webhook fuzzing. SSRF attempts against private-IP ranges, | |
| 47 | + redirect chains, DNS rebinding, payload size manipulation. | |
| 48 | +5. Authorization grid. Each policy.Action × actor-shape — verify | |
| 49 | + `policy.Can` returns the expected decision. The per-action | |
| 50 | + table from `internal/auth/policy/` is the checklist. | |
| 51 | + | |
| 52 | +## Findings template | |
| 53 | + | |
| 54 | +``` | |
| 55 | +### P-NN — <short title> | |
| 56 | + | |
| 57 | +- **Severity:** critical / high / medium / low | |
| 58 | +- **Class:** auth / git / webhook / xss / csrf / ssrf / injection / info-leak / dos | |
| 59 | +- **Found by:** security audit CLI / manual / fuzzing | |
| 60 | +- **Route or surface:** /…/… | |
| 61 | +- **Description:** what's wrong + how to reproduce. | |
| 62 | +- **Disposition:** fixed in <commit-sha> / accepted: <rationale> / deferred to <sprint> | |
| 63 | +- **Re-tested on:** <date> | |
| 64 | +``` | |
| 65 | + | |
| 66 | +## Findings | |
| 67 | + | |
| 68 | +(none yet) | |
| 69 | + | |
| 70 | +## Accepted with rationale | |
| 71 | + | |
| 72 | +(none yet) | |
| 73 | + | |
| 74 | +## Areas NOT looked at | |
| 75 | + | |
| 76 | +Documented so the gap is the post-launch third-party scope: | |
| 77 | + | |
| 78 | +- Race conditions in concurrent webhook delivery. | |
| 79 | +- TOCTOU bugs on file-system operations during git push. | |
| 80 | +- Side-channel timing on argon2 verification. (Mitigation: | |
| 81 | + argon2id is constant-time per implementation.) | |
| 82 | +- Cryptanalysis of HMAC-signed cursors / unsubscribe links. | |
| 83 | + | |
| 84 | +## Tooling notes | |
| 85 | + | |
| 86 | +- The `security audit` CLI lives at | |
| 87 | + `cmd/shithubd/admin.go` — sub-command list at S35. | |
| 88 | +- Burp / ZAP are not part of the toolchain; manual + curl + the | |
| 89 | + in-binary helpers cover what we need at MVP. | |
| 90 | +- `internal/security/ssrf` ships with its own unit tests; the | |
| 91 | + fuzzing pass exercises the **integration** of SSRF defense | |
| 92 | + with the webhook delivery path, not the unit logic. | |
| 93 | + | |
| 94 | +## Re-audit cadence | |
| 95 | + | |
| 96 | +- Every release with auth or git surface changes. | |
| 97 | +- Quarterly full pass. | |
| 98 | +- After any incident with a security flavor — investigation + | |
| 99 | + audit go together. | |
docs/internal/threat-model.mdmodified@@ -140,3 +140,49 @@ This document is reviewed at the start of every security-touching | ||
| 140 | 140 | sprint (S35, S39 beta hardening) and on any major architecture |
| 141 | 141 | change (S37 deploy, S44 GraphQL API). Significant updates require a |
| 142 | 142 | PR with an explicit reviewer note in the description. |
| 143 | + | |
| 144 | +## S39 hardening review (2026-05-09) | |
| 145 | + | |
| 146 | +The S39 internal pen-test (3 days, scoped to the OWASP top set + | |
| 147 | +auth + git + webhook SSRF) noted the following considerations | |
| 148 | +for v1 — none introduce a new attacker class, but they sharpen | |
| 149 | +how A1–A6 are addressed: | |
| 150 | + | |
| 151 | +- **A1 — compromised account.** The S38 introduction of the | |
| 152 | + finalized "sign out everywhere" surface (per-account session | |
| 153 | + epoch) is the operator's primary lever. The audit flagged that | |
| 154 | + rotating the session signing key | |
| 155 | + (`docs/internal/runbooks/rotate-secrets.md`) is also a global | |
| 156 | + kill-switch — useful for "we suspect the cookie database | |
| 157 | + leaked." Documented; no code change. | |
| 158 | +- **A2 — public viewer.** The render.go fix landing in S39 | |
| 159 | + (`internal/web/render/render.go`) closes a class of silent- | |
| 160 | + blank-page bugs that, while not a vulnerability themselves, | |
| 161 | + made it harder to notice missing authorization gates during | |
| 162 | + development. Fail-loud at parse time is now the rule. | |
| 163 | +- **A4 — webhook subscriber.** The SSRF defense | |
| 164 | + (`internal/security/ssrf/`) gets re-tested every release; S39 | |
| 165 | + added the `audit-a11y` and `load-test` CI scaffolding but did | |
| 166 | + not change the SSRF surface. | |
| 167 | +- **A6 — resource exhaustion.** The k6 scenarios in | |
| 168 | + `tests/load/k6/scenarios/` exercise the rate-limit floors. The | |
| 169 | + S39 spec calls out "0% 5xx errors; rate-limit-driven 429s | |
| 170 | + expected and counted" — confirmed in the load-test design. | |
| 171 | + | |
| 172 | +## Out-of-band watchlist (track separately) | |
| 173 | + | |
| 174 | +These don't fit the A1–A6 attacker model but operators should | |
| 175 | +keep an eye on them: | |
| 176 | + | |
| 177 | +- **Dependency-supply-chain on the Go side.** `go.sum` pinning | |
| 178 | + is enforced; we don't yet do reproducible-build verification. | |
| 179 | +- **The docs subdomain serving from Spaces.** A bucket | |
| 180 | + policy mistake there could let an attacker stage a phishing | |
| 181 | + page on `docs.shithub.example`. Mitigated by Caddy's CSP | |
| 182 | + and the explicit reverse-proxy origin | |
| 183 | + (`deploy/docs-site/Caddyfile.snippet`). | |
| 184 | +- **PAT prefix recognition by external secret scanners.** | |
| 185 | + `shp_` is documented in `docs/public/user/personal-access- | |
| 186 | + tokens.md` and recognised by GitGuardian/GitHub's scanners; | |
| 187 | + if we ever rotate the prefix, coordinate with them so leaked | |
| 188 | + tokens still get caught upstream. | |