# Capacity envelope Records the load-test results from the S39 hardening sprint and the rule-of-thumb numbers we use for capacity planning. The public-facing version is `docs/public/self-host/capacity.md`, which is summary-only; this file carries the run-by-run detail. > **Status.** Until the staging environment runs S39's load > scenarios end-to-end, the numbers below are placeholders > sourced from S36's bench (which exercises single-user p50/p95 > on the read-heavy paths). Re-populate after the first staging > load run; track each run as a dated row in the per-scenario > tables. ## Test environment - Staging compute matches the production reference deployment (`docs/public/self-host/prerequisites.md`): - 2× web (2 vCPU / 4 GB) - 1× worker (2 vCPU / 4 GB) - 1× postgres (2 vCPU / 8 GB / 100 GB SSD) - 1× backup, 1× monitoring (smaller) - Caddy at the edge with TLS terminated. - WireGuard mesh between hosts. - Staging seeded with synthetic data: - 5,000 users, 50,000 repos, ~500,000 issues, ~1M comments. - Largest repo: ~50 MB packed; 95th percentile under 5 MB. ## Per-scenario results ### Mixed-read (anonymous browsing, 100 RPS for 10 min) | Metric | S36 single-user | S39 baseline (target ≤2x) | Last actual | |----------------|-----------------|---------------------------|-------------| | p50 | 35 ms | 70 ms | TBD | | p95 | 80 ms | 160 ms | TBD | | p99 | 200 ms | 400 ms | TBD | | Error rate | n/a | < 1% (ex. 429) | TBD | | Worker queue | n/a | bounded | TBD | ### Authenticated mix (50 RPS for 10 min) | Metric | S36 single-user | S39 baseline (target ≤2x) | Last actual | |----------------|-----------------|---------------------------|-------------| | p50 | 50 ms | 100 ms | TBD | | p95 | 150 ms | 300 ms | TBD | | p99 | 350 ms | 700 ms | TBD | ### Issue-comment storm (100 c/s for 5 min) | Metric | Target | Last actual | |------------------------------|------------------|-------------| | Comment POST p95 | < 500 ms | TBD | | Worker queue depth at end | < 1k | TBD | | Notification fan-out lag | < 60s | TBD | | DB pool exhaustion errors | 0 | TBD | ### Search load (30 RPS for 10 min) | Metric | S36 single-user | S39 baseline (target ≤2x) | Last actual | |----------------|-----------------|---------------------------|-------------| | p50 | 350 ms | 700 ms | TBD | | p95 | 800 ms | 1600 ms | TBD | ## Degradation thresholds (where to scale) From the load tests we infer the **first ceiling** each component hits. These are operator triggers — monitoring rules in `deploy/monitoring/prometheus/rules.yml` alert below them. | Trigger | Action | |----------------------------------------------------|-------------------------------------| | p95 > 1.5 s sustained 10 min | Add a second web host. | | DB calls/sec > 5k sustained | Hunt for an N+1 first; then DB scale. | | Job queue depth > 5k for 15 min | Add a second worker. | | `pg_stat_archiver.failed_count > 0` | See archive-failing runbook. | | Web-host disk > 70% | Audit largest repos; clean archived. | | pgxpool exhaustion errors | Raise `db.max_conns`; investigate connection leaks. | ## Notes from the S39 run To be filled in after the first end-to-end load run on staging: - **Worker headroom** — at what comment-storm rate does the notification fan-out lag exceed 60s? - **Auth-mix fairness** — does API-only traffic at 50 RPS starve UI-rendered traffic, or do they coexist cleanly under pgxpool? - **Search hot-paths** — the search-load scenario's query distribution is synthetic; record which queries dominated the p95 tail and whether the indexes covered them. - **Caddy throughput** — at 100 RPS, is the edge a bottleneck or is the CPU mostly idle? ## Rebaseline cadence - After every major release that touches a hot path. - Quarterly. - After any infrastructure change to the staging shape. Each rebaseline replaces the "Last actual" column. Significant regressions get a row in the post-mortem section below. ## Regression history (Empty until the first run completes. Format: `YYYY-MM-DD — regressed from X to Y; root cause / fix.`)