Rotate secrets
Quarterly cadence; sooner if compromise is suspected. The secret classes:
| Secret | Where it lives | Rotation procedure |
|---|---|---|
session.key_b64 |
web.env |
See "Session signing key" below. |
auth.totp_key_b64 |
web.env |
See "TOTP AEAD key" below. |
Postgres shithub password |
web.env + worker.env + Postgres role |
See "DB password" below. |
Postgres shithub_hook pwd |
sshd env + hook-role-grants.sql apply env |
See "DB password" below. |
| S3 access keys | web.env + worker.env + Spaces dashboard |
See "Object store credentials" below. |
| Postmark / SMTP creds | web.env |
One-step: replace, redeploy. |
| Webhook AEAD key | per-row encrypted; key in worker.env |
Two-step migration, see below. |
| Operator SSH keys | ~operator/.ssh/authorized_keys per host |
Add new key, verify, remove old. |
Session signing key
The session key signs the cookie that authenticates a logged-in session. Rotating it logs every user out because every existing cookie's MAC stops verifying.
- Generate a new key:
openssl rand -base64 32 - Update the inventory variable
session_key. Keep the old key in a comment for one rotation cycle so you can revert. make deploy ANSIBLE_INVENTORY=production ANSIBLE_TAGS=app.- Verify: sign in to your own account with a fresh browser; the cookie set after sign-in is signed by the new key.
User-visible impact: every user is signed out. Notify in-band before doing this if avoidable; do it without notice if the old key may be compromised.
TOTP AEAD key
The TOTP AEAD key encrypts every user's TOTP shared secret at rest in the database. Rotating this key requires a re-encryption migration — without it, every 2FA enrollment becomes unreadable.
The procedure is:
- Add the new key to
web.envasauth.totp_key_b64_nextalongside the existingauth.totp_key_b64. - Restart web (the package supports a "current + next" pair: it reads with current, falls back to next, writes with current).
- Run the re-encryption job:
shithubd admin re-encrypt-totp --to-key=auth.totp_key_b64_next(operator-only). This decrypts each row with the old key and re-encrypts with the new. - Promote
auth.totp_key_b64_nexttoauth.totp_key_b64(drop the suffix), remove the old key. - Restart web.
Do not skip step 3. Failing to re-encrypt before retiring the old key locks every 2FA-enabled user out of their account; recovery codes are the only path back in, and not everyone has them saved.
DB password
Rotate by adding a new password and removing the old, without downtime.
- As
postgres:ALTER ROLE shithub WITH PASSWORD '<new>'; - Update
web.envandworker.envdb_password. make deploy ANSIBLE_INVENTORY=production ANSIBLE_TAGS=app. The web/worker units will restart and reconnect with the new password.
If you suspect the old password was leaked, do steps 1–3 in sequence within minutes — between (1) and (3) the running web process still has its open connections (which authenticated under the old password) but new connections will use the new.
Object store credentials
- In the Spaces dashboard, generate a new access key with the same scope as the old.
- Update inventory
s3_access_key_idands3_secret_access_key. make deploy ANSIBLE_INVENTORY=production ANSIBLE_TAGS=app.- Verify: trigger a webhook delivery (which writes a body snapshot) and confirm it lands in the bucket.
- Once confirmed, revoke the old key in the Spaces dashboard.
Do not revoke the old key first; the running process will lose access mid-flight.
Webhook AEAD key
The webhook secret AEAD key encrypts every webhook's secret at rest. Rotation is two-step like TOTP:
- Add
webhook.aead_key_nextalongsidewebhook.aead_key. - Run
shithubd admin re-encrypt-webhooks --to-key=webhook. aead_key_next. - Promote and restart.
Failing to re-encrypt before retiring the old key disables every webhook (the auto-disable logic kicks in on first decrypt failure).
Operator SSH keys
Standard procedure: add the new key to every host's
~operator/.ssh/authorized_keys, log in with the new key to
confirm, remove the old. Ansible's authorized_key module makes
this idempotent; the base role will pick up changes if the
inventory's operator_ssh_keys list is the source of truth.
Audit
Every rotation is logged in the host's journal (the deploy run's
output) and, for DB rotations, in pg_stat_activity history if
your retention allows. There's no centralized rotation log; if
you want one, capture each rotation in your team's incident
channel with date + class + reason.
View source
| 1 | # Rotate secrets |
| 2 | |
| 3 | Quarterly cadence; sooner if compromise is suspected. The secret |
| 4 | classes: |
| 5 | |
| 6 | | Secret | Where it lives | Rotation procedure | |
| 7 | |------------------------------|----------------------------------------------|-------------------------------------------| |
| 8 | | `session.key_b64` | `web.env` | See "Session signing key" below. | |
| 9 | | `auth.totp_key_b64` | `web.env` | See "TOTP AEAD key" below. | |
| 10 | | Postgres `shithub` password | `web.env` + `worker.env` + Postgres role | See "DB password" below. | |
| 11 | | Postgres `shithub_hook` pwd | sshd env + `hook-role-grants.sql` apply env | See "DB password" below. | |
| 12 | | S3 access keys | `web.env` + `worker.env` + Spaces dashboard | See "Object store credentials" below. | |
| 13 | | Postmark / SMTP creds | `web.env` | One-step: replace, redeploy. | |
| 14 | | Webhook AEAD key | per-row encrypted; key in `worker.env` | Two-step migration, see below. | |
| 15 | | Operator SSH keys | `~operator/.ssh/authorized_keys` per host | Add new key, verify, remove old. | |
| 16 | |
| 17 | ## Session signing key |
| 18 | |
| 19 | The session key signs the cookie that authenticates a logged-in |
| 20 | session. Rotating it logs **every user out** because every |
| 21 | existing cookie's MAC stops verifying. |
| 22 | |
| 23 | 1. Generate a new key: |
| 24 | ```sh |
| 25 | openssl rand -base64 32 |
| 26 | ``` |
| 27 | 2. Update the inventory variable `session_key`. Keep the old key |
| 28 | in a comment for one rotation cycle so you can revert. |
| 29 | 3. `make deploy ANSIBLE_INVENTORY=production ANSIBLE_TAGS=app`. |
| 30 | 4. Verify: sign in to your own account with a fresh browser; the |
| 31 | cookie set after sign-in is signed by the new key. |
| 32 | |
| 33 | User-visible impact: every user is signed out. Notify in-band |
| 34 | before doing this if avoidable; do it without notice if the old |
| 35 | key may be compromised. |
| 36 | |
| 37 | ## TOTP AEAD key |
| 38 | |
| 39 | The TOTP AEAD key encrypts every user's TOTP shared secret at |
| 40 | rest in the database. **Rotating this key requires a |
| 41 | re-encryption migration** — without it, every 2FA enrollment |
| 42 | becomes unreadable. |
| 43 | |
| 44 | The procedure is: |
| 45 | |
| 46 | 1. Add the new key to `web.env` as `auth.totp_key_b64_next` |
| 47 | alongside the existing `auth.totp_key_b64`. |
| 48 | 2. Restart web (the package supports a "current + next" pair: it |
| 49 | reads with current, falls back to next, writes with current). |
| 50 | 3. Run the re-encryption job: `shithubd admin re-encrypt-totp |
| 51 | --to-key=auth.totp_key_b64_next` (operator-only). This |
| 52 | decrypts each row with the old key and re-encrypts with the |
| 53 | new. |
| 54 | 4. Promote `auth.totp_key_b64_next` to `auth.totp_key_b64` (drop |
| 55 | the suffix), remove the old key. |
| 56 | 5. Restart web. |
| 57 | |
| 58 | Do not skip step 3. Failing to re-encrypt before retiring the old |
| 59 | key locks every 2FA-enabled user out of their account; recovery |
| 60 | codes are the only path back in, and not everyone has them |
| 61 | saved. |
| 62 | |
| 63 | ## DB password |
| 64 | |
| 65 | Rotate by adding a new password and removing the old, **without |
| 66 | downtime**. |
| 67 | |
| 68 | 1. As `postgres`: |
| 69 | ```sql |
| 70 | ALTER ROLE shithub WITH PASSWORD '<new>'; |
| 71 | ``` |
| 72 | 2. Update `web.env` and `worker.env` `db_password`. |
| 73 | 3. `make deploy ANSIBLE_INVENTORY=production ANSIBLE_TAGS=app`. |
| 74 | The web/worker units will restart and reconnect with the new |
| 75 | password. |
| 76 | |
| 77 | If you suspect the old password was leaked, do steps 1–3 in |
| 78 | sequence within minutes — between (1) and (3) the running web |
| 79 | process still has its open connections (which authenticated |
| 80 | under the old password) but new connections will use the new. |
| 81 | |
| 82 | ## Object store credentials |
| 83 | |
| 84 | 1. In the Spaces dashboard, generate a new access key with the |
| 85 | same scope as the old. |
| 86 | 2. Update inventory `s3_access_key_id` and `s3_secret_access_key`. |
| 87 | 3. `make deploy ANSIBLE_INVENTORY=production ANSIBLE_TAGS=app`. |
| 88 | 4. Verify: trigger a webhook delivery (which writes a body |
| 89 | snapshot) and confirm it lands in the bucket. |
| 90 | 5. Once confirmed, revoke the old key in the Spaces dashboard. |
| 91 | |
| 92 | Do not revoke the old key first; the running process will lose |
| 93 | access mid-flight. |
| 94 | |
| 95 | ## Webhook AEAD key |
| 96 | |
| 97 | The webhook secret AEAD key encrypts every webhook's secret at |
| 98 | rest. Rotation is two-step like TOTP: |
| 99 | |
| 100 | 1. Add `webhook.aead_key_next` alongside `webhook.aead_key`. |
| 101 | 2. Run `shithubd admin re-encrypt-webhooks --to-key=webhook. |
| 102 | aead_key_next`. |
| 103 | 3. Promote and restart. |
| 104 | |
| 105 | Failing to re-encrypt before retiring the old key disables every |
| 106 | webhook (the auto-disable logic kicks in on first decrypt |
| 107 | failure). |
| 108 | |
| 109 | ## Operator SSH keys |
| 110 | |
| 111 | Standard procedure: add the new key to every host's |
| 112 | `~operator/.ssh/authorized_keys`, log in with the new key to |
| 113 | confirm, remove the old. Ansible's `authorized_key` module makes |
| 114 | this idempotent; the `base` role will pick up changes if the |
| 115 | inventory's `operator_ssh_keys` list is the source of truth. |
| 116 | |
| 117 | ## Audit |
| 118 | |
| 119 | Every rotation is logged in the host's journal (the deploy run's |
| 120 | output) and, for DB rotations, in `pg_stat_activity` history if |
| 121 | your retention allows. There's no centralized rotation log; if |
| 122 | you want one, capture each rotation in your team's incident |
| 123 | channel with date + class + reason. |