Upgrade
Routine release deploys. The deploy is one binary swap + a systemd restart; the only place upgrades get exciting is around DB migrations and the occasional config schema change.
Standard release
# from a clean checkout of the release tag
git fetch --tags
git checkout v<version>
make deploy-check ANSIBLE_INVENTORY=staging
make deploy ANSIBLE_INVENTORY=staging
# ... canary period ...
make deploy ANSIBLE_INVENTORY=production
shithubd migrate up runs as the web service's ExecStartPre, so
the binary that needs the new schema is also the one that applies
it. Order on each host: ExecStartPre runs migrations → web starts
on the new schema.
If a migration is long (>30s), call it out in the release notes and time the deploy outside peak hours. The web service hangs in "activating" until ExecStartPre finishes.
Canary
We deploy to staging first, watch for 30 min in Grafana. Things to look at:
- p95 latency on the top routes (
shithubd-overviewdashboard). - DB call rate — a 10× jump usually means a regressed N+1.
- Job queue depth — a stuck migration reflects here.
- Error logs in Loki:
{service="shithubd"} |~ "panic|ERROR".
If anything looks off, do not promote to production. Rollback on staging is cheap; rollback on production is loud.
Major version (database)
If the release notes flag a major schema change:
- Take a manual
pg_dumpimmediately before the deploy:sudo -u postgres /usr/local/bin/shithub-backup-daily. - Confirm it landed in Spaces.
- Deploy to staging, run
make restore-drillagainst the post-deploy dump to confirm the new schema restores cleanly. - Then production.
Config schema changes
When a release adds a required env var, the binary refuses to start
and complains in the journal. Update deploy/ansible/roles/shithubd/ templates/web.env.j2 (and worker.env.j2), bump the inventory
vars, redeploy. There's no separate migration step for env files.
View source
| 1 | # Upgrade |
| 2 | |
| 3 | Routine release deploys. The deploy is one binary swap + a systemd |
| 4 | restart; the only place upgrades get exciting is around DB migrations |
| 5 | and the occasional config schema change. |
| 6 | |
| 7 | ## Standard release |
| 8 | |
| 9 | ```sh |
| 10 | # from a clean checkout of the release tag |
| 11 | git fetch --tags |
| 12 | git checkout v<version> |
| 13 | make deploy-check ANSIBLE_INVENTORY=staging |
| 14 | make deploy ANSIBLE_INVENTORY=staging |
| 15 | # ... canary period ... |
| 16 | make deploy ANSIBLE_INVENTORY=production |
| 17 | ``` |
| 18 | |
| 19 | `shithubd migrate up` runs as the web service's ExecStartPre, so |
| 20 | the binary that needs the new schema is also the one that applies |
| 21 | it. Order on each host: ExecStartPre runs migrations → web starts |
| 22 | on the new schema. |
| 23 | |
| 24 | If a migration is long (>30s), call it out in the release notes |
| 25 | and time the deploy outside peak hours. The web service hangs in |
| 26 | "activating" until ExecStartPre finishes. |
| 27 | |
| 28 | ## Canary |
| 29 | |
| 30 | We deploy to staging first, watch for 30 min in Grafana. Things to |
| 31 | look at: |
| 32 | |
| 33 | - p95 latency on the top routes (`shithubd-overview` dashboard). |
| 34 | - DB call rate — a 10× jump usually means a regressed N+1. |
| 35 | - Job queue depth — a stuck migration reflects here. |
| 36 | - Error logs in Loki: `{service="shithubd"} |~ "panic|ERROR"`. |
| 37 | |
| 38 | If anything looks off, **do not** promote to production. Rollback |
| 39 | on staging is cheap; rollback on production is loud. |
| 40 | |
| 41 | ## Major version (database) |
| 42 | |
| 43 | If the release notes flag a major schema change: |
| 44 | |
| 45 | 1. Take a manual `pg_dump` immediately before the deploy: |
| 46 | `sudo -u postgres /usr/local/bin/shithub-backup-daily`. |
| 47 | 2. Confirm it landed in Spaces. |
| 48 | 3. Deploy to staging, run `make restore-drill` against the |
| 49 | *post-deploy* dump to confirm the new schema restores cleanly. |
| 50 | 4. Then production. |
| 51 | |
| 52 | ## Config schema changes |
| 53 | |
| 54 | When a release adds a required env var, the binary refuses to start |
| 55 | and complains in the journal. Update `deploy/ansible/roles/shithubd/ |
| 56 | templates/web.env.j2` (and `worker.env.j2`), bump the inventory |
| 57 | vars, redeploy. There's no separate migration step for env files. |