shithub Public

Watch 1 Fork 0 Star 0

markdown · 2814 bytes Raw Blame History

Upgrades & migrations

Routine release deploys. Migrations apply automatically; the only place upgrades get exciting is around long migrations and the occasional config schema change.

Standard release

git fetch --tags
git checkout v<version>
make deploy-check ANSIBLE_INVENTORY=staging
make deploy ANSIBLE_INVENTORY=staging
# ... canary period ...
make deploy ANSIBLE_INVENTORY=production

shithubd migrate up runs as the web service's ExecStartPre=, so the binary that needs the new schema is also the one that applies it. Order on each host: ExecStartPre runs migrations → web starts on the new schema.

If a migration is long (>30s), the release notes call it out. Schedule the deploy outside peak hours; the web service hangs in "activating" until ExecStartPre finishes.

Canary

Deploy to staging first. Watch for 30 min in Grafana. Things to look at:

p95 latency on top routes.
DB call rate — a 10× jump usually means a regressed N+1.
Job queue depth — a stuck migration reflects here.
Error logs in Loki: {service="shithubd"} |~ "panic|ERROR".

If anything looks off, do not promote. Rollback on staging is cheap; rollback on production is loud.

Major release (database)

If the release notes flag a major schema change:

Take a manual pg_dump immediately before the deploy: sudo -u postgres /usr/local/bin/shithub-backup-daily.
Confirm it landed in Spaces.
Deploy to staging, run make restore-drill against the post-deploy dump to confirm the new schema restores cleanly.
Then production.

Config schema changes

When a release adds a required config key, the binary refuses to start and complains in the journal. Update deploy/ansible/roles/shithubd/templates/web.env.j2 (and worker.env.j2), bump the inventory vars, redeploy. There's no separate migration step for env files.

Rolling back

See Rollback (in-repo runbook).

Three rollback shapes, in preference order:

Schema-compatible rollback (best). If the migration only added columns/tables that the old code ignores, the old code runs against the new schema fine. Roll the code back; leave schema alone. Most of our migrations are deliberately additive for this reason.
Roll forward to a hotfix. If the migration changed semantics that the old code can't tolerate, ship a hotfix on top of the new release rather than reversing the migration.
Migration down + code rollback. Last resort; some downs drop columns and will lose data.

# (3) only when (1) and (2) won't work
ssh web-01
sudo -u shithub /usr/local/bin/shithubd migrate down  # ONE step
git checkout v<previous>
make deploy ANSIBLE_INVENTORY=production

View source

  
        1
        # Upgrades & migrations
      
        2
        
        3
        Routine release deploys. Migrations apply automatically; the
      
        4
        only place upgrades get exciting is around long migrations and
      
        5
        the occasional config schema change.
      
        6
        
        7
        ## Standard release
      
        8
        
        9
        ```sh
      
        10
        git fetch --tags
      
        11
        git checkout v<version>
      
        12
        make deploy-check ANSIBLE_INVENTORY=staging
      
        13
        make deploy ANSIBLE_INVENTORY=staging
      
        14
        # ... canary period ...
      
        15
        make deploy ANSIBLE_INVENTORY=production
      
        16
        ```
      
        17
        
        18
        `shithubd migrate up` runs as the web service's
      
        19
        `ExecStartPre=`, so the binary that needs the new schema is also
      
        20
        the one that applies it. Order on each host: ExecStartPre runs
      
        21
        migrations → web starts on the new schema.
      
        22
        
        23
        If a migration is long (>30s), the release notes call it out.
      
        24
        Schedule the deploy outside peak hours; the web service hangs in
      
        25
        "activating" until ExecStartPre finishes.
      
        26
        
        27
        ## Canary
      
        28
        
        29
        Deploy to staging first. Watch for 30 min in Grafana. Things to
      
        30
        look at:
      
        31
        
        32
        - p95 latency on top routes.
      
        33
        - DB call rate — a 10× jump usually means a regressed N+1.
      
        34
        - Job queue depth — a stuck migration reflects here.
      
        35
        - Error logs in Loki: `{service="shithubd"} |~ "panic|ERROR"`.
      
        36
        
        37
        If anything looks off, **do not** promote. Rollback on staging
      
        38
        is cheap; rollback on production is loud.
      
        39
        
        40
        ## Major release (database)
      
        41
        
        42
        If the release notes flag a major schema change:
      
        43
        
        44
        1. Take a manual `pg_dump` immediately before the deploy:
      
        45
           `sudo -u postgres /usr/local/bin/shithub-backup-daily`.
      
        46
        2. Confirm it landed in Spaces.
      
        47
        3. Deploy to staging, run `make restore-drill` against the
      
        48
           *post-deploy* dump to confirm the new schema restores cleanly.
      
        49
        4. Then production.
      
        50
        
        51
        ## Config schema changes
      
        52
        
        53
        When a release adds a required config key, the binary refuses to
      
        54
        start and complains in the journal. Update
      
        55
        `deploy/ansible/roles/shithubd/templates/web.env.j2` (and
      
        56
        `worker.env.j2`), bump the inventory vars, redeploy. There's no
      
        57
        separate migration step for env files.
      
        58
        
        59
        ## Rolling back
      
        60
        
        61
        See [Rollback (in-repo runbook)](https://github.com/tenseleyFlow/shithub/blob/main/docs/internal/runbooks/rollback.md).
      
        62
        
        63
        Three rollback shapes, in preference order:
      
        64
        
        65
        1. **Schema-compatible rollback (best).** If the migration only
      
        66
           *added* columns/tables that the old code ignores, the old code
      
        67
           runs against the new schema fine. Roll the code back; leave
      
        68
           schema alone. Most of our migrations are deliberately additive
      
        69
           for this reason.
      
        70
        2. **Roll forward to a hotfix.** If the migration changed
      
        71
           semantics that the old code can't tolerate, ship a hotfix on
      
        72
           top of the new release rather than reversing the migration.
      
        73
        3. **Migration `down` + code rollback.** Last resort; some `down`s
      
        74
           drop columns and *will* lose data.
      
        75
        
        76
        ```sh
      
        77
        # (3) only when (1) and (2) won't work
      
        78
        ssh web-01
      
        79
        sudo -u shithub /usr/local/bin/shithubd migrate down  # ONE step
      
        80
        git checkout v<previous>
      
        81
        make deploy ANSIBLE_INVENTORY=production
      
        82
        ```

1	# Upgrades & migrations
2
3	Routine release deploys. Migrations apply automatically; the
4	only place upgrades get exciting is around long migrations and
5	the occasional config schema change.
6
7	## Standard release
8
9	```sh
10	git fetch --tags
11	git checkout v<version>
12	make deploy-check ANSIBLE_INVENTORY=staging
13	make deploy ANSIBLE_INVENTORY=staging
14	# ... canary period ...
15	make deploy ANSIBLE_INVENTORY=production
16	```
17
18	`shithubd migrate up` runs as the web service's
19	`ExecStartPre=`, so the binary that needs the new schema is also
20	the one that applies it. Order on each host: ExecStartPre runs
21	migrations → web starts on the new schema.
22
23	If a migration is long (>30s), the release notes call it out.
24	Schedule the deploy outside peak hours; the web service hangs in
25	"activating" until ExecStartPre finishes.
26
27	## Canary
28
29	Deploy to staging first. Watch for 30 min in Grafana. Things to
30	look at:
31
32	- p95 latency on top routes.
33	- DB call rate — a 10× jump usually means a regressed N+1.
34	- Job queue depth — a stuck migration reflects here.
35	- Error logs in Loki: `{service="shithubd"} \|~ "panic\|ERROR"`.
36
37	If anything looks off, do not promote. Rollback on staging
38	is cheap; rollback on production is loud.
39
40	## Major release (database)
41
42	If the release notes flag a major schema change:
43
44	1. Take a manual `pg_dump` immediately before the deploy:
45	`sudo -u postgres /usr/local/bin/shithub-backup-daily`.
46	2. Confirm it landed in Spaces.
47	3. Deploy to staging, run `make restore-drill` against the
48	post-deploy dump to confirm the new schema restores cleanly.
49	4. Then production.
50
51	## Config schema changes
52
53	When a release adds a required config key, the binary refuses to
54	start and complains in the journal. Update
55	`deploy/ansible/roles/shithubd/templates/web.env.j2` (and
56	`worker.env.j2`), bump the inventory vars, redeploy. There's no
57	separate migration step for env files.
58
59	## Rolling back
60
61	See [Rollback (in-repo runbook)](https://github.com/tenseleyFlow/shithub/blob/main/docs/internal/runbooks/rollback.md).
62
63	Three rollback shapes, in preference order:
64
65	1. Schema-compatible rollback (best). If the migration only
66	added columns/tables that the old code ignores, the old code
67	runs against the new schema fine. Roll the code back; leave
68	schema alone. Most of our migrations are deliberately additive
69	for this reason.
70	2. Roll forward to a hotfix. If the migration changed
71	semantics that the old code can't tolerate, ship a hotfix on
72	top of the new release rather than reversing the migration.
73	3. Migration `down` + code rollback. Last resort; some `down`s
74	drop columns and will lose data.
75
76	```sh
77	# (3) only when (1) and (2) won't work
78	ssh web-01
79	sudo -u shithub /usr/local/bin/shithubd migrate down # ONE step
80	git checkout v<previous>
81	make deploy ANSIBLE_INVENTORY=production
82	```