tenseleyflow/shithub / 59c1236

Browse files

docs(runbook): pgdata migration from root disk to block volume

Authored by espadonne
SHA
59c1236f415245c04b9ccfcbfc09383ea577c9f9
Parents
e2c3943
Tree
c225372

1 changed file

StatusFile+-
A docs/internal/runbooks/db-on-block-volume.md 244 0
docs/internal/runbooks/db-on-block-volume.mdadded
@@ -0,0 +1,244 @@
1
+# Move Postgres data from root disk to block volume
2
+
3
+One-time migration. After this lands, the droplet's root disk
4
+holds only the OS + binaries; all stateful data (pgdata, repos,
5
+tmp) lives on the attached block volume mounted at `/data`. The
6
+goal is twofold: (1) the root disk can never fill up from
7
+runaway DB growth and OOM the system, and (2) the volume can be
8
+detached and reattached to a replacement droplet without losing
9
+state if the host ever needs replacing.
10
+
11
+## Preconditions (verify before scheduling)
12
+
13
+- **Block volume mounted at `/data`** with enough free space for
14
+  pgdata + headroom for several months of growth. Check:
15
+  ```sh
16
+  df -h /data
17
+  ```
18
+- **`/data/pgdata` does not contain a live cluster.** It may
19
+  contain a stale `initdb` from when the volume was first
20
+  provisioned — that's fine and we'll move it aside.
21
+  ```sh
22
+  ls -la /data/pgdata
23
+  # Expect a PG_VERSION file dated to volume-attach time, NOT
24
+  # to "minutes ago". If "minutes ago", STOP — something is
25
+  # already running there.
26
+  ```
27
+- **Recent dump landed in Spaces** (not just locally). Worst-case
28
+  rollback is restoring this dump:
29
+  ```sh
30
+  rclone --config /etc/rclone-shithub.conf --s3-no-check-bucket \
31
+         lsl spaces-prod:shithub-backups/daily/$(date -u +%Y/%m/%d)/ | tail -3
32
+  ```
33
+  If today's directory is empty, run a fresh dump first:
34
+  ```sh
35
+  /usr/local/bin/shithub-backup-daily
36
+  ```
37
+- **WAL archiver is healthy** — gives PITR coverage for changes
38
+  between the last dump and migration time:
39
+  ```sh
40
+  sudo -u postgres psql -tAc "SELECT last_archived_time, last_failed_time FROM pg_stat_archiver;"
41
+  # last_archived_time should be < 5 min ago, last_failed_time blank or older
42
+  ```
43
+- **DigitalOcean snapshot taken** via the DO dashboard
44
+  (Droplets → shithub-prod → Snapshots → Take Snapshot). This
45
+  is the panic-button rollback if everything else fails.
46
+  Snapshots take a few minutes; DON'T start the migration
47
+  until the snapshot completes.
48
+- **Notice users** — site will be down for ~3 minutes for a
49
+  ~115 MB pgdata. Scale the window with current `du -sh
50
+  /var/lib/postgresql/16/main`.
51
+
52
+## Migration
53
+
54
+Total downtime: ~3 minutes for a 100 MB DB. Most of that is
55
+postgres clean-shutdown + start; the rsync itself is seconds.
56
+
57
+```sh
58
+ssh root@shithub.sh
59
+```
60
+
61
+### 1. Drain writes
62
+
63
+```sh
64
+systemctl stop shithubd-web
65
+systemctl stop shithubd-cron 2>/dev/null   # if running
66
+```
67
+
68
+Verify nothing else has DB sessions open before stopping
69
+postgres (background workers, manual psql sessions):
70
+
71
+```sh
72
+sudo -u postgres psql -tAc "SELECT pid, application_name, client_addr, state FROM pg_stat_activity WHERE datname = 'shithub';"
73
+```
74
+
75
+If this returns rows besides your own psql, kill those processes
76
+first.
77
+
78
+### 2. Stop postgres
79
+
80
+```sh
81
+systemctl stop postgresql@16-main
82
+systemctl status postgresql@16-main --no-pager | head -5  # should be "inactive (dead)"
83
+```
84
+
85
+### 3. Rename the stale pre-init aside (don't delete)
86
+
87
+Keeping it lets us undo step 4 instantly if something looks off:
88
+
89
+```sh
90
+mv /data/pgdata /data/pgdata.preinit-$(date -u +%Y%m%d)
91
+```
92
+
93
+### 4. Copy live data to the volume
94
+
95
+`rsync -aHX --info=progress2` preserves perms, owners, hard
96
+links, and xattrs. `--info=progress2` shows a single overall
97
+progress line:
98
+
99
+```sh
100
+rsync -aHX --info=progress2 \
101
+  /var/lib/postgresql/16/main/ \
102
+  /data/pgdata/
103
+```
104
+
105
+Verify the copy looks right:
106
+
107
+```sh
108
+ls -la /data/pgdata/ | head
109
+diff <(cd /var/lib/postgresql/16/main && find . -printf '%p %s %m %u:%g\n' | sort) \
110
+     <(cd /data/pgdata               && find . -printf '%p %s %m %u:%g\n' | sort) | head
111
+# Empty diff = byte-identical layout.
112
+```
113
+
114
+### 5. Repoint the cluster
115
+
116
+Edit the active config:
117
+
118
+```sh
119
+sed -i.bak "s|^data_directory = .*|data_directory = '/data/pgdata'|" \
120
+  /etc/postgresql/16/main/postgresql.conf
121
+grep ^data_directory /etc/postgresql/16/main/postgresql.conf
122
+# Expect: data_directory = '/data/pgdata'
123
+```
124
+
125
+(`.bak` lets you `cp postgresql.conf.bak postgresql.conf` to
126
+revert in 1 second if needed.)
127
+
128
+### 6. Start postgres on the new path
129
+
130
+```sh
131
+systemctl start postgresql@16-main
132
+sleep 2
133
+systemctl is-active postgresql@16-main      # active
134
+sudo -u postgres pg_isready -h /var/run/postgresql
135
+sudo -u postgres psql -tAc "SHOW data_directory;"   # should print /data/pgdata
136
+sudo -u postgres psql -d shithub -tAc "SELECT count(*) FROM repos;"
137
+sudo -u postgres psql -d shithub -tAc "SELECT count(*) FROM users;"
138
+```
139
+
140
+If any of those fail, jump to **Rollback** below.
141
+
142
+### 7. Bring the app back up
143
+
144
+```sh
145
+systemctl start shithubd-web
146
+systemctl start shithubd-cron 2>/dev/null
147
+systemctl is-active shithubd-web
148
+curl -fsS -o /dev/null -w '%{http_code}\n' http://127.0.0.1:8080/healthz   # 200
149
+```
150
+
151
+### 8. Smoke test
152
+
153
+From your laptop (not the droplet):
154
+
155
+```sh
156
+curl -fsS -o /dev/null -w '%{http_code} %{time_total}s\n' https://shithub.sh/
157
+# Walk the site briefly: load a repo, view an issue, log in.
158
+```
159
+
160
+Confirm the WAL archiver picked up where it left off (next
161
+archive timestamp should be > migration time within a couple of
162
+minutes):
163
+
164
+```sh
165
+ssh root@shithub.sh 'sudo -u postgres psql -tAc "SELECT last_archived_time FROM pg_stat_archiver;"'
166
+```
167
+
168
+Run a fresh backup to confirm the new pgdata is durable end-to-end:
169
+
170
+```sh
171
+ssh root@shithub.sh /usr/local/bin/shithub-backup-daily
172
+```
173
+
174
+### 9. Cleanup (after a few days of healthy operation)
175
+
176
+Don't do this until you've slept on at least one full daily
177
+backup cycle from the new location and confirmed it landed in
178
+Spaces. Then:
179
+
180
+```sh
181
+# Reclaim root-disk space.
182
+rm -rf /var/lib/postgresql/16/main.preinit-*
183
+# (After confirming /var/lib/postgresql/16/main is empty)
184
+rmdir /var/lib/postgresql/16/main 2>/dev/null
185
+rm -rf /data/pgdata.preinit-*
186
+```
187
+
188
+The systemd unit's `RequiresMountsFor=/var/lib/postgresql/%I`
189
+is a static path — if you remove the empty dir, recreate it as
190
+`mkdir -p /var/lib/postgresql/16/main && chown postgres:postgres
191
+…` before the next reboot, otherwise `pg_ctlcluster` will refuse
192
+to start. Easier: leave the empty dir alone.
193
+
194
+## Rollback
195
+
196
+### Mid-migration (steps 4–6 failed)
197
+
198
+```sh
199
+systemctl stop postgresql@16-main
200
+cp /etc/postgresql/16/main/postgresql.conf.bak /etc/postgresql/16/main/postgresql.conf
201
+mv /data/pgdata /data/pgdata.failed-$(date -u +%Y%m%d_%H%M)
202
+mv /data/pgdata.preinit-* /data/pgdata 2>/dev/null   # restore the stale init in case something checks for it
203
+systemctl start postgresql@16-main
204
+systemctl start shithubd-web
205
+```
206
+
207
+The original `/var/lib/postgresql/16/main` was never modified, so
208
+postgres comes back up against unchanged data. Total recovery
209
+time: ~30 seconds.
210
+
211
+### Worst case (data corruption observed after step 6)
212
+
213
+```sh
214
+# 1. Stop everything that talks to the DB.
215
+systemctl stop shithubd-web shithubd-cron postgresql@16-main
216
+
217
+# 2. Find the latest dump in Spaces.
218
+rclone --config /etc/rclone-shithub.conf --s3-no-check-bucket \
219
+       lsl spaces-prod:shithub-backups/daily/ | sort | tail -1
220
+
221
+# 3. Drop and recreate the cluster from the dump. See restore.md
222
+#    for the full pg_restore procedure.
223
+```
224
+
225
+Or, faster:
226
+
227
+### Nuclear option
228
+
229
+Restore the droplet from the DO snapshot taken in preconditions.
230
+Loses any user activity since the snapshot — usually that's
231
+"the last few minutes" since the snapshot is taken right before
232
+migration. Coordinate via status page if you do this.
233
+
234
+## Why this layout
235
+
236
+- **Root disk is 77 GB**; pgdata growth is unbounded. A
237
+  runaway query log or WAL spike can fill it and freeze the
238
+  whole droplet, including sshd.
239
+- **Block volume is 100 GB**, separately managed, snapshotable
240
+  in DO independently of the droplet, and detachable. If the
241
+  droplet is unrecoverable, attaching the volume to a fresh
242
+  droplet recovers state in minutes.
243
+- **Repos and tmp already live on `/data`**. Moving pgdata
244
+  finishes the layout the volume was provisioned for.