shithub Public

Watch 1 Fork 0 Star 0

markdown · 5784 bytes Raw Blame History

Backing up the production .env

/etc/shithub/web.env (and /etc/shithub/worker.env, when split) holds every load-bearing secret on the droplet:

DB password (SHITHUB_DATABASE_URL)
Session signing key, TOTP AEAD key, webhook AEAD key
Spaces access keys (object store)
Postmark / SMTP creds
SSH-git host key seed (when configured)

The DB content is captured by backups.md — those dumps and WAL segments rebuild the data. web.env is what re-binds a fresh droplet to that data: without the same secrets, the existing DB rows can't be decrypted (TOTP/webhook payloads), sessions can't be signed, and S3 buckets can't be reached.

This doc covers backing up web.env itself.

The mental model

Treat web.env like a master password. The DB backup chain is fully automated and tested; the env file is operator-managed because it has different lifecycle:

Mostly stable: changes only when secrets rotate (see rotate-secrets.md) or new config keys land.
Tiny: a few KB.
Maximum sensitivity: leaking it is a "rotate everything" incident. So it shouldn't ride alongside the DB dumps in Spaces — different blast radius.

That puts it in a different storage tier than DB dumps. Pick one of the three options below.

Option A (recommended) — password manager

Keep an encrypted copy in your operator password manager (1Password, Bitwarden, KeePassXC, …) as a "Secure Note" or file attachment.

When to update:

After every secret rotation per rotate-secrets.md
After any change to web.env for new config keys
Right after a fresh droplet provision (initial baseline)

How:

ssh root@shithub.sh 'cat /etc/shithub/web.env'
# Paste output into a Secure Note titled e.g. "shithub-prod web.env (YYYY-MM-DD)"
# Tag with the rotation date so you can identify the active copy.

Pros: zero new infrastructure, auditable access (PM logs), already in your daily-use security tooling. Cons: manual step that's easy to skip after a rotation — set a calendar reminder for the quarterly rotation cadence.

Option B — encrypted blob alongside DB backups

Include an encrypted copy of web.env in the daily backup chain. Encrypt with age or gpg so even a compromised Spaces key can't decrypt it; the recipient key is held only by operators (in their PM).

Sketch (NOT yet wired up — would be a follow-up if we go this route):

# In shithub-backup-daily, after the pg_dump succeeds:
age -r "$(cat /etc/shithub/env-backup.recipients)" \
    -o "$LOCAL_DIR/web.env.${STAMP}.age" \
    /etc/shithub/web.env
rclone copyto "$LOCAL_DIR/web.env.${STAMP}.age" \
       "$BUCKET/env/$(date -u +%Y/%m/%d)/web.env.${STAMP}.age"

Pros: automatic, captures rotations as they happen. Cons: adds a dep (age), a new key-mgmt surface (the recipient key), and a failure mode (if the recipient key is lost, the backups become useless).

Option C — DO snapshot

DO droplet snapshots include /etc/shithub/web.env by virtue of including the whole filesystem. This is "free" coverage but the caveats are real:

Point-in-time only: a snapshot taken before a secret rotation has the OLD secret. Restoring a stale snapshot desynchronizes you from any DB rows encrypted with the new key (TOTP, webhook payloads).
Snapshots are scheduled to be deleted under DO's free policy (4 retained, oldest pruned).
Snapshot restore replaces the droplet, including the block volume's previous state (depending on snapshot type).

Suitable as a belt-and-suspenders on top of A or B. Not sufficient as the only backup of the env file.

Restore procedure

If the live web.env is lost (droplet replaced, file deleted, permissions wedged):

Stop the running services that need it:

systemctl stop shithubd-web shithubd-cron

Recreate /etc/shithub/web.env with the right ownership and mode (see deploy/ansible/roles/shithubd/tasks/main.yml for canonical perms — currently root:shithub 0640):
```
install -o root -g shithub -m 0640 /dev/stdin /etc/shithub/web.env <<'EOF'
<paste from your Secure Note>
EOF
```

Restart:

systemctl start shithubd-web shithubd-cron
curl -fsS http://127.0.0.1:8080/healthz   # 200

If the secrets in the restored copy are stale (you rotated after the backup), follow rotate-secrets.md to re-apply the current values.

What goes wrong if you skip this

Concrete scenarios where lacking an env backup turns a recoverable incident into a re-key-the-world incident:

Scenario	With env backup	Without
Droplet kernel panic, fsck loses files	Restore web.env, restart, done	Rotate every secret in `rotate-secrets.md`, plus likely re-encrypt every TOTP secret + webhook payload
Operator deletes `/etc/shithub` by mistake	Same as above	Same as above
DO destroys droplet (account billing issue)	New droplet + restored env + restored DB → up	Same DB-restore work, plus full secret rotation
Provider breach forces DB password rotation	Update one line in PM, redeploy	Same — env backup neutral here, but you should still update the PM copy after the rotation

How this relates to the rest of the backup story

What	Where it's backed up	Cadence
DB rows	`spaces-prod:shithub-backups/daily/...` (pg_dump) + `spaces-prod:shithub-wal/` (WAL)	Continuous + daily
Bare repos	`/data/repos` on the block volume + cross-region Spaces sync	Continuous
Object store contents	DO Spaces lifecycle handles versioning	Provider-managed
Operator secrets (`web.env`)	Operator password manager (this doc)	Per-rotation
Filesystem layout	DO droplet snapshots	Weekly or pre-major-change

View source

  
        1
        # Backing up the production .env
      
        2
        
        3
        `/etc/shithub/web.env` (and `/etc/shithub/worker.env`, when split)
      
        4
        holds every load-bearing secret on the droplet:
      
        5
        
        6
        - DB password (`SHITHUB_DATABASE_URL`)
      
        7
        - Session signing key, TOTP AEAD key, webhook AEAD key
      
        8
        - Spaces access keys (object store)
      
        9
        - Postmark / SMTP creds
      
        10
        - SSH-git host key seed (when configured)
      
        11
        
        12
        The DB content is captured by [`backups.md`](backups.md) — those
      
        13
        dumps and WAL segments rebuild the data. `web.env` is what
      
        14
        re-binds a fresh droplet to that data: without the same secrets,
      
        15
        the existing DB rows can't be decrypted (TOTP/webhook payloads),
      
        16
        sessions can't be signed, and S3 buckets can't be reached.
      
        17
        
        18
        This doc covers backing up `web.env` itself.
      
        19
        
        20
        ## The mental model
      
        21
        
        22
        Treat `web.env` like a master password. The DB backup chain is
      
        23
        fully automated and tested; the env file is operator-managed
      
        24
        because it has different lifecycle:
      
        25
        
        26
        - **Mostly stable**: changes only when secrets rotate (see
      
        27
          `rotate-secrets.md`) or new config keys land.
      
        28
        - **Tiny**: a few KB.
      
        29
        - **Maximum sensitivity**: leaking it is a "rotate everything"
      
        30
          incident. So it shouldn't ride alongside the DB dumps in
      
        31
          Spaces — different blast radius.
      
        32
        
        33
        That puts it in a different storage tier than DB dumps. Pick one
      
        34
        of the three options below.
      
        35
        
        36
        ## Option A (recommended) — password manager
      
        37
        
        38
        Keep an encrypted copy in your operator password manager
      
        39
        (1Password, Bitwarden, KeePassXC, …) as a "Secure Note" or
      
        40
        file attachment.
      
        41
        
        42
        **When to update:**
      
        43
        - After every secret rotation per `rotate-secrets.md`
      
        44
        - After any change to `web.env` for new config keys
      
        45
        - Right after a fresh droplet provision (initial baseline)
      
        46
        
        47
        **How:**
      
        48
        ```sh
      
        49
        ssh root@shithub.sh 'cat /etc/shithub/web.env'
      
        50
        # Paste output into a Secure Note titled e.g. "shithub-prod web.env (YYYY-MM-DD)"
      
        51
        # Tag with the rotation date so you can identify the active copy.
      
        52
        ```
      
        53
        
        54
        Pros: zero new infrastructure, auditable access (PM logs),
      
        55
        already in your daily-use security tooling. Cons: manual
      
        56
        step that's easy to skip after a rotation — set a calendar
      
        57
        reminder for the quarterly rotation cadence.
      
        58
        
        59
        ## Option B — encrypted blob alongside DB backups
      
        60
        
        61
        Include an encrypted copy of `web.env` in the daily backup
      
        62
        chain. Encrypt with `age` or `gpg` so even a compromised Spaces
      
        63
        key can't decrypt it; the recipient key is held only by
      
        64
        operators (in their PM).
      
        65
        
        66
        Sketch (NOT yet wired up — would be a follow-up if we go this
      
        67
        route):
      
        68
        
        69
        ```sh
      
        70
        # In shithub-backup-daily, after the pg_dump succeeds:
      
        71
        age -r "$(cat /etc/shithub/env-backup.recipients)" \
      
        72
            -o "$LOCAL_DIR/web.env.${STAMP}.age" \
      
        73
            /etc/shithub/web.env
      
        74
        rclone copyto "$LOCAL_DIR/web.env.${STAMP}.age" \
      
        75
               "$BUCKET/env/$(date -u +%Y/%m/%d)/web.env.${STAMP}.age"
      
        76
        ```
      
        77
        
        78
        Pros: automatic, captures rotations as they happen. Cons: adds
      
        79
        a dep (`age`), a new key-mgmt surface (the recipient key), and
      
        80
        a failure mode (if the recipient key is lost, the backups
      
        81
        become useless).
      
        82
        
        83
        ## Option C — DO snapshot
      
        84
        
        85
        DO droplet snapshots include `/etc/shithub/web.env` by virtue of
      
        86
        including the whole filesystem. This is "free" coverage but the
      
        87
        caveats are real:
      
        88
        
        89
        - **Point-in-time only**: a snapshot taken before a secret
      
        90
          rotation has the OLD secret. Restoring a stale snapshot
      
        91
          desynchronizes you from any DB rows encrypted with the new
      
        92
          key (TOTP, webhook payloads).
      
        93
        - **Snapshots are scheduled to be deleted** under DO's free
      
        94
          policy (4 retained, oldest pruned).
      
        95
        - **Snapshot restore replaces the droplet**, including the
      
        96
          block volume's previous state (depending on snapshot type).
      
        97
        
        98
        Suitable as a *belt-and-suspenders* on top of A or B. Not
      
        99
        sufficient as the only backup of the env file.
      
        100
        
        101
        ## Restore procedure
      
        102
        
        103
        If the live `web.env` is lost (droplet replaced, file deleted,
      
        104
        permissions wedged):
      
        105
        
        106
        1. Stop the running services that need it:
      
        107
           ```sh
      
        108
           systemctl stop shithubd-web shithubd-cron
      
        109
           ```
      
        110
        2. Recreate `/etc/shithub/web.env` with the right ownership and
      
        111
           mode (see `deploy/ansible/roles/shithubd/tasks/main.yml` for
      
        112
           canonical perms — currently `root:shithub 0640`):
      
        113
           ```sh
      
        114
           install -o root -g shithub -m 0640 /dev/stdin /etc/shithub/web.env <<'EOF'
      
        115
           <paste from your Secure Note>
      
        116
           EOF
      
        117
           ```
      
        118
        3. Restart:
      
        119
           ```sh
      
        120
           systemctl start shithubd-web shithubd-cron
      
        121
           curl -fsS http://127.0.0.1:8080/healthz   # 200
      
        122
           ```
      
        123
        4. If the secrets in the restored copy are stale (you rotated
      
        124
           after the backup), follow `rotate-secrets.md` to re-apply
      
        125
           the current values.
      
        126
        
        127
        ## What goes wrong if you skip this
      
        128
        
        129
        Concrete scenarios where lacking an env backup turns a recoverable
      
        130
        incident into a re-key-the-world incident:
      
        131
        
        132
        | Scenario | With env backup | Without |
      
        133
        |---|---|---|
      
        134
        | Droplet kernel panic, fsck loses files | Restore web.env, restart, done | Rotate every secret in `rotate-secrets.md`, plus likely re-encrypt every TOTP secret + webhook payload |
      
        135
        | Operator deletes `/etc/shithub` by mistake | Same as above | Same as above |
      
        136
        | DO destroys droplet (account billing issue) | New droplet + restored env + restored DB → up | Same DB-restore work, plus full secret rotation |
      
        137
        | Provider breach forces DB password rotation | Update one line in PM, redeploy | Same — env backup neutral here, but you should still update the PM copy after the rotation |
      
        138
        
        139
        ## How this relates to the rest of the backup story
      
        140
        
        141
        | What | Where it's backed up | Cadence |
      
        142
        |---|---|---|
      
        143
        | DB rows | `spaces-prod:shithub-backups/daily/...` (pg_dump) + `spaces-prod:shithub-wal/` (WAL) | Continuous + daily |
      
        144
        | Bare repos | `/data/repos` on the block volume + cross-region Spaces sync | Continuous |
      
        145
        | Object store contents | DO Spaces lifecycle handles versioning | Provider-managed |
      
        146
        | Operator secrets (`web.env`) | Operator password manager (this doc) | Per-rotation |
      
        147
        | Filesystem layout | DO droplet snapshots | Weekly or pre-major-change |

1	# Backing up the production .env
2
3	`/etc/shithub/web.env` (and `/etc/shithub/worker.env`, when split)
4	holds every load-bearing secret on the droplet:
5
6	- DB password (`SHITHUB_DATABASE_URL`)
7	- Session signing key, TOTP AEAD key, webhook AEAD key
8	- Spaces access keys (object store)
9	- Postmark / SMTP creds
10	- SSH-git host key seed (when configured)
11
12	The DB content is captured by [`backups.md`](backups.md) — those
13	dumps and WAL segments rebuild the data. `web.env` is what
14	re-binds a fresh droplet to that data: without the same secrets,
15	the existing DB rows can't be decrypted (TOTP/webhook payloads),
16	sessions can't be signed, and S3 buckets can't be reached.
17
18	This doc covers backing up `web.env` itself.
19
20	## The mental model
21
22	Treat `web.env` like a master password. The DB backup chain is
23	fully automated and tested; the env file is operator-managed
24	because it has different lifecycle:
25
26	- Mostly stable: changes only when secrets rotate (see
27	`rotate-secrets.md`) or new config keys land.
28	- Tiny: a few KB.
29	- Maximum sensitivity: leaking it is a "rotate everything"
30	incident. So it shouldn't ride alongside the DB dumps in
31	Spaces — different blast radius.
32
33	That puts it in a different storage tier than DB dumps. Pick one
34	of the three options below.
35
36	## Option A (recommended) — password manager
37
38	Keep an encrypted copy in your operator password manager
39	(1Password, Bitwarden, KeePassXC, …) as a "Secure Note" or
40	file attachment.
41
42	When to update:
43	- After every secret rotation per `rotate-secrets.md`
44	- After any change to `web.env` for new config keys
45	- Right after a fresh droplet provision (initial baseline)
46
47	How:
48	```sh
49	ssh root@shithub.sh 'cat /etc/shithub/web.env'
50	# Paste output into a Secure Note titled e.g. "shithub-prod web.env (YYYY-MM-DD)"
51	# Tag with the rotation date so you can identify the active copy.
52	```
53
54	Pros: zero new infrastructure, auditable access (PM logs),
55	already in your daily-use security tooling. Cons: manual
56	step that's easy to skip after a rotation — set a calendar
57	reminder for the quarterly rotation cadence.
58
59	## Option B — encrypted blob alongside DB backups
60
61	Include an encrypted copy of `web.env` in the daily backup
62	chain. Encrypt with `age` or `gpg` so even a compromised Spaces
63	key can't decrypt it; the recipient key is held only by
64	operators (in their PM).
65
66	Sketch (NOT yet wired up — would be a follow-up if we go this
67	route):
68
69	```sh
70	# In shithub-backup-daily, after the pg_dump succeeds:
71	age -r "$(cat /etc/shithub/env-backup.recipients)" \
72	-o "$LOCAL_DIR/web.env.${STAMP}.age" \
73	/etc/shithub/web.env
74	rclone copyto "$LOCAL_DIR/web.env.${STAMP}.age" \
75	"$BUCKET/env/$(date -u +%Y/%m/%d)/web.env.${STAMP}.age"
76	```
77
78	Pros: automatic, captures rotations as they happen. Cons: adds
79	a dep (`age`), a new key-mgmt surface (the recipient key), and
80	a failure mode (if the recipient key is lost, the backups
81	become useless).
82
83	## Option C — DO snapshot
84
85	DO droplet snapshots include `/etc/shithub/web.env` by virtue of
86	including the whole filesystem. This is "free" coverage but the
87	caveats are real:
88
89	- Point-in-time only: a snapshot taken before a secret
90	rotation has the OLD secret. Restoring a stale snapshot
91	desynchronizes you from any DB rows encrypted with the new
92	key (TOTP, webhook payloads).
93	- Snapshots are scheduled to be deleted under DO's free
94	policy (4 retained, oldest pruned).
95	- Snapshot restore replaces the droplet, including the
96	block volume's previous state (depending on snapshot type).
97
98	Suitable as a belt-and-suspenders on top of A or B. Not
99	sufficient as the only backup of the env file.
100
101	## Restore procedure
102
103	If the live `web.env` is lost (droplet replaced, file deleted,
104	permissions wedged):
105
106	1. Stop the running services that need it:
107	```sh
108	systemctl stop shithubd-web shithubd-cron
109	```
110	2. Recreate `/etc/shithub/web.env` with the right ownership and
111	mode (see `deploy/ansible/roles/shithubd/tasks/main.yml` for
112	canonical perms — currently `root:shithub 0640`):
113	```sh
114	install -o root -g shithub -m 0640 /dev/stdin /etc/shithub/web.env <<'EOF'
115	<paste from your Secure Note>
116	EOF
117	```
118	3. Restart:
119	```sh
120	systemctl start shithubd-web shithubd-cron
121	curl -fsS http://127.0.0.1:8080/healthz # 200
122	```
123	4. If the secrets in the restored copy are stale (you rotated
124	after the backup), follow `rotate-secrets.md` to re-apply
125	the current values.
126
127	## What goes wrong if you skip this
128
129	Concrete scenarios where lacking an env backup turns a recoverable
130	incident into a re-key-the-world incident:
131
132	\| Scenario \| With env backup \| Without \|
133	\|---\|---\|---\|
134	\| Droplet kernel panic, fsck loses files \| Restore web.env, restart, done \| Rotate every secret in `rotate-secrets.md`, plus likely re-encrypt every TOTP secret + webhook payload \|
135	\| Operator deletes `/etc/shithub` by mistake \| Same as above \| Same as above \|
136	\| DO destroys droplet (account billing issue) \| New droplet + restored env + restored DB → up \| Same DB-restore work, plus full secret rotation \|
137	\| Provider breach forces DB password rotation \| Update one line in PM, redeploy \| Same — env backup neutral here, but you should still update the PM copy after the rotation \|
138
139	## How this relates to the rest of the backup story
140
141	\| What \| Where it's backed up \| Cadence \|
142	\|---\|---\|---\|
143	\| DB rows \| `spaces-prod:shithub-backups/daily/...` (pg_dump) + `spaces-prod:shithub-wal/` (WAL) \| Continuous + daily \|
144	\| Bare repos \| `/data/repos` on the block volume + cross-region Spaces sync \| Continuous \|
145	\| Object store contents \| DO Spaces lifecycle handles versioning \| Provider-managed \|
146	\| Operator secrets (`web.env`) \| Operator password manager (this doc) \| Per-rotation \|
147	\| Filesystem layout \| DO droplet snapshots \| Weekly or pre-major-change \|