tenseleyflow/shithub / a92c6ef

Browse files

Add docs/internal/ssh-deploy.md (sshd_config, role separation, AKC contract)

Authored by mfwolffe <wolffemf@dukes.jmu.edu>
SHA
a92c6ef4112d2b29559a2f71f88b5adada815e6f
Parents
9f965ab
Tree
90b5bb0

1 changed file

StatusFile+-
A docs/internal/ssh-deploy.md 230 0
docs/internal/ssh-deploy.mdadded
@@ -0,0 +1,230 @@
1
+# SSH deploy notes
2
+
3
+S07 ships the SSH-key data layer + the `AuthorizedKeysCommand` (AKC) integration. To turn it into a working git-over-SSH endpoint, sshd needs a small amount of operator setup. This doc captures what that looks like and the security knobs that matter. The actual git protocol handler lands in S13.
4
+
5
+## Architecture
6
+
7
+```
8
+ssh client
9
+    │  TCP/22
10
+    ▼
11
+sshd (system) ──► AuthorizedKeysCommand: shithubd ssh-authkeys <fingerprint>
12
+                  ─ stdout: a single authorized_keys line, OR empty
13
+                  ─ exit 0 in both cases (failing closed on any error)
14
+                  └──► forced command per the line:
15
+                       shithubd ssh-shell <user_id>
16
+                       (S13 dispatcher; S07 placeholder logs and exits non-zero)
17
+```
18
+
19
+Why AKC instead of a static `~/.ssh/authorized_keys`:
20
+
21
+- The static-file approach requires regenerating a file on every key change and has weak consistency guarantees with replicas.
22
+- AKC is sshd's purpose-built mechanism for dynamic lookups; failure semantics are well-understood.
23
+- The forced command + restrictive options strip every interactive affordance, so a hijacked key can only invoke `shithubd ssh-shell` — never get a real shell.
24
+
25
+## Linux user setup
26
+
27
+A dedicated low-privilege system user runs the AKC binary. It owns no files, has no shell, and never logs in interactively.
28
+
29
+```sh
30
+sudo useradd --system --no-create-home --shell /usr/sbin/nologin shithub-ssh
31
+```
32
+
33
+The shithub binary lives somewhere both root and `shithub-ssh` can read:
34
+
35
+```sh
36
+sudo install -m 0755 ./bin/shithubd /usr/local/bin/shithubd
37
+```
38
+
39
+Configuration (`SHITHUB_DATABASE_URL`, etc.) lives in a 0600-mode file readable only by `shithub-ssh`:
40
+
41
+```sh
42
+sudo install -d -m 0750 -o shithub-ssh -g shithub-ssh /etc/shithub
43
+sudoedit /etc/shithub/ssh.env   # 0600, owned by shithub-ssh
44
+```
45
+
46
+Recommended `/etc/shithub/ssh.env`:
47
+
48
+```sh
49
+SHITHUB_DATABASE_URL=postgres://shithub_ssh:****@127.0.0.1:5432/shithub?sslmode=disable
50
+```
51
+
52
+## sshd configuration
53
+
54
+```
55
+# /etc/ssh/sshd_config.d/shithub.conf
56
+
57
+# Run AKC for ALL connections that match the standard auth path.
58
+AuthorizedKeysCommand /usr/local/bin/shithubd ssh-authkeys %f
59
+AuthorizedKeysCommandUser shithub-ssh
60
+
61
+# Disable password auth and host-based auth — only key-based.
62
+PasswordAuthentication no
63
+PubkeyAuthentication yes
64
+
65
+# AKC only sees the connecting client's pubkey FINGERPRINT (%f), not
66
+# the full key. shithubd looks it up by fingerprint and emits the matching
67
+# stored key + forced-command line, which sshd then uses to authenticate.
68
+
69
+# Mitigate connection floods. Real numbers depend on traffic profile —
70
+# these are reasonable defaults.
71
+MaxStartups 30:30:100
72
+LoginGraceTime 20s
73
+
74
+# Standard hardening. Already on most modern distros; spelled out here for
75
+# review:
76
+PermitRootLogin no
77
+PermitEmptyPasswords no
78
+ChallengeResponseAuthentication no
79
+UsePAM no                     # we own auth end-to-end via AKC
80
+ClientAliveInterval 60
81
+ClientAliveCountMax 3
82
+```
83
+
84
+Wrapping the AKC invocation through `EnvironmentFile` to load
85
+`SHITHUB_DATABASE_URL` is the cleanest way to keep secrets out of `argv`:
86
+the simplest path is a 2-line wrapper script in `/usr/local/bin/`:
87
+
88
+```sh
89
+#!/bin/sh
90
+# /usr/local/bin/shithub-ssh-authkeys
91
+set -e
92
+. /etc/shithub/ssh.env
93
+exec /usr/local/bin/shithubd ssh-authkeys "$1"
94
+```
95
+
96
+Then `AuthorizedKeysCommand /usr/local/bin/shithub-ssh-authkeys %f`. The
97
+wrapper is owned by root and 0755 so sshd can verify the path's ownership
98
+chain (sshd refuses to invoke an AKC whose path or any parent is
99
+group/world writable).
100
+
101
+Validate before reloading:
102
+
103
+```sh
104
+sudo sshd -t                  # config syntax check
105
+sudo systemctl reload sshd    # apply
106
+```
107
+
108
+## Postgres role separation
109
+
110
+The AKC binary runs under a low-privilege role with the minimum surface
111
+needed to look up keys and update last-used columns. Everything else
112
+(insert / delete / etc.) goes through the web binary's normal role.
113
+
114
+```sql
115
+CREATE ROLE shithub_ssh LOGIN PASSWORD '****';
116
+GRANT CONNECT ON DATABASE shithub TO shithub_ssh;
117
+GRANT USAGE ON SCHEMA public TO shithub_ssh;
118
+GRANT SELECT ON user_ssh_keys TO shithub_ssh;
119
+GRANT UPDATE (last_used_at, last_used_ip) ON user_ssh_keys TO shithub_ssh;
120
+```
121
+
122
+Read-only would be cleaner but precludes the last-used update. The
123
+column-level UPDATE grant is the smallest privilege that still serves the
124
+operational need.
125
+
126
+## AKC behavior contract
127
+
128
+The AKC subcommand has three outcomes:
129
+
130
+| Input | Output | Exit |
131
+|---|---|---|
132
+| Known fingerprint | `command="..." options... <algo> <b64>` (single line) | 0 |
133
+| Unknown fingerprint | empty stdout | 0 |
134
+| Any error (DB down, panic, malformed input) | empty stdout | 0 |
135
+
136
+**sshd reads stdout content as the auth answer; non-zero exit is a
137
+configuration error, not a deny.** Failing closed (return empty on
138
+unrecoverable conditions) is therefore the correct posture: better to
139
+deny a legitimate connection than to authorize the wrong user.
140
+
141
+The forced-command line emitted on a hit:
142
+
143
+```
144
+command="/usr/local/bin/shithubd ssh-shell <user_id>",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty <algo> <b64>
145
+```
146
+
147
+The option set strips every interactive affordance — defense in depth on
148
+top of `shithubd ssh-shell`'s own command parsing.
149
+
150
+## Latency
151
+
152
+AKC runs on every SSH connection. Targets:
153
+
154
+- p99 < 100 ms with a warm DB on the production droplet.
155
+- Per-process pool sized small (`MaxConns: 4`) — sshd spawns a fresh
156
+  process per connection, so the pool's lifetime is the connection's.
157
+- Connect timeout capped at 750 ms so a flaky DB doesn't stall sshd.
158
+
159
+`shithub_http_*` metrics are emitted by the web binary, not by AKC. Add a
160
+synthetic check that calls `shithubd ssh-authkeys <test-fp>` periodically
161
+and alerts if it exceeds the SLO.
162
+
163
+## Last-used tracking
164
+
165
+The AKC subcommand updates `user_ssh_keys.last_used_at` and
166
+`last_used_ip` after a successful lookup. It runs in a fire-and-forget
167
+goroutine with a 500 ms timeout; any error is silently dropped. This
168
+keeps the hot path fast at the cost of occasional missed updates under
169
+DB pressure — acceptable for a UI-only display field.
170
+
171
+If a future sprint needs strict last-used accuracy (anomaly detection,
172
+forensic timeline), promote the path to a small append-only log + worker
173
+roll-up. Don't make AKC's success contingent on the update.
174
+
175
+## Operational notes
176
+
177
+- **Deleting a key while a session is active:** sshd doesn't re-auth
178
+  mid-session. Existing sessions persist until disconnect. Document this
179
+  in user-facing security help; it's expected SSH behavior.
180
+- **Fingerprint canonicalization:** the codebase uses
181
+  `SHA256:<base64-no-padding>`. Both `ssh.FingerprintSHA256` and the
182
+  `ssh-keygen -E sha256 -lf <pubfile>` output produce this exact format.
183
+  When matching by hand, copy from the fingerprint shown in the user's
184
+  SSH-keys settings page, NOT from `ssh-keygen -lf` (which uses MD5
185
+  unless `-E sha256` is passed).
186
+- **Fail2ban / connection-rate limiting:** lives at the OS layer. Document
187
+  desired thresholds for ops in the S37 deploy bundle. AKC alone does
188
+  not throttle — that's sshd's `MaxStartups` and the firewall's job.
189
+
190
+## Smoke test
191
+
192
+```sh
193
+# 1. Bring up dev Postgres + apply migrations.
194
+make dev-db && make build && SHITHUB_DATABASE_URL=... ./bin/shithubd migrate up
195
+
196
+# 2. Add a key (via the UI or a direct INSERT for testing).
197
+
198
+# 3. Invoke the AKC subcommand with a known fingerprint:
199
+SHITHUB_DATABASE_URL=... ./bin/shithubd ssh-authkeys "SHA256:<...>"
200
+# Expect: a single authorized_keys line on stdout, exit 0.
201
+
202
+# 4. Invoke with an unknown fingerprint:
203
+SHITHUB_DATABASE_URL=... ./bin/shithubd ssh-authkeys "SHA256:not-a-real-fingerprint-xxxx"
204
+# Expect: empty stdout, exit 0.
205
+
206
+# 5. Connect via SSH (placeholder shell):
207
+ssh -p 22 git@<host>
208
+# Expect: stderr line "shithubd ssh-shell: user_id=N original_command=..."
209
+# Exit non-zero with "git over SSH not enabled yet" — replaced fully in S13.
210
+```
211
+
212
+## Pitfalls / risks
213
+
214
+- **Wrong-user authorization** is the catastrophic bug. Audit the
215
+  unknown-fingerprint and DB-error paths in any future change to
216
+  `cmd/shithubd/ssh.go`.
217
+- **Stray newlines / unescaped options** in the emitted line break sshd
218
+  parsing in subtle ways. The codebase emits a strict template; don't
219
+  introduce dynamic options without escaping.
220
+- **Group/world-writable paths in the AKC chain** make sshd refuse to
221
+  invoke AKC. The wrapper script and binary must be 0755, parents 0755.
222
+- **Postgres role drift:** if `shithub_ssh` ever gains broader privileges,
223
+  a compromise of the AKC binary becomes a much bigger event. Audit
224
+  grants on every deploy.
225
+
226
+## Related docs
227
+
228
+- `docs/internal/auth.md` — email/password auth (S05).
229
+- `docs/internal/2fa.md` — TOTP + recovery codes (S06).
230
+- `docs/internal/observability.md` — slog redaction.