tenseleyflow/shithub / 49ecaa3

Browse files

docs/runner: document DNS allowlist and secret scrub posture

Authored by mfwolffe <wolffemf@dukes.jmu.edu>
SHA
49ecaa3d0c90a887d1dd838b75f54200cfb71554
Parents
07bdfb8
Tree
53f00c4

11 changed files

StatusFile+-
M SECURITY.md 19 0
M deploy/ansible/inventory/production.example 1 0
M deploy/ansible/roles/shithubd-runner/defaults/main.yml 12 0
M deploy/ansible/roles/shithubd-runner/tasks/main.yml 8 0
M deploy/ansible/roles/shithubd-runner/templates/config.toml.j2 10 0
M deploy/runner-config/README.md 12 0
A deploy/runner-config/dnsmasq.conf.j2 15 0
M docs/internal/actions-runner-api.md 7 0
M docs/internal/actions-schema.md 6 3
M docs/internal/runbooks/actions-runner.md 11 0
M docs/internal/runbooks/runner-deploy.md 31 0
SECURITY.mdmodified
@@ -91,6 +91,8 @@ fresh Docker or Podman container with these defaults:
9191
 - pinned seccomp profile at `/etc/shithubd-runner/seccomp.json`
9292
 - `--user 65534:65534`
9393
 - PID, file-descriptor, process, CPU, memory, and log-size caps
94
+- optional per-container DNS servers for operator-managed egress
95
+  allowlisting
9496
 
9597
 The writable `/workspace` mount is deliberate. The v1 engine starts one
9698
 container per step, so checkout/build outputs need a host-backed job
@@ -104,6 +106,23 @@ host user. It is not a general privilege grant: `CAP_SYS_ADMIN` is not
104106
 present, no-new-privileges is set, and the default seccomp profile still
105107
 filters dangerous syscalls.
106108
 
109
+Runner network allowlisting is an operator-managed control. The runner
110
+config records `runner.network_allowlist` and passes
111
+`engine.dns_servers` to each step container; the deployment role renders
112
+a dnsmasq allowlist template. DNS filtering must be paired with host
113
+firewall rules on the runner bridge to block direct-IP egress. Do not
114
+treat DNS-only filtering as a complete network sandbox.
115
+
116
+Actions secrets are decrypted only for the runner job claim that needs
117
+them. Repo secrets shadow org secrets with the same name. The runner
118
+receives the resolved secret map plus an exact-value mask set; runner
119
+logs are scrubbed before upload, and the web API scrubs again before
120
+persisting chunks. The server-side scrubber carries possible secret
121
+prefix tails across adjacent chunks so a bypassing runner cannot leak a
122
+secret by splitting it over multiple log POSTs. Base64-encoded or
123
+transformed secrets are not masked; workflows must not print secrets in
124
+derived forms.
125
+
107126
 Root containers are opt-in per job through an explicit shithub-only
108127
 permissions key:
109128
 
deploy/ansible/inventory/production.examplemodified
@@ -52,3 +52,4 @@ grafana_cloud_prom_token=REPLACE_ME # access-policy token
5252
 # shithub_runner_labels=self-hosted,linux,ubuntu-latest
5353
 # shithub_runner_capacity=1
5454
 # shithub_runner_default_image=ghcr.io/shithub/runner-nix:1.0
55
+# shithub_runner_dns_servers=172.30.0.1
deploy/ansible/roles/shithubd-runner/defaults/main.ymlmodified
@@ -12,9 +12,21 @@ shithub_runner_capacity: 1
1212
 shithub_runner_poll_interval: 5s
1313
 shithub_runner_workspace_root: /var/lib/shithubd-runner/workspaces
1414
 shithub_runner_workspace_ttl: 24h
15
+shithub_runner_network_allowlist:
16
+  - api.github.com
17
+  - auth.docker.io
18
+  - codeload.github.com
19
+  - github.com
20
+  - objects.githubusercontent.com
21
+  - production.cloudflare.docker.com
22
+  - registry-1.docker.io
23
+  - "*.githubusercontent.com"
1524
 shithub_runner_engine: docker
1625
 shithub_runner_default_image: ghcr.io/shithub/runner-nix:1.0
1726
 shithub_runner_network: bridge
27
+shithub_runner_dns_servers: []
28
+shithub_runner_dnsmasq_config: /etc/shithubd-runner/dnsmasq.conf
29
+shithub_runner_dnsmasq_upstream: 1.1.1.1
1830
 shithub_runner_memory: 2g
1931
 shithub_runner_cpus: "2"
2032
 shithub_runner_seccomp_profile: /etc/shithubd-runner/seccomp.json
deploy/ansible/roles/shithubd-runner/tasks/main.ymlmodified
@@ -102,6 +102,14 @@
102102
     mode: "0640"
103103
   notify: restart shithubd-runner
104104
 
105
+- name: Runner DNS allowlist template
106
+  template:
107
+    src: "{{ playbook_dir }}/../runner-config/dnsmasq.conf.j2"
108
+    dest: "{{ shithub_runner_dnsmasq_config }}"
109
+    owner: root
110
+    group: shithub-runner
111
+    mode: "0640"
112
+
105113
 - name: Runner systemd unit
106114
   copy:
107115
     src: "{{ playbook_dir }}/../systemd/shithubd-runner.service"
deploy/ansible/roles/shithubd-runner/templates/config.toml.j2modified
@@ -13,6 +13,11 @@ capacity = {{ shithub_runner_capacity }}
1313
 poll_interval = "{{ shithub_runner_poll_interval }}"
1414
 workspace_root = "{{ shithub_runner_workspace_root }}"
1515
 workspace_ttl = "{{ shithub_runner_workspace_ttl }}"
16
+{% if shithub_runner_network_allowlist is string %}
17
+network_allowlist = {{ shithub_runner_network_allowlist.split(",") | map("trim") | list | to_json }}
18
+{% else %}
19
+network_allowlist = {{ shithub_runner_network_allowlist | to_json }}
20
+{% endif %}
1621
 
1722
 [engine]
1823
 kind = "{{ shithub_runner_engine }}"
@@ -23,6 +28,11 @@ cpus = "{{ shithub_runner_cpus }}"
2328
 seccomp_profile = "{{ shithub_runner_seccomp_profile }}"
2429
 user = "{{ shithub_runner_container_user }}"
2530
 pids_limit = {{ shithub_runner_pids_limit }}
31
+{% if shithub_runner_dns_servers is string %}
32
+dns_servers = {{ shithub_runner_dns_servers.split(",") | map("trim") | list | to_json }}
33
+{% else %}
34
+dns_servers = {{ shithub_runner_dns_servers | to_json }}
35
+{% endif %}
2636
 
2737
 [log]
2838
 level = "{{ shithub_runner_log_level }}"
deploy/runner-config/README.mdmodified
@@ -14,3 +14,15 @@ Source: `moby/moby` commit
1414
 
1515
 Update this file deliberately when changing Docker daemon versions or
1616
 runner syscall posture.
17
+
18
+`dnsmasq.conf.j2` is the optional runner DNS allowlist template. The
19
+Ansible role renders it to `/etc/shithubd-runner/dnsmasq.conf` from
20
+`shithub_runner_network_allowlist`; operators can run dnsmasq bound to
21
+their Actions Docker bridge and point step containers at it with
22
+`engine.dns_servers`.
23
+
24
+The dnsmasq template intentionally has no default upstream resolver, so
25
+names outside the allowlist fail resolution. DNS allowlisting alone does
26
+not block direct-IP egress or a workflow that brings its own resolver;
27
+pair it with host firewall rules on the runner bridge for a deny-by-
28
+default network boundary.
deploy/runner-config/dnsmasq.conf.j2added
@@ -0,0 +1,15 @@
1
+# Managed by Ansible. Optional DNS allowlist resolver for Actions runners.
2
+#
3
+# Pair this with a Docker bridge/network that uses this resolver as its only
4
+# DNS server. This controls name resolution, not direct-IP egress; enforce
5
+# direct-IP denial with host firewall rules on the runner bridge.
6
+
7
+domain-needed
8
+bogus-priv
9
+no-resolv
10
+no-hosts
11
+
12
+{% for pattern in shithub_runner_network_allowlist %}
13
+{% set host = (pattern[2:] if pattern.startswith("*.") else pattern) %}
14
+server=/{{ host }}/{{ shithub_runner_dnsmasq_upstream }}
15
+{% endfor %}
docs/internal/actions-runner-api.mdmodified
@@ -62,6 +62,8 @@ Returns 204 when no matching job is claimable. Returns 200 with
6262
 `token`, `expires_at`, and `job` when a job is claimed. Capacity is
6363
 enforced server-side by counting current `workflow_jobs.status =
6464
 'running'` rows for the runner while holding a row lock on the runner.
65
+The job payload includes resolved `secrets` and `mask_values`; repo
66
+secrets shadow org secrets with the same name.
6567
 
6668
 `POST /api/v1/jobs/{id}/logs`
6769
 
@@ -75,6 +77,10 @@ Auth: job JWT. Body:
7577
 first step in the job receives the chunk. Chunks are base64-decoded,
7678
 capped at 512 KiB raw, and appended to `workflow_step_log_chunks`.
7779
 Duplicate `(step_id, seq)` inserts are accepted as idempotent retries.
80
+Before append, the API re-scrubs exact secret values from the runner
81
+claim's visible secret set. It also reprocesses any possible secret
82
+prefix carried at the end of the prior chunk, so a runner cannot leak a
83
+secret by splitting it across two log calls.
7884
 
7985
 `POST /api/v1/jobs/{id}/steps/{step_id}/status`
8086
 
@@ -142,3 +148,4 @@ request UI lands later in S41g.
142148
 - `shithub_actions_runner_registrations_total`
143149
 - `shithub_actions_runner_heartbeats_total{result="claimed|no_job"}`
144150
 - `shithub_actions_runner_jwt_total{result="issued|rejected|replay"}`
151
+- `shithub_actions_log_scrub_replacements_total{location="server"}`
docs/internal/actions-schema.mdmodified
@@ -294,9 +294,12 @@ load-bearing for S41e's threat model.
294294
 Runner log chunks pass through `internal/runner/scrub` before they are
295295
 posted to the API. It masks exact secret values and preserves enough
296296
 tail bytes between chunks to catch a secret split across chunk
297
-boundaries. S41e follow-up work wires resolved workflow secrets into
298
-the runner/API mask set and adds server-side defense in depth before
299
-persisting chunks.
297
+boundaries. S41e wires resolved workflow secrets into the runner claim
298
+payload and mask set, then applies the same exact-value scrub again in
299
+the runner API before persisting chunks. The server path also carries a
300
+possible secret-prefix tail from the prior persisted chunk, so a runner
301
+that bypasses client-side scrubbing cannot leak a secret by splitting
302
+it across adjacent log POSTs.
300303
 
301304
 ## `shithub.event` payload schema (v1)
302305
 
docs/internal/runbooks/actions-runner.mdmodified
@@ -64,6 +64,16 @@ capacity = 1
6464
 poll_interval = "5s"
6565
 workspace_root = "/var/lib/shithubd-runner/workspaces"
6666
 workspace_ttl = "24h"
67
+network_allowlist = [
68
+  "api.github.com",
69
+  "auth.docker.io",
70
+  "codeload.github.com",
71
+  "github.com",
72
+  "objects.githubusercontent.com",
73
+  "production.cloudflare.docker.com",
74
+  "registry-1.docker.io",
75
+  "*.githubusercontent.com",
76
+]
6777
 
6878
 [engine]
6979
 kind = "docker"
@@ -74,6 +84,7 @@ cpus = "2"
7484
 seccomp_profile = "/etc/shithubd-runner/seccomp.json"
7585
 user = "65534:65534"
7686
 pids_limit = 512
87
+dns_servers = []
7788
 ```
7889
 
7990
 The config path defaults to `/etc/shithubd-runner/config.toml`.
docs/internal/runbooks/runner-deploy.mdmodified
@@ -59,6 +59,7 @@ shithub_runner_default_image=ghcr.io/shithub/runner-nix:1.0
5959
 shithub_runner_seccomp_profile=/etc/shithubd-runner/seccomp.json
6060
 shithub_runner_container_user=65534:65534
6161
 shithub_runner_pids_limit=512
62
+shithub_runner_dns_servers=172.30.0.1
6263
 ```
6364
 
6465
 The role writes non-secret config to
@@ -67,6 +68,12 @@ The role writes non-secret config to
6768
 Keep `shithub_runner_workspace_root` under `/var/lib/shithubd-runner`;
6869
 the systemd unit grants runner writes only to that subtree.
6970
 
71
+`shithub_runner_network_allowlist` defaults to GitHub source/archive
72
+hosts plus Docker Hub registry hosts. Override it when a runner must
73
+fetch from an internal package registry. `shithub_runner_dns_servers`
74
+is empty by default; set it only after a DNS allowlist resolver exists
75
+on the runner network.
76
+
7077
 ## Deploy
7178
 
7279
 For the runner role only:
@@ -82,6 +89,8 @@ The role:
8289
 - creates the `shithub-runner` system user and joins it to `docker`
8390
 - uploads `/usr/local/bin/shithubd-runner`
8491
 - renders `/etc/shithubd-runner/config.toml` and `runner.env`
92
+- renders `/etc/shithubd-runner/dnsmasq.conf` from the network
93
+  allowlist for operators who run a local DNS allowlist resolver
8594
 - installs the pinned seccomp profile at
8695
   `/etc/shithubd-runner/seccomp.json`
8796
 - installs `deploy/systemd/shithubd-runner.service`
@@ -143,6 +152,28 @@ Expected state:
143152
 - step logs and systemd journal include the configured image, network,
144153
   CPU/memory limits, PID limit, container user, and seccomp profile
145154
 
155
+## Network Allowlist
156
+
157
+The runner config carries two separate network controls:
158
+
159
+- `runner.network_allowlist`: the host patterns allowed by the
160
+  operator's DNS allowlist resolver.
161
+- `engine.dns_servers`: DNS servers passed to each step container with
162
+  Docker `--dns`.
163
+
164
+For a single-host deployment, create a dedicated Docker bridge for
165
+Actions jobs, run dnsmasq bound to that bridge, render
166
+`/etc/shithubd-runner/dnsmasq.conf`, and set
167
+`shithub_runner_dns_servers` to the bridge address of that resolver.
168
+The rendered dnsmasq config has no default upstream resolver; names not
169
+matching the allowlist fail DNS resolution.
170
+
171
+DNS filtering is not a complete egress boundary by itself. Block
172
+direct-IP egress from the Actions bridge with host firewall rules, and
173
+allow only DNS to the resolver plus established outbound connections
174
+opened by that resolver. Keep the runner on a separate host from web
175
+and database services.
176
+
146177
 ## Rollback
147178
 
148179
 Stop the runner first so it does not claim new jobs: