tenseleyflow/shithub / 49ecaa3

Browse files

docs/runner: document DNS allowlist and secret scrub posture

Authored by mfwolffe <wolffemf@dukes.jmu.edu>
SHA
49ecaa3d0c90a887d1dd838b75f54200cfb71554
Parents
07bdfb8
Tree
53f00c4

11 changed files

StatusFile+-
M SECURITY.md 19 0
M deploy/ansible/inventory/production.example 1 0
M deploy/ansible/roles/shithubd-runner/defaults/main.yml 12 0
M deploy/ansible/roles/shithubd-runner/tasks/main.yml 8 0
M deploy/ansible/roles/shithubd-runner/templates/config.toml.j2 10 0
M deploy/runner-config/README.md 12 0
A deploy/runner-config/dnsmasq.conf.j2 15 0
M docs/internal/actions-runner-api.md 7 0
M docs/internal/actions-schema.md 6 3
M docs/internal/runbooks/actions-runner.md 11 0
M docs/internal/runbooks/runner-deploy.md 31 0
SECURITY.mdmodified
@@ -91,6 +91,8 @@ fresh Docker or Podman container with these defaults:
91
 - pinned seccomp profile at `/etc/shithubd-runner/seccomp.json`
91
 - pinned seccomp profile at `/etc/shithubd-runner/seccomp.json`
92
 - `--user 65534:65534`
92
 - `--user 65534:65534`
93
 - PID, file-descriptor, process, CPU, memory, and log-size caps
93
 - PID, file-descriptor, process, CPU, memory, and log-size caps
94
+- optional per-container DNS servers for operator-managed egress
95
+  allowlisting
94
 
96
 
95
 The writable `/workspace` mount is deliberate. The v1 engine starts one
97
 The writable `/workspace` mount is deliberate. The v1 engine starts one
96
 container per step, so checkout/build outputs need a host-backed job
98
 container per step, so checkout/build outputs need a host-backed job
@@ -104,6 +106,23 @@ host user. It is not a general privilege grant: `CAP_SYS_ADMIN` is not
104
 present, no-new-privileges is set, and the default seccomp profile still
106
 present, no-new-privileges is set, and the default seccomp profile still
105
 filters dangerous syscalls.
107
 filters dangerous syscalls.
106
 
108
 
109
+Runner network allowlisting is an operator-managed control. The runner
110
+config records `runner.network_allowlist` and passes
111
+`engine.dns_servers` to each step container; the deployment role renders
112
+a dnsmasq allowlist template. DNS filtering must be paired with host
113
+firewall rules on the runner bridge to block direct-IP egress. Do not
114
+treat DNS-only filtering as a complete network sandbox.
115
+
116
+Actions secrets are decrypted only for the runner job claim that needs
117
+them. Repo secrets shadow org secrets with the same name. The runner
118
+receives the resolved secret map plus an exact-value mask set; runner
119
+logs are scrubbed before upload, and the web API scrubs again before
120
+persisting chunks. The server-side scrubber carries possible secret
121
+prefix tails across adjacent chunks so a bypassing runner cannot leak a
122
+secret by splitting it over multiple log POSTs. Base64-encoded or
123
+transformed secrets are not masked; workflows must not print secrets in
124
+derived forms.
125
+
107
 Root containers are opt-in per job through an explicit shithub-only
126
 Root containers are opt-in per job through an explicit shithub-only
108
 permissions key:
127
 permissions key:
109
 
128
 
deploy/ansible/inventory/production.examplemodified
@@ -52,3 +52,4 @@ grafana_cloud_prom_token=REPLACE_ME # access-policy token
52
 # shithub_runner_labels=self-hosted,linux,ubuntu-latest
52
 # shithub_runner_labels=self-hosted,linux,ubuntu-latest
53
 # shithub_runner_capacity=1
53
 # shithub_runner_capacity=1
54
 # shithub_runner_default_image=ghcr.io/shithub/runner-nix:1.0
54
 # shithub_runner_default_image=ghcr.io/shithub/runner-nix:1.0
55
+# shithub_runner_dns_servers=172.30.0.1
deploy/ansible/roles/shithubd-runner/defaults/main.ymlmodified
@@ -12,9 +12,21 @@ shithub_runner_capacity: 1
12
 shithub_runner_poll_interval: 5s
12
 shithub_runner_poll_interval: 5s
13
 shithub_runner_workspace_root: /var/lib/shithubd-runner/workspaces
13
 shithub_runner_workspace_root: /var/lib/shithubd-runner/workspaces
14
 shithub_runner_workspace_ttl: 24h
14
 shithub_runner_workspace_ttl: 24h
15
+shithub_runner_network_allowlist:
16
+  - api.github.com
17
+  - auth.docker.io
18
+  - codeload.github.com
19
+  - github.com
20
+  - objects.githubusercontent.com
21
+  - production.cloudflare.docker.com
22
+  - registry-1.docker.io
23
+  - "*.githubusercontent.com"
15
 shithub_runner_engine: docker
24
 shithub_runner_engine: docker
16
 shithub_runner_default_image: ghcr.io/shithub/runner-nix:1.0
25
 shithub_runner_default_image: ghcr.io/shithub/runner-nix:1.0
17
 shithub_runner_network: bridge
26
 shithub_runner_network: bridge
27
+shithub_runner_dns_servers: []
28
+shithub_runner_dnsmasq_config: /etc/shithubd-runner/dnsmasq.conf
29
+shithub_runner_dnsmasq_upstream: 1.1.1.1
18
 shithub_runner_memory: 2g
30
 shithub_runner_memory: 2g
19
 shithub_runner_cpus: "2"
31
 shithub_runner_cpus: "2"
20
 shithub_runner_seccomp_profile: /etc/shithubd-runner/seccomp.json
32
 shithub_runner_seccomp_profile: /etc/shithubd-runner/seccomp.json
deploy/ansible/roles/shithubd-runner/tasks/main.ymlmodified
@@ -102,6 +102,14 @@
102
     mode: "0640"
102
     mode: "0640"
103
   notify: restart shithubd-runner
103
   notify: restart shithubd-runner
104
 
104
 
105
+- name: Runner DNS allowlist template
106
+  template:
107
+    src: "{{ playbook_dir }}/../runner-config/dnsmasq.conf.j2"
108
+    dest: "{{ shithub_runner_dnsmasq_config }}"
109
+    owner: root
110
+    group: shithub-runner
111
+    mode: "0640"
112
+
105
 - name: Runner systemd unit
113
 - name: Runner systemd unit
106
   copy:
114
   copy:
107
     src: "{{ playbook_dir }}/../systemd/shithubd-runner.service"
115
     src: "{{ playbook_dir }}/../systemd/shithubd-runner.service"
deploy/ansible/roles/shithubd-runner/templates/config.toml.j2modified
@@ -13,6 +13,11 @@ capacity = {{ shithub_runner_capacity }}
13
 poll_interval = "{{ shithub_runner_poll_interval }}"
13
 poll_interval = "{{ shithub_runner_poll_interval }}"
14
 workspace_root = "{{ shithub_runner_workspace_root }}"
14
 workspace_root = "{{ shithub_runner_workspace_root }}"
15
 workspace_ttl = "{{ shithub_runner_workspace_ttl }}"
15
 workspace_ttl = "{{ shithub_runner_workspace_ttl }}"
16
+{% if shithub_runner_network_allowlist is string %}
17
+network_allowlist = {{ shithub_runner_network_allowlist.split(",") | map("trim") | list | to_json }}
18
+{% else %}
19
+network_allowlist = {{ shithub_runner_network_allowlist | to_json }}
20
+{% endif %}
16
 
21
 
17
 [engine]
22
 [engine]
18
 kind = "{{ shithub_runner_engine }}"
23
 kind = "{{ shithub_runner_engine }}"
@@ -23,6 +28,11 @@ cpus = "{{ shithub_runner_cpus }}"
23
 seccomp_profile = "{{ shithub_runner_seccomp_profile }}"
28
 seccomp_profile = "{{ shithub_runner_seccomp_profile }}"
24
 user = "{{ shithub_runner_container_user }}"
29
 user = "{{ shithub_runner_container_user }}"
25
 pids_limit = {{ shithub_runner_pids_limit }}
30
 pids_limit = {{ shithub_runner_pids_limit }}
31
+{% if shithub_runner_dns_servers is string %}
32
+dns_servers = {{ shithub_runner_dns_servers.split(",") | map("trim") | list | to_json }}
33
+{% else %}
34
+dns_servers = {{ shithub_runner_dns_servers | to_json }}
35
+{% endif %}
26
 
36
 
27
 [log]
37
 [log]
28
 level = "{{ shithub_runner_log_level }}"
38
 level = "{{ shithub_runner_log_level }}"
deploy/runner-config/README.mdmodified
@@ -14,3 +14,15 @@ Source: `moby/moby` commit
14
 
14
 
15
 Update this file deliberately when changing Docker daemon versions or
15
 Update this file deliberately when changing Docker daemon versions or
16
 runner syscall posture.
16
 runner syscall posture.
17
+
18
+`dnsmasq.conf.j2` is the optional runner DNS allowlist template. The
19
+Ansible role renders it to `/etc/shithubd-runner/dnsmasq.conf` from
20
+`shithub_runner_network_allowlist`; operators can run dnsmasq bound to
21
+their Actions Docker bridge and point step containers at it with
22
+`engine.dns_servers`.
23
+
24
+The dnsmasq template intentionally has no default upstream resolver, so
25
+names outside the allowlist fail resolution. DNS allowlisting alone does
26
+not block direct-IP egress or a workflow that brings its own resolver;
27
+pair it with host firewall rules on the runner bridge for a deny-by-
28
+default network boundary.
deploy/runner-config/dnsmasq.conf.j2added
@@ -0,0 +1,15 @@
1
+# Managed by Ansible. Optional DNS allowlist resolver for Actions runners.
2
+#
3
+# Pair this with a Docker bridge/network that uses this resolver as its only
4
+# DNS server. This controls name resolution, not direct-IP egress; enforce
5
+# direct-IP denial with host firewall rules on the runner bridge.
6
+
7
+domain-needed
8
+bogus-priv
9
+no-resolv
10
+no-hosts
11
+
12
+{% for pattern in shithub_runner_network_allowlist %}
13
+{% set host = (pattern[2:] if pattern.startswith("*.") else pattern) %}
14
+server=/{{ host }}/{{ shithub_runner_dnsmasq_upstream }}
15
+{% endfor %}
docs/internal/actions-runner-api.mdmodified
@@ -62,6 +62,8 @@ Returns 204 when no matching job is claimable. Returns 200 with
62
 `token`, `expires_at`, and `job` when a job is claimed. Capacity is
62
 `token`, `expires_at`, and `job` when a job is claimed. Capacity is
63
 enforced server-side by counting current `workflow_jobs.status =
63
 enforced server-side by counting current `workflow_jobs.status =
64
 'running'` rows for the runner while holding a row lock on the runner.
64
 'running'` rows for the runner while holding a row lock on the runner.
65
+The job payload includes resolved `secrets` and `mask_values`; repo
66
+secrets shadow org secrets with the same name.
65
 
67
 
66
 `POST /api/v1/jobs/{id}/logs`
68
 `POST /api/v1/jobs/{id}/logs`
67
 
69
 
@@ -75,6 +77,10 @@ Auth: job JWT. Body:
75
 first step in the job receives the chunk. Chunks are base64-decoded,
77
 first step in the job receives the chunk. Chunks are base64-decoded,
76
 capped at 512 KiB raw, and appended to `workflow_step_log_chunks`.
78
 capped at 512 KiB raw, and appended to `workflow_step_log_chunks`.
77
 Duplicate `(step_id, seq)` inserts are accepted as idempotent retries.
79
 Duplicate `(step_id, seq)` inserts are accepted as idempotent retries.
80
+Before append, the API re-scrubs exact secret values from the runner
81
+claim's visible secret set. It also reprocesses any possible secret
82
+prefix carried at the end of the prior chunk, so a runner cannot leak a
83
+secret by splitting it across two log calls.
78
 
84
 
79
 `POST /api/v1/jobs/{id}/steps/{step_id}/status`
85
 `POST /api/v1/jobs/{id}/steps/{step_id}/status`
80
 
86
 
@@ -142,3 +148,4 @@ request UI lands later in S41g.
142
 - `shithub_actions_runner_registrations_total`
148
 - `shithub_actions_runner_registrations_total`
143
 - `shithub_actions_runner_heartbeats_total{result="claimed|no_job"}`
149
 - `shithub_actions_runner_heartbeats_total{result="claimed|no_job"}`
144
 - `shithub_actions_runner_jwt_total{result="issued|rejected|replay"}`
150
 - `shithub_actions_runner_jwt_total{result="issued|rejected|replay"}`
151
+- `shithub_actions_log_scrub_replacements_total{location="server"}`
docs/internal/actions-schema.mdmodified
@@ -294,9 +294,12 @@ load-bearing for S41e's threat model.
294
 Runner log chunks pass through `internal/runner/scrub` before they are
294
 Runner log chunks pass through `internal/runner/scrub` before they are
295
 posted to the API. It masks exact secret values and preserves enough
295
 posted to the API. It masks exact secret values and preserves enough
296
 tail bytes between chunks to catch a secret split across chunk
296
 tail bytes between chunks to catch a secret split across chunk
297
-boundaries. S41e follow-up work wires resolved workflow secrets into
297
+boundaries. S41e wires resolved workflow secrets into the runner claim
298
-the runner/API mask set and adds server-side defense in depth before
298
+payload and mask set, then applies the same exact-value scrub again in
299
-persisting chunks.
299
+the runner API before persisting chunks. The server path also carries a
300
+possible secret-prefix tail from the prior persisted chunk, so a runner
301
+that bypasses client-side scrubbing cannot leak a secret by splitting
302
+it across adjacent log POSTs.
300
 
303
 
301
 ## `shithub.event` payload schema (v1)
304
 ## `shithub.event` payload schema (v1)
302
 
305
 
docs/internal/runbooks/actions-runner.mdmodified
@@ -64,6 +64,16 @@ capacity = 1
64
 poll_interval = "5s"
64
 poll_interval = "5s"
65
 workspace_root = "/var/lib/shithubd-runner/workspaces"
65
 workspace_root = "/var/lib/shithubd-runner/workspaces"
66
 workspace_ttl = "24h"
66
 workspace_ttl = "24h"
67
+network_allowlist = [
68
+  "api.github.com",
69
+  "auth.docker.io",
70
+  "codeload.github.com",
71
+  "github.com",
72
+  "objects.githubusercontent.com",
73
+  "production.cloudflare.docker.com",
74
+  "registry-1.docker.io",
75
+  "*.githubusercontent.com",
76
+]
67
 
77
 
68
 [engine]
78
 [engine]
69
 kind = "docker"
79
 kind = "docker"
@@ -74,6 +84,7 @@ cpus = "2"
74
 seccomp_profile = "/etc/shithubd-runner/seccomp.json"
84
 seccomp_profile = "/etc/shithubd-runner/seccomp.json"
75
 user = "65534:65534"
85
 user = "65534:65534"
76
 pids_limit = 512
86
 pids_limit = 512
87
+dns_servers = []
77
 ```
88
 ```
78
 
89
 
79
 The config path defaults to `/etc/shithubd-runner/config.toml`.
90
 The config path defaults to `/etc/shithubd-runner/config.toml`.
docs/internal/runbooks/runner-deploy.mdmodified
@@ -59,6 +59,7 @@ shithub_runner_default_image=ghcr.io/shithub/runner-nix:1.0
59
 shithub_runner_seccomp_profile=/etc/shithubd-runner/seccomp.json
59
 shithub_runner_seccomp_profile=/etc/shithubd-runner/seccomp.json
60
 shithub_runner_container_user=65534:65534
60
 shithub_runner_container_user=65534:65534
61
 shithub_runner_pids_limit=512
61
 shithub_runner_pids_limit=512
62
+shithub_runner_dns_servers=172.30.0.1
62
 ```
63
 ```
63
 
64
 
64
 The role writes non-secret config to
65
 The role writes non-secret config to
@@ -67,6 +68,12 @@ The role writes non-secret config to
67
 Keep `shithub_runner_workspace_root` under `/var/lib/shithubd-runner`;
68
 Keep `shithub_runner_workspace_root` under `/var/lib/shithubd-runner`;
68
 the systemd unit grants runner writes only to that subtree.
69
 the systemd unit grants runner writes only to that subtree.
69
 
70
 
71
+`shithub_runner_network_allowlist` defaults to GitHub source/archive
72
+hosts plus Docker Hub registry hosts. Override it when a runner must
73
+fetch from an internal package registry. `shithub_runner_dns_servers`
74
+is empty by default; set it only after a DNS allowlist resolver exists
75
+on the runner network.
76
+
70
 ## Deploy
77
 ## Deploy
71
 
78
 
72
 For the runner role only:
79
 For the runner role only:
@@ -82,6 +89,8 @@ The role:
82
 - creates the `shithub-runner` system user and joins it to `docker`
89
 - creates the `shithub-runner` system user and joins it to `docker`
83
 - uploads `/usr/local/bin/shithubd-runner`
90
 - uploads `/usr/local/bin/shithubd-runner`
84
 - renders `/etc/shithubd-runner/config.toml` and `runner.env`
91
 - renders `/etc/shithubd-runner/config.toml` and `runner.env`
92
+- renders `/etc/shithubd-runner/dnsmasq.conf` from the network
93
+  allowlist for operators who run a local DNS allowlist resolver
85
 - installs the pinned seccomp profile at
94
 - installs the pinned seccomp profile at
86
   `/etc/shithubd-runner/seccomp.json`
95
   `/etc/shithubd-runner/seccomp.json`
87
 - installs `deploy/systemd/shithubd-runner.service`
96
 - installs `deploy/systemd/shithubd-runner.service`
@@ -143,6 +152,28 @@ Expected state:
143
 - step logs and systemd journal include the configured image, network,
152
 - step logs and systemd journal include the configured image, network,
144
   CPU/memory limits, PID limit, container user, and seccomp profile
153
   CPU/memory limits, PID limit, container user, and seccomp profile
145
 
154
 
155
+## Network Allowlist
156
+
157
+The runner config carries two separate network controls:
158
+
159
+- `runner.network_allowlist`: the host patterns allowed by the
160
+  operator's DNS allowlist resolver.
161
+- `engine.dns_servers`: DNS servers passed to each step container with
162
+  Docker `--dns`.
163
+
164
+For a single-host deployment, create a dedicated Docker bridge for
165
+Actions jobs, run dnsmasq bound to that bridge, render
166
+`/etc/shithubd-runner/dnsmasq.conf`, and set
167
+`shithub_runner_dns_servers` to the bridge address of that resolver.
168
+The rendered dnsmasq config has no default upstream resolver; names not
169
+matching the allowlist fail DNS resolution.
170
+
171
+DNS filtering is not a complete egress boundary by itself. Block
172
+direct-IP egress from the Actions bridge with host firewall rules, and
173
+allow only DNS to the resolver plus established outbound connections
174
+opened by that resolver. Keep the runner on a separate host from web
175
+and database services.
176
+
146
 ## Rollback
177
 ## Rollback
147
 
178
 
148
 Stop the runner first so it does not claim new jobs:
179
 Stop the runner first so it does not claim new jobs: