markdown · 4660 bytes Raw Blame History

Actions runner smoke runbook

This runbook validates the runner-facing Actions path. shithubd-runner now claims jobs and executes containerized run: steps through Docker or Podman. The curl flow below remains useful for token/replay debugging.

For host provisioning and the systemd/Ansible path, see runner-deploy.md.

Prereqs:

  • Database migrations are current through 0055_workflow_job_secret_masks.sql.
  • SHITHUB_TOTP_KEY or auth.totp_key_b64 is set on the web process.
  • Object storage is configured if testing artifact upload.
  • Docker or Podman is installed on the runner host.
  • A repo has a workflow under .shithub/workflows/*.yml with runs-on: ubuntu-latest, and a push/dispatch has enqueued a run. S41d PR1 supports run: steps; checkout and artifact aliases land in the following S41d slices.

runs-on is a runner-label selector, not a hard-coded image name. A workflow that says runs-on: ubuntu-latest can be claimed by any runner advertising the ubuntu-latest label. The container image is selected by the runner host's engine.default_image setting; the reproducible Nix-built image is the default, but operators can point it at another OCI image when they need closer Ubuntu parity.

Register a runner:

shithubd admin runner register \
  --name runner-1 \
  --labels self-hosted,linux,ubuntu-latest \
  --capacity 1

Save the printed token:

export RUNNER_TOKEN='<printed-token>'
export BASE='https://shithub.example'

Run the binary:

shithubd-runner run \
  --server-url "$BASE" \
  --token "$RUNNER_TOKEN" \
  --labels self-hosted,linux,ubuntu-latest \
  --workspace-root /var/lib/shithubd-runner/workspaces \
  --network shithub-actions \
  --dns-servers 172.30.0.1

Equivalent config file:

[server]
base_url = "https://shithub.example"

[runner]
token = "<printed-token>"
labels = ["self-hosted", "linux", "ubuntu-latest"]
capacity = 1
poll_interval = "5s"
workspace_root = "/var/lib/shithubd-runner/workspaces"
workspace_ttl = "24h"
network_allowlist = [
  "api.github.com",
  "auth.docker.io",
  "codeload.github.com",
  "github.com",
  "objects.githubusercontent.com",
  "production.cloudflare.docker.com",
  "registry-1.docker.io",
  "*.githubusercontent.com",
]

[engine]
kind = "docker"
default_image = "ghcr.io/shithub/runner-nix:1.0"
network = "shithub-actions"
memory = "2g"
cpus = "2"
seccomp_profile = "/etc/shithubd-runner/seccomp.json"
user = "65534:65534"
pids_limit = 512
dns_servers = ["172.30.0.1"]

The config path defaults to /etc/shithubd-runner/config.toml. Environment variables use the SHITHUB_RUNNER_ prefix, for example SHITHUB_RUNNER_TOKEN or SHITHUB_RUNNER_SERVER__BASE_URL.

The Ansible runner role creates the shithub-actions bridge, runs the allowlist resolver at 172.30.0.1, and installs firewall rules that reject direct-IP egress from step containers. If you run the binary without the role, provision equivalent network controls before pointing workflows at the runner.

Curl token smoke

Claim a job:

curl -fsS "$BASE/api/v1/runners/heartbeat" \
  -H "Authorization: Bearer $RUNNER_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"labels":["self-hosted","linux","ubuntu-latest"],"capacity":1}' \
  | tee /tmp/shithub-claim.json

Extract the job token and id:

export JOB_ID="$(jq -r '.job.id' /tmp/shithub-claim.json)"
export JOB_TOKEN="$(jq -r '.token' /tmp/shithub-claim.json)"

Append a log chunk:

curl -fsS "$BASE/api/v1/jobs/$JOB_ID/logs" \
  -H "Authorization: Bearer $JOB_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"seq\":0,\"chunk\":\"$(printf 'hello from curl\n' | base64)\"}" \
  | tee /tmp/shithub-log.json

export JOB_TOKEN="$(jq -r '.next_token' /tmp/shithub-log.json)"

Complete the job:

curl -fsS "$BASE/api/v1/jobs/$JOB_ID/status" \
  -H "Authorization: Bearer $JOB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status":"completed","conclusion":"success"}'

Replay check: reusing the log token after the log call must fail with 401 because its jti is already present in runner_jwt_used.

curl -i "$BASE/api/v1/jobs/$JOB_ID/status" \
  -H "Authorization: Bearer $(jq -r '.next_token' /tmp/shithub-log.json)" \
  -H "Content-Type: application/json" \
  -d '{"status":"running"}'

Expected results:

  • workflow_jobs.status = completed and conclusion success.
  • The parent workflow_runs row rolls up to completed/success when all jobs are terminal.
  • The PR Checks tab shows the matching check run as success.
  • /metrics includes runner registration, heartbeat, and JWT counters.
View source
1 # Actions runner smoke runbook
2
3 This runbook validates the runner-facing Actions path. `shithubd-runner`
4 now claims jobs and executes containerized `run:` steps through Docker or
5 Podman. The curl flow below remains useful for token/replay debugging.
6
7 For host provisioning and the systemd/Ansible path, see
8 [runner-deploy.md](./runner-deploy.md).
9
10 Prereqs:
11
12 - Database migrations are current through `0055_workflow_job_secret_masks.sql`.
13 - `SHITHUB_TOTP_KEY` or `auth.totp_key_b64` is set on the web process.
14 - Object storage is configured if testing artifact upload.
15 - Docker or Podman is installed on the runner host.
16 - A repo has a workflow under `.shithub/workflows/*.yml` with
17 `runs-on: ubuntu-latest`, and a push/dispatch has enqueued a run.
18 S41d PR1 supports `run:` steps; checkout and artifact aliases land in
19 the following S41d slices.
20
21 `runs-on` is a runner-label selector, not a hard-coded image name.
22 A workflow that says `runs-on: ubuntu-latest` can be claimed by any
23 runner advertising the `ubuntu-latest` label. The container image is
24 selected by the runner host's `engine.default_image` setting; the
25 reproducible Nix-built image is the default, but operators can point it
26 at another OCI image when they need closer Ubuntu parity.
27
28 Register a runner:
29
30 ```sh
31 shithubd admin runner register \
32 --name runner-1 \
33 --labels self-hosted,linux,ubuntu-latest \
34 --capacity 1
35 ```
36
37 Save the printed token:
38
39 ```sh
40 export RUNNER_TOKEN='<printed-token>'
41 export BASE='https://shithub.example'
42 ```
43
44 Run the binary:
45
46 ```sh
47 shithubd-runner run \
48 --server-url "$BASE" \
49 --token "$RUNNER_TOKEN" \
50 --labels self-hosted,linux,ubuntu-latest \
51 --workspace-root /var/lib/shithubd-runner/workspaces \
52 --network shithub-actions \
53 --dns-servers 172.30.0.1
54 ```
55
56 Equivalent config file:
57
58 ```toml
59 [server]
60 base_url = "https://shithub.example"
61
62 [runner]
63 token = "<printed-token>"
64 labels = ["self-hosted", "linux", "ubuntu-latest"]
65 capacity = 1
66 poll_interval = "5s"
67 workspace_root = "/var/lib/shithubd-runner/workspaces"
68 workspace_ttl = "24h"
69 network_allowlist = [
70 "api.github.com",
71 "auth.docker.io",
72 "codeload.github.com",
73 "github.com",
74 "objects.githubusercontent.com",
75 "production.cloudflare.docker.com",
76 "registry-1.docker.io",
77 "*.githubusercontent.com",
78 ]
79
80 [engine]
81 kind = "docker"
82 default_image = "ghcr.io/shithub/runner-nix:1.0"
83 network = "shithub-actions"
84 memory = "2g"
85 cpus = "2"
86 seccomp_profile = "/etc/shithubd-runner/seccomp.json"
87 user = "65534:65534"
88 pids_limit = 512
89 dns_servers = ["172.30.0.1"]
90 ```
91
92 The config path defaults to `/etc/shithubd-runner/config.toml`.
93 Environment variables use the `SHITHUB_RUNNER_` prefix, for example
94 `SHITHUB_RUNNER_TOKEN` or `SHITHUB_RUNNER_SERVER__BASE_URL`.
95
96 The Ansible runner role creates the `shithub-actions` bridge, runs the
97 allowlist resolver at `172.30.0.1`, and installs firewall rules that
98 reject direct-IP egress from step containers. If you run the binary
99 without the role, provision equivalent network controls before pointing
100 workflows at the runner.
101
102 ## Curl token smoke
103
104 Claim a job:
105
106 ```sh
107 curl -fsS "$BASE/api/v1/runners/heartbeat" \
108 -H "Authorization: Bearer $RUNNER_TOKEN" \
109 -H "Content-Type: application/json" \
110 -d '{"labels":["self-hosted","linux","ubuntu-latest"],"capacity":1}' \
111 | tee /tmp/shithub-claim.json
112 ```
113
114 Extract the job token and id:
115
116 ```sh
117 export JOB_ID="$(jq -r '.job.id' /tmp/shithub-claim.json)"
118 export JOB_TOKEN="$(jq -r '.token' /tmp/shithub-claim.json)"
119 ```
120
121 Append a log chunk:
122
123 ```sh
124 curl -fsS "$BASE/api/v1/jobs/$JOB_ID/logs" \
125 -H "Authorization: Bearer $JOB_TOKEN" \
126 -H "Content-Type: application/json" \
127 -d "{\"seq\":0,\"chunk\":\"$(printf 'hello from curl\n' | base64)\"}" \
128 | tee /tmp/shithub-log.json
129
130 export JOB_TOKEN="$(jq -r '.next_token' /tmp/shithub-log.json)"
131 ```
132
133 Complete the job:
134
135 ```sh
136 curl -fsS "$BASE/api/v1/jobs/$JOB_ID/status" \
137 -H "Authorization: Bearer $JOB_TOKEN" \
138 -H "Content-Type: application/json" \
139 -d '{"status":"completed","conclusion":"success"}'
140 ```
141
142 Replay check: reusing the log token after the log call must fail with
143 401 because its `jti` is already present in `runner_jwt_used`.
144
145 ```sh
146 curl -i "$BASE/api/v1/jobs/$JOB_ID/status" \
147 -H "Authorization: Bearer $(jq -r '.next_token' /tmp/shithub-log.json)" \
148 -H "Content-Type: application/json" \
149 -d '{"status":"running"}'
150 ```
151
152 Expected results:
153
154 - `workflow_jobs.status = completed` and conclusion `success`.
155 - The parent `workflow_runs` row rolls up to completed/success when all
156 jobs are terminal.
157 - The PR Checks tab shows the matching check run as success.
158 - `/metrics` includes runner registration, heartbeat, and JWT counters.