Actions runner API
The runner-facing HTTP surface lives in
internal/web/handlers/api/runners.go. It is mounted under /api/v1
in the CSRF-exempt API group, but it does not use PAT auth. Runners
authenticate first with a long-lived registration token and then with
short-lived per-job JWTs.
Auth model
Operators register a runner with:
shithubd admin runner register \
--name runner-1 \
--labels self-hosted,linux,ubuntu-latest,x64 \
--capacity 1 \
--output json
The command inserts workflow_runners, stores only a SHA-256 hash in
runner_tokens, and returns the raw 32-byte hex token once.
--expires-in is optional and should only be used when the deployment rotates
the runner token before it expires, because the runner uses that same token for
heartbeat authentication.
Operators can drain, undrain, rotate, and hard-revoke runners with
shithubd admin runner drain, undrain, rotate-token, and revoke.
Drained runners keep heartbeating and may finish already claimed jobs,
but heartbeat claims return 204 until the runner is undrained. Hard
revocation sets the runner offline, records revoked_at, revokes all
registration tokens, and makes job API JWTs minted for that runner
invalid. This is the token-compromise boundary: a host with an old config
file cannot claim new jobs or update already claimed jobs after
revocation lands in Postgres.
POST /api/v1/runners/heartbeat accepts:
Authorization: Bearer <registration-token>
When a queued job matches the runner labels and capacity is available, the response includes a job payload and a 15-minute job JWT. That JWT has claims:
{"sub":"runner:<id>","purpose":"api","job_id":1,"run_id":1,"repo_id":1,"exp":0,"jti":"..."}
The signing key is derived from auth.totp_key_b64 with HKDF label
actions-runner-jwt-v1; the raw TOTP/secretbox key is not used
directly for JWT signing.
API-purpose job JWTs are single-use. Every job endpoint verifies the
signature and expiry, checks that the path job belongs to the claimed
runner/run, and then inserts jti into runner_jwt_used. A replay
returns 401. To support multi-step runner flows, successful in-flight job
endpoints return next_token and next_token_expires_at.
Consumed JWT rows are retained for 30 days after token expiry, then
pruned by the daily workflow:cleanup worker. This keeps the replay
gate audit trail available for recent jobs without letting the table
grow unbounded.
shithubd-runner consumes the same token chain: it claims with the
registration token, marks the job running with the first API-purpose job
JWT, then uses each returned next_token serially for log chunks,
step-status updates, cancel checks, artifact upload requests, and finally
the terminal job-status update. Reusing any consumed API-purpose job JWT
is a replay and must fail with 401.
The heartbeat claim also returns job.checkout_url and
job.checkout_token for actions/checkout@v4. The checkout token is a
separate JWT with purpose:"checkout" and the same runner/job/run/repo
scope. It is intentionally reusable while the job is running, because
Git smart HTTP performs multiple Basic-authenticated requests during one
checkout. The git HTTP handler accepts it only for git-upload-pack, only
for the claimed repository, and only while the database still shows that
the claimed runner is running the job. It is never accepted for pushes or
runner API endpoints.
Endpoints
POST /api/v1/runners/heartbeat
Request body:
{
"labels": ["self-hosted", "linux", "ubuntu-latest", "x64"],
"capacity": 1,
"host_name": "runner-host-1",
"version": "v0.1.0"
}
Returns 204 when no matching job is claimable. Returns 200 with
token, expires_at, and job when a job is claimed. Capacity is
enforced server-side by counting current workflow_jobs.status = 'running' rows for the runner while holding a row lock on the runner.
Claiming also enforces the effective Actions policy for the repository:
disabled repos, approval-pending runs, per-repo concurrent job caps, and
per-owner/org concurrent job caps are not dispatchable. Approval simply
sets workflow_runs.approved_by_user_id; the next heartbeat can claim the
same queued jobs, so no duplicate run is created.
host_name and version are optional runner metadata. The server stores
trimmed values up to 255 bytes for pool diagnostics and preserves the
previous values when old runners omit them.
The job payload includes checkout_url, checkout_token, resolved
secrets, and mask_values; repo secrets shadow org secrets with the
same name. The server also stores an encrypted claim-time copy of the mask
values on workflow_job_secret_masks so later log uploads are scrubbed
against the secrets that were actually handed to the runner, even if an
operator rotates or deletes a secret mid-job.
Pull request runs receive no org or repo secrets in v1, even after a maintainer approves dispatch. This is intentionally stricter than the approval gate until environments/protected deployment secrets exist.
POST /api/v1/jobs/{id}/logs
Auth: job JWT. Body:
{"seq":0,"chunk":"aGVsbG8K","step_id":123}
step_id is optional for the S41c curl smoke path; when omitted the
first step in the job receives the chunk. Chunks are base64-decoded,
capped at 512 KiB raw, and appended to workflow_step_log_chunks.
Duplicate (step_id, seq) inserts are accepted as idempotent retries.
Before append, the API re-scrubs exact secret values from the job's
claim-time mask snapshot. It also reprocesses any possible secret prefix
carried at the end of the prior chunk, so a runner cannot leak a secret
by splitting it across two log calls.
POST /api/v1/jobs/{id}/steps/{step_id}/status
Auth: job JWT. Body:
{"status":"completed","conclusion":"success"}
Valid transitions are queued|running -> running|completed|cancelled|skipped
with idempotent repeats of the target terminal state. Completed and
skipped steps require a valid check conclusion; cancelled defaults to
cancelled when omitted. The endpoint always returns a next_token
because a completed step is not the end of the job.
When object storage is configured, terminal step updates enqueue
workflow:finalize_step. The worker concatenates
workflow_step_log_chunks in sequence order, uploads the log to
actions/runs/<run_id>/jobs/<job_id>/steps/<step_id>.log, stores that
key and byte count on workflow_steps, then deletes the SQL chunks.
The repository Actions UI reads logs from the same two-stage storage
model. While chunks remain in SQL, a step log page concatenates them in
sequence order and renders a static snapshot. After finalization, the
page reads workflow_steps.log_object_key from object storage and
offers a short-lived signed download URL. Live tailing is intentionally
separate and lands in the S41f SSE slice.
POST /api/v1/jobs/{id}/status
Auth: job JWT. Body:
{"status":"completed","conclusion":"success"}
Valid transitions are queued|running -> running|completed|cancelled.
Completed jobs require a valid check conclusion. The handler updates
workflow_jobs, rolls up workflow_runs, and best-effort updates the
matching check_runs row created by the trigger pipeline.
timeout-minutes is enforced by shithubd-runner as a whole-job
deadline. When it expires, the runner kills the active container,
reports the current step as completed/timed_out, and reports the job
as completed/timed_out. The server treats that conclusion as terminal
failure for the workflow run rollup.
When a runner reports status:"cancelled", any still-open steps in the
job are marked cancelled too. This keeps a killed job from leaving queued
step rows that the UI would otherwise treat as live.
Runner execution supports host-side actions/checkout@v4 followed by
containerized run: steps with per-step log streaming and server-side log
finalization. Artifact upload/download aliases remain reserved until the
artifact transfer path lands.
POST /api/v1/jobs/{id}/artifacts/upload
Auth: job JWT. Body:
{"name":"test-results.tgz","size_bytes":12345}
Creates a workflow_artifacts row and returns a pre-signed S3 PUT URL.
The object key is actions/runs/<run_id>/artifacts/<name>.
POST /api/v1/jobs/{id}/cancel
Auth: PAT with repo:write, and the actor must have write permission on
the repository that owns the job's workflow run. Browser UI forms use
CSRF-protected repo routes that call the same lifecycle orchestrator.
Queued jobs are made terminal immediately:
workflow_jobs.status = cancelledworkflow_jobs.conclusion = cancelledworkflow_jobs.cancel_requested = true- open steps for that job are marked cancelled
Running jobs keep status = running and get
cancel_requested = true. The runner sees this through
cancel-check, kills the active container, then reports terminal
cancelled.
POST /api/v1/runs/{id}/rerun
Auth: PAT with repo:write, and the actor must have write permission on
the repository that owns the workflow run. Browser UI forms use
CSRF-protected repo routes for the same operation.
Only terminal workflow runs are rerunnable. A re-run reads the original
workflow file from the source run's head_sha, not from the current
branch tip, then enqueues a new workflow_runs row with:
- the same
repo_id,workflow_file,head_sha,head_ref, event, and event payload actor_user_idset to the user requesting the re-runparent_run_idset to the source run- a fresh
trigger_event_idin thererun:<source_run_id>:<random>namespace
POST /api/v1/jobs/{id}/cancel-check
Auth: job JWT. Returns:
{"cancelled":false,"next_token":"..."}
The boolean mirrors workflow_jobs.cancel_requested. shithubd-runner
polls this endpoint during job execution, serializing it through the
same single-use JWT chain as logs and status updates. On cancelled: true, the Docker engine runs docker kill <active-container> and the
runner posts terminal job status cancelled.
Metrics
shithub_actions_runner_registrations_totalshithub_actions_runner_heartbeats_total{result="claimed|no_job"}shithub_actions_runner_jwt_total{result="issued|rejected|replay"}shithub_actions_queue_depth{resource="runs|jobs"}shithub_actions_active{resource="runs|jobs"}shithub_actions_runner_heartbeat_age_seconds{runner,status}shithub_actions_runner_capacity{runner,status}shithub_actions_runs_completed_total{event,conclusion}shithub_actions_run_duration_seconds{event,conclusion}shithub_actions_steps_completed_total{step_type,conclusion}shithub_actions_jobs_cancelled_total{reason="user|concurrency|timeout"}shithub_actions_concurrency_queued_totalshithub_actions_log_scrub_replacements_total{location="server"}shithub_actions_log_chunks_total{location="server"}shithub_actions_log_chunk_bytes_total{location="server"}shithub_actions_runs_pruned_total{kind="chunks|blobs|runs|jwt_used"}shithub_actions_step_timeouts_totalshithub_actions_storage_objects{kind="artifacts|step_logs|hot_log_chunks"}shithub_actions_storage_bytes{kind="artifacts|step_logs|hot_log_chunks"}
View source
| 1 | # Actions runner API |
| 2 | |
| 3 | The runner-facing HTTP surface lives in |
| 4 | `internal/web/handlers/api/runners.go`. It is mounted under `/api/v1` |
| 5 | in the CSRF-exempt API group, but it does not use PAT auth. Runners |
| 6 | authenticate first with a long-lived registration token and then with |
| 7 | short-lived per-job JWTs. |
| 8 | |
| 9 | ## Auth model |
| 10 | |
| 11 | Operators register a runner with: |
| 12 | |
| 13 | ```sh |
| 14 | shithubd admin runner register \ |
| 15 | --name runner-1 \ |
| 16 | --labels self-hosted,linux,ubuntu-latest,x64 \ |
| 17 | --capacity 1 \ |
| 18 | --output json |
| 19 | ``` |
| 20 | |
| 21 | The command inserts `workflow_runners`, stores only a SHA-256 hash in |
| 22 | `runner_tokens`, and returns the raw 32-byte hex token once. |
| 23 | `--expires-in` is optional and should only be used when the deployment rotates |
| 24 | the runner token before it expires, because the runner uses that same token for |
| 25 | heartbeat authentication. |
| 26 | |
| 27 | Operators can drain, undrain, rotate, and hard-revoke runners with |
| 28 | `shithubd admin runner drain`, `undrain`, `rotate-token`, and `revoke`. |
| 29 | Drained runners keep heartbeating and may finish already claimed jobs, |
| 30 | but heartbeat claims return 204 until the runner is undrained. Hard |
| 31 | revocation sets the runner offline, records `revoked_at`, revokes all |
| 32 | registration tokens, and makes job API JWTs minted for that runner |
| 33 | invalid. This is the token-compromise boundary: a host with an old config |
| 34 | file cannot claim new jobs or update already claimed jobs after |
| 35 | revocation lands in Postgres. |
| 36 | |
| 37 | `POST /api/v1/runners/heartbeat` accepts: |
| 38 | |
| 39 | ```http |
| 40 | Authorization: Bearer <registration-token> |
| 41 | ``` |
| 42 | |
| 43 | When a queued job matches the runner labels and capacity is available, |
| 44 | the response includes a job payload and a 15-minute job JWT. That JWT |
| 45 | has claims: |
| 46 | |
| 47 | ```json |
| 48 | {"sub":"runner:<id>","purpose":"api","job_id":1,"run_id":1,"repo_id":1,"exp":0,"jti":"..."} |
| 49 | ``` |
| 50 | |
| 51 | The signing key is derived from `auth.totp_key_b64` with HKDF label |
| 52 | `actions-runner-jwt-v1`; the raw TOTP/secretbox key is not used |
| 53 | directly for JWT signing. |
| 54 | |
| 55 | API-purpose job JWTs are single-use. Every job endpoint verifies the |
| 56 | signature and expiry, checks that the path job belongs to the claimed |
| 57 | runner/run, and then inserts `jti` into `runner_jwt_used`. A replay |
| 58 | returns 401. To support multi-step runner flows, successful in-flight job |
| 59 | endpoints return `next_token` and `next_token_expires_at`. |
| 60 | |
| 61 | Consumed JWT rows are retained for 30 days after token expiry, then |
| 62 | pruned by the daily `workflow:cleanup` worker. This keeps the replay |
| 63 | gate audit trail available for recent jobs without letting the table |
| 64 | grow unbounded. |
| 65 | |
| 66 | `shithubd-runner` consumes the same token chain: it claims with the |
| 67 | registration token, marks the job `running` with the first API-purpose job |
| 68 | JWT, then uses each returned `next_token` serially for log chunks, |
| 69 | step-status updates, cancel checks, artifact upload requests, and finally |
| 70 | the terminal job-status update. Reusing any consumed API-purpose job JWT |
| 71 | is a replay and must fail with 401. |
| 72 | |
| 73 | The heartbeat claim also returns `job.checkout_url` and |
| 74 | `job.checkout_token` for `actions/checkout@v4`. The checkout token is a |
| 75 | separate JWT with `purpose:"checkout"` and the same runner/job/run/repo |
| 76 | scope. It is intentionally reusable while the job is `running`, because |
| 77 | Git smart HTTP performs multiple Basic-authenticated requests during one |
| 78 | checkout. The git HTTP handler accepts it only for `git-upload-pack`, only |
| 79 | for the claimed repository, and only while the database still shows that |
| 80 | the claimed runner is running the job. It is never accepted for pushes or |
| 81 | runner API endpoints. |
| 82 | |
| 83 | ## Endpoints |
| 84 | |
| 85 | `POST /api/v1/runners/heartbeat` |
| 86 | |
| 87 | Request body: |
| 88 | |
| 89 | ```json |
| 90 | { |
| 91 | "labels": ["self-hosted", "linux", "ubuntu-latest", "x64"], |
| 92 | "capacity": 1, |
| 93 | "host_name": "runner-host-1", |
| 94 | "version": "v0.1.0" |
| 95 | } |
| 96 | ``` |
| 97 | |
| 98 | Returns 204 when no matching job is claimable. Returns 200 with |
| 99 | `token`, `expires_at`, and `job` when a job is claimed. Capacity is |
| 100 | enforced server-side by counting current `workflow_jobs.status = |
| 101 | 'running'` rows for the runner while holding a row lock on the runner. |
| 102 | Claiming also enforces the effective Actions policy for the repository: |
| 103 | disabled repos, approval-pending runs, per-repo concurrent job caps, and |
| 104 | per-owner/org concurrent job caps are not dispatchable. Approval simply |
| 105 | sets `workflow_runs.approved_by_user_id`; the next heartbeat can claim the |
| 106 | same queued jobs, so no duplicate run is created. |
| 107 | `host_name` and `version` are optional runner metadata. The server stores |
| 108 | trimmed values up to 255 bytes for pool diagnostics and preserves the |
| 109 | previous values when old runners omit them. |
| 110 | The job payload includes `checkout_url`, `checkout_token`, resolved |
| 111 | `secrets`, and `mask_values`; repo secrets shadow org secrets with the |
| 112 | same name. The server also stores an encrypted claim-time copy of the mask |
| 113 | values on `workflow_job_secret_masks` so later log uploads are scrubbed |
| 114 | against the secrets that were actually handed to the runner, even if an |
| 115 | operator rotates or deletes a secret mid-job. |
| 116 | |
| 117 | Pull request runs receive no org or repo secrets in v1, even after a |
| 118 | maintainer approves dispatch. This is intentionally stricter than the |
| 119 | approval gate until environments/protected deployment secrets exist. |
| 120 | |
| 121 | `POST /api/v1/jobs/{id}/logs` |
| 122 | |
| 123 | Auth: job JWT. Body: |
| 124 | |
| 125 | ```json |
| 126 | {"seq":0,"chunk":"aGVsbG8K","step_id":123} |
| 127 | ``` |
| 128 | |
| 129 | `step_id` is optional for the S41c curl smoke path; when omitted the |
| 130 | first step in the job receives the chunk. Chunks are base64-decoded, |
| 131 | capped at 512 KiB raw, and appended to `workflow_step_log_chunks`. |
| 132 | Duplicate `(step_id, seq)` inserts are accepted as idempotent retries. |
| 133 | Before append, the API re-scrubs exact secret values from the job's |
| 134 | claim-time mask snapshot. It also reprocesses any possible secret prefix |
| 135 | carried at the end of the prior chunk, so a runner cannot leak a secret |
| 136 | by splitting it across two log calls. |
| 137 | |
| 138 | `POST /api/v1/jobs/{id}/steps/{step_id}/status` |
| 139 | |
| 140 | Auth: job JWT. Body: |
| 141 | |
| 142 | ```json |
| 143 | {"status":"completed","conclusion":"success"} |
| 144 | ``` |
| 145 | |
| 146 | Valid transitions are `queued|running -> running|completed|cancelled|skipped` |
| 147 | with idempotent repeats of the target terminal state. Completed and |
| 148 | skipped steps require a valid check conclusion; cancelled defaults to |
| 149 | `cancelled` when omitted. The endpoint always returns a `next_token` |
| 150 | because a completed step is not the end of the job. |
| 151 | |
| 152 | When object storage is configured, terminal step updates enqueue |
| 153 | `workflow:finalize_step`. The worker concatenates |
| 154 | `workflow_step_log_chunks` in sequence order, uploads the log to |
| 155 | `actions/runs/<run_id>/jobs/<job_id>/steps/<step_id>.log`, stores that |
| 156 | key and byte count on `workflow_steps`, then deletes the SQL chunks. |
| 157 | |
| 158 | The repository Actions UI reads logs from the same two-stage storage |
| 159 | model. While chunks remain in SQL, a step log page concatenates them in |
| 160 | sequence order and renders a static snapshot. After finalization, the |
| 161 | page reads `workflow_steps.log_object_key` from object storage and |
| 162 | offers a short-lived signed download URL. Live tailing is intentionally |
| 163 | separate and lands in the S41f SSE slice. |
| 164 | |
| 165 | `POST /api/v1/jobs/{id}/status` |
| 166 | |
| 167 | Auth: job JWT. Body: |
| 168 | |
| 169 | ```json |
| 170 | {"status":"completed","conclusion":"success"} |
| 171 | ``` |
| 172 | |
| 173 | Valid transitions are `queued|running -> running|completed|cancelled`. |
| 174 | Completed jobs require a valid check conclusion. The handler updates |
| 175 | `workflow_jobs`, rolls up `workflow_runs`, and best-effort updates the |
| 176 | matching `check_runs` row created by the trigger pipeline. |
| 177 | |
| 178 | `timeout-minutes` is enforced by `shithubd-runner` as a whole-job |
| 179 | deadline. When it expires, the runner kills the active container, |
| 180 | reports the current step as `completed/timed_out`, and reports the job |
| 181 | as `completed/timed_out`. The server treats that conclusion as terminal |
| 182 | failure for the workflow run rollup. |
| 183 | |
| 184 | When a runner reports `status:"cancelled"`, any still-open steps in the |
| 185 | job are marked cancelled too. This keeps a killed job from leaving queued |
| 186 | step rows that the UI would otherwise treat as live. |
| 187 | |
| 188 | Runner execution supports host-side `actions/checkout@v4` followed by |
| 189 | containerized `run:` steps with per-step log streaming and server-side log |
| 190 | finalization. Artifact upload/download aliases remain reserved until the |
| 191 | artifact transfer path lands. |
| 192 | |
| 193 | `POST /api/v1/jobs/{id}/artifacts/upload` |
| 194 | |
| 195 | Auth: job JWT. Body: |
| 196 | |
| 197 | ```json |
| 198 | {"name":"test-results.tgz","size_bytes":12345} |
| 199 | ``` |
| 200 | |
| 201 | Creates a `workflow_artifacts` row and returns a pre-signed S3 PUT URL. |
| 202 | The object key is `actions/runs/<run_id>/artifacts/<name>`. |
| 203 | |
| 204 | `POST /api/v1/jobs/{id}/cancel` |
| 205 | |
| 206 | Auth: PAT with `repo:write`, and the actor must have write permission on |
| 207 | the repository that owns the job's workflow run. Browser UI forms use |
| 208 | CSRF-protected repo routes that call the same lifecycle orchestrator. |
| 209 | |
| 210 | Queued jobs are made terminal immediately: |
| 211 | |
| 212 | - `workflow_jobs.status = cancelled` |
| 213 | - `workflow_jobs.conclusion = cancelled` |
| 214 | - `workflow_jobs.cancel_requested = true` |
| 215 | - open steps for that job are marked cancelled |
| 216 | |
| 217 | Running jobs keep `status = running` and get |
| 218 | `cancel_requested = true`. The runner sees this through |
| 219 | `cancel-check`, kills the active container, then reports terminal |
| 220 | `cancelled`. |
| 221 | |
| 222 | `POST /api/v1/runs/{id}/rerun` |
| 223 | |
| 224 | Auth: PAT with `repo:write`, and the actor must have write permission on |
| 225 | the repository that owns the workflow run. Browser UI forms use |
| 226 | CSRF-protected repo routes for the same operation. |
| 227 | |
| 228 | Only terminal workflow runs are rerunnable. A re-run reads the original |
| 229 | workflow file from the source run's `head_sha`, not from the current |
| 230 | branch tip, then enqueues a new `workflow_runs` row with: |
| 231 | |
| 232 | - the same `repo_id`, `workflow_file`, `head_sha`, `head_ref`, event, |
| 233 | and event payload |
| 234 | - `actor_user_id` set to the user requesting the re-run |
| 235 | - `parent_run_id` set to the source run |
| 236 | - a fresh `trigger_event_id` in the `rerun:<source_run_id>:<random>` |
| 237 | namespace |
| 238 | |
| 239 | `POST /api/v1/jobs/{id}/cancel-check` |
| 240 | |
| 241 | Auth: job JWT. Returns: |
| 242 | |
| 243 | ```json |
| 244 | {"cancelled":false,"next_token":"..."} |
| 245 | ``` |
| 246 | |
| 247 | The boolean mirrors `workflow_jobs.cancel_requested`. `shithubd-runner` |
| 248 | polls this endpoint during job execution, serializing it through the |
| 249 | same single-use JWT chain as logs and status updates. On `cancelled: |
| 250 | true`, the Docker engine runs `docker kill <active-container>` and the |
| 251 | runner posts terminal job status `cancelled`. |
| 252 | |
| 253 | ## Metrics |
| 254 | |
| 255 | - `shithub_actions_runner_registrations_total` |
| 256 | - `shithub_actions_runner_heartbeats_total{result="claimed|no_job"}` |
| 257 | - `shithub_actions_runner_jwt_total{result="issued|rejected|replay"}` |
| 258 | - `shithub_actions_queue_depth{resource="runs|jobs"}` |
| 259 | - `shithub_actions_active{resource="runs|jobs"}` |
| 260 | - `shithub_actions_runner_heartbeat_age_seconds{runner,status}` |
| 261 | - `shithub_actions_runner_capacity{runner,status}` |
| 262 | - `shithub_actions_runs_completed_total{event,conclusion}` |
| 263 | - `shithub_actions_run_duration_seconds{event,conclusion}` |
| 264 | - `shithub_actions_steps_completed_total{step_type,conclusion}` |
| 265 | - `shithub_actions_jobs_cancelled_total{reason="user|concurrency|timeout"}` |
| 266 | - `shithub_actions_concurrency_queued_total` |
| 267 | - `shithub_actions_log_scrub_replacements_total{location="server"}` |
| 268 | - `shithub_actions_log_chunks_total{location="server"}` |
| 269 | - `shithub_actions_log_chunk_bytes_total{location="server"}` |
| 270 | - `shithub_actions_runs_pruned_total{kind="chunks|blobs|runs|jwt_used"}` |
| 271 | - `shithub_actions_step_timeouts_total` |
| 272 | - `shithub_actions_storage_objects{kind="artifacts|step_logs|hot_log_chunks"}` |
| 273 | - `shithub_actions_storage_bytes{kind="artifacts|step_logs|hot_log_chunks"}` |