Actions runner API
The runner-facing HTTP surface lives in
internal/web/handlers/api/runners.go. It is mounted under /api/v1
in the CSRF-exempt API group, but it does not use PAT auth. Runners
authenticate first with a long-lived registration token and then with
short-lived per-job JWTs.
Auth model
Operators register a runner with:
shithubd admin runner register --name runner-1 --labels self-hosted,linux,ubuntu-latest
The command inserts workflow_runners, stores only a SHA-256 hash in
runner_tokens, and prints the 32-byte hex token once.
POST /api/v1/runners/heartbeat accepts:
Authorization: Bearer <registration-token>
When a queued job matches the runner labels and capacity is available, the response includes a job payload and a 15-minute job JWT. That JWT has claims:
{"sub":"runner:<id>","job_id":1,"run_id":1,"repo_id":1,"exp":0,"jti":"..."}
The signing key is derived from auth.totp_key_b64 with HKDF label
actions-runner-jwt-v1; the raw TOTP/secretbox key is not used
directly for JWT signing.
Job JWTs are single-use. Every job endpoint verifies the signature and
expiry, checks that the path job belongs to the claimed runner/run, and
then inserts jti into runner_jwt_used. A replay returns 401. To
support multi-step runner flows, successful in-flight job endpoints
return next_token and next_token_expires_at.
Consumed JWT rows are retained for 30 days after token expiry, then
pruned by the daily workflow:cleanup worker. This keeps the replay
gate audit trail available for recent jobs without letting the table
grow unbounded.
shithubd-runner consumes the same token chain: it claims with the
registration token, marks the job running with the first job JWT, then
uses each returned next_token serially for log chunks, step-status
updates, cancel checks, artifact upload requests, and finally the
terminal job-status update. Reusing any consumed job JWT is a replay and
must fail with 401.
Endpoints
POST /api/v1/runners/heartbeat
Request body:
{"labels":["ubuntu-latest","linux"],"capacity":1}
Returns 204 when no matching job is claimable. Returns 200 with
token, expires_at, and job when a job is claimed. Capacity is
enforced server-side by counting current workflow_jobs.status = 'running' rows for the runner while holding a row lock on the runner.
The job payload includes resolved secrets and mask_values; repo
secrets shadow org secrets with the same name. The server also stores
an encrypted claim-time copy of the mask values on
workflow_job_secret_masks so later log uploads are scrubbed against
the secrets that were actually handed to the runner, even if an
operator rotates or deletes a secret mid-job.
POST /api/v1/jobs/{id}/logs
Auth: job JWT. Body:
{"seq":0,"chunk":"aGVsbG8K","step_id":123}
step_id is optional for the S41c curl smoke path; when omitted the
first step in the job receives the chunk. Chunks are base64-decoded,
capped at 512 KiB raw, and appended to workflow_step_log_chunks.
Duplicate (step_id, seq) inserts are accepted as idempotent retries.
Before append, the API re-scrubs exact secret values from the job's
claim-time mask snapshot. It also reprocesses any possible secret prefix
carried at the end of the prior chunk, so a runner cannot leak a secret
by splitting it across two log calls.
POST /api/v1/jobs/{id}/steps/{step_id}/status
Auth: job JWT. Body:
{"status":"completed","conclusion":"success"}
Valid transitions are queued|running -> running|completed|cancelled|skipped
with idempotent repeats of the target terminal state. Completed and
skipped steps require a valid check conclusion; cancelled defaults to
cancelled when omitted. The endpoint always returns a next_token
because a completed step is not the end of the job.
When object storage is configured, terminal step updates enqueue
workflow:finalize_step. The worker concatenates
workflow_step_log_chunks in sequence order, uploads the log to
actions/runs/<run_id>/jobs/<job_id>/steps/<step_id>.log, stores that
key and byte count on workflow_steps, then deletes the SQL chunks.
The repository Actions UI reads logs from the same two-stage storage
model. While chunks remain in SQL, a step log page concatenates them in
sequence order and renders a static snapshot. After finalization, the
page reads workflow_steps.log_object_key from object storage and
offers a short-lived signed download URL. Live tailing is intentionally
separate and lands in the S41f SSE slice.
POST /api/v1/jobs/{id}/status
Auth: job JWT. Body:
{"status":"completed","conclusion":"success"}
Valid transitions are queued|running -> running|completed|cancelled.
Completed jobs require a valid check conclusion. The handler updates
workflow_jobs, rolls up workflow_runs, and best-effort updates the
matching check_runs row created by the trigger pipeline.
timeout-minutes is enforced by shithubd-runner as a whole-job
deadline. When it expires, the runner kills the active container,
reports the current step as completed/timed_out, and reports the job
as completed/timed_out. The server treats that conclusion as terminal
failure for the workflow run rollup.
When a runner reports status:"cancelled", any still-open steps in the
job are marked cancelled too. This keeps a killed job from leaving queued
step rows that the UI would otherwise treat as live.
S41d PR2 runner execution supports containerized run: steps with
per-step log streaming and server-side log finalization. uses: aliases
such as actions/checkout@v4 and artifact upload/download remain
reserved for the later S41d slices that add checkout metadata and
artifact transfer.
POST /api/v1/jobs/{id}/artifacts/upload
Auth: job JWT. Body:
{"name":"test-results.tgz","size_bytes":12345}
Creates a workflow_artifacts row and returns a pre-signed S3 PUT URL.
The object key is actions/runs/<run_id>/artifacts/<name>.
POST /api/v1/jobs/{id}/cancel
Auth: PAT with repo:write, and the actor must have write permission on
the repository that owns the job's workflow run. Browser UI forms use
CSRF-protected repo routes that call the same lifecycle orchestrator.
Queued jobs are made terminal immediately:
workflow_jobs.status = cancelledworkflow_jobs.conclusion = cancelledworkflow_jobs.cancel_requested = true- open steps for that job are marked cancelled
Running jobs keep status = running and get
cancel_requested = true. The runner sees this through
cancel-check, kills the active container, then reports terminal
cancelled.
POST /api/v1/runs/{id}/rerun
Auth: PAT with repo:write, and the actor must have write permission on
the repository that owns the workflow run. Browser UI forms use
CSRF-protected repo routes for the same operation.
Only terminal workflow runs are rerunnable. A re-run reads the original
workflow file from the source run's head_sha, not from the current
branch tip, then enqueues a new workflow_runs row with:
- the same
repo_id,workflow_file,head_sha,head_ref, event, and event payload actor_user_idset to the user requesting the re-runparent_run_idset to the source run- a fresh
trigger_event_idin thererun:<source_run_id>:<random>namespace
POST /api/v1/jobs/{id}/cancel-check
Auth: job JWT. Returns:
{"cancelled":false,"next_token":"..."}
The boolean mirrors workflow_jobs.cancel_requested. shithubd-runner
polls this endpoint during job execution, serializing it through the
same single-use JWT chain as logs and status updates. On cancelled: true, the Docker engine runs docker kill <active-container> and the
runner posts terminal job status cancelled.
Metrics
shithub_actions_runner_registrations_totalshithub_actions_runner_heartbeats_total{result="claimed|no_job"}shithub_actions_runner_jwt_total{result="issued|rejected|replay"}shithub_actions_jobs_cancelled_total{reason="user|concurrency|timeout"}shithub_actions_concurrency_queued_totalshithub_actions_log_scrub_replacements_total{location="server"}shithub_actions_runs_pruned_total{kind="chunks|blobs|runs|jwt_used"}shithub_actions_step_timeouts_total
View source
| 1 | # Actions runner API |
| 2 | |
| 3 | The runner-facing HTTP surface lives in |
| 4 | `internal/web/handlers/api/runners.go`. It is mounted under `/api/v1` |
| 5 | in the CSRF-exempt API group, but it does not use PAT auth. Runners |
| 6 | authenticate first with a long-lived registration token and then with |
| 7 | short-lived per-job JWTs. |
| 8 | |
| 9 | ## Auth model |
| 10 | |
| 11 | Operators register a runner with: |
| 12 | |
| 13 | ```sh |
| 14 | shithubd admin runner register --name runner-1 --labels self-hosted,linux,ubuntu-latest |
| 15 | ``` |
| 16 | |
| 17 | The command inserts `workflow_runners`, stores only a SHA-256 hash in |
| 18 | `runner_tokens`, and prints the 32-byte hex token once. |
| 19 | |
| 20 | `POST /api/v1/runners/heartbeat` accepts: |
| 21 | |
| 22 | ```http |
| 23 | Authorization: Bearer <registration-token> |
| 24 | ``` |
| 25 | |
| 26 | When a queued job matches the runner labels and capacity is available, |
| 27 | the response includes a job payload and a 15-minute job JWT. That JWT |
| 28 | has claims: |
| 29 | |
| 30 | ```json |
| 31 | {"sub":"runner:<id>","job_id":1,"run_id":1,"repo_id":1,"exp":0,"jti":"..."} |
| 32 | ``` |
| 33 | |
| 34 | The signing key is derived from `auth.totp_key_b64` with HKDF label |
| 35 | `actions-runner-jwt-v1`; the raw TOTP/secretbox key is not used |
| 36 | directly for JWT signing. |
| 37 | |
| 38 | Job JWTs are single-use. Every job endpoint verifies the signature and |
| 39 | expiry, checks that the path job belongs to the claimed runner/run, and |
| 40 | then inserts `jti` into `runner_jwt_used`. A replay returns 401. To |
| 41 | support multi-step runner flows, successful in-flight job endpoints |
| 42 | return `next_token` and `next_token_expires_at`. |
| 43 | |
| 44 | Consumed JWT rows are retained for 30 days after token expiry, then |
| 45 | pruned by the daily `workflow:cleanup` worker. This keeps the replay |
| 46 | gate audit trail available for recent jobs without letting the table |
| 47 | grow unbounded. |
| 48 | |
| 49 | `shithubd-runner` consumes the same token chain: it claims with the |
| 50 | registration token, marks the job `running` with the first job JWT, then |
| 51 | uses each returned `next_token` serially for log chunks, step-status |
| 52 | updates, cancel checks, artifact upload requests, and finally the |
| 53 | terminal job-status update. Reusing any consumed job JWT is a replay and |
| 54 | must fail with 401. |
| 55 | |
| 56 | ## Endpoints |
| 57 | |
| 58 | `POST /api/v1/runners/heartbeat` |
| 59 | |
| 60 | Request body: |
| 61 | |
| 62 | ```json |
| 63 | {"labels":["ubuntu-latest","linux"],"capacity":1} |
| 64 | ``` |
| 65 | |
| 66 | Returns 204 when no matching job is claimable. Returns 200 with |
| 67 | `token`, `expires_at`, and `job` when a job is claimed. Capacity is |
| 68 | enforced server-side by counting current `workflow_jobs.status = |
| 69 | 'running'` rows for the runner while holding a row lock on the runner. |
| 70 | The job payload includes resolved `secrets` and `mask_values`; repo |
| 71 | secrets shadow org secrets with the same name. The server also stores |
| 72 | an encrypted claim-time copy of the mask values on |
| 73 | `workflow_job_secret_masks` so later log uploads are scrubbed against |
| 74 | the secrets that were actually handed to the runner, even if an |
| 75 | operator rotates or deletes a secret mid-job. |
| 76 | |
| 77 | `POST /api/v1/jobs/{id}/logs` |
| 78 | |
| 79 | Auth: job JWT. Body: |
| 80 | |
| 81 | ```json |
| 82 | {"seq":0,"chunk":"aGVsbG8K","step_id":123} |
| 83 | ``` |
| 84 | |
| 85 | `step_id` is optional for the S41c curl smoke path; when omitted the |
| 86 | first step in the job receives the chunk. Chunks are base64-decoded, |
| 87 | capped at 512 KiB raw, and appended to `workflow_step_log_chunks`. |
| 88 | Duplicate `(step_id, seq)` inserts are accepted as idempotent retries. |
| 89 | Before append, the API re-scrubs exact secret values from the job's |
| 90 | claim-time mask snapshot. It also reprocesses any possible secret prefix |
| 91 | carried at the end of the prior chunk, so a runner cannot leak a secret |
| 92 | by splitting it across two log calls. |
| 93 | |
| 94 | `POST /api/v1/jobs/{id}/steps/{step_id}/status` |
| 95 | |
| 96 | Auth: job JWT. Body: |
| 97 | |
| 98 | ```json |
| 99 | {"status":"completed","conclusion":"success"} |
| 100 | ``` |
| 101 | |
| 102 | Valid transitions are `queued|running -> running|completed|cancelled|skipped` |
| 103 | with idempotent repeats of the target terminal state. Completed and |
| 104 | skipped steps require a valid check conclusion; cancelled defaults to |
| 105 | `cancelled` when omitted. The endpoint always returns a `next_token` |
| 106 | because a completed step is not the end of the job. |
| 107 | |
| 108 | When object storage is configured, terminal step updates enqueue |
| 109 | `workflow:finalize_step`. The worker concatenates |
| 110 | `workflow_step_log_chunks` in sequence order, uploads the log to |
| 111 | `actions/runs/<run_id>/jobs/<job_id>/steps/<step_id>.log`, stores that |
| 112 | key and byte count on `workflow_steps`, then deletes the SQL chunks. |
| 113 | |
| 114 | The repository Actions UI reads logs from the same two-stage storage |
| 115 | model. While chunks remain in SQL, a step log page concatenates them in |
| 116 | sequence order and renders a static snapshot. After finalization, the |
| 117 | page reads `workflow_steps.log_object_key` from object storage and |
| 118 | offers a short-lived signed download URL. Live tailing is intentionally |
| 119 | separate and lands in the S41f SSE slice. |
| 120 | |
| 121 | `POST /api/v1/jobs/{id}/status` |
| 122 | |
| 123 | Auth: job JWT. Body: |
| 124 | |
| 125 | ```json |
| 126 | {"status":"completed","conclusion":"success"} |
| 127 | ``` |
| 128 | |
| 129 | Valid transitions are `queued|running -> running|completed|cancelled`. |
| 130 | Completed jobs require a valid check conclusion. The handler updates |
| 131 | `workflow_jobs`, rolls up `workflow_runs`, and best-effort updates the |
| 132 | matching `check_runs` row created by the trigger pipeline. |
| 133 | |
| 134 | `timeout-minutes` is enforced by `shithubd-runner` as a whole-job |
| 135 | deadline. When it expires, the runner kills the active container, |
| 136 | reports the current step as `completed/timed_out`, and reports the job |
| 137 | as `completed/timed_out`. The server treats that conclusion as terminal |
| 138 | failure for the workflow run rollup. |
| 139 | |
| 140 | When a runner reports `status:"cancelled"`, any still-open steps in the |
| 141 | job are marked cancelled too. This keeps a killed job from leaving queued |
| 142 | step rows that the UI would otherwise treat as live. |
| 143 | |
| 144 | S41d PR2 runner execution supports containerized `run:` steps with |
| 145 | per-step log streaming and server-side log finalization. `uses:` aliases |
| 146 | such as `actions/checkout@v4` and artifact upload/download remain |
| 147 | reserved for the later S41d slices that add checkout metadata and |
| 148 | artifact transfer. |
| 149 | |
| 150 | `POST /api/v1/jobs/{id}/artifacts/upload` |
| 151 | |
| 152 | Auth: job JWT. Body: |
| 153 | |
| 154 | ```json |
| 155 | {"name":"test-results.tgz","size_bytes":12345} |
| 156 | ``` |
| 157 | |
| 158 | Creates a `workflow_artifacts` row and returns a pre-signed S3 PUT URL. |
| 159 | The object key is `actions/runs/<run_id>/artifacts/<name>`. |
| 160 | |
| 161 | `POST /api/v1/jobs/{id}/cancel` |
| 162 | |
| 163 | Auth: PAT with `repo:write`, and the actor must have write permission on |
| 164 | the repository that owns the job's workflow run. Browser UI forms use |
| 165 | CSRF-protected repo routes that call the same lifecycle orchestrator. |
| 166 | |
| 167 | Queued jobs are made terminal immediately: |
| 168 | |
| 169 | - `workflow_jobs.status = cancelled` |
| 170 | - `workflow_jobs.conclusion = cancelled` |
| 171 | - `workflow_jobs.cancel_requested = true` |
| 172 | - open steps for that job are marked cancelled |
| 173 | |
| 174 | Running jobs keep `status = running` and get |
| 175 | `cancel_requested = true`. The runner sees this through |
| 176 | `cancel-check`, kills the active container, then reports terminal |
| 177 | `cancelled`. |
| 178 | |
| 179 | `POST /api/v1/runs/{id}/rerun` |
| 180 | |
| 181 | Auth: PAT with `repo:write`, and the actor must have write permission on |
| 182 | the repository that owns the workflow run. Browser UI forms use |
| 183 | CSRF-protected repo routes for the same operation. |
| 184 | |
| 185 | Only terminal workflow runs are rerunnable. A re-run reads the original |
| 186 | workflow file from the source run's `head_sha`, not from the current |
| 187 | branch tip, then enqueues a new `workflow_runs` row with: |
| 188 | |
| 189 | - the same `repo_id`, `workflow_file`, `head_sha`, `head_ref`, event, |
| 190 | and event payload |
| 191 | - `actor_user_id` set to the user requesting the re-run |
| 192 | - `parent_run_id` set to the source run |
| 193 | - a fresh `trigger_event_id` in the `rerun:<source_run_id>:<random>` |
| 194 | namespace |
| 195 | |
| 196 | `POST /api/v1/jobs/{id}/cancel-check` |
| 197 | |
| 198 | Auth: job JWT. Returns: |
| 199 | |
| 200 | ```json |
| 201 | {"cancelled":false,"next_token":"..."} |
| 202 | ``` |
| 203 | |
| 204 | The boolean mirrors `workflow_jobs.cancel_requested`. `shithubd-runner` |
| 205 | polls this endpoint during job execution, serializing it through the |
| 206 | same single-use JWT chain as logs and status updates. On `cancelled: |
| 207 | true`, the Docker engine runs `docker kill <active-container>` and the |
| 208 | runner posts terminal job status `cancelled`. |
| 209 | |
| 210 | ## Metrics |
| 211 | |
| 212 | - `shithub_actions_runner_registrations_total` |
| 213 | - `shithub_actions_runner_heartbeats_total{result="claimed|no_job"}` |
| 214 | - `shithub_actions_runner_jwt_total{result="issued|rejected|replay"}` |
| 215 | - `shithub_actions_jobs_cancelled_total{reason="user|concurrency|timeout"}` |
| 216 | - `shithub_actions_concurrency_queued_total` |
| 217 | - `shithub_actions_log_scrub_replacements_total{location="server"}` |
| 218 | - `shithub_actions_runs_pruned_total{kind="chunks|blobs|runs|jwt_used"}` |
| 219 | - `shithub_actions_step_timeouts_total` |