- gofumpt fixes across the trigger package + dispatch handler
- drop the stale 'startedAtNow' placeholder var in enqueue.go that
the lint-unused script flagged as a dead 'silence unused import'
shim (it was originally a hint for S41c+, but never used)
- scripts/lint-unused.sh: ${ALLOWED_FILES[@]:-} so an empty
array doesn't trip set -u under macOS bash 3.2
Documents the three-layer flow (caller → worker → enqueue), the
trigger_event_id idempotency convention with the per-caller
construction table, the per-event-kind match semantics + glob
subset, the conservative collaborator gate decision, and the
workflow_dispatch HTTP surface. Calls out the S41b/S41b-2 split
and what's deliberately out of scope until S41c+.
Adds two metrics per the S41 campaign's observability commitments:
- shithub_actions_runs_enqueued_total{event,result}
Incremented in trigger.Enqueue. Result is 'fresh' for newly
inserted runs or 'already_exists' when ON CONFLICT DO NOTHING
fired (worker retry / admin replay landing on the same
trigger_event_id). Lets dashboards show the dedup-effectiveness.
- shithub_actions_trigger_match_duration_seconds (histogram)
Wall-clock time the trigger handler spends discovering +
parsing + matching workflows for one triggering event. The
'HEAD scan cost' pitfall in the sprint spec — alert when p95
drifts up.
Both register against the shared metrics.Registry. Instrumentation
sites: enqueue.Enqueue (counter) and handler.Handler (histogram via
deferred Observe so the timer covers the entire match loop).
The workflow_dispatch HTTP surface from the S41b spec. Synchronous
trigger.Enqueue (no discovery needed since the file is named in the
URL); 204 No Content on success.
Auth: requires repo write (policy.ActionRepoWrite). Body:
{ 'ref': 'refs/heads/main' (optional, defaults to default branch),
'inputs': {key: value ...} }
Validation (defense-in-depth on the URL parameter):
- file must start with .shithub/workflows/ (or be normalized to it)
- no path traversal (.. anywhere)
- must end in .yml or .yaml
- basename only (no nested subdirs in the file param)
Body capped at 64 KiB. Each dispatch click produces a fresh
trigger_event_id ('dispatch:<file>:<sha>:<8-byte-hex>') so the same
workflow at the same SHA can be dispatched multiple times by a human
and each fires distinct runs.
Wired through the new RepoActionsAPIMounter field on Deps. Mounted in
a RequireUser group (matches the per-route policy.Can re-check inside
the handler — RequireUser doesn't replace the write check, just
prevents anonymous calls from reaching the policy gate).
UI button is disabled until S41f wires it; this PR exposes the
endpoint so curl + future UI both have a target.
PR-create-side trigger fan-out. Lives in the pulls orchestrator (not
a domain_event watcher) so the open path stays self-contained.
- trigger_event_id = 'pr_opened:<pr_id>:<head_sha>' — distinct from
the synchronize key so the same SHA can produce both an 'opened'
run AND a 'synchronize' run if the workflow author listens for both
- same collaborator gate as the synchronize site (actor must be the
repo's owning user)
- changed_paths derived from base..head; best-effort on diff failure
Both PR sites use the same pattern: build canonical event payload via
internal/actions/event, populate filter hints (action, base_ref,
head_ref_short, changed_paths) on the worker payload, enqueue.
Hooks the trigger pipeline into PR head-movement events. After
pulls.Synchronize finishes (commits + files refresh), enqueue a
workflow:trigger with:
- trigger_event_id = 'pr_synchronize:<pr_id>:<head_sha>' — stable
across worker retries; rebases on the same head no-op
- canonical event payload via internal/actions/event.PullRequest
- action='synchronize'
- changed_paths from the base..head diff
- actor = issue.author_user_id
Collaborator gate: actor must be the repo's owning user. External
contributors + non-owner org members don't trigger in v1 — explicit
collaborator support is parked behind a TODO requiring a richer
policy lookup that the worker context doesn't easily reach today.
Best-effort: failures here log and let the rest of the synchronize
chain (mergeability, notify) complete.
Hooks the actions trigger pipeline into the existing push pipeline.
After the push event is processed (size recalc, webhook stash, PR
sync, stale checks), enqueue a workflow:trigger job with:
- trigger_event_id = 'push:<push_event_id>' — stable, dedups retries
- canonical event payload via internal/actions/event.Push
- branch + tag classification from event.Ref
- changed paths from gitops.ChangedPaths(before, after) so paths:
filters in workflows actually trigger
- actor = pusher_user_id
Skipped when:
- event.Ref isn't refs/heads/* (notes / non-tracked refs)
- after-sha is the zero SHA (branch delete — no commit to scan)
Best-effort enqueue: a failure to schedule the trigger logs at warn
and lets the rest of the push pipeline complete. The push is still
marked processed so the next push isn't retried indefinitely; the
trigger handler will run on the *next* successful push if the
workflow file shape didn't change.
Computes repo-relative paths touched by a push range (before, after].
Used by the actions trigger pipeline to evaluate on.push.paths and
on.pull_request.paths filters.
Special cases:
- before is the all-zero SHA (new-branch creation): list every path
at after instead. Matches GHA's 'new branch surfaces all files as
changed' semantic — workflows with paths: filters still trigger.
- before exists but git can't find it (pruned): same fallback to
'list all paths at after' so we never silently skip a workflow
that should have triggered.
Uses -z + NUL split to handle paths with embedded newlines or
unusual chars correctly. De-duplicates via a small map (rename pairs
emit both old + new in --name-only).
Wires trigger.Handler into the worker pool. Lives at the tail of the
Register block alongside the other subsystems; the comment notes the
queued-forever state until S41c+ adds the runner.
The worker-pool surface for the trigger pipeline. Subsystem-local
constant (KindWorkflowTrigger = 'workflow:trigger') mirrors the
webhook subsystem's pattern of owning its kinds alongside the
handler — keeps the central worker/types.go from collecting every
kind across every subsystem.
JobPayload carries:
- identity: repo_id, head_sha, head_ref, trigger_event_id,
actor_user_id, event_kind
- canonical event payload (already shaped by the caller via
internal/actions/event)
- filter hints (branch, tag, action, base_ref, head_ref_short,
changed_paths, cron) extracted from the event payload by the
caller and passed explicitly so the handler doesn't need to
know the canonical schema's internal layout
Handler flow:
1. Load repo + resolve owner login → on-disk gitDir via RepoFS.
2. Discover .shithub/workflows/*.yml at head_sha.
3. Per file: parse, run Match against an Event built from the
filter hints, and Enqueue on a hit. Per-file errors log + skip
so one bad workflow doesn't block its peers on the same commit.
4. Discovery skips (oversized, IO error) log via deps.Logger.
Errors during discovery or repo-lookup return non-nil so the worker
retries. Per-file enqueue failures continue to the next file —
already-existing runs are AlreadyExists from Enqueue, logged but
treated as success.
- TestEnqueue_HappyPath: fresh run lands with jobs+steps+check_run.
- TestEnqueue_IdempotentSecondCall: same trigger_event_id replayed
returns AlreadyExists, same RunID, single workflow_runs row.
- TestEnqueue_DifferentTriggerEventIDsDoNotCollide: rerun with a
different trigger_event_id + parent_run_id produces a new run.
- TestEnqueue_EmptyTriggerEventIDIsRejected: validation catches the
silent-bypass-idempotency footgun.
- TestEnqueue_RunIndexIsPerRepoMonotonic: r2.RunIndex == r1+1.
- TestEnqueue_ChildRowsExist: the per-tx run+jobs+steps insertion
lands all three layers atomically (no orphan runs).
- TestEnqueue_ConflictDetectsExistingRun: pgx.ErrNoRows from the
ON CONFLICT path is correctly translated to AlreadyExists rather
than bubbling out as an error.
The core of the trigger pipeline. One transaction inserts:
- workflow_runs (via EnqueueWorkflowRun, ON CONFLICT DO NOTHING
against the partial unique index from migration 0051)
- workflow_jobs per parsed job
- workflow_steps per parsed step
Outside the tx, after commit:
- check_runs per job, ExternalID = 'workflow_run:<run_id>:job:<key>'
so a retry of just this phase converges cleanly via checks.Create's
idempotency lookup.
Conflict path: when ON CONFLICT fires (worker retry, admin replay of
the same triggering event), the inner tx commits empty, the function
looks up the existing workflow_run by its trigger_event_id, and
returns Result{AlreadyExists: true}. The handler treats that as a
successful no-op — child rows from the prior successful call are
still in place because they were inserted in the same tx as the run.
Atomicity boundary: the run + its jobs + its steps land together or
not at all. check_runs lag is acceptable — the run is queued and
visible, no runner picks it up yet (S41c+), and ExternalID makes
reconciliation safe.
Validates required params (RepoID, WorkflowFile, HeadSHA,
TriggerEventID, Workflow with at least one job, supported event kind).
Empty TriggerEventID is explicitly rejected — would silently bypass
idempotency (the partial unique index excludes empty).
Two query additions/tweaks for the Enqueue function:
- NextRunIndexForRepo now casts (MAX+1)::bigint so sqlc emits int64
matching the run_index column type. Pre-cast it returned int32
because the +1 literal promoted to int4.
- LookupWorkflowRunByTriggerEvent: companion query for the conflict
path. When EnqueueWorkflowRun's ON CONFLICT DO NOTHING returns no
rows, the trigger handler uses this to find the existing row and
surface a stable RunID.
Same shared-schema-dir pattern as prior rounds — every package's
locally-generated WorkflowRun struct picks up the new field.
Captures the regeneration so 'sqlc generate' is idempotent.
Two queries on workflow_runs now:
- InsertWorkflowRun (existing): always inserts, errors on conflict.
Kept stable for tests/admin tooling that wants the force-insert.
- EnqueueWorkflowRun (new): trigger-pipeline path, returns
pgx.ErrNoRows on conflict so the handler treats duplicate
triggering events as a successful no-op.
Both queries return the new trigger_event_id column. ON CONFLICT
predicate matches the partial unique index from migration 0051;
postgres needs both to agree to infer the target.
Idempotency on the *triggering event* — the robust pattern. Each
push, PR transition, dispatch click, or cron tick is a unique
triggering event with a stable identifier constructed by the
enqueueing site:
- push_process → 'push:<push_event_id>'
- PR open/synchronize → 'pr_<action>:<pr_id>:<head_sha>'
- workflow_dispatch → 'dispatch:<workflow_id>:<request_uuid>'
- schedule sweep → 'schedule:<workflow_id>:<window_start_unix>'
Partial UNIQUE on (repo_id, workflow_file, trigger_event_id) WHERE
trigger_event_id <> '' is the dedup key. Trigger handler does
INSERT ... ON CONFLICT DO NOTHING; worker retries + admin replays of
the same triggering event no-op cleanly.
Re-runs (the future Re-run button) explicitly produce a new
trigger_event_id (e.g. 'rerun:<original_run_id>:<request_uuid>') so
they don't collide. parent_run_id chains the new run back to its
ancestor — history preserved.
DEFAULT '' covers the migration-time backfill (no rows yet); the
partial UNIQUE excludes empty so backfilled rows don't constrain
each other. New code is responsible for always providing a
non-empty trigger_event_id.
Walks .shithub/workflows/ at a given SHA, returns each .yml/.yaml
blob's bytes (capped at workflow.MaxWorkflowFileBytes) plus a
parallel skips list for files that exceeded the cap or had a read
error. The handler logs skips so operators see workflows that have
grown out of bounds without other workflows on the same commit
getting blocked.
Repos without .shithub/workflows/ (the common case for repos that
haven't adopted CI yet) return clean (nil, nil, nil) — typed via
git.ErrNotATree + a stderr-text check for 'not a valid object name'.
Those aren't errors; they're 'no work to do'.
Discovery does NOT parse — separation of concerns. The handler
parses each returned file and logs+skips per-file on parse error or
non-empty Error diagnostics. Keeps discovery work intact when one
workflow file is malformed.
Tests use the existing gitops.InitialCommit fixture builder against
fresh bare repos: the happy path with mixed yaml/non-yaml entries,
the no-CI repo, and an oversized-file split where huge.yml goes to
skips and normal.yml still lands in files.
Match(workflow, event) → bool. No I/O, no DB; cheap to fuzz.
Per-kind semantics:
- push: branch vs tag classified from ev.Branch/ev.Tag, only the
matching filter list applies (branches: filter doesn't accept
tag pushes and vice versa). paths: filter (when set) requires
at least one changed path to match. Empty filter = match-all.
- pull_request: types: defaults to ['opened', 'synchronize',
'reopened'] when omitted (GHA parity). branches: applies to the
BASE ref (the PR's destination). paths: as for push.
- schedule: requires the workflow to declare the cron expression
that fired. The sweep is the source of truth for which cron is
firing; we just gate on declaration. Avoids interpreting cron
semantics in two places.
- workflow_dispatch: matches whenever the workflow declares
on.workflow_dispatch.
Strict-allowlist posture: nil workflow, unknown event kinds, and
events of a kind the workflow doesn't declare all return false
silently — mirrors S41a's parser/evaluator stance.
Tests: 30+ table-driven cases covering branch/tag/path filters,
default + custom PR types, base-ref filter, schedule cron-match,
dispatch declared/undeclared, kind-mismatch + nil/unknown safety.
Pattern matcher for branches:/tags:/paths: filter lists. Translates
each pattern to an anchored regex once (memoized in a sync.Map) so a
workflow's branches: list compiles on first hit and stays warm for
every subsequent candidate string.
Pattern semantics — the GHA-compatible subset that v1 fixtures use:
- literal: main → exact match
- *: any non-/ chars (one path segment)
- **: any chars (zero or more segments + slashes)
- /** at end: optional trailing (feature/** matches feature too)
- **/ at start: optional leading (**/*.go matches main.go)
- !pattern: exclude (last-match-wins; mirrors minimatch)
- empty list: match all
- exclude-only list: implicit-include + exclusions
Tests pin every interesting case including the zero-segment edges
that are easy to get wrong in hand-rolled recursion.
Phase 1 of the trigger pipeline. The Event type is the typed input
the (still-to-come) Match function consults — built by each caller
from its triggering source (push_event row, PR transition, cron
sweep, dispatch HTTP body) and passed in.
Four event kinds for v1 (push/pull_request/schedule/workflow_dispatch);
anything else returns false from Match silently — mirrors S41a's
strict-allowlist posture rather than erroring on unrecognized triggers.
Field naming follows GHA convention (Ref/Branch/Tag/Action/BaseRef/
HeadRef/ChangedPaths/Cron). Each field is documented with the kinds
that actually populate it; unrelated fields stay zero.
Pre-L6, an env entry with a dotted key (e.g. {"weird.key": ...}) produced
diagnostic paths like 'env.weird.key' — ambiguous between the
single key 'weird.key' and a nested env.weird.key. Authors reading
admin-actions-parse output had to deduce the boundary.
joinPath helper renders identifier-shaped keys unbracketed
(parent.child) and bracket-quotes anything else
(parent["weird.key"]). Wired into parseEnv where env names are
unconstrained YAML — the worst offender. Other callers can adopt
incrementally.
Local pathIdentRe rather than reusing the L2 polish-bundle's
canonical identRe so this cleanup PR is independent of #67. Once
both land we can dedupe.
Adds a 'Where the flag lives' subsection so future readers don't
have to deduce the single-source-of-truth from grep. References
the L5 cleanup that dropped the duplicate field.
Pre-L5 workflow.Value had a Tainted bool field plus a Tainted()
constructor — both unused. The parser only ever called V() (which
zeroed Tainted) so the field was always false. Two different
structs in two different packages both named Value claimed to own
the taint contract; the architecture doc has always pointed at
expr.Value.Tainted as load-bearing.
Single source of truth now: workflow.Value carries Raw only;
expr.Value carries the taint flag. Constructor V() preserved with
trimmed signature (no Tainted to zero).
Audit found we relied on go.yaml.in/yaml/v3's built-in defense
without acknowledging it. Reading the source: yaml.v3 has its own
ratio-based check (alias count > 100 AND decode count > 1000 AND
ratio above the size-scaled threshold) but the constants are baked
in and not configurable.
Our 100-alias hard cap fires on documents the ratio defense ignores
(small files where the decode count is too low to trigger). Both
layers are real defense; document the relationship so the next
maintainer knows what 'we own' vs 'the library owns'.