# Forks (S27) S27 ships fork creation, fork sync (fast-forward only), ahead/behind stats, and the schema columns + triggers that maintain `repos.fork_count`. Cross-fork PRs and the S16 hard-delete-cascade amendment for repacking forks are scoped here too; the cross-fork PR deferral pointer and the S16 amendment are landed in their own sub-sections. ## Schema `repos` gained two columns in 0029: * `fork_count bigint NOT NULL DEFAULT 0` — maintained by the `forks_count_inc` / `forks_count_dec` AFTER triggers on `repos` insert/delete. Decrement uses `GREATEST(... - 1, 0)` so a hand-written DB tweak that violates the trigger doesn't underflow into negatives. * `init_status repo_init_status NOT NULL DEFAULT 'initialized'` — enum `('initialized', 'init_pending', 'init_failed')`. Synchronous repo creates (the S11 path) write `'initialized'` directly. Forks start at `'init_pending'`; the worker job flips to `'initialized'` on success or `'init_failed'` on permanent failure. `fork_of_repo_id` was already present from S11 (the only column the S11 status block actually shipped — `is_fork` and `fork_count` were the missed ones, same shape as the S11/S26 gap noted in the stars-watchers doc). We deliberately did NOT add `is_fork` — it would duplicate `fork_of_repo_id IS NOT NULL` and create the kind of two-source-of- truth drift that the audit penalises. Use the FK predicate. ## On-disk layout `git clone --bare --shared ` creates a fork whose `objects/info/alternates` file points back at the source's `objects/` directory. Disk usage of the fork is essentially refs + a small overhead. The same-volume requirement (S04 `RepoFS.Root()`) is what makes alternates safe — alternates across volumes is undefined behaviour for git. When a fork is created we additionally set `extensions.preciousObjects = true` on the **source** so a future `git gc` on the source can't prune objects the fork reaches via alternates. Idempotent; the fork-clone worker re-asserts on every new fork so missing config is self-healing. ## Worker job `KindRepoForkClone` (`internal/worker/jobs/repo_fork_clone.go`) runs the on-disk clone out of band so fork-create returns fast even for large source repos. Payload is `{source_repo_id, fork_repo_id}`. The job's flow: 1. Reload both repos by id (defends against soft-delete between enqueue and run). 2. `CloneBareShared(sourcePath, forkPath)` — git clone + alternates. 3. `hooks.Install(forkPath, shithubdPath)` — same hook install as the synchronous repo-create path so subsequent user pushes fire `push:process`. 4. `SetPreciousObjects(sourcePath)` — pin source's objects. 5. `SetRepoInitStatus(fork.ID, 'initialized')`. On any permanent failure: flip to `'init_failed'` and return `worker.PoisonError` (no retries). The repo row stays so the user sees the failure; we don't auto-cleanup because that races concurrent retries. ## Sync (fast-forward fork from upstream) `fork.Sync(ctx, deps, actorUserID, forkRepoID)` only fast-forwards. Anything else (merge, rebase) belongs in the user's client; doing either server-side without the user's resolution preferences risks producing commits the user doesn't want. Algorithm: 1. Resolve both default-branch OIDs (`fork`, `source`). 2. If equal → `ErrSyncUpToDate`. 3. If fork is NOT an ancestor of upstream → `ErrSyncDiverged`. 4. CAS update via `repogit.UpdateRefCAS(fork, branch, upstream, fork)` — the trailing `fork` argument is git's old-value guard. A concurrent push to the fork loses the CAS and surfaces as `ErrSyncRefRaced`. 5. Update `repos.default_branch_oid` so the home view reflects the new tip without waiting for `push:process` (update-ref bypasses `post-receive`, same shape as the merge handler's fix in the audit-remediation sprint). Empty fork (no branch yet) is handled via the 40-zero OID literal that git accepts as "ref must not exist yet" semantics — sync to an empty fork creates the branch from upstream's tip. ## Ahead/behind `fork.AheadBehind(ctx, deps, forkRepoID)` returns `{Ahead, Behind, Comparable}` where: * `Ahead` = commits in fork's default branch not in source's. * `Behind` = commits in source's default branch not in fork's. * `Comparable` = false when either side's default ref is missing (empty fork, never-initialised source). Implementation: read both OIDs, then run `git rev-list --left-right --count` *inside the fork's repo*. Because the fork shares object alternates with the source, the upstream OID resolves without an explicit fetch. This is the floor implementation. S36's perf-pass sprint adds an LRU cache keyed on `(fork_repo_id, fork_default_oid, upstream_default_oid)` — already documented in S36's "Code-tab caching" deliverables and on the S00–S25 audit's H4 deferral. ## Visibility floor `fork.allowedTargetVisibility(source, target)` enforces: | source | target=public | target=private | target="" | |---------|---------------|----------------|-----------| | public | ✓ | ✓ | public | | private | ✗ | ✓ | private | Forking private → public would expose previously-private content and is always rejected (`ErrVisibilityFloor`). ## Permission lattice `policy.ActionForkCreate` was already in the registry from earlier sprints. Today's gating shape: * Anonymous on any repo → deny (`DenyAnonymous`). * Logged-in on a repo they can read → allow (login-required, no role gate). * Logged-in on a private repo they CAN'T read → deny (`DenyVisibility`, leaks as 404 at the handler layer). Suspended actors are blocked by step 3 of `policy.Can` (suspended + write action → deny). Fork create counts as write (it mutates the target owner's namespace). ## Cross-fork PRs (deferred to a follow-up) S27's spec lists cross-fork PR support as in-scope, but the actual plumbing — fetching the fork's head into the base repo's `refs/shithub-pr//head` namespace and routing the merge from the internal ref — is large enough that this sprint ships fork creation, sync, and ahead/behind only. The cross-fork PR work is tracked here as a follow-up: * Extend `pulls.Create` to accept `head_repo_id != repo_id`. * Add `repogit.FetchIntoNamespace` (already shipped in this sprint for the eventual consumer). * `pulls.Synchronize` reads head from the internal ref when `pull_request.head_repo_id != pull_request.repo_id`. * `pulls.Merge` worktree-add reads head from the internal ref. * Re-check fork visibility at merge time (the merger may have lost read access on the head between PR open and merge). The internal ref is private — we never advertise it via `info/refs`. The git-http handler's ref filter already restricts to `refs/heads/*` and `refs/tags/*`, so the namespace is naturally hidden. ## S16 hard-delete cascade amendment When a source repo with active forks is hard-deleted, the forks become orphans (`fork_of_repo_id ON DELETE SET NULL` from the existing FK). Today the orphan forks have only the *refs* they added since fork — the objects up to fork point still live in the source. Hard-deleting the source would prune those objects and break the orphan forks. The fix is to repack each fork before removing the source: ``` git repack -a -d --no-shared ``` …runs in the fork's repo, copies all reachable objects into the fork's own pack, then we can safely delete the source. This is a `KindRepoForkRepackOnSourceDelete` job (deferred from S16; see `ListForksOfRepoForRepack` query that this sprint shipped for it). The lifecycle worker's `repo_hard_delete` step needs to fan out one repack job per fork, await completion, then proceed with the FS delete. The query is in place; the job + the cascade wiring land in a follow-up commit (or in S37 when the deploy plan freezes the hard-delete sequence). ## Routes | Method | Path | Auth | Notes | |--------|-------------------------------------|---------------|------------------------------------| | POST | `/{owner}/{repo}/fork` | RequireUser | Create a fork | | POST | `/{owner}/{repo}/sync` | RequireUser | Fast-forward fork from upstream | | GET | `/{owner}/{repo}/forks` | public | Paginated list of forks | The `/fork` POST emits a `forked` domain event (kind=`forked`, source_kind=`repo`) into S26's `domain_events` log so the future activity feed picks it up. The `/sync` POST emits `repo_fork_synced` through the audit log only (no public event). The fork-create handler also auto-watches the new fork at `level=all` so the user sees fork-side events without having to opt in. Matches GitHub's "watching your own forks" default. ## Pitfalls noted in code * Source-repo GC pruning fork-needed objects → `preciousObjects`. * Source-repo deletion with active forks → S16 amendment (above). * Cross-fork PR with deleted fork → mark `mergeable_state='blocked'` with "head repository deleted" reason at the merge gate (lands with cross-fork PR work). * Fork rename / transfer → `fork_of_repo_id` is by-id so the relationship survives. * Sync race with concurrent push → CAS on update-ref; surfaces as `ErrSyncRefRaced`. * Fork-of-fork chains → spec leans "flatten alternates to root". Today the clone uses `--shared` against whatever path we pass; if the source is itself a fork, the alternates chain is two levels deep. Acceptable for v1; the flattening lands when fork-of-fork becomes a real user complaint.