Commits & blame
S18 ships the commits list, single-commit page, blame view, and the
Atom feed. Every page resolves commit author emails to shithub user
identities (display name + avatar + profile link) when there's a
verified user_emails row, and falls back to the raw author + a
deterministic identicon seed when there isn't.
Routes
| Route | Handler |
|---|---|
GET /{owner}/{repo}/commits/{ref}/* |
commitsList |
GET /{owner}/{repo}/commits/{ref}.atom |
commitsAtom |
GET /{owner}/{repo}/commit/{sha} |
commitView |
GET /{owner}/{repo}/blame/{ref}/{path...} |
blameView |
The commits/* and blame/* patterns use chi's * wildcard so
branches with / in their name (feature/x, release/v1.0/beta)
resolve correctly via repogit.ResolveRef (longest-prefix match
against the cached ref list, hex-SHA shortcut for 40-char first
segments). Same logic the code-tab uses.
{sha} accepts 7..40 hex chars; git itself disambiguates short SHAs.
git plumbing
repogit.Log(ctx, gitDir, opts)— singlegit log --format=...call with packed-record output (ASCII unit-separator + record-end markers so newlines in commit bodies don't confuse the parser). Supports--max-count,--skip,--author,--since,--until, optional--follow -- <path>.repogit.GetCommit(ctx, gitDir, sha)—git log -1 --format=...for metadata + parents + tree, thenrepogit.DiffStatfor the per-file change rows.repogit.DiffStat(ctx, gitDir, sha)— combinesgit diff-tree -r --root --name-status -M -C(status + rename pairs) andgit diff-tree -r --root --numstat(insert/delete counts,-flag for binary). The--rootflag is what makes the initial parentless commit emit anything.repogit.Blame(ctx, gitDir, opts)—git blame --line-porcelainparsed intoBlameLines, then collapsed viagroupBlameintoBlameChunks for the rendered "consecutive lines from the same commit collapse the gutter" UX. Caps at 5 MiB / 50k lines (returnsErrBlameTooLarge); refuses on non-blobs (ErrBlameOnBinary).
Identity resolution
internal/repos/identity/Resolver is a per-request memo that maps
author emails to Resolved records:
User=true→ matched a verifieduser_emailsrow whose user is not suspended/deleted. Fields populated: UserID, Username, DisplayName, AvatarURL.User=false→ unknown email; render the raw author name with a deterministic identicon seed (md5 hex of the lowercased email, matching the gravatar/identicon convention).
Construct one resolver per request and pass it through the page render. The cache is in-process and request-scoped — across-request caching is S36 territory.
A user with multiple verified emails (work + personal) resolves on
either: the lookup checks user_emails against any verified row, not
just the primary. Document this in the user-facing help when /settings
docs land.
File-changed table on the commit view
S18 emits the rows + per-file +X / -Y stats; the per-file diff body
slot is left structural so S19's diff renderer drops in without
re-rendering. The status column uses git's letters: A M D R C T.
Renames/copies show "old → new" in the path column.
Atom feed
Lightweight: title, id, updated, link, then per-entry id (full SHA),
title (subject), updated (author time), author name+email, summary
(commit body). The ID URN format urn:shithub:commit:<sha> is stable
across renames and visibility flips. The feed is capped at 50 commits.
Issue-ref linkification (S21 hook)
Commit-body URLs are linkified inline; #NNN and owner/repo#NNN
issue refs are emitted as stable tokens
<span data-ref="#123">#123</span> so the S21 issue layer can
post-render-link them without re-rendering. The token shape is
documented here so the S21 enhancer doesn't have to re-derive it.
Caching
Currently no caching layer — every request runs the relevant git
plumbing fresh. The S18 spec calls out cache keys
((repo_id, ref_oid, page, filters) for commit lists,
(repo_id, ref_oid, path) for blame, (repo_id, sha) for single
commits) and push-event invalidation. Wiring lives with the rest of
the code-tab caching deferral in S36.
Pitfalls handled
- Encoding in commit messages: bodies are HTML-escaped before any
link/ref substitution; non-UTF-8 characters render as the closest
fallback
template.HTMLEscapeStringproduces (Go's escaper accepts arbitrary byte slices). - Path traversal in blame paths: same
validateSubpathguard the code-tab uses. - Memory on
--line-porcelain: parsed in a streamingbufio.Scannerwith a 1 MiB max line length; we neverio.ReadAllthe porcelain output. - Initial-commit DiffStat: requires
--rootflag to diff against the empty tree. Without it, root commits show no files. - SHA collision with
feature/abcdef…: ref-list lookup wins over hex-SHA shortcut when the same string appears in both — same rule as the code-tabResolveRef. - Blame on binary: rejected early via
StatPath(which already knows the kind fromcat-file -t). No expensivegit blameruns.
Deferred polish
- Tree last-commit-per-entry column (
POST /tree-commitshtmx fragment) is the S17 deferral that lands here. The sharedgit log --name-statuswalk that powers per-file history can also power this column. Not wired in this commit set — the column slot exists intree.htmland the data shape is straight Log + DiffStat. Wire when we want the polish. - Caching layer with push-event invalidation → S36.
- Older / Newer blame navigation (walk the chunk's commit's parent
for that path) — UI links not yet emitted; the data path is a
git log -1 <chunk_sha>^ -- <path>away. Wire when needed. - Signed-commit verification badge — placeholder slot only; actual verification post-MVP.
View source
| 1 | # Commits & blame |
| 2 | |
| 3 | S18 ships the commits list, single-commit page, blame view, and the |
| 4 | Atom feed. Every page resolves commit author emails to shithub user |
| 5 | identities (display name + avatar + profile link) when there's a |
| 6 | verified `user_emails` row, and falls back to the raw author + a |
| 7 | deterministic identicon seed when there isn't. |
| 8 | |
| 9 | ## Routes |
| 10 | |
| 11 | | Route | Handler | |
| 12 | | ---------------------------------------------------- | ------------------------ | |
| 13 | | `GET /{owner}/{repo}/commits/{ref}/*` | `commitsList` | |
| 14 | | `GET /{owner}/{repo}/commits/{ref}.atom` | `commitsAtom` | |
| 15 | | `GET /{owner}/{repo}/commit/{sha}` | `commitView` | |
| 16 | | `GET /{owner}/{repo}/blame/{ref}/{path...}` | `blameView` | |
| 17 | |
| 18 | The `commits/*` and `blame/*` patterns use chi's `*` wildcard so |
| 19 | branches with `/` in their name (`feature/x`, `release/v1.0/beta`) |
| 20 | resolve correctly via `repogit.ResolveRef` (longest-prefix match |
| 21 | against the cached ref list, hex-SHA shortcut for 40-char first |
| 22 | segments). Same logic the code-tab uses. |
| 23 | |
| 24 | `{sha}` accepts 7..40 hex chars; git itself disambiguates short SHAs. |
| 25 | |
| 26 | ## git plumbing |
| 27 | |
| 28 | * `repogit.Log(ctx, gitDir, opts)` — single `git log --format=...` call |
| 29 | with packed-record output (ASCII unit-separator + record-end markers |
| 30 | so newlines in commit bodies don't confuse the parser). Supports |
| 31 | `--max-count`, `--skip`, `--author`, `--since`, `--until`, optional |
| 32 | `--follow -- <path>`. |
| 33 | * `repogit.GetCommit(ctx, gitDir, sha)` — `git log -1 --format=...` |
| 34 | for metadata + parents + tree, then `repogit.DiffStat` for the |
| 35 | per-file change rows. |
| 36 | * `repogit.DiffStat(ctx, gitDir, sha)` — combines |
| 37 | `git diff-tree -r --root --name-status -M -C` (status + rename |
| 38 | pairs) and `git diff-tree -r --root --numstat` (insert/delete |
| 39 | counts, `-` flag for binary). The `--root` flag is what makes the |
| 40 | initial parentless commit emit anything. |
| 41 | * `repogit.Blame(ctx, gitDir, opts)` — `git blame --line-porcelain` |
| 42 | parsed into `BlameLine`s, then collapsed via `groupBlame` into |
| 43 | `BlameChunk`s for the rendered "consecutive lines from the same |
| 44 | commit collapse the gutter" UX. Caps at 5 MiB / 50k lines (returns |
| 45 | `ErrBlameTooLarge`); refuses on non-blobs (`ErrBlameOnBinary`). |
| 46 | |
| 47 | ## Identity resolution |
| 48 | |
| 49 | `internal/repos/identity/Resolver` is a per-request memo that maps |
| 50 | author emails to `Resolved` records: |
| 51 | |
| 52 | * `User=true` → matched a verified `user_emails` row whose user is |
| 53 | not suspended/deleted. Fields populated: UserID, Username, |
| 54 | DisplayName, AvatarURL. |
| 55 | * `User=false` → unknown email; render the raw author name with a |
| 56 | deterministic identicon seed (md5 hex of the lowercased email, |
| 57 | matching the gravatar/identicon convention). |
| 58 | |
| 59 | Construct one resolver per request and pass it through the page render. |
| 60 | The cache is in-process and request-scoped — across-request caching is |
| 61 | S36 territory. |
| 62 | |
| 63 | A user with multiple verified emails (work + personal) resolves on |
| 64 | either: the lookup checks `user_emails` against any verified row, not |
| 65 | just the primary. Document this in the user-facing help when /settings |
| 66 | docs land. |
| 67 | |
| 68 | ## File-changed table on the commit view |
| 69 | |
| 70 | S18 emits the rows + per-file `+X / -Y` stats; the per-file diff body |
| 71 | slot is left structural so S19's diff renderer drops in without |
| 72 | re-rendering. The status column uses git's letters: `A M D R C T`. |
| 73 | Renames/copies show "old → new" in the path column. |
| 74 | |
| 75 | ## Atom feed |
| 76 | |
| 77 | Lightweight: title, id, updated, link, then per-entry id (full SHA), |
| 78 | title (subject), updated (author time), author name+email, summary |
| 79 | (commit body). The ID URN format `urn:shithub:commit:<sha>` is stable |
| 80 | across renames and visibility flips. The feed is capped at 50 commits. |
| 81 | |
| 82 | ## Issue-ref linkification (S21 hook) |
| 83 | |
| 84 | Commit-body URLs are linkified inline; `#NNN` and `owner/repo#NNN` |
| 85 | issue refs are emitted as stable tokens |
| 86 | `<span data-ref="#123">#123</span>` so the S21 issue layer can |
| 87 | post-render-link them without re-rendering. The token shape is |
| 88 | documented here so the S21 enhancer doesn't have to re-derive it. |
| 89 | |
| 90 | ## Caching |
| 91 | |
| 92 | Currently **no caching layer** — every request runs the relevant git |
| 93 | plumbing fresh. The S18 spec calls out cache keys |
| 94 | (`(repo_id, ref_oid, page, filters)` for commit lists, |
| 95 | `(repo_id, ref_oid, path)` for blame, `(repo_id, sha)` for single |
| 96 | commits) and push-event invalidation. Wiring lives with the rest of |
| 97 | the code-tab caching deferral in **S36**. |
| 98 | |
| 99 | ## Pitfalls handled |
| 100 | |
| 101 | * **Encoding in commit messages**: bodies are HTML-escaped before any |
| 102 | link/ref substitution; non-UTF-8 characters render as the closest |
| 103 | fallback `template.HTMLEscapeString` produces (Go's escaper accepts |
| 104 | arbitrary byte slices). |
| 105 | * **Path traversal in blame paths**: same `validateSubpath` guard the |
| 106 | code-tab uses. |
| 107 | * **Memory on `--line-porcelain`**: parsed in a streaming `bufio.Scanner` |
| 108 | with a 1 MiB max line length; we never `io.ReadAll` the porcelain |
| 109 | output. |
| 110 | * **Initial-commit DiffStat**: requires `--root` flag to diff against |
| 111 | the empty tree. Without it, root commits show no files. |
| 112 | * **SHA collision with `feature/abcdef…`**: ref-list lookup wins over |
| 113 | hex-SHA shortcut when the same string appears in both — same rule |
| 114 | as the code-tab `ResolveRef`. |
| 115 | * **Blame on binary**: rejected early via `StatPath` (which already |
| 116 | knows the kind from `cat-file -t`). No expensive `git blame` runs. |
| 117 | |
| 118 | ## Deferred polish |
| 119 | |
| 120 | * **Tree last-commit-per-entry column** (`POST /tree-commits` htmx |
| 121 | fragment) is the S17 deferral that lands here. The shared |
| 122 | `git log --name-status` walk that powers per-file history can also |
| 123 | power this column. Not wired in this commit set — the column slot |
| 124 | exists in `tree.html` and the data shape is straight Log + DiffStat. |
| 125 | Wire when we want the polish. |
| 126 | * **Caching layer** with push-event invalidation → S36. |
| 127 | * **Older / Newer blame navigation** (walk the chunk's commit's parent |
| 128 | for that path) — UI links not yet emitted; the data path is a |
| 129 | `git log -1 <chunk_sha>^ -- <path>` away. Wire when needed. |
| 130 | * **Signed-commit verification badge** — placeholder slot only; |
| 131 | actual verification post-MVP. |