Code tab
The code tab is the GitHub-style repo browser: tree listing, blob view
with syntax highlighting, raw view, "Go to file" finder, and the
branch/tag switcher. For populated repos, /{owner}/{repo} renders the
default branch Code tab directly, matching GitHub's canonical repo URL.
Routes
| Route | Handler |
|---|---|
GET /{owner}/{repo} |
default-branch Code tab |
GET /{owner}/{repo}/tree/{ref}/{path...} |
codeTree |
GET /{owner}/{repo}/blob/{ref}/{path...} |
codeBlob |
GET /{owner}/{repo}/raw/{ref}/{path...} |
codeRaw |
GET /{owner}/{repo}/find/{ref}?q=... |
codeFinder |
GET /{owner}/{repo}/edit/{ref}/{path...} |
in-browser file editor |
POST /{owner}/{repo}/edit/{ref}/{path...} |
commit file edit / rename |
GET /{owner}/{repo}/new/{ref}/{path...} |
in-browser new-file editor |
POST /{owner}/{repo}/new/{ref}/{path...} |
commit new file |
GET /{owner}/{repo}/delete/{ref}/{path...} |
delete-file confirmation |
POST /{owner}/{repo}/delete/{ref}/{path...} |
commit file deletion |
GET /{owner}/{repo}/upload/{ref}/{path...} |
upload-files form |
POST /{owner}/{repo}/upload/{ref}/{path...} |
commit uploaded files |
POST /{owner}/{repo}/markdown-preview |
editor Markdown preview fragment |
GET /{owner}/{repo}/actions |
parked product-tab shell |
GET /{owner}/{repo}/projects |
parked product-tab shell |
GET /{owner}/{repo}/wiki |
parked product-tab shell |
GET /{owner}/{repo}/security |
parked product-tab shell |
GET /{owner}/{repo}/pulse |
parked product-tab shell |
GET /{owner}/{repo}/packages |
parked product-tab shell |
GET /{owner}/{repo}/releases |
parked product-tab shell |
GET /static/css/chroma.css |
runtime-generated Chroma theme |
Every code-tab handler runs through policy.Can(... ActionRepoRead) —
private repos hide from anonymous viewers and unrelated users via the
existence-leak 404 guard from S15.
The in-browser mutation routes run through policy.Can(... ActionRepoWrite).
They inherit the same archived-repo, suspended-user, collaborator-role,
site-admin, and private-repo existence behavior as git push surfaces.
Repository product tabs
The repo header intentionally exposes GitHub's major product-map tabs: Code, Issues, Pull requests, Actions, Projects, Wiki, Security and quality, Insights, and Settings when visible to the viewer. Forks remain available from the repo action button and About sidebar, but are not a top-level tab on GitHub.
Actions, Projects, Wiki, Security and quality, Insights, Packages, and
Releases currently render honest parked shells via repo/deferred_tab.
They are public read surfaces gated by ActionRepoRead, so private repo
existence behavior matches Code/Issues/Pull requests while the deeper
systems remain assigned to their later sprints.
Ref + path disambiguation
{ref} is the chi * wildcard, so the URL /tree/feature/x/sub/file.go
arrives as a single string. Resolution:
- If the first segment is exactly 40 hex chars → treat as a SHA, the rest is the path.
- Otherwise, longest-prefix match against the cached ref list (branches first, then tags, sorted longest-first). The remainder after the matched ref is the in-tree path.
This handles release/v1.0/beta/CHANGELOG.md correctly without
ambiguity. Resolution lives in internal/repos/git/treeops.go::ResolveRef.
When the matched ref is a raw 40-character commit SHA, the tree page resolves the top commit summary against that SHA and displays the short SHA in the ref switcher, matching GitHub's detached-commit tree view.
Path validation rejects .., control chars, leading slashes, and
backslashes — defense in depth on top of git's own validation.
Tree listing
git ls-tree --long --full-tree <ref>:<path> is parsed into typed
TreeEntry values (tree | blob | commit | symlink). Sort is
directories first, then files alphabetically.
commit entries are git submodule pointers. When .gitmodules exists
on the rendered ref, the Code tab parses it once, matches entries by
submodule path, and links GitHub or configured shithub clone remotes to
the local /{owner}/{repo}/tree/{gitlink-oid} route when the target
repo has that commit.
If the target repo exists locally but does not have the pinned commit
object, the handler first checks repo_source_remotes for that target
repo. A stored source remote is the durable source of truth for imports:
the handler validates it with the shared SSRF defense, performs a
bounded, non-forced fetch of heads/tags, re-checks the object, and then
links to the exact detached-commit tree when it arrived. Successful
backfills update the target repo's default-branch OID when that ref
moved, mark the source remote fetched, and enqueue the same code-index
and size-recalc maintenance used after pushes.
GitHub URL/name inference remains as a compatibility fallback for
legacy repos that were created before source remotes existed: when the
.gitmodules URL is GitHub-hosted or a relative sibling that maps
cleanly to a GitHub owner/repo, shithub may fetch from the inferred
GitHub URL. Diverged local refs are never force-updated; on fetch
failure or still-missing objects, the row links to the target repo's
default Code tab so independently-created mirrors don't produce dead
links. Unknown, external, absent, or malformed remotes stay as plain
name @ shortsha rows.
The S17 ship excludes the htmx-driven "last commit per entry" column that the spec describes — an extra round-trip we can add later without a schema change. The current page renders the listing immediately. Deferred to S18 (commits-per-entry) — the spec calls out this deferral path; the tree template has the column slot ready.
File view
codeBlob walks four cases:
- Large (>1 MiB): placeholder + raw download link, no body read.
- Binary (NUL byte in first 8 KiB): placeholder. Image extensions
(png/jpg/jpeg/gif/webp) ≤5 MiB get an
<img>preview pointing at/raw/.... - Markdown (
.md/.markdown): Goldmark + bluemonday rendered HTML PLUS a<details>source-toggle with the highlighted source. - Default text: Chroma highlight by filename extension, content sniffing fallback.
Chroma uses the github style baked at process start; the CSS is
served from /static/css/chroma.css via a tiny in-process generator.
In-browser file edits
The Code tab surfaces GitHub-style write affordances for users with
repo:write on a named branch:
- The tree header has an Add file dropdown with create and upload actions.
- Text blob headers show edit and delete icon buttons.
- The rendered README header shows an edit icon when the README was found in the current directory. SECURITY and CONTRIBUTING documents use the same blob-header controls when opened from the document tabs.
Direct web commits are intentionally limited to refs/heads/<branch>.
Tags and detached 40-hex commit views render read-only controls and
direct edit URLs return 400.
internal/repos/webedit owns the mutation path. For each edit it:
- Resolves the branch to its current commit and compares the submitted
hidden
base_oid; a mismatch returns a stale-edit conflict. - Builds a temporary index from the old commit with
git read-tree. - Stages file changes via canonical git plumbing (
hash-object,update-index,write-tree,commit-tree). - Runs
protection.Enforcebefore moving the branch, so protected branches deny direct web commits just like pushes. - Advances the branch with
git update-ref <ref> <new> <old>CAS. - Inserts a
push_eventsrow withprotocol = 'web', enqueuespush:process, and sends the worker NOTIFY. If enqueueing fails after the ref has moved, the commit still succeeds and the failure is logged; the same post-push reconciliation gap exists for hook failures.
Validation rules:
- Text editor actions are capped at 1 MiB and reject NUL-byte binary content. Existing edit sources must be regular blobs, not symlinks, submodules, trees, or oversized blobs.
- Uploads are capped at 25 MiB per request and 10 MiB per file. Uploads may contain binary data.
- Repository paths reject empty names, leading/trailing slash, duplicate
slash, backslash,
./..segments, control bytes, exact overwrites, duplicate uploads, and parent-path conflicts. - Default commit messages are generated server-side (
Update,Create,Rename,Delete, orUpload) when the form leaves the message blank.
The editor component is still server-rendered Go templates plus a small page-local script. No frontend build pipeline or React/Vite layer is required for this slice.
Raw view
- Content-Type derived from the extension whitelist
(
code.go::rawContentType). X-Content-Type-Options: nosniffalways.Content-Security-Policy: default-src 'none'; sandboxat the handler level (the global SecureHeaders middleware may overlay a broader CSP — both are restrictive; the OR of the two is what user agents enforce).Content-Disposition: attachmentis forced for HTML, SVG, JS, WASM, and anything that could execute on shithub's domain. We don't have a separateraw.shithub.tldhost yet (post-MVP); attachment is the safety belt.- Streamed via
git cat-file -p; never buffered. Large blobs don't blow up the worker's memory.
Finder ("Go to file")
/find/{ref} lists every blob path on the ref via
git ls-tree -r --name-only, then filters with
internal/repos/finder/finder.go::Filter. The matcher is a
subsequence-with-bonus scorer (boundary, consecutive run, basename
hit) — not as fancy as VS Code's quickopen but good enough for tens of
thousands of paths.
Key shortcut and live-filter via htmx are spec deliverables that we defer for now — the form-submission flow works without JS and that's the floor S17 commits to.
Caching
Currently no caching layer. Every request runs git for-each-ref,
git ls-tree, etc. That's fine for small-to-medium repos; the cost
shows up on big repos with deep trees. The S17 spec proposes a cache
keyed on (repo_id, ref_oid, dir_path) invalidated on push (S14's
push:process job is the right invalidation hook).
Deferred — the cache is purely performance polish. When we hit a
real-world repo where it matters, wire it in: file internal/cache/
plus a callback in worker/jobs/push_process.go. The handlers already
take a per-request policy.Cache so adding a per-process git cache is
mechanically straightforward.
Pitfalls + protections
- XSS via raw HTML/SVG: blocked by
Content-Disposition: attachmentfor those extensions. - XSS via markdown: Goldmark configured without HTML passthrough +
bluemonday's UGC policy on top. Tests in
internal/repos/markdown/(TODO — minimal coverage today). - Path traversal:
validateSubpathincode.gorejects.., controls, leading slashes. - Web edit path traversal / overwrite:
webedit.ValidateFilePathapplies the stricter mutation path guard and the service re-checks path existence against the commit being modified. - Hex collision with SHA: ref-list lookup wins over SHA shortcut when the same string is both.
- Encoding (GBK / Shift-JIS): TODO — text files outside UTF-8 may
render as garbled. The body is rendered as-is; a future commit can
add
golang.org/x/text/encodingautodetection.
Dependencies
github.com/alecthomas/chroma/v2— syntax highlightinggithub.com/yuin/goldmark— CommonMark + GFMgithub.com/microcosm-cc/bluemonday— HTML sanitizer
Deferred polish (tracked, not blocking)
These items are spec deliverables we ship in a later pass:
- Last-commit-per-entry column with htmx lazy load and pre-walked
git log --name-statuscache → wire into S18 (commit history) where the same walk powers the per-file history page. - Tree caching keyed on (repo_id, ref_oid, dir_path), push-event invalidation → wire into S36 (performance pass) once we have a real workload to measure.
- Pagination at 1000 entries per directory → cosmetic for huge
trees; add when someone hits
node_modules-grade inflation. - Encoding detection for non-UTF-8 source files → file reads are
defensive (
io.LimitReader+ size cap); render quality is the only loss until this lands.
View source
| 1 | # Code tab |
| 2 | |
| 3 | The code tab is the GitHub-style repo browser: tree listing, blob view |
| 4 | with syntax highlighting, raw view, "Go to file" finder, and the |
| 5 | branch/tag switcher. For populated repos, `/{owner}/{repo}` renders the |
| 6 | default branch Code tab directly, matching GitHub's canonical repo URL. |
| 7 | |
| 8 | ## Routes |
| 9 | |
| 10 | | Route | Handler | |
| 11 | | ------------------------------------------------ | -------------------------------- | |
| 12 | | `GET /{owner}/{repo}` | default-branch Code tab | |
| 13 | | `GET /{owner}/{repo}/tree/{ref}/{path...}` | `codeTree` | |
| 14 | | `GET /{owner}/{repo}/blob/{ref}/{path...}` | `codeBlob` | |
| 15 | | `GET /{owner}/{repo}/raw/{ref}/{path...}` | `codeRaw` | |
| 16 | | `GET /{owner}/{repo}/find/{ref}?q=...` | `codeFinder` | |
| 17 | | `GET /{owner}/{repo}/edit/{ref}/{path...}` | in-browser file editor | |
| 18 | | `POST /{owner}/{repo}/edit/{ref}/{path...}` | commit file edit / rename | |
| 19 | | `GET /{owner}/{repo}/new/{ref}/{path...}` | in-browser new-file editor | |
| 20 | | `POST /{owner}/{repo}/new/{ref}/{path...}` | commit new file | |
| 21 | | `GET /{owner}/{repo}/delete/{ref}/{path...}` | delete-file confirmation | |
| 22 | | `POST /{owner}/{repo}/delete/{ref}/{path...}` | commit file deletion | |
| 23 | | `GET /{owner}/{repo}/upload/{ref}/{path...}` | upload-files form | |
| 24 | | `POST /{owner}/{repo}/upload/{ref}/{path...}` | commit uploaded files | |
| 25 | | `POST /{owner}/{repo}/markdown-preview` | editor Markdown preview fragment | |
| 26 | | `GET /{owner}/{repo}/actions` | parked product-tab shell | |
| 27 | | `GET /{owner}/{repo}/projects` | parked product-tab shell | |
| 28 | | `GET /{owner}/{repo}/wiki` | parked product-tab shell | |
| 29 | | `GET /{owner}/{repo}/security` | parked product-tab shell | |
| 30 | | `GET /{owner}/{repo}/pulse` | parked product-tab shell | |
| 31 | | `GET /{owner}/{repo}/packages` | parked product-tab shell | |
| 32 | | `GET /{owner}/{repo}/releases` | parked product-tab shell | |
| 33 | | `GET /static/css/chroma.css` | runtime-generated Chroma theme | |
| 34 | |
| 35 | Every code-tab handler runs through `policy.Can(... ActionRepoRead)` — |
| 36 | private repos hide from anonymous viewers and unrelated users via the |
| 37 | existence-leak 404 guard from S15. |
| 38 | |
| 39 | The in-browser mutation routes run through `policy.Can(... ActionRepoWrite)`. |
| 40 | They inherit the same archived-repo, suspended-user, collaborator-role, |
| 41 | site-admin, and private-repo existence behavior as git push surfaces. |
| 42 | |
| 43 | ## Repository product tabs |
| 44 | |
| 45 | The repo header intentionally exposes GitHub's major product-map tabs: |
| 46 | Code, Issues, Pull requests, Actions, Projects, Wiki, Security and |
| 47 | quality, Insights, and Settings when visible to the viewer. Forks remain |
| 48 | available from the repo action button and About sidebar, but are not a |
| 49 | top-level tab on GitHub. |
| 50 | |
| 51 | Actions, Projects, Wiki, Security and quality, Insights, Packages, and |
| 52 | Releases currently render honest parked shells via `repo/deferred_tab`. |
| 53 | They are public read surfaces gated by `ActionRepoRead`, so private repo |
| 54 | existence behavior matches Code/Issues/Pull requests while the deeper |
| 55 | systems remain assigned to their later sprints. |
| 56 | |
| 57 | ## Ref + path disambiguation |
| 58 | |
| 59 | `{ref}` is the chi `*` wildcard, so the URL `/tree/feature/x/sub/file.go` |
| 60 | arrives as a single string. Resolution: |
| 61 | |
| 62 | 1. If the first segment is exactly 40 hex chars → treat as a SHA, the |
| 63 | rest is the path. |
| 64 | 2. Otherwise, longest-prefix match against the cached ref list |
| 65 | (branches first, then tags, sorted longest-first). The remainder |
| 66 | after the matched ref is the in-tree path. |
| 67 | |
| 68 | This handles `release/v1.0/beta/CHANGELOG.md` correctly without |
| 69 | ambiguity. Resolution lives in `internal/repos/git/treeops.go::ResolveRef`. |
| 70 | |
| 71 | When the matched ref is a raw 40-character commit SHA, the tree page |
| 72 | resolves the top commit summary against that SHA and displays the short |
| 73 | SHA in the ref switcher, matching GitHub's detached-commit tree view. |
| 74 | |
| 75 | Path validation rejects `..`, control chars, leading slashes, and |
| 76 | backslashes — defense in depth on top of git's own validation. |
| 77 | |
| 78 | ## Tree listing |
| 79 | |
| 80 | `git ls-tree --long --full-tree <ref>:<path>` is parsed into typed |
| 81 | `TreeEntry` values (`tree | blob | commit | symlink`). Sort is |
| 82 | directories first, then files alphabetically. |
| 83 | |
| 84 | `commit` entries are git submodule pointers. When `.gitmodules` exists |
| 85 | on the rendered ref, the Code tab parses it once, matches entries by |
| 86 | submodule path, and links GitHub or configured shithub clone remotes to |
| 87 | the local `/{owner}/{repo}/tree/{gitlink-oid}` route when the target |
| 88 | repo has that commit. |
| 89 | |
| 90 | If the target repo exists locally but does not have the pinned commit |
| 91 | object, the handler first checks `repo_source_remotes` for that target |
| 92 | repo. A stored source remote is the durable source of truth for imports: |
| 93 | the handler validates it with the shared SSRF defense, performs a |
| 94 | bounded, non-forced fetch of heads/tags, re-checks the object, and then |
| 95 | links to the exact detached-commit tree when it arrived. Successful |
| 96 | backfills update the target repo's default-branch OID when that ref |
| 97 | moved, mark the source remote fetched, and enqueue the same code-index |
| 98 | and size-recalc maintenance used after pushes. |
| 99 | |
| 100 | GitHub URL/name inference remains as a compatibility fallback for |
| 101 | legacy repos that were created before source remotes existed: when the |
| 102 | `.gitmodules` URL is GitHub-hosted or a relative sibling that maps |
| 103 | cleanly to a GitHub owner/repo, shithub may fetch from the inferred |
| 104 | GitHub URL. Diverged local refs are never force-updated; on fetch |
| 105 | failure or still-missing objects, the row links to the target repo's |
| 106 | default Code tab so independently-created mirrors don't produce dead |
| 107 | links. Unknown, external, absent, or malformed remotes stay as plain |
| 108 | `name @ shortsha` rows. |
| 109 | |
| 110 | The S17 ship excludes the htmx-driven "last commit per entry" column |
| 111 | that the spec describes — an extra round-trip we can add later without |
| 112 | a schema change. The current page renders the listing immediately. |
| 113 | **Deferred to S18 (commits-per-entry)** — the spec calls out this |
| 114 | deferral path; the tree template has the column slot ready. |
| 115 | |
| 116 | ## File view |
| 117 | |
| 118 | `codeBlob` walks four cases: |
| 119 | |
| 120 | * **Large** (>1 MiB): placeholder + raw download link, no body read. |
| 121 | * **Binary** (NUL byte in first 8 KiB): placeholder. Image extensions |
| 122 | (png/jpg/jpeg/gif/webp) ≤5 MiB get an `<img>` preview pointing at |
| 123 | `/raw/...`. |
| 124 | * **Markdown** (`.md`/`.markdown`): Goldmark + bluemonday rendered HTML |
| 125 | PLUS a `<details>` source-toggle with the highlighted source. |
| 126 | * **Default text**: Chroma highlight by filename extension, content |
| 127 | sniffing fallback. |
| 128 | |
| 129 | Chroma uses the `github` style baked at process start; the CSS is |
| 130 | served from `/static/css/chroma.css` via a tiny in-process generator. |
| 131 | |
| 132 | ## In-browser file edits |
| 133 | |
| 134 | The Code tab surfaces GitHub-style write affordances for users with |
| 135 | `repo:write` on a named branch: |
| 136 | |
| 137 | - The tree header has an **Add file** dropdown with create and upload |
| 138 | actions. |
| 139 | - Text blob headers show edit and delete icon buttons. |
| 140 | - The rendered README header shows an edit icon when the README was |
| 141 | found in the current directory. SECURITY and CONTRIBUTING documents |
| 142 | use the same blob-header controls when opened from the document tabs. |
| 143 | |
| 144 | Direct web commits are intentionally limited to `refs/heads/<branch>`. |
| 145 | Tags and detached 40-hex commit views render read-only controls and |
| 146 | direct edit URLs return `400`. |
| 147 | |
| 148 | `internal/repos/webedit` owns the mutation path. For each edit it: |
| 149 | |
| 150 | 1. Resolves the branch to its current commit and compares the submitted |
| 151 | hidden `base_oid`; a mismatch returns a stale-edit conflict. |
| 152 | 2. Builds a temporary index from the old commit with `git read-tree`. |
| 153 | 3. Stages file changes via canonical git plumbing (`hash-object`, |
| 154 | `update-index`, `write-tree`, `commit-tree`). |
| 155 | 4. Runs `protection.Enforce` before moving the branch, so protected |
| 156 | branches deny direct web commits just like pushes. |
| 157 | 5. Advances the branch with `git update-ref <ref> <new> <old>` CAS. |
| 158 | 6. Inserts a `push_events` row with `protocol = 'web'`, enqueues |
| 159 | `push:process`, and sends the worker NOTIFY. If enqueueing fails |
| 160 | after the ref has moved, the commit still succeeds and the failure |
| 161 | is logged; the same post-push reconciliation gap exists for hook |
| 162 | failures. |
| 163 | |
| 164 | Validation rules: |
| 165 | |
| 166 | - Text editor actions are capped at 1 MiB and reject NUL-byte binary |
| 167 | content. Existing edit sources must be regular blobs, not symlinks, |
| 168 | submodules, trees, or oversized blobs. |
| 169 | - Uploads are capped at 25 MiB per request and 10 MiB per file. Uploads |
| 170 | may contain binary data. |
| 171 | - Repository paths reject empty names, leading/trailing slash, duplicate |
| 172 | slash, backslash, `.`/`..` segments, control bytes, exact overwrites, |
| 173 | duplicate uploads, and parent-path conflicts. |
| 174 | - Default commit messages are generated server-side (`Update`, `Create`, |
| 175 | `Rename`, `Delete`, or `Upload`) when the form leaves the message |
| 176 | blank. |
| 177 | |
| 178 | The editor component is still server-rendered Go templates plus a small |
| 179 | page-local script. No frontend build pipeline or React/Vite layer is |
| 180 | required for this slice. |
| 181 | |
| 182 | ## Raw view |
| 183 | |
| 184 | * Content-Type derived from the extension whitelist |
| 185 | (`code.go::rawContentType`). |
| 186 | * `X-Content-Type-Options: nosniff` always. |
| 187 | * `Content-Security-Policy: default-src 'none'; sandbox` at the |
| 188 | handler level (the global SecureHeaders middleware may overlay a |
| 189 | broader CSP — both are restrictive; the OR of the two is what user |
| 190 | agents enforce). |
| 191 | * **`Content-Disposition: attachment`** is forced for HTML, SVG, JS, |
| 192 | WASM, and anything that could execute on shithub's domain. We don't |
| 193 | have a separate `raw.shithub.tld` host yet (post-MVP); attachment is |
| 194 | the safety belt. |
| 195 | * Streamed via `git cat-file -p`; never buffered. Large blobs don't |
| 196 | blow up the worker's memory. |
| 197 | |
| 198 | ## Finder ("Go to file") |
| 199 | |
| 200 | `/find/{ref}` lists every blob path on the ref via |
| 201 | `git ls-tree -r --name-only`, then filters with |
| 202 | `internal/repos/finder/finder.go::Filter`. The matcher is a |
| 203 | subsequence-with-bonus scorer (boundary, consecutive run, basename |
| 204 | hit) — not as fancy as VS Code's quickopen but good enough for tens of |
| 205 | thousands of paths. |
| 206 | |
| 207 | Key shortcut and live-filter via htmx are spec deliverables that we |
| 208 | defer for now — the form-submission flow works without JS and that's |
| 209 | the floor S17 commits to. |
| 210 | |
| 211 | ## Caching |
| 212 | |
| 213 | Currently **no caching layer**. Every request runs `git for-each-ref`, |
| 214 | `git ls-tree`, etc. That's fine for small-to-medium repos; the cost |
| 215 | shows up on big repos with deep trees. The S17 spec proposes a cache |
| 216 | keyed on `(repo_id, ref_oid, dir_path)` invalidated on push (S14's |
| 217 | `push:process` job is the right invalidation hook). |
| 218 | |
| 219 | **Deferred** — the cache is purely performance polish. When we hit a |
| 220 | real-world repo where it matters, wire it in: file `internal/cache/` |
| 221 | plus a callback in `worker/jobs/push_process.go`. The handlers already |
| 222 | take a per-request `policy.Cache` so adding a per-process git cache is |
| 223 | mechanically straightforward. |
| 224 | |
| 225 | ## Pitfalls + protections |
| 226 | |
| 227 | * **XSS via raw HTML/SVG**: blocked by `Content-Disposition: attachment` |
| 228 | for those extensions. |
| 229 | * **XSS via markdown**: Goldmark configured without HTML passthrough + |
| 230 | bluemonday's UGC policy on top. Tests in `internal/repos/markdown/` |
| 231 | (TODO — minimal coverage today). |
| 232 | * **Path traversal**: `validateSubpath` in `code.go` rejects `..`, |
| 233 | controls, leading slashes. |
| 234 | * **Web edit path traversal / overwrite**: `webedit.ValidateFilePath` |
| 235 | applies the stricter mutation path guard and the service re-checks |
| 236 | path existence against the commit being modified. |
| 237 | * **Hex collision with SHA**: ref-list lookup wins over SHA shortcut |
| 238 | when the same string is both. |
| 239 | * **Encoding (GBK / Shift-JIS)**: TODO — text files outside UTF-8 may |
| 240 | render as garbled. The body is rendered as-is; a future commit can |
| 241 | add `golang.org/x/text/encoding` autodetection. |
| 242 | |
| 243 | ## Dependencies |
| 244 | |
| 245 | * `github.com/alecthomas/chroma/v2` — syntax highlighting |
| 246 | * `github.com/yuin/goldmark` — CommonMark + GFM |
| 247 | * `github.com/microcosm-cc/bluemonday` — HTML sanitizer |
| 248 | |
| 249 | ## Deferred polish (tracked, not blocking) |
| 250 | |
| 251 | These items are spec deliverables we ship in a later pass: |
| 252 | |
| 253 | * **Last-commit-per-entry column** with htmx lazy load and pre-walked |
| 254 | `git log --name-status` cache → wire into S18 (commit history) where |
| 255 | the same walk powers the per-file history page. |
| 256 | * **Tree caching keyed on (repo_id, ref_oid, dir_path)**, push-event |
| 257 | invalidation → wire into S36 (performance pass) once we have a real |
| 258 | workload to measure. |
| 259 | * **Pagination at 1000 entries per directory** → cosmetic for huge |
| 260 | trees; add when someone hits `node_modules`-grade inflation. |
| 261 | * **Encoding detection for non-UTF-8 source files** → file reads are |
| 262 | defensive (`io.LimitReader` + size cap); render quality is the only |
| 263 | loss until this lands. |