tenseleyflow/documentlanguagemodel / 1d24029

Browse files

docs(share): CLI reference + sharing cookbook + mkdocs nav

Authored by espadonne
SHA
1d24029d58bdd1619013be77c656e0a5db1d77f6
Parents
5975236
Tree
0fe5449

3 changed files

StatusFile+-
M docs/cli/reference.md 66 0
A docs/cookbook/sharing.md 188 0
M mkdocs.yml 1 0
docs/cli/reference.mdmodified
@@ -220,6 +220,72 @@ dlm unpack <pack> [--force] [--out DIR]
220220
 | `--force` | false | Overwrite an existing store with the same `dlm_id`. |
221221
 | `--out DIR` | pack parent | Where to place the restored `.dlm`. |
222222
 
223
+### `dlm push`
224
+
225
+Upload a `.dlm` (auto-packs) or `.dlm.pack` to a sharing destination
226
+(Sprint 28).
227
+
228
+```
229
+dlm push <path> --to <destination> [--sign] [pack flags]
230
+```
231
+
232
+| Option | Default | Notes |
233
+|---|---|---|
234
+| `--to <destination>` | required | `hf:<org>/<repo>`, `https://...` URL endpoint, or a local path. |
235
+| `--sign` | false | Sign the pack with `minisign` before upload (requires `minisign` on PATH + key at `~/.dlm/minisign.key`). |
236
+| `--include-exports` | false | Forwarded to `dlm pack` when auto-packing a `.dlm`. |
237
+| `--include-base` | false | Same. |
238
+| `--include-logs` | false | Same. |
239
+| `--i-am-the-licensee URL` | none | Required with `--include-base` on a non-redistributable base. |
240
+
241
+**Destinations:**
242
+- `hf:<org>/<repo>` — HuggingFace Hub. Uses `$HF_TOKEN` if set. Autogenerates a `README.md` with `library_name: dlm` tag. Creates the repo if missing (your personal namespace needs no approval).
243
+- `https://…` — any HTTPS endpoint that accepts a POST with an `application/octet-stream` body. Sets `Authorization:` from `$DLM_SHARE_AUTH` when present (e.g. `Bearer <token>`).
244
+- `<local/path>` — copy the pack to a filesystem path.
245
+
246
+### `dlm pull`
247
+
248
+Download + verify + unpack a `.dlm.pack` from a remote source.
249
+
250
+```
251
+dlm pull <source> [--out DIR] [--force]
252
+```
253
+
254
+| Option | Default | Notes |
255
+|---|---|---|
256
+| `<source>` | required | `hf:<org>/<repo>`, `https://…`, `peer://host:port/<id>?token=…`, or a local path. |
257
+| `--out DIR` | CWD | Directory for the restored `.dlm`. |
258
+| `--force` | false | Overwrite an existing store with the same `dlm_id`. |
259
+
260
+Pulls always verify sha256 checksums during unpack. If a `.minisig`
261
+sidecar is served alongside the pack, `dlm pull` tries every key in
262
+`~/.dlm/trusted-keys/*.pub` — match → `verified`, no match →
263
+`unverified` warning (still installs, checksums are fine). No sidecar
264
+→ `unsigned` (still installs).
265
+
266
+### `dlm serve`
267
+
268
+Serve a `.dlm`'s pack over LAN for peers to pull.
269
+
270
+```
271
+dlm serve <path> [--port N] [--public --i-know-this-is-public]
272
+                 [--max-concurrency N] [--rate-limit N]
273
+                 [--token-ttl-minutes N]
274
+```
275
+
276
+| Option | Default | Notes |
277
+|---|---|---|
278
+| `--port N` | 7337 | Bind port. |
279
+| `--public` | false | Bind `0.0.0.0` **only when paired with** `--i-know-this-is-public`. Without the confirmation flag, `--public` logs a refusal and binds `127.0.0.1`. |
280
+| `--i-know-this-is-public` | false | Acknowledges the public bind. Meaningless without `--public`. |
281
+| `--max-concurrency N` | 4 | Max concurrent connections per token. Excess returns HTTP 429. |
282
+| `--rate-limit N` | 30 | Max requests per minute per token. |
283
+| `--token-ttl-minutes N` | 15 | Issued token lifetime. Ctrl-C invalidates every outstanding token instantly — the session secret lives only in the serving process. |
284
+
285
+On start, prints the `peer://` URL (with embedded token) that the
286
+other side pastes into `dlm pull`. Ctrl-C cleanly stops the server
287
+and deletes the temp pack.
288
+
223289
 ### `dlm doctor`
224290
 
225291
 Inspect hardware + print the resolved training plan.
docs/cookbook/sharing.mdadded
@@ -0,0 +1,188 @@
1
+# Sharing trained adapters
2
+
3
+Three ways to move a `.dlm.pack` between machines:
4
+
5
+| Channel | When to use | Auth story |
6
+|---|---|---|
7
+| **HuggingFace Hub** | Sharing with the world; persistent discoverability | Needs an HF account + write token; personal namespaces need no approval |
8
+| **Generic URL** | Uploading to your own server (S3, nginx, private bucket) | Optional `$DLM_SHARE_AUTH` header; you control the endpoint |
9
+| **Peer LAN** | Sending to a teammate on the same network; air-gapped labs | HMAC token, expires in 15 min, no accounts |
10
+
11
+All three produce the same artifact: a `.dlm.pack` the receiver
12
+unpacks via `dlm pull`. Pick the channel that matches your threat model
13
+and network.
14
+
15
+## Channel 1 — HuggingFace Hub
16
+
17
+```bash
18
+# Personal namespace: no approval required.
19
+dlm push mydoc.dlm --to hf:myusername/my-adapter
20
+
21
+# Output:
22
+# pushed: hf:myusername/my-adapter (45.32 MB)
23
+# install: dlm pull hf:myusername/my-adapter
24
+```
25
+
26
+Behind the scenes `dlm push`:
27
+1. Auto-packs `mydoc.dlm` → `mydoc.dlm.pack` in a temp dir.
28
+2. Creates the HF repo (idempotent — existing repo is reused).
29
+3. Uploads `adapter.dlm.pack` + autogenerated `README.md`.
30
+4. Tags `library_name: dlm` so HF filters surface the repo.
31
+
32
+**Auth:** HF Hub reads your token from `$HF_TOKEN` or
33
+`~/.cache/huggingface/token`. Run `huggingface-cli login` once if you
34
+haven't. The token is a write token from your own account — you're
35
+pushing to YOUR namespace, not claiming membership in an organization.
36
+
37
+**On the other machine:**
38
+```bash
39
+dlm pull hf:myusername/my-adapter
40
+# pulled: hf:myusername/my-adapter → ./my-adapter.dlm (45.32 MB)
41
+# unsigned (sha256 integrity still validated)
42
+
43
+dlm prompt ./my-adapter.dlm "What's in this document?"
44
+```
45
+
46
+## Channel 2 — Generic URL endpoint
47
+
48
+For anything that accepts an HTTPS POST with a binary body — S3
49
+signed URL, your own nginx, an API gateway, whatever.
50
+
51
+```bash
52
+# Optional bearer auth:
53
+export DLM_SHARE_AUTH="Bearer $MY_API_TOKEN"
54
+
55
+dlm push mydoc.dlm --to https://uploads.example.com/mydoc.dlm.pack
56
+```
57
+
58
+The receiver:
59
+```bash
60
+dlm pull https://uploads.example.com/mydoc.dlm.pack
61
+```
62
+
63
+If your endpoint needs a different auth header (Basic, custom), set
64
+`DLM_SHARE_AUTH` to the full header value — `dlm` copies it verbatim
65
+into the `Authorization:` header on both push and pull.
66
+
67
+`http://` (plaintext) works but logs a warning — use HTTPS when you
68
+can.
69
+
70
+## Channel 3 — Peer LAN
71
+
72
+You want to hand a `.dlm.pack` to your coworker sitting across the
73
+hallway, without going through the cloud.
74
+
75
+**Machine A:**
76
+```bash
77
+dlm serve ~/mydoc.dlm
78
+
79
+# serving: mydoc.dlm (dlm_id 01HZ...) on http://127.0.0.1:7337/01HZ...
80
+# peer URL: peer://192.168.1.42:7337/01HZ...?token=pDzfz1QwRFVUq...
81
+# token valid for 15 min. Ctrl-C to stop.
82
+```
83
+
84
+**Machine B:**
85
+```bash
86
+dlm pull peer://192.168.1.42:7337/01HZ...?token=pDzfz1QwRFVUq...
87
+```
88
+
89
+### Peer security posture
90
+
91
+- **Bind default is `127.0.0.1`** (loopback only). Going LAN-public
92
+  needs both `--public` AND `--i-know-this-is-public`:
93
+  ```bash
94
+  dlm serve mydoc.dlm --public --i-know-this-is-public
95
+  ```
96
+  Passing just `--public` without the confirmation flag logs a
97
+  refusal and binds loopback — safer default.
98
+- **HMAC tokens.** The token in the URL is `HMAC-SHA256(secret,
99
+  dlm_id || expiry || nonce)` where `secret` lives only in the
100
+  serving process's memory. Ctrl-C kills the process and every
101
+  outstanding token becomes unverifiable instantly — no persistent
102
+  key to revoke.
103
+- **Rate limits.** Default caps: 4 concurrent connections, 30
104
+  requests per minute per token. Violations return HTTP 429. Tune
105
+  via `--max-concurrency` and `--rate-limit`.
106
+- **Token lifetime.** Default 15 min. Tune via
107
+  `--token-ttl-minutes` if your pack is large enough that the pull
108
+  might cross the boundary.
109
+- **Connection logs only.** Metadata (IP, timestamp, status) goes
110
+  to stdout. Pack content bytes never hit the log stream.
111
+
112
+**Never use `--public` on a coffee-shop wifi.** It binds `0.0.0.0`
113
+and publishes your pack to every machine on that network until you
114
+Ctrl-C the server. Use a LAN you control.
115
+
116
+## Optional signing with minisign
117
+
118
+If you have [`minisign`](https://jedisct1.github.io/minisign/)
119
+installed (`brew install minisign` on macOS), you can sign packs so
120
+receivers with your public key get a `verified` marker.
121
+
122
+**One-time setup (sender):**
123
+```bash
124
+minisign -G -s ~/.dlm/minisign.key
125
+# Generates a keypair at ~/.dlm/minisign.key (secret) and
126
+# ~/.dlm/minisign.key.pub (public). Prompt for passphrase.
127
+```
128
+
129
+**Distribute your public key** (`~/.dlm/minisign.key.pub`) to
130
+receivers by any trusted channel — email it, commit to a repo,
131
+whatever. It's safe to share.
132
+
133
+**Sender (every push):**
134
+```bash
135
+dlm push mydoc.dlm --to hf:myusername/my-adapter --sign
136
+# minisign prompts for passphrase, produces <pack>.minisig sidecar,
137
+# uploads both.
138
+```
139
+
140
+**Receiver (one-time setup):**
141
+```bash
142
+mkdir -p ~/.dlm/trusted-keys
143
+cp /path/to/senders-key.pub ~/.dlm/trusted-keys/alice.pub
144
+```
145
+
146
+**Receiver (every pull):**
147
+```bash
148
+dlm pull hf:myusername/my-adapter
149
+# pulled: hf:... → ./my-adapter.dlm (45.32 MB)
150
+# verified: signature matches /Users/you/.dlm/trusted-keys/alice.pub
151
+```
152
+
153
+**What the trust states mean:**
154
+- `verified` — signature present, matched a key in your trusted-keys
155
+  dir. Strongest possible guarantee.
156
+- `unverified` — signature present but no key matched (or no keys
157
+  configured, or `minisign` not installed). Pack is still installed;
158
+  sha256 checksums are still validated. Only the sender-identity
159
+  claim is uncorroborated.
160
+- `unsigned` — no signature. Fine for casual sharing. Rely on
161
+  channel-level trust (HF account, URL TLS, peer LAN).
162
+
163
+## Licensing and non-redistributable bases
164
+
165
+`dlm push` refuses to upload a pack that bundles a non-redistributable
166
+base model (Llama 3.2, etc.) unless you acknowledge you've accepted
167
+the base's license in your target channel:
168
+
169
+```bash
170
+dlm push mydoc.dlm --to hf:myuser/my-llama-adapter \
171
+    --include-base \
172
+    --i-am-the-licensee https://huggingface.co/meta-llama/Llama-3.2-3B
173
+```
174
+
175
+If you didn't `--include-base`, the pack carries only the LoRA
176
+adapter (a few MB) and the receiver supplies their own base — no
177
+licensing friction on the share path. This is the default and
178
+typically what you want.
179
+
180
+## Pulling from a local path
181
+
182
+If someone hands you a `.dlm.pack` on a USB drive:
183
+
184
+```bash
185
+dlm pull /Volumes/usb/mydoc.dlm.pack --out ~/Documents/
186
+```
187
+
188
+Same sha256 verification, same signature detection, zero network.
mkdocs.ymlmodified
@@ -72,6 +72,7 @@ nav:
7272
       - Save-to-train (--watch): cookbook/watch-mode.md
7373
       - Metrics & observability: cookbook/metrics.md
7474
       - Template gallery: cookbook/template-gallery.md
75
+      - Sharing adapters: cookbook/sharing.md
7576
   - Architecture: architecture.md
7677
   - Determinism: determinism.md
7778
   - Hardware: