sway Public

trunk

Branches trunk

1 Branches 0 Tags

Go to file T

Matthew Forrester Wolffe Merge pull request #32 from tenseleyFlow/sprint/36-sway-serve

f04e68b 2 weeks ago 476 Commits

__init__.py	sway: scaffold standalone subproject (pyproject, LICENSE, README)	3 weeks ago
test_autogen_quality.py	tests/autogen: F07 skipped-probes rollup coverage	3 weeks ago
test_backend_api.py	tests/unit: backend_api — MockTransport coverage across all three scoring methods, retries, preflight	3 weeks ago
test_backend_dummy.py	sway(backends): DummyDifferentialBackend for unit tests	3 weeks ago
test_backend_hf_helpers.py	tests/backend_hf_helpers: unit cover _resolve_dtype + _detect_device	3 weeks ago
test_backend_instrumentation.py	tests/scoring+instrumentation: new FakeScoring.next_token_dist_batch + cached_batch coverage	2 weeks ago
test_backend_registry.py	tests/backend_registry: cover __sway_protocols__ stamp on full + minimal custom backends (B20)	3 weeks ago
test_batched_backend_s23.py	tests/batched_backend_s23: probe-level batched-path + results-equivalence + footer coverage	2 weeks ago
test_bootstrap_ci_narrowing.py	tests/unit: fix bootstrap-CI-narrowing flake — use hashlib over PYTHONHASHSEED-salted hash()	3 weeks ago
test_cli.py	tests/cli: doctor --json schema-shape snapshot (stronger-test #11)	3 weeks ago
test_cli_compare.py	tests/unit: sway compare CLI — formats, exit codes, --fail-on-regression gate	3 weeks ago
test_cli_mine.py	tests/cli_mine: bump fixture pool size to clear F04 2·top_k floor	3 weeks ago
test_cli_report_html.py	tests/unit: sway report --format html --out — file write, missing-plotly, format guards	3 weeks ago
test_cli_serve.py	Add sway serve CLI safety tests	2 weeks ago
test_cluster_kl_prove_value.py	tests/test_cluster_kl_prove_value: same sklearn-free kmeans stub	3 weeks ago
test_compare.py	tests/unit: compare — build_matrix + renderers + regression gate	3 weeks ago
test_compare_prove_value.py	tests/unit: prove-the-value — sway compare catches planted regression in committed history	3 weeks ago
test_cross_process_determinism.py	tests: subprocess determinism test — pins F08 + stronger-test #12	3 weeks ago
test_cross_verdict_consistency.py	tests: cross-verdict consistency — run/gate/junit tally agree (stronger-test #10)	3 weeks ago
test_determinism.py	tests/determinism: runner calls seed_everything before probes, populates DeterminismReport	3 weeks ago
test_divergence.py	tests/divergence: degenerate uniform TokenDist rejection (stronger-test #9)	3 weeks ago
test_dlm_bridge.py	tests: ruff N818/PT018 fixups on the F06 test additions	2 weeks ago
test_dlm_not_imported.py	tests/dlm_not_imported: dummy suite + autogen clean-error when dlm absent (C7)	3 weeks ago
test_errors.py	sway(core): exception hierarchy	3 weeks ago
test_ext_ppl_vs_calibration_drift.py	tests/unit: prove-the-value — diffuse forgetting splits external_perplexity vs calibration_drift verdicts	3 weeks ago
test_golden_comparator.py	tests/golden_comparator: update tolerance boundaries for 1e-4 default	3 weeks ago
test_mlx_convert.py	tests/mlx_convert: pytest.importorskip safetensors so fast lane (no [hf]) skips cleanly	2 weeks ago
test_model.py	tests/model: cover dtype/endpoint/trust_remote_code branches (DC5)	3 weeks ago
test_no_dead_options.py	tests/no_dead_options: meta-guard against P14/P15 regression (documented-but-unused)	3 weeks ago
test_null_cache.py	probes/null_adapter: on-disk cache keyed by backend identity + calibration params	3 weeks ago
test_null_calibration.py	tests/null_calibration: assert runs=1 is flagged degenerate + refused by z_score (F02)	3 weeks ago
test_null_multi_rank.py	tests/unit: multi-rank null calibration — rank_scale semantics, z-profile emission, prove-the-value rank saturation	3 weeks ago
test_outlier_miner.py	tests/outlier_miner: regress-test small-pool guard + actionable hint (F04)	3 weeks ago
test_pack_unpack.py	tests/pack_unpack: 17 unit tests round-tripping spec + dlm_source + golden + every error path (S26 X3-P6)	2 weeks ago
test_paraphrase_miner.py	tests: paraphrase_miner — ranker + diversity filter + input validation (S17.5)	3 weeks ago
test_paraphrase_miner_prove_value.py	tests: prove-value — mined paraphrases flip memorizing adapter PASS→FAIL (S17.6)	3 weeks ago
test_preflight_check.py	backends: PreflightCheckable protocol + finite-check on HF and dummy	3 weeks ago
test_probe_adapter_ablation.py	tests/adapter_ablation: probe-level saturation reason coverage (C8, B3 test side)	3 weeks ago
test_probe_adapter_revert.py	sway(probes): A2 adapter_revert via sentence embeddings	3 weeks ago
test_probe_base.py	tests/probe_base: cover validate_all_probes — multi-error collection + index-label fallback (B7)	3 weeks ago
test_probe_calibration_drift.py	tests/calibration_drift: split compound asserts (PT018)	3 weeks ago
test_probe_cluster_kl.py	tests/cluster_kl: perturb _dist_broad fixture to clear uniformity guard	3 weeks ago
test_probe_delta_kl.py	tests: delta_kl NaN-routes-to-error at both probe and runner levels (B1 regression)	3 weeks ago
test_probe_external_perplexity.py	tests/ext_ppl: assert runner threads null_stats even when degenerate (F02)	3 weeks ago
test_probe_gradient_ghost.py	probes/gradient_ghost: min-baseline ratio + 17 unit tests covering the verdict ladder (S25 P7)	2 weeks ago
test_probe_leakage.py	tests/leakage: cover the 4 new perturbations + expanded fixture canned-responses (B11)	3 weeks ago
test_probe_multi_turn_coherence.py	tests/unit: 22 tests for multi_turn_coherence probe + curve-fit math	2 weeks ago
test_probe_paraphrase_invariance.py	sway(probes): B2 paraphrase_invariance with intent-aware pass rule	3 weeks ago
test_probe_preference_flip.py	tests/preference_flip: cover one-bad-triple and all-fail paths (B14)	3 weeks ago
test_probe_prompt_collapse.py	tests/prompt_collapse: cover tokenizer path + legacy fallback + spec opt-out (B13)	3 weeks ago
test_probe_section_internalization.py	sway(probes): B1 section_internalization (flagship per-section attribution)	3 weeks ago
test_probe_style_fingerprint.py	probes/style_fingerprint: detect zero-fp ft as ERROR; replace cosine with projection (B4)	3 weeks ago
test_probe_tool_use_fidelity.py	tests/unit: 40 tests covering tool_use_fidelity probe + parsers + schema check	2 weeks ago
test_probe_training_drift.py	tests/unit: 30 tests for training_drift probe + helpers + real-fixture parse	2 weeks ago
test_pytest_plugin.py	tests/unit: pytest_plugin via pytester — expansion, verdict routing, gate, error paths, cache reuse	3 weeks ago
test_report_extras_rollup.py	tests/report: degenerate null rollup coverage (F02)	3 weeks ago
test_report_formatters.py	tests: add D3/D4/D6/D7/D10/D11/D12 coverage — formatters, extras rollup, CLI surfaces	3 weeks ago
test_report_html.py	tests/unit: report_html — renderer + panel divs + snapshot + missing-plotly hint	3 weeks ago
test_report_html_offline.py	tests/unit: prove-the-value — HTML from committed history fixture loads offline, no external URLs	3 weeks ago
test_report_snapshot.py	tests/report_snapshot: fixture probe carries ci_95 — locks F01 path	3 weeks ago
test_result.py	sway(core): ProbeResult, SuiteResult, SwayScore, Verdict	3 weeks ago
test_runner_backend_stats.py	tests/runner: trace writer ↔ analyzer round-trip regression (F09)	3 weeks ago
test_safe_finalize.py	core: safe_finalize helper — non-finite critical fields → Verdict.ERROR	3 weeks ago
test_score_weights_override.py	tests/score_weights: spec field validation + CLI parser + composite override	3 weeks ago
test_scoring.py	tests/scoring+instrumentation: new FakeScoring.next_token_dist_batch + cached_batch coverage	2 weeks ago
test_sections.py	sway(core): Section / SectionProbe / SectionPreference dataclasses	3 weeks ago
test_serve_app.py	Apply ruff format	2 weeks ago
test_serve_cache.py	Add BackendCache unit tests	2 weeks ago
test_serve_client.py	Add ServeClient unit tests	2 weeks ago
test_stats.py	tests/unit: fix type-narrowing and unused-import in stats tests	3 weeks ago
test_style_fingerprint_extended.py	tests/style_fingerprint: extended fingerprint fallback + extended=on SKIP path	3 weeks ago
test_suite_runner.py	probes/null_adapter: per-kind calibration matrix (fixes P02, B2, C9)	3 weeks ago
test_suite_score_report.py	rename CLI + source references to sway; keep dlm-sway as the PyPI wheel name	3 weeks ago
test_suite_spec.py	tests: cover ModelSpec.adapter normalization (tilde, relative, None) + update yaml-roundtrip assertion (B22)	3 weeks ago
test_trace_analysis.py	tests/unit: trace_analysis + trace_cmd CLI coverage (22 tests)	3 weeks ago
test_two_model_differential.py	tests/two_model: concurrency flag composition regression (F06)	3 weeks ago
test_visualize.py	sway(viz): matplotlib plots for SIS, adapter ablation, KL histogram (viz extra)	3 weeks ago
test_zscore_helpers.py	tests/zscore_helpers: degenerate flag rejects valid-floored std (F02)	3 weeks ago
test_zscore_threading.py	tests/zscore_threading: reformat	3 weeks ago

About

Differential testing for fine-tuned causal LMs — did LoRA/QLoRA training actually change behavior, or is the model defaulting to base?

Report repository

Releases

No releases published

Packages

No packages published

Contributors 3

espadonne mfwolffe Matthew Forrester Wolffe