# bensch A POSIX shell testing framework. Pass any shell binary, get a compliance report. ## Quick Start ```bash # Test bash ./bensch --shell /bin/bash --suite posix # Test a custom shell with interactive PTY tests ./bensch --shell /path/to/myshell --suite interactive # Full test run with profile auto-detection ./bensch --shell /bin/zsh --suite all # See what's available ./bensch --list-suites ./bensch --list-profiles ``` ## What It Tests bensch runs three categories of tests against any shell binary: | Suite | Tests | What It Checks | |-------|-------|----------------| | `posix` | ~3,800 | POSIX compliance: expansion, quoting, redirection, control flow, builtins, heredocs, job control | | `interactive` | ~1,014 | PTY behavior: line editing, history, tab completion, vi mode, signals, prompts | | `builtins` | ~650 | Individual builtin commands compared against a reference shell (bash) | ## How It Works **POSIX tests** run each command in both the shell under test and a reference shell (default: bash), then compare outputs. This tells you exactly where your shell diverges from established behavior. **Interactive tests** spawn the shell in a pseudo-terminal via pexpect and drive it with keystrokes — typing commands, pressing Tab, Ctrl+R, arrow keys, Ctrl+C — then verify the terminal output matches expectations. **Builtin tests** exercise individual commands (`cd`, `export`, `test`, `trap`, etc.) and compare output and exit codes against the reference shell. ## Shell Profiles Each shell has a profile (`profiles/*.yaml`) describing its capabilities: ```yaml # profiles/bash.yaml name: bash prompt_pattern: '\$ ' prompt_set_command: "PS1='$ '" mode_reset_command: "set -o emacs" capabilities: readline: true vi_mode: true job_control: true command_completion: true ``` Profiles control prompt detection, session reset, rc-file disabling, and which test suites are applicable. Shells without readline (like dash) automatically skip interactive editing tests. Available profiles: `bash`, `zsh`, `dash`, `ksh`, `fortsh`, `generic` ## Requirements - Python 3.8+ with pip (for interactive tests) - bash (as reference shell for comparison tests) - A shell binary to test Dependencies are installed automatically into a local `.venv` on first run. ## Project Structure ``` bensch/ ├── bensch # Entry point script ├── framework/ # Core PTY engine and test runner │ ├── shell_pty.py # pexpect-based PTY wrapper │ ├── runner.py # YAML test runner with session management │ ├── profile.py # Shell profile loader │ └── utils/ # Key definitions, output matchers ├── suites/ # Test specifications │ ├── interactive/ # YAML PTY tests (posix, editing, history, ...) │ ├── posix/ # Shell-script POSIX compliance tests │ └── builtins/ # Builtin command tests (portable, extended) ├── profiles/ # Shell capability profiles └── docs/ # Documentation ``` ## Compliance Matrix Tested on macOS ARM64 (April 2026). POSIX scores are averaged across 7 test suites. Builtins are 605 assertions. Integration is 479 assertions. | Shell | Version | POSIX | Builtins | Integration | Notes | |-------|---------|-------|----------|-------------|-------| | **bash** | 5.3 | **~100%** | 93% | 100% | Reference shell | | **osh** | 0.37 | **93%** | 91% | — | Oils — bash replacement, excellent compat | | **/bin/sh** | bash 3.2 | **97%** | — | — | macOS system shell, lacks bash 4+ features | | **mksh** | R59c | **93%** | 87% | 99% | MirBSD Korn Shell | | **yash** | 2.60 | **92%** | 88% | 99% | Designed for strict POSIX compliance | | **dash** | 0.5.12 | **91%** | 84% | 99% | Debian minimal POSIX shell | | **zsh** | 5.9 | **89%** | 89% | 99% | Extensions cause POSIX divergence | | **fish** | 4.1 | **31%** | 86% | 98% | Not POSIX — different syntax entirely | | **rc** | — | **23%** | 84% | — | Plan 9 shell, not POSIX | | **elvish** | 0.21 | **18%** | 75% | — | Modern shell, not POSIX | | **ksh** | macOS built-in | **~0%** | — | — | SEGFAULTS — broken system binary | ## Origin bensch was extracted from the [fortsh](https://github.com/FortranGoingOnForty/fortsh) project's test infrastructure, which achieved 1014/1014 interactive test parity across x86 Linux, ARM64 Linux, and macOS ARM64. The framework is 93% shell-agnostic by design. ## License GPLv3