trunk
Branches trunk
1 Branches 0 Tags
Go to file T
Code

bensch

A POSIX shell testing framework. Pass any shell binary, get a compliance report.

Quick Start

# Test bash
./bensch --shell /bin/bash --suite posix

# Test a custom shell with interactive PTY tests
./bensch --shell /path/to/myshell --suite interactive

# Full test run with profile auto-detection
./bensch --shell /bin/zsh --suite all

# See what's available
./bensch --list-suites
./bensch --list-profiles

What It Tests

bensch runs three categories of tests against any shell binary:

Suite Tests What It Checks
posix ~3,800 POSIX compliance: expansion, quoting, redirection, control flow, builtins, heredocs, job control
interactive ~1,014 PTY behavior: line editing, history, tab completion, vi mode, signals, prompts
builtins ~650 Individual builtin commands compared against a reference shell (bash)

How It Works

POSIX tests run each command in both the shell under test and a reference shell (default: bash), then compare outputs. This tells you exactly where your shell diverges from established behavior.

Interactive tests spawn the shell in a pseudo-terminal via pexpect and drive it with keystrokes — typing commands, pressing Tab, Ctrl+R, arrow keys, Ctrl+C — then verify the terminal output matches expectations.

Builtin tests exercise individual commands (cd, export, test, trap, etc.) and compare output and exit codes against the reference shell.

Shell Profiles

Each shell has a profile (profiles/*.yaml) describing its capabilities:

# profiles/bash.yaml
name: bash
prompt_pattern: '\$ '
prompt_set_command: "PS1='$ '"
mode_reset_command: "set -o emacs"
capabilities:
  readline: true
  vi_mode: true
  job_control: true
  command_completion: true

Profiles control prompt detection, session reset, rc-file disabling, and which test suites are applicable. Shells without readline (like dash) automatically skip interactive editing tests.

Available profiles: bash, zsh, dash, ksh, fortsh, generic

Requirements

  • Python 3.8+ with pip (for interactive tests)
  • bash (as reference shell for comparison tests)
  • A shell binary to test

Dependencies are installed automatically into a local .venv on first run.

Project Structure

bensch/
├── bensch                    # Entry point script
├── framework/                # Core PTY engine and test runner
│   ├── shell_pty.py          # pexpect-based PTY wrapper
│   ├── runner.py             # YAML test runner with session management
│   ├── profile.py            # Shell profile loader
│   └── utils/                # Key definitions, output matchers
├── suites/                   # Test specifications
│   ├── interactive/          # YAML PTY tests (posix, editing, history, ...)
│   ├── posix/                # Shell-script POSIX compliance tests
│   └── builtins/             # Builtin command tests (portable, extended)
├── profiles/                 # Shell capability profiles
└── docs/                     # Documentation

Compliance Matrix

Tested on macOS ARM64 (April 2026). POSIX scores are averaged across 7 test suites. Builtins are 605 assertions. Integration is 479 assertions.

Shell Version POSIX Builtins Integration Notes
bash 5.3 ~100% 93% 100% Reference shell
osh 0.37 93% 91% Oils — bash replacement, excellent compat
/bin/sh bash 3.2 97% macOS system shell, lacks bash 4+ features
mksh R59c 93% 87% 99% MirBSD Korn Shell
yash 2.60 92% 88% 99% Designed for strict POSIX compliance
dash 0.5.12 91% 84% 99% Debian minimal POSIX shell
zsh 5.9 89% 89% 99% Extensions cause POSIX divergence
fish 4.1 31% 86% 98% Not POSIX — different syntax entirely
rc 23% 84% Plan 9 shell, not POSIX
elvish 0.21 18% 75% Modern shell, not POSIX
ksh macOS built-in ~0% SEGFAULTS — broken system binary

Origin

bensch was extracted from the fortsh project's test infrastructure, which achieved 1014/1014 interactive test parity across x86 Linux, ARM64 Linux, and macOS ARM64. The framework is 93% shell-agnostic by design.

License

GPLv3