markdown · 6973 bytes Raw Blame History

Interactive Test Expansion - Quick Reference

Current State

  • Interactive Tests: 321 tests (72.6% pass rate)
  • Non-Interactive Tests: ~656 POSIX compliance tests
  • Coverage Gap: ~335 tests (mostly edge cases and builtin testing)

What We're Missing

1. Edge Case Coverage (~200 tests)

From posix_compliance_gaps.sh (180 tests):

  • Parameter expansion edge cases (nested, complex patterns)
  • Builtin edge cases (set, shift, eval, return, break/continue)
  • Quoting and escaping complexity
  • Here-document variations (<<-, <<EOF)
  • Function scope and recursion
  • Special parameter edge cases ($@, $*, IFS interactions)
  • Redirection edge cases (<>, append with FD)

2. Interactive-Specific Depth (~150 tests)

Areas where we have basic tests but need edge cases:

  • Line Editing: Undo, macros, very long lines, Unicode edge cases
  • History: File operations, size limits, multi-line commands, sharing
  • Completion: Programmable completion, special chars, long lists, context-sensitive
  • Job Control: Multiple job specs, notification timing, wait variants

3. Cross-Feature Interactions (~100 tests)

Combinations that reveal bugs:

  • Editing + History (edit recalled command, Ctrl+R while editing)
  • Completion + Variables (complete with spaces, inside ${VAR[TAB]})
  • Job Control + Signals (Ctrl+C during completion, notification while editing)
  • Prompt + Escape Sequences (command substitution in prompt, resize behavior)

4. Error Handling & Resources (~80 tests)

  • Input edge cases (binary input, invalid UTF-8, very fast typing)
  • Output edge cases (> buffer size, control chars, broken pipe)
  • Resource limits (max history, FD exhaustion, process limits)
  • Error recovery (undefined HOME/PATH, terminal errors)

How to Expand

Quick Wins (Easiest First)

1. Use the Converter Tool (30 minutes)

# Convert simple tests from POSIX suite
cd tests/interactive
.venv/bin/python utils/convert_posix_tests.py \
  ../../fortsh/tests/posix_compliance_test.sh \
  test_specs/posix_basic_converted.yaml

# Review and fix MANUAL_REVIEW items
# Most echo commands auto-convert well

Expected: ~60-80 usable tests from the 96 in posix_compliance_test.sh

2. Hand-Craft Edge Cases (2-3 hours)

Pick 30-40 edge cases from posix_compliance_gaps.sh that are interesting:

# Example: Builtin edge cases
- name: "shift with no arguments uses $@"
  steps:
    - send_line: "set -- a b c"
    - send_line: "shift"
    - send_line: "echo $@"
  expect_output: "b c"

- name: "shift beyond available args is error"
  steps:
    - send_line: "set -- a"
    - send_line: "shift 2; echo $?"
  expect_output: "1"

- name: "eval with empty string"
  steps:
    - send_line: "eval ''; echo $?"
  expect_output: "0"

3. Extend Existing Categories (1-2 hours)

Add variations to existing tests:

# In history.yaml, add:
- name: "History with very long command (>1000 chars)"
  steps:
    - send_line: "echo <1000 char string>"
    - send_key: "Up"
  expect_output: "<verify it appears>"

# In completion.yaml, add:
- name: "Complete filename with spaces and quotes"
  steps:
    - send: "ls 'file with "
    - send_key: "Tab"
  expect_output: "file with spaces.txt"

Medium Effort (More Time Investment)

4. Create New Spec Files (3-4 hours each)

Add comprehensive coverage for specific areas:

test_specs/builtins_edge_cases.yaml:

  • All edge cases for cd, set, shift, eval, return, break, continue
  • Readonly and unset interactions
  • Alias edge cases
  • getopts comprehensive testing

test_specs/parameter_expansion_advanced.yaml:

  • Nested parameter expansion
  • All pattern matching variations (%, %%, #, ##)
  • Substring operations
  • Complex default/assign/error patterns

test_specs/cross_feature_interactions.yaml:

  • Editing while history search active
  • Completion during variable expansion
  • Job control notification during prompt display
  • Signal handling during different interactive states

Tools We Created

  1. convert_posix_tests.py

    • Parses compare_posix_output calls from shell scripts
    • Generates YAML test specs
    • Marks complex cases for manual review
    • ~60-70% of tests auto-convert successfully
  2. Session Reuse Framework

    • Reuses PTY sessions (10 tests per session)
    • Automatic reset between tests
    • Handles ~300+ tests without resource exhaustion

Phase 1: Foundation (Week 1) - Target: +100 tests

  1. Convert basic POSIX tests (echo, variables, simple commands)
  2. Add builtin edge cases (shift, set, eval - highest value)
  3. Extend parameter expansion tests

Outcome: 421 tests (~74% pass rate expected)

Phase 2: Interactive Depth (Week 2) - Target: +80 tests

  1. Line editing edge cases (long lines, Unicode, special chars)
  2. History edge cases (multi-line, file operations, size limits)
  3. Completion improvements (special chars, long lists)

Outcome: 501 tests (~73% pass rate expected, interactive-specific may reveal bugs)

Phase 3: Interactions & Edge Cases (Week 3) - Target: +70 tests

  1. Cross-feature interaction tests
  2. Job control comprehensive testing
  3. Error handling and resource limit tests

Outcome: 571 tests (~72% pass rate expected)

Phase 4: Coverage Complete (Week 4) - Target: +80 tests

  1. Remaining POSIX gaps conversions
  2. Stress and performance tests
  3. Documentation and CI integration

Outcome: 650+ tests (parity with non-interactive suite)

Quick Start Example

Here's how to add 10 new tests in 15 minutes:

  1. Pick a category (e.g., "shift builtin edge cases")

  2. Create tests in existing spec or new file:

# Add to test_specs/posix.yaml or create test_specs/builtins.yaml
  1. Write 10 variations:
- name: "shift no args"
  steps:
    - send_line: "set -- a b; shift; echo $@"
  expect_output: "b"

- name: "shift with count"
  steps:
    - send_line: "set -- a b c; shift 2; echo $@"
  expect_output: "c"

- name: "shift all"
  steps:
    - send_line: "set -- a; shift; echo $#"
  expect_output: "0"

# ... 7 more variations
  1. Run tests:
tests/interactive/.venv/bin/python tests/interactive/run_tests.py \
  --fortsh ../fortsh/bin/fortsh --spec builtins.yaml
  1. Fix failures and commit.

Metrics

Time Estimates:

  • Convert 100 tests: ~2 hours (with converter)
  • Hand-write 50 tests: ~3 hours
  • Review/fix converted tests: ~1 hour per 50 tests
  • Total for 500+ test expansion: ~20-25 hours

Expected Outcomes:

  • Coverage: Match non-interactive test count (650+)
  • Pass Rate: 70-75% (some new tests will reveal bugs)
  • Quality: Better edge case coverage
  • Maintenance: Easier to identify gaps

Resources

  • Expansion Plan: EXPANSION_PLAN.md (detailed strategy)
  • Converter Tool: utils/convert_posix_tests.py
  • Non-Interactive Tests: ../../fortsh/tests/posix_compliance*.sh
  • Current Tests: test_specs/*.yaml
View source
1 # Interactive Test Expansion - Quick Reference
2
3 ## Current State
4 - **Interactive Tests**: 321 tests (72.6% pass rate)
5 - **Non-Interactive Tests**: ~656 POSIX compliance tests
6 - **Coverage Gap**: ~335 tests (mostly edge cases and builtin testing)
7
8 ## What We're Missing
9
10 ### 1. Edge Case Coverage (~200 tests)
11 From `posix_compliance_gaps.sh` (180 tests):
12 - Parameter expansion edge cases (nested, complex patterns)
13 - Builtin edge cases (set, shift, eval, return, break/continue)
14 - Quoting and escaping complexity
15 - Here-document variations (<<-, <<EOF)
16 - Function scope and recursion
17 - Special parameter edge cases ($@, $*, IFS interactions)
18 - Redirection edge cases (<>, append with FD)
19
20 ### 2. Interactive-Specific Depth (~150 tests)
21 Areas where we have basic tests but need edge cases:
22 - **Line Editing**: Undo, macros, very long lines, Unicode edge cases
23 - **History**: File operations, size limits, multi-line commands, sharing
24 - **Completion**: Programmable completion, special chars, long lists, context-sensitive
25 - **Job Control**: Multiple job specs, notification timing, wait variants
26
27 ### 3. Cross-Feature Interactions (~100 tests)
28 Combinations that reveal bugs:
29 - Editing + History (edit recalled command, Ctrl+R while editing)
30 - Completion + Variables (complete with spaces, inside ${VAR[TAB]})
31 - Job Control + Signals (Ctrl+C during completion, notification while editing)
32 - Prompt + Escape Sequences (command substitution in prompt, resize behavior)
33
34 ### 4. Error Handling & Resources (~80 tests)
35 - Input edge cases (binary input, invalid UTF-8, very fast typing)
36 - Output edge cases (> buffer size, control chars, broken pipe)
37 - Resource limits (max history, FD exhaustion, process limits)
38 - Error recovery (undefined HOME/PATH, terminal errors)
39
40 ## How to Expand
41
42 ### Quick Wins (Easiest First)
43
44 #### 1. Use the Converter Tool (30 minutes)
45 ```bash
46 # Convert simple tests from POSIX suite
47 cd tests/interactive
48 .venv/bin/python utils/convert_posix_tests.py \
49 ../../fortsh/tests/posix_compliance_test.sh \
50 test_specs/posix_basic_converted.yaml
51
52 # Review and fix MANUAL_REVIEW items
53 # Most echo commands auto-convert well
54 ```
55
56 **Expected**: ~60-80 usable tests from the 96 in posix_compliance_test.sh
57
58 #### 2. Hand-Craft Edge Cases (2-3 hours)
59 Pick 30-40 edge cases from `posix_compliance_gaps.sh` that are interesting:
60 ```yaml
61 # Example: Builtin edge cases
62 - name: "shift with no arguments uses $@"
63 steps:
64 - send_line: "set -- a b c"
65 - send_line: "shift"
66 - send_line: "echo $@"
67 expect_output: "b c"
68
69 - name: "shift beyond available args is error"
70 steps:
71 - send_line: "set -- a"
72 - send_line: "shift 2; echo $?"
73 expect_output: "1"
74
75 - name: "eval with empty string"
76 steps:
77 - send_line: "eval ''; echo $?"
78 expect_output: "0"
79 ```
80
81 #### 3. Extend Existing Categories (1-2 hours)
82 Add variations to existing tests:
83 ```yaml
84 # In history.yaml, add:
85 - name: "History with very long command (>1000 chars)"
86 steps:
87 - send_line: "echo <1000 char string>"
88 - send_key: "Up"
89 expect_output: "<verify it appears>"
90
91 # In completion.yaml, add:
92 - name: "Complete filename with spaces and quotes"
93 steps:
94 - send: "ls 'file with "
95 - send_key: "Tab"
96 expect_output: "file with spaces.txt"
97 ```
98
99 ### Medium Effort (More Time Investment)
100
101 #### 4. Create New Spec Files (3-4 hours each)
102 Add comprehensive coverage for specific areas:
103
104 **test_specs/builtins_edge_cases.yaml**:
105 - All edge cases for cd, set, shift, eval, return, break, continue
106 - Readonly and unset interactions
107 - Alias edge cases
108 - getopts comprehensive testing
109
110 **test_specs/parameter_expansion_advanced.yaml**:
111 - Nested parameter expansion
112 - All pattern matching variations (%, %%, #, ##)
113 - Substring operations
114 - Complex default/assign/error patterns
115
116 **test_specs/cross_feature_interactions.yaml**:
117 - Editing while history search active
118 - Completion during variable expansion
119 - Job control notification during prompt display
120 - Signal handling during different interactive states
121
122 ### Tools We Created
123
124 1. **convert_posix_tests.py**
125 - Parses `compare_posix_output` calls from shell scripts
126 - Generates YAML test specs
127 - Marks complex cases for manual review
128 - ~60-70% of tests auto-convert successfully
129
130 2. **Session Reuse Framework**
131 - Reuses PTY sessions (10 tests per session)
132 - Automatic reset between tests
133 - Handles ~300+ tests without resource exhaustion
134
135 ## Recommended Priorities
136
137 ### Phase 1: Foundation (Week 1) - Target: +100 tests
138 1. Convert basic POSIX tests (echo, variables, simple commands)
139 2. Add builtin edge cases (shift, set, eval - highest value)
140 3. Extend parameter expansion tests
141
142 **Outcome**: 421 tests (~74% pass rate expected)
143
144 ### Phase 2: Interactive Depth (Week 2) - Target: +80 tests
145 1. Line editing edge cases (long lines, Unicode, special chars)
146 2. History edge cases (multi-line, file operations, size limits)
147 3. Completion improvements (special chars, long lists)
148
149 **Outcome**: 501 tests (~73% pass rate expected, interactive-specific may reveal bugs)
150
151 ### Phase 3: Interactions & Edge Cases (Week 3) - Target: +70 tests
152 1. Cross-feature interaction tests
153 2. Job control comprehensive testing
154 3. Error handling and resource limit tests
155
156 **Outcome**: 571 tests (~72% pass rate expected)
157
158 ### Phase 4: Coverage Complete (Week 4) - Target: +80 tests
159 1. Remaining POSIX gaps conversions
160 2. Stress and performance tests
161 3. Documentation and CI integration
162
163 **Outcome**: 650+ tests (parity with non-interactive suite)
164
165 ## Quick Start Example
166
167 Here's how to add 10 new tests in 15 minutes:
168
169 1. **Pick a category** (e.g., "shift builtin edge cases")
170
171 2. **Create tests** in existing spec or new file:
172 ```bash
173 # Add to test_specs/posix.yaml or create test_specs/builtins.yaml
174 ```
175
176 3. **Write 10 variations**:
177 ```yaml
178 - name: "shift no args"
179 steps:
180 - send_line: "set -- a b; shift; echo $@"
181 expect_output: "b"
182
183 - name: "shift with count"
184 steps:
185 - send_line: "set -- a b c; shift 2; echo $@"
186 expect_output: "c"
187
188 - name: "shift all"
189 steps:
190 - send_line: "set -- a; shift; echo $#"
191 expect_output: "0"
192
193 # ... 7 more variations
194 ```
195
196 4. **Run tests**:
197 ```bash
198 tests/interactive/.venv/bin/python tests/interactive/run_tests.py \
199 --fortsh ../fortsh/bin/fortsh --spec builtins.yaml
200 ```
201
202 5. **Fix failures** and commit.
203
204 ## Metrics
205
206 **Time Estimates**:
207 - Convert 100 tests: ~2 hours (with converter)
208 - Hand-write 50 tests: ~3 hours
209 - Review/fix converted tests: ~1 hour per 50 tests
210 - **Total for 500+ test expansion**: ~20-25 hours
211
212 **Expected Outcomes**:
213 - **Coverage**: Match non-interactive test count (650+)
214 - **Pass Rate**: 70-75% (some new tests will reveal bugs)
215 - **Quality**: Better edge case coverage
216 - **Maintenance**: Easier to identify gaps
217
218 ## Resources
219
220 - **Expansion Plan**: `EXPANSION_PLAN.md` (detailed strategy)
221 - **Converter Tool**: `utils/convert_posix_tests.py`
222 - **Non-Interactive Tests**: `../../fortsh/tests/posix_compliance*.sh`
223 - **Current Tests**: `test_specs/*.yaml`