loader Public

Notifications Fork 0 Star 0

Commits

March 2026

‹ ›

Commits on January 2, 2026

fix: handle thread context in approval bar

The call_from_thread method only works from worker threads, not from
the main thread. Added try/except to handle both cases - if we're on
the main thread, call the function directly instead.

Fixes RuntimeError: 'call_from_thread' must run in different thread

mfwolffe committed 4 months ago

ef74134

fix: hide internal chatbot-mode steering from users

The 'Chatbot mode detected' error message was confusing users - it's
internal correction logic that should be silent. Now only logged to
debug file, not displayed in UI.

mfwolffe committed 4 months ago

7e40ae7

fix: improve agent tool use reliability

Key changes to make models use tools more consistently:

1. Simplified system prompts (86 lines -> ~25 lines)
   - Removed confusing negative examples
   - Clear, short positive examples only
   - Same format for both native and ReAct modes

2. Added assistant prefilling
   - On action-oriented tasks, prefill with '[' to force tool format
   - Guides model to start with tool call instead of chatting

3. Lowered temperature (0.5 -> 0.3)
   - More deterministic = better instruction following

4. Added few-shot examples in message history
   - Shows model actual tool use conversation
   - Different examples for bracket vs ReAct format

These changes address the core issue: models ignoring the system
prompt and outputting chatbot text instead of tool calls.

mfwolffe committed 4 months ago

5c8003b

fix: detect and filter hallucinated tool use narrations

Models sometimes describe using tools instead of actually calling them,
outputting text like 'Used bash tool with command...' or 'Here is what
I did:' followed by fake tool descriptions.

- Add hallucination detection patterns in _contains_unexecuted_code()
- Update steering message to explicitly address narration problem
- Add streaming filter to hide hallucinated narrations from users
- Add complete content filter for hallucination patterns

mfwolffe committed 4 months ago

0397de7

Commits on January 1, 2026

fix: filter streaming content to hide bracket tool calls

**The Problem:**
Bracket-format tool calls like [USE write tool: ...] were showing
to users in the stream before being extracted and executed.

**Root Cause:**
on_stream_chunk() was displaying content directly without filtering.
Only complete (non-streaming) content went through safeguards.

**The Fix:**
Apply agent.safeguards.filter_stream_chunk() to streaming content
before appending to StreamingText widget.

**What This Fixes:**
- Bracket tool calls no longer visible to users
- Code blocks removed from stream in real-time
- JSON tool attempts filtered out as they're generated
- Cleaner, more professional output during streaming

espadonne committed 4 months ago

dafa5bc

fix: approval bar focus and threading issues

**Issues Fixed:**

1. **Approval bar never got focus**
   - Set can_focus=True in __init__ and show_approval
   - Approval bar can now receive keyboard events (Y, n, e)

2. **Threading deadlock**
   - Changed call_later to call_from_thread (worker-safe)
   - Added 300s timeout to prevent infinite hangs
   - Properly hide bar after confirmation

3. **No visibility into confirmation flow**
   - Added extensive debug logging to trace:
     - When bar is shown/focused
     - When action methods are called
     - When app handlers receive events
     - When futures are resolved

**What This Fixes:**
- Tool calls now actually wait for user approval
- Pressing Y/n/e now works correctly
- No more 240s hangs with no output
- Approval bar properly shows, focuses, and responds

espadonne committed 4 months ago

750735d

fix(critical): chatbot detection was checking filtered content

**The Bug:**
Chatbot detection was running AFTER safeguards filtered code blocks from content.
By the time _contains_unexecuted_code() ran, the code blocks were already gone,
so detection always failed!

**The Fix:**
Check response_content (original) instead of content (filtered).
Now detects numbered steps, code blocks, and tutorial patterns correctly.

**Why This Matters:**
Without this fix, even large models like Mixtral would slip into chatbot mode
and never get corrected because the detection was blind.

**Added:**
- Debug logging to trace detection decisions
- Check original content before filtering
- Visibility into what content is being evaluated

espadonne committed 4 months ago

a953bb4

fix: improve chatbot mode detection and recovery

Enhances detection of tutorial-style responses to catch small models that give
numbered instructions instead of using tools.

**New Detection Patterns:**
- Numbered lists (1., 2., 3. Open..., Create..., Navigate...)
- Sequenced steps (First..., Next..., Then...)
- Tutorial starters ('Open your terminal...', 'Navigate to...')
- How-to preambles ('Here's how you can...')

**Stronger Steering:**
- More explicit error message when chatbot mode is detected
- Clear examples of what NOT to do vs what TO do
- User-visible warning when auto-correction is triggered

This helps smaller models like llama3.2:3b stay in agent mode instead
of reverting to chatbot/tutorial behavior.

espadonne committed 4 months ago

a97c3af

Commits on December 30, 2025

feat: add model selection modal with fuzzy filtering

- fzf-style model selector modal
- Fuzzy filter models as you type
- Keyboard shortcuts: j/k or arrows for navigation
- Used by /model and /models slash commands

espadonne committed 4 months ago

ee05471

feat: add approval bar and slash command suggestions

**Approval Bar (Claude Code style):**
- Inline approval bar above input for confirmations
- Shows: [tool] command_preview    [Y]es [n]o [e]dit
- Press 'Y' to approve, 'n' to reject, 'e' to edit
- Edit puts command in input field for modification
- Replaces modal dialog with cleaner inline UX

**Slash Command Suggestions:**
- Shadow text suggestions for /help, /model, /clear, /exit
- Shows completion in placeholder: "help  (Tab to complete)"
- Press Tab to accept suggestion

**Other Improvements:**
- Fix "Show full output" toggle visibility
- Hide completion check messages from users
- Refocus input after rejection

espadonne committed 4 months ago

fea4bce

feat: add runtime safeguards to improve agent behavior

Implements comprehensive runtime safeguards to help smaller models stay on track:

**Content Filtering:**
- Filter code blocks, bracket tool calls, preambles from stream
- Filter raw JSON tool calls from output
- Filter internal recovery/steering prompts from user display

**Loop Detection:**
- Action loop detection (glob → write → glob patterns)
- Text loop detection (repeated responses)
- Aggressive phrase-based repetition detection

**Pre-Action Validation:**
- Block dangerous commands (rm -rf /, fork bombs, etc.)
- Validate empty arguments, invalid paths
- Block interactive tools (vim, nano, less)
- Prevent writes to system directories

**Deduplication:**
- Track files created, commands run, edits made
- Skip duplicate tool calls automatically

**Improvements:**
- Hide "Task not complete yet" messages from users
- Fix completion check false positives for simple tasks
- Support both "parameters" and "arguments" in JSON extraction
- More conservative completion detection for web/design tasks

espadonne committed 4 months ago

5453736

Commits on December 29, 2025

fix: use Textual worker for async model listing

- Fixes 'event loop already running' error
- Uses run_worker() to properly schedule async code in Textual

espadonne committed 4 months ago

2300268

feat: add slash commands for /exit, /model, /help, /clear

- /help or /h - show available commands
- /exit or /q - exit the application
- /clear or /c - clear conversation
- /model or /m - list models or switch (/model llama3.2:3b)
- /models - list available Ollama models
- Shows current model with green dot, others with dim dot
- Updates status line when model changes
- Auto-detects Native vs ReAct mode for new model

espadonne committed 4 months ago

f729e7f

fix: show DiffWidget for edits even when old_string is empty

- Changed condition from 'old_string and new_string' to 'old_string is not None'
- This handles edits where old_string='' (inserting new content)
- Added debug logging with character counts

espadonne committed 4 months ago

19f8bdf

ui: improve DiffWidget to show content previews

- For Create: show preview of content being written (first 100 chars)
- For Update: show both old and new content previews with -/+ prefixes
- Makes it clearer what's being changed without expanding full diff

espadonne committed 4 months ago

5946d38

fix: discourage code block previews in prompts

- Add rule: no 'the file will look like:' followed by code
- Tool results show changes automatically
- Streamlined rule wording

espadonne committed 4 months ago

45ef0c6

fix: prevent repetitive commands and skip browser commands

- Skip xdg-open, firefox, chrome, browser commands (don't work in TUI)
- Track executed commands and skip duplicates
- Update prompts to discourage repetition and browser commands
- Streamline rules to be more focused
- Clear command tracking on history clear

espadonne committed 4 months ago

bd3440c

fix: update prompts to prohibit placeholder content like '...'

- Add rule 11: NO PLACEHOLDERS - never use ... as content
- Fix examples that showed ... as placeholder
- Show complete, real content in all examples
- Applies to both native and ReAct prompts

espadonne committed 4 months ago

c5663ca

ux: friendlier messages when stopping, invite user to continue

espadonne committed 4 months ago

35ed29b

fix: improve bracket-format tool call extraction

- Avoid duplicate extraction by tracking match positions
- Expand ~ in all file paths using os.path.expanduser
- Better bash command extraction handling 'command=' prefix
- Handle model outputting 'cmd, command=cmd' format
- Use position-based IDs to prevent duplicates across patterns

espadonne committed 4 months ago

f3985be

remove stray index.html

espadonne committed 4 months ago

aacef5f

fix: improve content extraction for bracket-format write tool calls

- Better parsing of file_path with quotes
- Walk backward to find matching end quote for content
- Add smarter bracket patterns with lookahead for common endings
- Add debug logging for write tool extraction

espadonne committed 4 months ago

515c51e

fix: limit extracted tool call iterations and stop on consecutive errors

- Add MAX_EXTRACTED_ITERATIONS (3) to prevent infinite loops
- Track consecutive errors across tool executions
- Stop after 3 consecutive errors or when all batch tools fail
- Prevents model from endlessly retrying failing operations

espadonne committed 4 months ago

2031f1a

ux: add 0.4s delay between extracted tool executions for smoother experience

espadonne committed 4 months ago

dff75e0
fix: use skip_confirmation flag for re-execution after confirmation (match main tool loop)

espadonne committed 4 months ago

7466f7b
fix: unpack tool arguments with ** when calling registry.execute()

espadonne committed 4 months ago

2b80190
fix: correct AgentEvent field name for tool_result (content, not tool_result)

espadonne committed 4 months ago

dc9c4ca
debug: add logging to trace bracket-format tool call extraction

espadonne committed 4 months ago

4d661fd

feat: improve agent conversation flow and handle non-standard tool calls

Multi-turn conversation improvements:
- Add _current_task field to track original task across turns
- Pass original_task to _run_inner for better completion detection
- Add empty response handling with progressive retry prompts (5 retries)
- Clear _current_task on conversation clear

Raw tool call extraction:
- Add _extract_raw_json_tool_calls() to parse tool calls from model text
- Support JSON format: {"name": "bash", "parameters": {...}}
- Support bracket format: [calls bash tool with: ...], [USE write tool: ...]
- Emit clear_stream event before showing extracted tool widgets
- Execute extracted tool calls properly

Other improvements:
- Add follow-up question after task completion
- Detect bracket patterns in _contains_unexecuted_code()

espadonne committed 4 months ago

67e6c0d

fix: UI streaming improvements for multi-turn conversations

- Add ClearStream message to adapter for hiding ugly raw tool call JSON
- Add on_clear_stream handler to remove streaming widget when raw JSON detected
- Add _streamed_content flag to track if content was shown
- Finalize previous streaming widget in on_thinking_started to prevent appending
- Show fallback content in on_response_complete when nothing was streamed

espadonne committed 4 months ago

61dd56d