Commits

trunk
Switch branches/tags
All users
Until Apr 3, 2026
April 2026
Su Mo Tu We Th Fr Sa
29 30 31 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 1 2
3 4 5 6 7 8 9

Commits on January 2, 2026

  1. fix: handle thread context in approval bar
    The call_from_thread method only works from worker threads, not from
    the main thread. Added try/except to handle both cases - if we're on
    the main thread, call the function directly instead.
    
    Fixes RuntimeError: 'call_from_thread' must run in different thread
    mfwolffe committed
  2. fix: hide internal chatbot-mode steering from users
    The 'Chatbot mode detected' error message was confusing users - it's
    internal correction logic that should be silent. Now only logged to
    debug file, not displayed in UI.
    mfwolffe committed
  3. fix: improve agent tool use reliability
    Key changes to make models use tools more consistently:
    
    1. Simplified system prompts (86 lines -> ~25 lines)
       - Removed confusing negative examples
       - Clear, short positive examples only
       - Same format for both native and ReAct modes
    
    2. Added assistant prefilling
       - On action-oriented tasks, prefill with '[' to force tool format
       - Guides model to start with tool call instead of chatting
    
    3. Lowered temperature (0.5 -> 0.3)
       - More deterministic = better instruction following
    
    4. Added few-shot examples in message history
       - Shows model actual tool use conversation
       - Different examples for bracket vs ReAct format
    
    These changes address the core issue: models ignoring the system
    prompt and outputting chatbot text instead of tool calls.
    mfwolffe committed
  4. fix: detect and filter hallucinated tool use narrations
    Models sometimes describe using tools instead of actually calling them,
    outputting text like 'Used bash tool with command...' or 'Here is what
    I did:' followed by fake tool descriptions.
    
    - Add hallucination detection patterns in _contains_unexecuted_code()
    - Update steering message to explicitly address narration problem
    - Add streaming filter to hide hallucinated narrations from users
    - Add complete content filter for hallucination patterns
    mfwolffe committed

Commits on January 1, 2026

  1. fix: filter streaming content to hide bracket tool calls
    **The Problem:**
    Bracket-format tool calls like [USE write tool: ...] were showing
    to users in the stream before being extracted and executed.
    
    **Root Cause:**
    on_stream_chunk() was displaying content directly without filtering.
    Only complete (non-streaming) content went through safeguards.
    
    **The Fix:**
    Apply agent.safeguards.filter_stream_chunk() to streaming content
    before appending to StreamingText widget.
    
    **What This Fixes:**
    - Bracket tool calls no longer visible to users
    - Code blocks removed from stream in real-time
    - JSON tool attempts filtered out as they're generated
    - Cleaner, more professional output during streaming
    espadonne committed
  2. fix: approval bar focus and threading issues
    **Issues Fixed:**
    
    1. **Approval bar never got focus**
       - Set can_focus=True in __init__ and show_approval
       - Approval bar can now receive keyboard events (Y, n, e)
    
    2. **Threading deadlock**
       - Changed call_later to call_from_thread (worker-safe)
       - Added 300s timeout to prevent infinite hangs
       - Properly hide bar after confirmation
    
    3. **No visibility into confirmation flow**
       - Added extensive debug logging to trace:
         - When bar is shown/focused
         - When action methods are called
         - When app handlers receive events
         - When futures are resolved
    
    **What This Fixes:**
    - Tool calls now actually wait for user approval
    - Pressing Y/n/e now works correctly
    - No more 240s hangs with no output
    - Approval bar properly shows, focuses, and responds
    espadonne committed
  3. fix(critical): chatbot detection was checking filtered content
    **The Bug:**
    Chatbot detection was running AFTER safeguards filtered code blocks from content.
    By the time _contains_unexecuted_code() ran, the code blocks were already gone,
    so detection always failed!
    
    **The Fix:**
    Check response_content (original) instead of content (filtered).
    Now detects numbered steps, code blocks, and tutorial patterns correctly.
    
    **Why This Matters:**
    Without this fix, even large models like Mixtral would slip into chatbot mode
    and never get corrected because the detection was blind.
    
    **Added:**
    - Debug logging to trace detection decisions
    - Check original content before filtering
    - Visibility into what content is being evaluated
    espadonne committed
  4. fix: improve chatbot mode detection and recovery
    Enhances detection of tutorial-style responses to catch small models that give
    numbered instructions instead of using tools.
    
    **New Detection Patterns:**
    - Numbered lists (1., 2., 3. Open..., Create..., Navigate...)
    - Sequenced steps (First..., Next..., Then...)
    - Tutorial starters ('Open your terminal...', 'Navigate to...')
    - How-to preambles ('Here's how you can...')
    
    **Stronger Steering:**
    - More explicit error message when chatbot mode is detected
    - Clear examples of what NOT to do vs what TO do
    - User-visible warning when auto-correction is triggered
    
    This helps smaller models like llama3.2:3b stay in agent mode instead
    of reverting to chatbot/tutorial behavior.
    espadonne committed

Commits on December 30, 2025

  1. feat: add model selection modal with fuzzy filtering
    - fzf-style model selector modal
    - Fuzzy filter models as you type
    - Keyboard shortcuts: j/k or arrows for navigation
    - Used by /model and /models slash commands
    espadonne committed
  2. feat: add approval bar and slash command suggestions
    **Approval Bar (Claude Code style):**
    - Inline approval bar above input for confirmations
    - Shows: [tool] command_preview    [Y]es [n]o [e]dit
    - Press 'Y' to approve, 'n' to reject, 'e' to edit
    - Edit puts command in input field for modification
    - Replaces modal dialog with cleaner inline UX
    
    **Slash Command Suggestions:**
    - Shadow text suggestions for /help, /model, /clear, /exit
    - Shows completion in placeholder: "help  (Tab to complete)"
    - Press Tab to accept suggestion
    
    **Other Improvements:**
    - Fix "Show full output" toggle visibility
    - Hide completion check messages from users
    - Refocus input after rejection
    espadonne committed
  3. feat: add runtime safeguards to improve agent behavior
    Implements comprehensive runtime safeguards to help smaller models stay on track:
    
    **Content Filtering:**
    - Filter code blocks, bracket tool calls, preambles from stream
    - Filter raw JSON tool calls from output
    - Filter internal recovery/steering prompts from user display
    
    **Loop Detection:**
    - Action loop detection (glob → write → glob patterns)
    - Text loop detection (repeated responses)
    - Aggressive phrase-based repetition detection
    
    **Pre-Action Validation:**
    - Block dangerous commands (rm -rf /, fork bombs, etc.)
    - Validate empty arguments, invalid paths
    - Block interactive tools (vim, nano, less)
    - Prevent writes to system directories
    
    **Deduplication:**
    - Track files created, commands run, edits made
    - Skip duplicate tool calls automatically
    
    **Improvements:**
    - Hide "Task not complete yet" messages from users
    - Fix completion check false positives for simple tasks
    - Support both "parameters" and "arguments" in JSON extraction
    - More conservative completion detection for web/design tasks
    espadonne committed

Commits on December 29, 2025

  1. fix: use Textual worker for async model listing
    - Fixes 'event loop already running' error
    - Uses run_worker() to properly schedule async code in Textual
    espadonne committed
  2. feat: add slash commands for /exit, /model, /help, /clear
    - /help or /h - show available commands
    - /exit or /q - exit the application
    - /clear or /c - clear conversation
    - /model or /m - list models or switch (/model llama3.2:3b)
    - /models - list available Ollama models
    - Shows current model with green dot, others with dim dot
    - Updates status line when model changes
    - Auto-detects Native vs ReAct mode for new model
    espadonne committed
  3. fix: show DiffWidget for edits even when old_string is empty
    - Changed condition from 'old_string and new_string' to 'old_string is not None'
    - This handles edits where old_string='' (inserting new content)
    - Added debug logging with character counts
    espadonne committed
  4. ui: improve DiffWidget to show content previews
    - For Create: show preview of content being written (first 100 chars)
    - For Update: show both old and new content previews with -/+ prefixes
    - Makes it clearer what's being changed without expanding full diff
    espadonne committed
  5. fix: discourage code block previews in prompts
    - Add rule: no 'the file will look like:' followed by code
    - Tool results show changes automatically
    - Streamlined rule wording
    espadonne committed
  6. fix: prevent repetitive commands and skip browser commands
    - Skip xdg-open, firefox, chrome, browser commands (don't work in TUI)
    - Track executed commands and skip duplicates
    - Update prompts to discourage repetition and browser commands
    - Streamline rules to be more focused
    - Clear command tracking on history clear
    espadonne committed
  7. fix: update prompts to prohibit placeholder content like '...'
    - Add rule 11: NO PLACEHOLDERS - never use ... as content
    - Fix examples that showed ... as placeholder
    - Show complete, real content in all examples
    - Applies to both native and ReAct prompts
    espadonne committed
  8. fix: improve bracket-format tool call extraction
    - Avoid duplicate extraction by tracking match positions
    - Expand ~ in all file paths using os.path.expanduser
    - Better bash command extraction handling 'command=' prefix
    - Handle model outputting 'cmd, command=cmd' format
    - Use position-based IDs to prevent duplicates across patterns
    espadonne committed
  9. espadonne committed
  10. fix: improve content extraction for bracket-format write tool calls
    - Better parsing of file_path with quotes
    - Walk backward to find matching end quote for content
    - Add smarter bracket patterns with lookahead for common endings
    - Add debug logging for write tool extraction
    espadonne committed
  11. fix: limit extracted tool call iterations and stop on consecutive errors
    - Add MAX_EXTRACTED_ITERATIONS (3) to prevent infinite loops
    - Track consecutive errors across tool executions
    - Stop after 3 consecutive errors or when all batch tools fail
    - Prevents model from endlessly retrying failing operations
    espadonne committed
  12. feat: improve agent conversation flow and handle non-standard tool calls
    Multi-turn conversation improvements:
    - Add _current_task field to track original task across turns
    - Pass original_task to _run_inner for better completion detection
    - Add empty response handling with progressive retry prompts (5 retries)
    - Clear _current_task on conversation clear
    
    Raw tool call extraction:
    - Add _extract_raw_json_tool_calls() to parse tool calls from model text
    - Support JSON format: {"name": "bash", "parameters": {...}}
    - Support bracket format: [calls bash tool with: ...], [USE write tool: ...]
    - Emit clear_stream event before showing extracted tool widgets
    - Execute extracted tool calls properly
    
    Other improvements:
    - Add follow-up question after task completion
    - Detect bracket patterns in _contains_unexecuted_code()
    espadonne committed
  13. fix: UI streaming improvements for multi-turn conversations
    - Add ClearStream message to adapter for hiding ugly raw tool call JSON
    - Add on_clear_stream handler to remove streaming widget when raw JSON detected
    - Add _streamed_content flag to track if content was shown
    - Finalize previous streaming widget in on_thinking_started to prevent appending
    - Show fallback content in on_response_complete when nothing was streamed
    espadonne committed