The call_from_thread method only works from worker threads, not from
the main thread. Added try/except to handle both cases - if we're on
the main thread, call the function directly instead.
Fixes RuntimeError: 'call_from_thread' must run in different thread
The 'Chatbot mode detected' error message was confusing users - it's
internal correction logic that should be silent. Now only logged to
debug file, not displayed in UI.
Key changes to make models use tools more consistently:
1. Simplified system prompts (86 lines -> ~25 lines)
- Removed confusing negative examples
- Clear, short positive examples only
- Same format for both native and ReAct modes
2. Added assistant prefilling
- On action-oriented tasks, prefill with '[' to force tool format
- Guides model to start with tool call instead of chatting
3. Lowered temperature (0.5 -> 0.3)
- More deterministic = better instruction following
4. Added few-shot examples in message history
- Shows model actual tool use conversation
- Different examples for bracket vs ReAct format
These changes address the core issue: models ignoring the system
prompt and outputting chatbot text instead of tool calls.
Models sometimes describe using tools instead of actually calling them,
outputting text like 'Used bash tool with command...' or 'Here is what
I did:' followed by fake tool descriptions.
- Add hallucination detection patterns in _contains_unexecuted_code()
- Update steering message to explicitly address narration problem
- Add streaming filter to hide hallucinated narrations from users
- Add complete content filter for hallucination patterns
**The Problem:**
Bracket-format tool calls like [USE write tool: ...] were showing
to users in the stream before being extracted and executed.
**Root Cause:**
on_stream_chunk() was displaying content directly without filtering.
Only complete (non-streaming) content went through safeguards.
**The Fix:**
Apply agent.safeguards.filter_stream_chunk() to streaming content
before appending to StreamingText widget.
**What This Fixes:**
- Bracket tool calls no longer visible to users
- Code blocks removed from stream in real-time
- JSON tool attempts filtered out as they're generated
- Cleaner, more professional output during streaming
**Issues Fixed:**
1. **Approval bar never got focus**
- Set can_focus=True in __init__ and show_approval
- Approval bar can now receive keyboard events (Y, n, e)
2. **Threading deadlock**
- Changed call_later to call_from_thread (worker-safe)
- Added 300s timeout to prevent infinite hangs
- Properly hide bar after confirmation
3. **No visibility into confirmation flow**
- Added extensive debug logging to trace:
- When bar is shown/focused
- When action methods are called
- When app handlers receive events
- When futures are resolved
**What This Fixes:**
- Tool calls now actually wait for user approval
- Pressing Y/n/e now works correctly
- No more 240s hangs with no output
- Approval bar properly shows, focuses, and responds
**The Bug:**
Chatbot detection was running AFTER safeguards filtered code blocks from content.
By the time _contains_unexecuted_code() ran, the code blocks were already gone,
so detection always failed!
**The Fix:**
Check response_content (original) instead of content (filtered).
Now detects numbered steps, code blocks, and tutorial patterns correctly.
**Why This Matters:**
Without this fix, even large models like Mixtral would slip into chatbot mode
and never get corrected because the detection was blind.
**Added:**
- Debug logging to trace detection decisions
- Check original content before filtering
- Visibility into what content is being evaluated
Enhances detection of tutorial-style responses to catch small models that give
numbered instructions instead of using tools.
**New Detection Patterns:**
- Numbered lists (1., 2., 3. Open..., Create..., Navigate...)
- Sequenced steps (First..., Next..., Then...)
- Tutorial starters ('Open your terminal...', 'Navigate to...')
- How-to preambles ('Here's how you can...')
**Stronger Steering:**
- More explicit error message when chatbot mode is detected
- Clear examples of what NOT to do vs what TO do
- User-visible warning when auto-correction is triggered
This helps smaller models like llama3.2:3b stay in agent mode instead
of reverting to chatbot/tutorial behavior.
- fzf-style model selector modal
- Fuzzy filter models as you type
- Keyboard shortcuts: j/k or arrows for navigation
- Used by /model and /models slash commands
- /help or /h - show available commands
- /exit or /q - exit the application
- /clear or /c - clear conversation
- /model or /m - list models or switch (/model llama3.2:3b)
- /models - list available Ollama models
- Shows current model with green dot, others with dim dot
- Updates status line when model changes
- Auto-detects Native vs ReAct mode for new model
- Changed condition from 'old_string and new_string' to 'old_string is not None'
- This handles edits where old_string='' (inserting new content)
- Added debug logging with character counts
- For Create: show preview of content being written (first 100 chars)
- For Update: show both old and new content previews with -/+ prefixes
- Makes it clearer what's being changed without expanding full diff
- Skip xdg-open, firefox, chrome, browser commands (don't work in TUI)
- Track executed commands and skip duplicates
- Update prompts to discourage repetition and browser commands
- Streamline rules to be more focused
- Clear command tracking on history clear
- Add rule 11: NO PLACEHOLDERS - never use ... as content
- Fix examples that showed ... as placeholder
- Show complete, real content in all examples
- Applies to both native and ReAct prompts
- Avoid duplicate extraction by tracking match positions
- Expand ~ in all file paths using os.path.expanduser
- Better bash command extraction handling 'command=' prefix
- Handle model outputting 'cmd, command=cmd' format
- Use position-based IDs to prevent duplicates across patterns
- Better parsing of file_path with quotes
- Walk backward to find matching end quote for content
- Add smarter bracket patterns with lookahead for common endings
- Add debug logging for write tool extraction
- Add MAX_EXTRACTED_ITERATIONS (3) to prevent infinite loops
- Track consecutive errors across tool executions
- Stop after 3 consecutive errors or when all batch tools fail
- Prevents model from endlessly retrying failing operations
- Add ClearStream message to adapter for hiding ugly raw tool call JSON
- Add on_clear_stream handler to remove streaming widget when raw JSON detected
- Add _streamed_content flag to track if content was shown
- Finalize previous streaming widget in on_thinking_started to prevent appending
- Show fallback content in on_response_complete when nothing was streamed