Subagent's _run_subagent() was dropping reasoning_content and
thinking_blocks when building assistant messages for the conversation
history. Providers like Deepseek Reasoner require reasoning_content on
every assistant message when thinking mode is active, causing a 400
BadRequestError on the second LLM round-trip.
Align with the main AgentLoop which already preserves these fields via
ContextBuilder.add_assistant_message().
Closes#1834
_extract_absolute_paths() only matched paths starting with / or drive
letters, missing ~ paths that expand to the home directory. This
allowed agents to bypass restrictToWorkspace by using commands like
cat ~/.nanobot/config.json to access files outside the workspace.
Add tilde path extraction regex and use expanduser() before resolving.
Also switch from manual parent-chain check to is_relative_to() for
more robust path containment validation.
Fixes#1817
MCP SDK's anyio cancel scopes can leak CancelledError on timeout or
failure paths. Since CancelledError is a BaseException (not Exception),
it escapes both MCPToolWrapper.execute() and ToolRegistry.execute(),
crashing the agent loop.
Now catches CancelledError and returns a graceful error to the LLM,
while still re-raising genuine task cancellations from /stop.
Also catches general Exception for other MCP failures (connection
drops, invalid responses, etc.).
Related: #1055
Major changes:
- Replace message-count-based memory window with token-budget-based compression
- Add max_tokens_input, compression_start_ratio, compression_target_ratio config
- Implement _maybe_compress_history() that triggers based on prompt token usage
- Use _build_compressed_history_view() to provide compressed history to LLM
- Refactor MemoryStore.consolidate() -> consolidate_chunk() for chunk-based compression
- Remove last_consolidated from Session, use _compressed_until metadata instead
- Add background compression scheduling to avoid blocking message processing
Key improvements:
- Compression now based on actual token usage, not arbitrary message counts
- Better handling of long conversations with large context windows
- Non-destructive compression: old messages remain in session, but excluded from prompt
- Automatic compression when history exceeds configured token thresholds