Commit Graph

1423 Commits

Author SHA1 Message Date
Xubin Ren
25288f9951 feat(whatsapp): add outbound media support via bridge 2026-03-24 01:11:33 +08:00
Xubin Ren
bef88a5ea1 docs: require explicit channel login command 2026-03-24 01:11:33 +08:00
Xubin Ren
d164548d9a docs(weixin): add setup guide and focused channel tests 2026-03-24 01:11:33 +08:00
Xubin Ren
0ca639bf22 fix(cli): use discovered class for channel login 2026-03-24 01:11:33 +08:00
chengyongru
556b21d011 refactor(channels): abstract login() into BaseChannel, unify CLI commands
Move channel-specific login logic from CLI into each channel class via a
new `login(force=False)` method on BaseChannel. The `channels login <name>`
command now dynamically loads the channel and calls its login() method.

- WeixinChannel.login(): calls existing _qr_login(), with force to clear saved token
- WhatsAppChannel.login(): sets up bridge and spawns npm process for QR login
- CLI no longer contains duplicate login logic per channel
- Update CHANNEL_PLUGIN_GUIDE to document the login() hook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 01:11:33 +08:00
ZhangYuanhan-AI
11e1bbbab7 feat(weixin): add outbound media file sending via CDN upload
Previously the WeChat channel's send() method only handled text messages,
completely ignoring msg.media. When the agent called message(media=[...]),
the file was never delivered to the user.

Implement the full WeChat CDN upload protocol following the reference
@tencent-weixin/openclaw-weixin v1.0.2:
  1. Generate a client-side AES-128 key (16 random bytes)
  2. Call getuploadurl with file metadata + hex-encoded AES key
  3. AES-128-ECB encrypt the file and POST to CDN with filekey param
  4. Read x-encrypted-param from CDN response header as download param
  5. Send message with the media item (image/video/file) referencing
     the CDN upload

Also adds:
- _encrypt_aes_ecb() for AES-128-ECB encryption (reverse of existing
  _decrypt_aes_ecb)
- Media type detection from file extension (image/video/file)
- Graceful error handling: failed media sends notify the user via text
  without blocking subsequent text delivery

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 01:11:33 +08:00
ZhangYuanhan-AI
8abbe8a6df fix(agent): instruct LLM to use message tool for file delivery
During testing, we discovered that when a user requests the agent to
send a file (e.g., "send me IMG_1115.png"), the agent would call
read_file to view the content and then reply with text claiming
"file sent" — but never actually deliver the file to the user.

Root cause: The system prompt stated "Reply directly with text for
conversations. Only use the 'message' tool to send to a specific
chat channel", which led the LLM to believe text replies were
sufficient for all responses, including file delivery.

Fix: Add an explicit IMPORTANT instruction in the system prompt
telling the LLM it MUST use the 'message' tool with the 'media'
parameter to send files, and that read_file only reads content
for its own analysis.

Co-Authored-By: qulllee <qullkui@tencent.com>
2026-03-24 01:11:33 +08:00
qulllee
bc9f861bb1 feat: add media message support in agent context and message tool
Cherry-picked from PR #2355 (ad128a7) — only agent/context.py and agent/tools/message.py.

Co-Authored-By: qulllee <qullkui@tencent.com>
2026-03-24 01:11:33 +08:00
ZhangYuanhan-AI
ebc4c2ec35 feat(weixin): add personal WeChat channel via ilinkai HTTP long-poll API
Add a new WeChat (微信) channel that connects to personal WeChat using
the ilinkai.weixin.qq.com HTTP long-poll API. Protocol reverse-engineered
from @tencent-weixin/openclaw-weixin v1.0.2.

Features:
- QR code login flow (nanobot weixin login)
- HTTP long-poll message receiving (getupdates)
- Text message sending with proper WeixinMessage format
- Media download with AES-128-ECB decryption (image/voice/file/video)
- Voice-to-text from WeChat + Groq Whisper fallback
- Quoted message (ref_msg) support
- Session expiry detection and auto-pause
- Server-suggested poll timeout adaptation
- Context token caching for replies
- Auto-discovery via channel registry

No WebSocket, no Node.js bridge, no local WeChat client needed — pure
HTTP with a bot token obtained via QR code scan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 01:11:33 +08:00
Xubin Ren
2056061765 refine heartbeat session retention boundaries 2026-03-24 00:33:43 +08:00
flobo3
ba0a3d14d9 fix: clear heartbeat session to prevent token overflow
(cherry picked from commit 5c871d75d5b1aac09a8df31e6d1e04ee3d9b0d2c)
2026-03-24 00:33:43 +08:00
Eric Yang
84a7f8af73 refactor(shell): fix syntax error 2026-03-24 00:02:49 +08:00
Eric Yang
e2e1c9c276 refactor(shell): use finally block to reap zombie processes on timeoutx 2026-03-24 00:02:49 +08:00
Eric Yang
dbcc7cb539 refactor(shell): use finally block to reap zombie processes on timeout 2026-03-24 00:02:49 +08:00
Eric Yang
e423ceef9c fix(shell): reap zombie processes when command timeout kills subprocess 2026-03-24 00:02:49 +08:00
gem12
97fe9ab7d4 feat(agent): replace global lock with per-session locks for concurrent dispatch
Replace the single _processing_lock (asyncio.Lock) with per-session locks
so that different sessions can process LLM requests concurrently, while
messages within the same session remain serialised.

An optional global concurrency cap is available via the
NANOBOT_MAX_CONCURRENT_REQUESTS env var (default 3, <=0 for unlimited).

Also re-binds tool context before each tool execution round to prevent
concurrent sessions from clobbering each other's routing info.

Tested in production and manually reviewed.

(cherry picked from commit c397bb4229e8c3b7f99acea7ffe4bea15e73e957)
2026-03-23 18:57:03 +08:00
Xubin Ren
20494a2c52 refactor command routing for future plugins and clearer CLI structure 2026-03-23 16:48:42 +08:00
Xubin Ren
aba0b83a77 fix(memory): reserve completion headroom for consolidation
Trigger token consolidation before prompt usage reaches the full context window so response tokens and tokenizer estimation drift still fit safely within the model budget.

Made-with: Cursor
2026-03-23 11:54:44 +08:00
Xubin Ren
8f5c2d1a06 fix(cli): stop spinner after non-streaming interactive replies 2026-03-23 03:28:10 +00:00
chengyongru
a46803cbd7 docs(provider): add mistral intro 2026-03-23 11:07:46 +08:00
Desmond Sow
f64ae3b900 feat(provider): add OpenVINO Model Server provider (#2193)
add OpenVINO Model Server provider
2026-03-23 11:07:46 +08:00
Matt von Rohr
7878340031 feat(providers): add Mistral AI provider
Register Mistral as a first-class provider with LiteLLM routing,
MISTRAL_API_KEY env var, and https://api.mistral.ai/v1 default base.

Includes schema field, registry entry, and tests.
2026-03-23 11:07:46 +08:00
Xubin Ren
9d5e511a6e feat(streaming): centralize think-tag filtering and add Telegram streaming
- Add strip_think() to helpers.py as single source of truth
- Filter deltas in agent loop before dispatching to consumers
- Implement send_delta in TelegramChannel with progressive edit_message_text
- Remove duplicate think filtering from CLI stream.py and telegram.py
- Remove legacy fake streaming (send_message_draft) from Telegram
- Default Telegram streaming to true
- Update CHANNEL_PLUGIN_GUIDE.md with streaming documentation

Made-with: Cursor
2026-03-23 10:20:41 +08:00
Xubin Ren
f2e1cb3662 feat(cli): extract streaming renderer to stream.py with Rich Live
Move ThinkingSpinner and StreamRenderer into a dedicated module to keep
commands.py focused on orchestration. Uses Rich Live with manual refresh
(auto_refresh=False) and ellipsis overflow for stable streaming output.

Made-with: Cursor
2026-03-23 10:20:41 +08:00
Xubin Ren
bd621df57f feat: add streaming channel support with automatic fallback
Provider layer: add chat_stream / chat_stream_with_retry to all providers
(base fallback, litellm, custom, azure, codex). Refactor shared kwargs
building in each provider.

Channel layer: BaseChannel gains send_delta (no-op) and supports_streaming
(checks config + method override). ChannelManager routes _stream_delta /
_stream_end to send_delta, skips _streamed final messages.

AgentLoop._dispatch builds bus-backed on_stream/on_stream_end callbacks
when _wants_stream metadata is set. Non-streaming path unchanged.

CLI: clean up spinner ANSI workarounds, simplify commands.py flow.
Made-with: Cursor
2026-03-23 10:20:41 +08:00
Xubin Ren
e79b9f4a83 feat(agent): add streaming groundwork for future TUI
Preserve the provider and agent-loop streaming primitives plus the CLI experiment scaffolding so this work can be resumed later without blocking urgent bug fixes on main.

Made-with: Cursor
2026-03-23 10:20:41 +08:00
Xubin Ren
5fd66cae5c Merge PR #1109: perf: optimize prompt cache hit rate for Anthropic models
perf: optimize prompt cache hit rate for Anthropic models
2026-03-22 14:23:41 +08:00
Xubin Ren
931cec3908 Merge remote-tracking branch 'origin/main' into pr-1109
Resolve conflict in context.py: keep main's build_messages which already
merges runtime context into user message (achieving the same cache goal).
The real value-add from this PR is the second cache breakpoint in
litellm_provider.py.

Made-with: Cursor
2026-03-22 06:14:18 +00:00
Xubin Ren
1c71489121 fix(agent): count all message fields in token estimation
estimate_prompt_tokens() only counted the `content` text field, completely
missing tool_calls JSON (~72% of actual payload), reasoning_content,
tool_call_id, name, and per-message framing overhead. This caused the
memory consolidator to never trigger for tool-heavy sessions (e.g. cron
jobs), leading to context window overflow errors from the LLM provider.

Also adds reasoning_content counting and proper per-message overhead to
estimate_message_tokens() for consistent boundary detection.

Made-with: Cursor
2026-03-22 12:19:44 +08:00
Xubin Ren
48c71bb61e refactor(agent): unify process_direct to return OutboundMessage
Merge process_direct() and process_direct_outbound() into a single
interface returning OutboundMessage | None. This eliminates the
dual-path detection logic in CLI single-message mode that relied on
inspect.iscoroutinefunction to distinguish between the two APIs.

Extract status rendering into a pure function build_status_content()
in utils/helpers.py, decoupling it from AgentLoop internals.

Made-with: Cursor
2026-03-22 00:39:38 +08:00
Xubin Ren
064ca256f5 Merge PR #1985: feat: add /status command to show runtime info
feat: add /status command to show runtime info
2026-03-22 00:11:34 +08:00
Xubin Ren
a8176ef2c6 fix(cli): keep direct-call rendering compatible in tests
Only use process_direct_outbound when the agent loop actually exposes it as an async method, and otherwise fall back to the legacy process_direct path. This keeps the new CLI render-metadata flow without breaking existing test doubles or older direct-call implementations.

Made-with: Cursor
2026-03-21 16:07:14 +00:00
Xubin Ren
e430b1daf5 fix(agent): refine status output and CLI rendering
Keep status output responsive while estimating current context from session history, dropping low-value queue/subagent counters, and marking command-style replies for plain-text rendering in CLI. Also route direct CLI calls through outbound metadata so help/status formatting stays explicit instead of relying on content heuristics.

Made-with: Cursor
2026-03-21 15:52:10 +00:00
Xubin Ren
4d1897609d fix(agent): make status command responsive and accurate
Handle /status at the run-loop level so it can return immediately while the agent is busy, and reset last-usage stats when providers omit usage data. Also keep Telegram help/menu coverage for /status without changing the existing final-response send path.

Made-with: Cursor
2026-03-21 15:21:32 +00:00
Xubin Ren
570ca47483 Merge branch 'main' into pr-1985 2026-03-21 09:48:09 +00:00
Xubin Ren
e87bb0a82d fix(mcp): preserve schema semantics during normalization
Only normalize nullable MCP tool schemas for OpenAI-compatible providers so optional params still work without collapsing unrelated unions. Also teach local validation to honor nullable flags and add regression coverage for nullable and non-nullable schemas.

Made-with: Cursor
2026-03-21 14:35:47 +08:00
haosenwang1018
b6cf7020ac fix: normalize MCP tool schema for OpenAI-compatible providers 2026-03-21 14:35:47 +08:00
Xubin Ren
9f10ce072f Merge PR #2304: feat(agent): implement native multimodal tool perception
Add native image content blocks for read_file and web_fetch, preserve the multimodal tool-result path through the agent loop, and keep session history compact with image placeholders. Also harden web_fetch against redirect-based SSRF bypasses and add regression coverage for image reads and blocked private redirects.
2026-03-21 05:39:17 +00:00
Xubin Ren
445a96ab55 fix(agent): harden multimodal tool result flow
Keep multimodal tool outputs on the native content-block path while
restoring redirect SSRF checks for web_fetch image responses. Also share
image block construction, simplify persisted history sanitization, and
add regression tests for image reads and blocked private redirects.

Made-with: Cursor
2026-03-21 05:34:56 +00:00
Xubin Ren
834f1e3a9f Merge branch 'main' into pr-2304 2026-03-21 04:14:40 +00:00
Xubin Ren
32f4e60145 refactor(providers): hide oauth-only providers from config setup
Exclude openai_codex alongside github_copilot from generated config,
filter OAuth-only providers out of the onboarding wizard, and clarify in
README that OAuth login stores session state outside config. Also unify
the GitHub Copilot login command spelling and add regression tests.

Made-with: Cursor
2026-03-21 03:20:59 +08:00
Harvey Mackie
e029d52e70 chore: remove redundant github_copilot field from config.json 2026-03-21 03:20:59 +08:00
Harvey Mackie
055e2f3816 docs: add github copilot oauth channel setup instructions 2026-03-21 03:20:59 +08:00
Xubin Ren
542455109d fix(email): preserve fetched messages across IMAP retry
Keep messages already collected in the current poll cycle when a stale
IMAP connection dies mid-fetch, so retrying once does not drop emails
that were already parsed and marked seen. Add a regression test covering
a mid-cycle disconnect after the first message succeeds.

Made-with: Cursor
2026-03-21 03:00:39 +08:00
jr_blue_551
b16bd2d9a8 Harden email IMAP polling retries 2026-03-21 03:00:39 +08:00
Kian
d7f6cbbfc4 fix: add openssh-client and use HTTPS for GitHub in Docker build
- Add openssh-client to apt dependencies for git operations
- Configure git to use HTTPS instead of SSH for github.com to avoid
  SSH key requirements during Docker build

Made-with: Cursor
2026-03-21 02:43:11 +08:00
James Wrigley
9aaeb7ebd8 Add support for -h in the CLI 2026-03-21 02:36:48 +08:00
Xubin Ren
09ad9a4673 feat(cron): add run history tracking for cron jobs
Record run_at_ms, status, duration_ms and error for each execution,
keeping the last 20 entries per job in jobs.json. Adds CronRunRecord
dataclass, get_job() lookup, and four regression tests covering
success, error, trimming and persistence.

Closes #1837

Made-with: Cursor
2026-03-21 02:28:35 +08:00
Xubin Ren
ec2e12b028 Merge PR #1824: feat(tools): enhance ExecTool with enable flag
feat(tools): enhance ExecTool with enable flag
2026-03-21 01:54:18 +08:00
Xubin Ren
1c39a4d311 refactor(tools): keep exec enable without configurable deny patterns
Made-with: Cursor
2026-03-20 17:46:08 +00:00