Register Mistral as a first-class provider with LiteLLM routing,
MISTRAL_API_KEY env var, and https://api.mistral.ai/v1 default base.
Includes schema field, registry entry, and tests.
Preserve the provider and agent-loop streaming primitives plus the CLI experiment scaffolding so this work can be resumed later without blocking urgent bug fixes on main.
Made-with: Cursor
Resolve conflict in context.py: keep main's build_messages which already
merges runtime context into user message (achieving the same cache goal).
The real value-add from this PR is the second cache breakpoint in
litellm_provider.py.
Made-with: Cursor
When an OpenAI-compatible API returns a non-JSON response (e.g. plain
text "unsupported model: xxx" with HTTP 200), the OpenAI SDK raises a
JSONDecodeError whose message is the unhelpful "Expecting value: line 1
column 1 (char 0)". Extract the original response body from
JSONDecodeError.doc (or APIError.response.text) so users see the actual
error message from the API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the static provider-level supports_vision check with a
reactive fallback: when a model returns an image-unsupported error,
strip image_url blocks from messages and retry once. This avoids
maintaining an inaccurate vision capability table and correctly
handles gateway/unknown model scenarios.
Also extract _safe_chat() to deduplicate try/except boilerplate
in chat_with_retry().
- Add field to ProviderSpec (default True)
- Add and methods in LiteLLMProvider
- Filter image_url content blocks in before sending to non-vision models
- Reverts session-layer filtering from original PR (wrong layer)
This fixes the issue where switching from Claude (vision-capable) to
non-vision models (e.g., Baidu Qianfan) causes API errors due to
unsupported image_url content blocks.
The provider layer is the correct place for this filtering because:
1. It has access to model/provider capabilities
2. It only affects non-vision models
3. It preserves session layer purity (storage should not know about model capabilities)
With custom_llm_provider kwarg handling routing, the openrouter/ prefix
caused model names like anthropic/claude-sonnet-4-6 to become
openrouter/anthropic/claude-sonnet-4-6, which OpenRouter API rejects.
Add native Ollama support so local models (e.g. nemotron-3-nano) can be
used without an API key. Adds ProviderSpec with ollama_chat LiteLLM
prefix, ProvidersConfig field, and skips API key validation for local
providers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GitHub Copilot and some other providers have a 64-character limit on
tool_call_id. When switching from providers that generate longer IDs
(such as OpenAI Codex), this caused validation errors.
This fix truncates tool_call_id to 64 characters by preserving the first
32 and last 32 characters to maintain uniqueness while respecting the
provider's limit.
Fixes#1554