feat: add Anthropic prompt caching via cache_control

Inject cache_control: {"type": "ephemeral"} on the system message and
last tool definition for providers that support prompt caching. Adds
supports_prompt_caching flag to ProviderSpec (enabled for Anthropic only)
and skips caching when routing through a gateway.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
tercerapersona
2026-02-19 11:05:22 -03:00
parent 8de36d398f
commit 3b4763b3f9
2 changed files with 45 additions and 2 deletions

View File

@@ -57,6 +57,9 @@ class ProviderSpec:
# Direct providers bypass LiteLLM entirely (e.g., CustomProvider)
is_direct: bool = False
# Provider supports cache_control on content blocks (e.g. Anthropic prompt caching)
supports_prompt_caching: bool = False
@property
def label(self) -> str:
return self.display_name or self.name.title()
@@ -155,6 +158,7 @@ PROVIDERS: tuple[ProviderSpec, ...] = (
default_api_base="",
strip_model_prefix=False,
model_overrides=(),
supports_prompt_caching=True,
),
# OpenAI: LiteLLM recognizes "gpt-*" natively, no prefix needed.