Spaces:

elmerzole
/

llm-api-proxy

Paused

Mirrowel commited on Dec 5, 2025

Commit

1d1a62b

1 Parent(s): df7a756

docs: 📚 update documentation to reflect gemini 2.5 removal and claude sonnet dual-mode support

This commit updates both README.md and DOCUMENTATION.md to accurately reflect recent changes to the Antigravity provider:

- Remove all references to Gemini 2.5 models (Pro/Flash) as they are no longer supported
- Document Claude Sonnet 4.5's dual-mode capability (thinking and non-thinking variants)
- Add provider support section explaining credential prioritization implementation for both Gemini CLI and Antigravity providers
- Clarify that Claude Opus 4.5 only supports thinking mode
- Update model-specific logic documentation to reflect current architecture (Gemini 3, Claude Sonnet, Claude Opus)
- Add credential tier reset timing details (paid tier: 5 hours, free tier: weekly)
- Remove outdated "NEW" badges and function call response pairing references

Files changed (2) hide show

DOCUMENTATION.md +15 -10
README.md +5 -4

DOCUMENTATION.md CHANGED Viewed

@@ -361,6 +361,13 @@ def get_model_tier_requirement(self, model: str) -> Optional[int]:
     return None  # All other models have no restrictions
 ```
 **Usage Manager Integration:**
 The `acquire_key()` method has been enhanced to:
@@ -391,22 +398,18 @@ A modular, shared caching system for providers to persist conversation state acr
 ### 3.5. Antigravity (`antigravity_provider.py`)
-The most sophisticated provider implementation, supporting Google's internal Antigravity API for Gemini and Claude models (including **Claude Opus 4.5**, Anthropic's most powerful model).
 #### Architecture
 - **Unified Streaming/Non-Streaming**: Single code path handles both response types with optimal transformations
 - **Thought Signature Caching**: Server-side caching of encrypted signatures for multi-turn Gemini 3 conversations
-- **Model-Specific Logic**: Automatic configuration based on model type (Gemini 2.5, Gemini 3, Claude)
 #### Model Support
-**Gemini 2.5 (Pro/Flash):**
-- Uses `thinkingBudget` parameter (integer tokens: -1 for auto, 0 to disable, or specific value)
-- Standard safety settings and toolConfig
-- Stream processing with thinking content separation
-**Gemini 3 (Pro/Image):**
 - Uses `thinkingLevel` parameter (string: "low" or "high")
 - **Tool Hallucination Prevention**:
   - Automatic system instruction injection explaining custom tool schema rules
@@ -427,8 +430,10 @@ The most sophisticated provider implementation, supporting Google's internal Ant
 - Increased default max output tokens to 64000 to accommodate thinking output
 **Claude Sonnet 4.5:**
-- Proxied through Antigravity API (uses internal model name `claude-sonnet-4-5-thinking`)
-- Uses `thinkingBudget` parameter like Gemini 2.5
 - **Thinking Preservation**: Caches thinking content using composite keys (tool_call_id + text_hash)
 - **Schema Cleaning**: Removes unsupported properties (`$schema`, `additionalProperties`, `const` → `enum`)

     return None  # All other models have no restrictions
 ```
+**Provider Support:**
+The following providers implement credential prioritization:
+- **Gemini CLI**: Paid tier (priority 1), Free tier (priority 2), Legacy/Unknown (priority 10). Gemini 3 models require paid tier.
+- **Antigravity**: Same priority system as Gemini CLI. No model-tier restrictions (all models work on all tiers). Paid tier resets every 5 hours, free tier resets weekly.
 **Usage Manager Integration:**
 The `acquire_key()` method has been enhanced to:
 ### 3.5. Antigravity (`antigravity_provider.py`)
+The most sophisticated provider implementation, supporting Google's internal Antigravity API for Gemini 3 and Claude models (including **Claude Opus 4.5**, Anthropic's most powerful model).
 #### Architecture
 - **Unified Streaming/Non-Streaming**: Single code path handles both response types with optimal transformations
 - **Thought Signature Caching**: Server-side caching of encrypted signatures for multi-turn Gemini 3 conversations
+- **Model-Specific Logic**: Automatic configuration based on model type (Gemini 3, Claude Sonnet, Claude Opus)
+- **Credential Prioritization**: Automatic tier detection with paid credentials prioritized over free (paid tier resets every 5 hours, free tier resets weekly)
 #### Model Support
+**Gemini 3 Pro:**
 - Uses `thinkingLevel` parameter (string: "low" or "high")
 - **Tool Hallucination Prevention**:
   - Automatic system instruction injection explaining custom tool schema rules
 - Increased default max output tokens to 64000 to accommodate thinking output
 **Claude Sonnet 4.5:**
+- Proxied through Antigravity API
+- **Supports both thinking and non-thinking modes**:
+  - With `reasoning_effort`: Uses `claude-sonnet-4-5-thinking` variant with `thinkingBudget`
+  - Without `reasoning_effort`: Uses standard `claude-sonnet-4-5` variant
 - **Thinking Preservation**: Caches thinking content using composite keys (tool_call_id + text_hash)
 - **Schema Cleaning**: Removes unsupported properties (`$schema`, `additionalProperties`, `const` → `enum`)

README.md CHANGED Viewed

@@ -28,13 +28,14 @@ This project provides a powerful solution for developers building complex applic
 -   **OpenAI-Compatible Proxy**: Offers a familiar API interface with additional endpoints for model and provider discovery.
 -   **Advanced Model Filtering**: Supports both blacklists and whitelists to give you fine-grained control over which models are available through the proxy.
--   **🆕 Antigravity Provider**: Full support for Google's internal Antigravity API, providing access to Gemini 2.5, Gemini 3, and Claude models with advanced features:
-    - **🚀 NEW: Claude Opus 4.5** - Anthropic's most powerful model, now available via Antigravity!
-    - Claude Sonnet 4.5 with extended thinking support
     - Thought signature caching for multi-turn conversations
     - Tool hallucination prevention via parameter signature injection
     - Automatic thinking block sanitization for Claude models (with recovery strategies)
-    - Improved function call response pairing with three-tier matching strategy
     - Note: Claude thinking mode requires careful conversation state management (see [Antigravity documentation](DOCUMENTATION.md#antigravity-claude-extended-thinking-sanitization) for details)
 -   **🆕 Credential Prioritization**: Automatic tier detection and priority-based credential selection ensures paid-tier credentials are used for premium models that require them.
 -   **🆕 Weighted Random Rotation**: Configurable credential rotation strategy - choose between deterministic (perfect balance) or weighted random (unpredictable, harder to fingerprint) selection.

 -   **OpenAI-Compatible Proxy**: Offers a familiar API interface with additional endpoints for model and provider discovery.
 -   **Advanced Model Filtering**: Supports both blacklists and whitelists to give you fine-grained control over which models are available through the proxy.
+-   **🆕 Antigravity Provider**: Full support for Google's internal Antigravity API, providing access to Gemini 3 and Claude models with advanced features:
+    - **🚀 Claude Opus 4.5** - Anthropic's most powerful model (thinking mode only)
+    - **Claude Sonnet 4.5** - Supports both thinking and non-thinking modes
+    - **Gemini 3 Pro** - With thinkingLevel support (low/high)
+    - Credential prioritization with automatic paid/free tier detection
     - Thought signature caching for multi-turn conversations
     - Tool hallucination prevention via parameter signature injection
     - Automatic thinking block sanitization for Claude models (with recovery strategies)
     - Note: Claude thinking mode requires careful conversation state management (see [Antigravity documentation](DOCUMENTATION.md#antigravity-claude-extended-thinking-sanitization) for details)
 -   **🆕 Credential Prioritization**: Automatic tier detection and priority-based credential selection ensures paid-tier credentials are used for premium models that require them.
 -   **🆕 Weighted Random Rotation**: Configurable credential rotation strategy - choose between deterministic (perfect balance) or weighted random (unpredictable, harder to fingerprint) selection.