Spaces:

elmerzole
/

llm-api-proxy

Paused

App Files Files Community

Mirrowel commited on Dec 14, 2025

Commit

ed4dd55

2 Parent(s): 0c82aac c745d73

Merge branch 'dev' into main

Browse files

Files changed (25) hide show

DOCUMENTATION.md +312 -3
README.md +590 -582
src/proxy_app/detailed_logger.py +60 -30
src/proxy_app/launcher_tui.py +65 -34
src/proxy_app/main.py +23 -15
src/proxy_app/settings_tool.py +707 -84
src/rotator_library/client.py +27 -5
src/rotator_library/credential_manager.py +70 -34
src/rotator_library/credential_tool.py +795 -769
src/rotator_library/failure_logger.py +97 -26
src/rotator_library/providers/antigravity_auth_base.py +620 -3
src/rotator_library/providers/antigravity_provider.py +170 -557
src/rotator_library/providers/gemini_auth_base.py +626 -4
src/rotator_library/providers/gemini_cli_provider.py +577 -562
src/rotator_library/providers/google_oauth_base.py +775 -174
src/rotator_library/providers/iflow_auth_base.py +652 -178
src/rotator_library/providers/iflow_provider.py +173 -74
src/rotator_library/providers/provider_cache.py +161 -133
src/rotator_library/providers/qwen_auth_base.py +576 -176
src/rotator_library/providers/qwen_code_provider.py +209 -90
src/rotator_library/timeout_config.py +102 -0
src/rotator_library/usage_manager.py +37 -10
src/rotator_library/utils/__init__.py +29 -1
src/rotator_library/utils/paths.py +99 -0
src/rotator_library/utils/resilient_io.py +665 -0

DOCUMENTATION.md CHANGED Viewed

@@ -856,6 +856,142 @@ class AntigravityAuthBase(GoogleOAuthBase):
 - Headless environment detection
 - Sequential refresh queue processing
 ---
@@ -877,8 +1013,8 @@ The `GeminiCliProvider` is the most complex implementation, mimicking the Google
 #### Authentication (`gemini_auth_base.py`)
- *   **Device Flow**: Uses a standard OAuth 2.0 flow. The `credential_tool` spins up a local web server (`localhost:8085`) to capture the callback from Google's auth page.
-*   **Token Lifecycle**:
     *   **Proactive Refresh**: Tokens are refreshed 5 minutes before expiry.
     *   **Atomic Writes**: Credential files are updated using a temp-file-and-move strategy to prevent corruption during writes.
     *   **Revocation Handling**: If a `400` or `401` occurs during refresh, the token is marked as revoked, preventing infinite retry loops.
@@ -907,7 +1043,7 @@ The provider employs a sophisticated, cached discovery mechanism to find a valid
 ### 3.3. iFlow (`iflow_provider.py`)
 *   **Hybrid Auth**: Uses a custom OAuth flow (Authorization Code) to obtain an `access_token`. However, the *actual* API calls use a separate `apiKey` that is retrieved from the user's profile (`/api/oauth/getUserInfo`) using the access token.
-*   **Callback Server**: The auth flow spins up a local server on port `11451` to capture the redirect.
 *   **Token Management**: Automatically refreshes the OAuth token and re-fetches the API key if needed.
 *   **Schema Cleaning**: Similar to Qwen, it aggressively sanitizes tool schemas to prevent 400 errors.
 *   **Dedicated Logging**: Implements `_IFlowFileLogger` to capture raw chunks for debugging proprietary API behaviors.
@@ -935,4 +1071,177 @@ To facilitate robust debugging, the proxy includes a comprehensive transaction l
 This level of detail allows developers to trace exactly why a request failed or why a specific key was rotated.

 - Headless environment detection
 - Sequential refresh queue processing
+#### OAuth Callback Port Configuration
+Each OAuth provider uses a local callback server during authentication. The callback port can be customized via environment variables to avoid conflicts with other services.
+**Default Ports:**
+| Provider | Default Port | Environment Variable |
+|----------|-------------|---------------------|
+| Gemini CLI | 8085 | `GEMINI_CLI_OAUTH_PORT` |
+| Antigravity | 51121 | `ANTIGRAVITY_OAUTH_PORT` |
+| iFlow | 11451 | `IFLOW_OAUTH_PORT` |
+**Configuration Methods:**
+1. **Via TUI Settings Menu:**
+   - Main Menu → `4. View Provider & Advanced Settings` → `1. Launch Settings Tool`
+   - Select the provider (Gemini CLI, Antigravity, or iFlow)
+   - Modify the `*_OAUTH_PORT` setting
+   - Use "Reset to Default" to restore the original port
+2. **Via `.env` file:**
+   ```env
+   # Custom OAuth callback ports (optional)
+   GEMINI_CLI_OAUTH_PORT=8085
+   ANTIGRAVITY_OAUTH_PORT=51121
+   IFLOW_OAUTH_PORT=11451
+   ```
+**When to Change Ports:**
+- If the default port conflicts with another service on your system
+- If running multiple proxy instances on the same machine
+- If firewall rules require specific port ranges
+**Note:** Port changes take effect on the next OAuth authentication attempt. Existing tokens are not affected.
+---
+### 2.14. HTTP Timeout Configuration (`timeout_config.py`)
+Centralized timeout configuration for all HTTP requests to LLM providers.
+#### Purpose
+The `TimeoutConfig` class provides fine-grained control over HTTP timeouts for streaming and non-streaming LLM requests. This addresses the common issue of proxy hangs when upstream providers stall during connection establishment or response generation.
+#### Timeout Types Explained
+| Timeout | Description |
+|---------|-------------|
+| **connect** | Maximum time to establish a TCP/TLS connection to the upstream server |
+| **read** | Maximum time to wait between receiving data chunks (resets on each chunk for streaming) |
+| **write** | Maximum time to wait while sending the request body |
+| **pool** | Maximum time to wait for a connection from the connection pool |
+#### Default Values
+| Setting | Streaming | Non-Streaming | Rationale |
+|---------|-----------|---------------|-----------|
+| **connect** | 30s | 30s | Fast fail if server is unreachable |
+| **read** | 180s (3 min) | 600s (10 min) | Streaming expects periodic chunks; non-streaming may wait for full generation |
+| **write** | 30s | 30s | Request bodies are typically small |
+| **pool** | 60s | 60s | Reasonable wait for connection pool |
+#### Environment Variable Overrides
+All timeout values can be customized via environment variables:
+```env
+# Connection establishment timeout (seconds)
+TIMEOUT_CONNECT=30
+# Request body send timeout (seconds)
+TIMEOUT_WRITE=30
+# Connection pool acquisition timeout (seconds)
+TIMEOUT_POOL=60
+# Read timeout between chunks for streaming requests (seconds)
+# If no data arrives for this duration, the connection is considered stalled
+TIMEOUT_READ_STREAMING=180
+# Read timeout for non-streaming responses (seconds)
+# Longer to accommodate models that take time to generate full responses
+TIMEOUT_READ_NON_STREAMING=600
+```
+#### Streaming vs Non-Streaming Behavior
+**Streaming Requests** (`TimeoutConfig.streaming()`):
+- Uses shorter read timeout (default 3 minutes)
+- Timer resets every time a chunk arrives
+- If no data for 3 minutes → connection considered dead → failover to next credential
+- Appropriate for chat completions where tokens should arrive periodically
+**Non-Streaming Requests** (`TimeoutConfig.non_streaming()`):
+- Uses longer read timeout (default 10 minutes)
+- Server may take significant time to generate the complete response before sending anything
+- Complex reasoning tasks or large outputs may legitimately take several minutes
+- Only used by Antigravity provider's `_handle_non_streaming()` method
+#### Provider Usage
+The following providers use `TimeoutConfig`:
+| Provider | Method | Timeout Type |
+|----------|--------|--------------|
+| `antigravity_provider.py` | `_handle_non_streaming()` | `non_streaming()` |
+| `antigravity_provider.py` | `_handle_streaming()` | `streaming()` |
+| `gemini_cli_provider.py` | `acompletion()` | `streaming()` |
+| `iflow_provider.py` | `acompletion()` | `streaming()` |
+| `qwen_code_provider.py` | `acompletion()` | `streaming()` |
+**Note:** iFlow, Qwen Code, and Gemini CLI providers always use streaming internally (even for non-streaming requests), aggregating chunks into a complete response. Only Antigravity has a true non-streaming path.
+#### Tuning Recommendations
+| Use Case | Recommendation |
+|----------|----------------|
+| **Long thinking tasks** | Increase `TIMEOUT_READ_STREAMING` to 300-360s |
+| **Unstable network** | Increase `TIMEOUT_CONNECT` to 60s |
+| **High concurrency** | Increase `TIMEOUT_POOL` if seeing pool exhaustion |
+| **Large context/output** | Increase `TIMEOUT_READ_NON_STREAMING` to 900s+ |
+#### Example Configuration
+```env
+# For environments with complex reasoning tasks
+TIMEOUT_READ_STREAMING=300
+TIMEOUT_READ_NON_STREAMING=900
+# For unstable network conditions
+TIMEOUT_CONNECT=60
+TIMEOUT_POOL=120
+```
 ---
 #### Authentication (`gemini_auth_base.py`)
+ *   **Device Flow**: Uses a standard OAuth 2.0 flow. The `credential_tool` spins up a local web server (default: `localhost:8085`, configurable via `GEMINI_CLI_OAUTH_PORT`) to capture the callback from Google's auth page.
+ *   **Token Lifecycle**:
     *   **Proactive Refresh**: Tokens are refreshed 5 minutes before expiry.
     *   **Atomic Writes**: Credential files are updated using a temp-file-and-move strategy to prevent corruption during writes.
     *   **Revocation Handling**: If a `400` or `401` occurs during refresh, the token is marked as revoked, preventing infinite retry loops.
 ### 3.3. iFlow (`iflow_provider.py`)
 *   **Hybrid Auth**: Uses a custom OAuth flow (Authorization Code) to obtain an `access_token`. However, the *actual* API calls use a separate `apiKey` that is retrieved from the user's profile (`/api/oauth/getUserInfo`) using the access token.
+*   **Callback Server**: The auth flow spins up a local server (default: port `11451`, configurable via `IFLOW_OAUTH_PORT`) to capture the redirect.
 *   **Token Management**: Automatically refreshes the OAuth token and re-fetches the API key if needed.
 *   **Schema Cleaning**: Similar to Qwen, it aggressively sanitizes tool schemas to prevent 400 errors.
 *   **Dedicated Logging**: Implements `_IFlowFileLogger` to capture raw chunks for debugging proprietary API behaviors.
 This level of detail allows developers to trace exactly why a request failed or why a specific key was rotated.
+---
+## 5. Runtime Resilience
+The proxy is engineered to maintain high availability even in the face of runtime filesystem disruptions. This "Runtime Resilience" capability ensures that the service continues to process API requests even if data files or directories are deleted while the application is running.
+### 5.1. Centralized Resilient I/O (`resilient_io.py`)
+All file operations are centralized in a single utility module that provides consistent error handling, graceful degradation, and automatic retry with shutdown flush:
+#### `BufferedWriteRegistry` (Singleton)
+Global registry for buffered writes with periodic retry and shutdown flush. Ensures critical data is saved even if disk writes fail temporarily:
+- **Per-file buffering**: Each file path has its own pending write (latest data always wins)
+- **Periodic retries**: Background thread retries failed writes every 30 seconds
+- **Shutdown flush**: `atexit` hook ensures final write attempt on app exit (Ctrl+C)
+- **Thread-safe**: Safe for concurrent access from multiple threads
+```python
+# Get the singleton instance
+registry = BufferedWriteRegistry.get_instance()
+# Check pending writes (for monitoring)
+pending_count = registry.get_pending_count()
+pending_files = registry.get_pending_paths()
+# Manual flush (optional - atexit handles this automatically)
+results = registry.flush_all()  # Returns {path: success_bool}
+# Manual shutdown (if needed before atexit)
+results = registry.shutdown()
+```
+#### `ResilientStateWriter`
+For stateful files that must persist (usage stats):
+- **Memory-first**: Always updates in-memory state before attempting disk write
+- **Atomic writes**: Uses tempfile + move pattern to prevent corruption
+- **Automatic retry with backoff**: If disk fails, waits `retry_interval` seconds before trying again
+- **Shutdown integration**: Registers with `BufferedWriteRegistry` on failure for final flush
+- **Health monitoring**: Exposes `is_healthy` property for monitoring
+```python
+writer = ResilientStateWriter("data.json", logger, retry_interval=30.0)
+writer.write({"key": "value"})  # Always succeeds (memory update)
+if not writer.is_healthy:
+    logger.warning("Disk writes failing, data in memory only")
+# On next write() call after retry_interval, disk write is attempted again
+# On app exit (Ctrl+C), BufferedWriteRegistry attempts final save
+```
+#### `safe_write_json()`
+For JSON writes with configurable options (credentials, cache):
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `path` | required | File path to write to |
+| `data` | required | JSON-serializable data |
+| `logger` | required | Logger for warnings |
+| `atomic` | `True` | Use atomic write pattern (tempfile + move) |
+| `indent` | `2` | JSON indentation level |
+| `ensure_ascii` | `True` | Escape non-ASCII characters |
+| `secure_permissions` | `False` | Set file permissions to 0o600 |
+| `buffer_on_failure` | `False` | Register with BufferedWriteRegistry on failure |
+When `buffer_on_failure=True`:
+- Failed writes are registered with `BufferedWriteRegistry`
+- Data is retried every 30 seconds in background
+- On app exit, final write attempt is made automatically
+- Success unregisters the pending write
+```python
+# For critical data (auth tokens) - use buffer_on_failure
+safe_write_json(path, creds, logger, secure_permissions=True, buffer_on_failure=True)
+# For non-critical data (logs) - no buffering needed
+safe_write_json(path, data, logger)
+```
+#### `safe_log_write()`
+For log files where occasional loss is acceptable:
+- Fire-and-forget pattern
+- Creates parent directories if needed
+- Returns `True`/`False`, never raises
+- **No buffering** - logs are dropped on failure
+#### `safe_mkdir()`
+For directory creation with error handling.
+### 5.2. Resilience Hierarchy
+The system follows a strict hierarchy of survival:
+1. **Core API Handling (Level 1)**: The Python runtime keeps all necessary code in memory. Deleting source code files while the proxy is running will **not** crash active requests.
+2. **Credential Management (Level 2)**: OAuth tokens are cached in memory first. If credential files are deleted, the proxy continues using cached tokens. If a token refresh succeeds but the file cannot be written, the new token is buffered for retry and saved on shutdown.
+3. **Usage Tracking (Level 3)**: Usage statistics (`key_usage.json`) are maintained in memory via `ResilientStateWriter`. If the file is deleted, the system tracks usage internally and attempts to recreate the file on the next save interval. Pending writes are flushed on shutdown.
+4. **Provider Cache (Level 4)**: The provider cache tracks disk health and continues operating in memory-only mode if disk writes fail. Has its own shutdown mechanism.
+5. **Logging (Level 5)**: Logging is treated as non-critical. If the `logs/` directory is removed, the system attempts to recreate it. If creation fails, logging degrades gracefully without interrupting the request flow. **No buffering or retry**.
+### 5.3. Component Integration
+| Component | Utility Used | Behavior on Disk Failure | Shutdown Flush |
+|-----------|--------------|--------------------------|----------------|
+| `UsageManager` | `ResilientStateWriter` | Continues in memory, retries after 30s | Yes (via registry) |
+| `GoogleOAuthBase` | `safe_write_json(buffer_on_failure=True)` | Memory cache preserved, buffered for retry | Yes (via registry) |
+| `QwenAuthBase` | `safe_write_json(buffer_on_failure=True)` | Memory cache preserved, buffered for retry | Yes (via registry) |
+| `IFlowAuthBase` | `safe_write_json(buffer_on_failure=True)` | Memory cache preserved, buffered for retry | Yes (via registry) |
+| `ProviderCache` | `safe_write_json` + own shutdown | Retries via own background loop | Yes (own mechanism) |
+| `DetailedLogger` | `safe_write_json` | Logs dropped, no crash | No |
+| `failure_logger` | Python `logging.RotatingFileHandler` | Falls back to NullHandler | No |
+### 5.4. Shutdown Behavior
+When the application exits (including Ctrl+C):
+1. **atexit handler fires**: `BufferedWriteRegistry._atexit_handler()` is called
+2. **Pending writes counted**: Registry checks how many files have pending writes
+3. **Flush attempted**: Each pending file gets a final write attempt
+4. **Results logged**:
+   - Success: `"Shutdown flush: all N write(s) succeeded"`
+   - Partial: `"Shutdown flush: X succeeded, Y failed"` with failed file names
+**Console output example:**
+```
+INFO:rotator_library.resilient_io:Flushing 2 pending write(s) on shutdown...
+INFO:rotator_library.resilient_io:Shutdown flush: all 2 write(s) succeeded
+```
+### 5.5. "Develop While Running"
+This architecture supports a robust development workflow:
+- **Log Cleanup**: You can safely run `rm -rf logs/` while the proxy is serving traffic. The system will recreate the directory structure on the next request.
+- **Config Reset**: Deleting `key_usage.json` resets the persistence layer, but the running instance preserves its current in-memory counts for load balancing consistency.
+- **File Recovery**: If you delete a critical file, the system attempts directory auto-recreation before every write operation.
+- **Safe Exit**: Ctrl+C triggers graceful shutdown with final data flush attempt.
+### 5.6. Graceful Degradation & Data Loss
+While functionality is preserved, persistence may be compromised during filesystem failures:
+- **Logs**: If disk writes fail, detailed request logs may be lost (no buffering).
+- **Usage Stats**: Buffered in memory and flushed on shutdown. Data loss only if shutdown flush also fails.
+- **Credentials**: Buffered in memory and flushed on shutdown. Re-authentication only needed if shutdown flush fails.
+- **Cache**: Provider cache entries may need to be regenerated after restart if its own shutdown mechanism fails.
+### 5.7. Monitoring Disk Health
+Components expose health information for monitoring:
+```python
+# BufferedWriteRegistry
+registry = BufferedWriteRegistry.get_instance()
+pending = registry.get_pending_count()  # Number of files with pending writes
+files = registry.get_pending_paths()    # List of pending file names
+# UsageManager
+writer = usage_manager._state_writer
+health = writer.get_health_info()
+# Returns: {"healthy": True, "failure_count": 0, "last_success": 1234567890.0, ...}
+# ProviderCache
+stats = cache.get_stats()
+# Includes: {"disk_available": True, "disk_errors": 0, ...}
+```

README.md CHANGED Viewed

@@ -1,755 +1,763 @@
-# Universal LLM API Proxy & Resilience Library [![ko-fi](https://ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/C0C0UZS4P)
 [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/Mirrowel/LLM-API-Key-Proxy) [![zread](https://img.shields.io/badge/Ask_Zread-_.svg?style=flat&color=00b0aa&labelColor=000000&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZyB3aWR0aD0iMTYiIGhlaWdodD0iMTYiIHZpZXdCb3g9IjAgMCAxNiAxNiIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPHBhdGggZD0iTTQuOTYxNTYgMS42MDAxSDIuMjQxNTZDMS44ODgxIDEuNjAwMSAxLjYwMTU2IDEuODg2NjQgMS42MDE1NiAyLjI0MDFWNC45NjAxQzEuNjAxNTYgNS4zMTM1NiAxLjg4ODEgNS42MDAxIDIuMjQxNTYgNS42MDAxSDQuOTYxNTZDNS4zMTUwMiA1LjYwMDEgNS42MDE1NiA1LjMxMzU2IDUuNjAxNTYgNC45NjAxVjIuMjQwMUM1LjYwMTU2IDEuODg2NjQgNS4zMTUwMiAxLjYwMDEgNC45NjE1NiAxLjYwMDFaIiBmaWxsPSIjZmZmIi8%2BCjxwYXRoIGQ9Ik00Ljk2MTU2IDEwLjM5OTlIMi4yNDE1NkMxLjg4ODEgMTAuMzk5OSAxLjYwMTU2IDEwLjY4NjQgMS42MDE1NiAxMS4wMzk5VjEzLjc1OTlDMS42MDE1NiAxNC4xMTM0IDEuODg4MSAxNC4zOTk5IDIuMjQxNTYgMTQuMzk5OUg0Ljk2MTU2QzUuMzE1MDIgMTQuMzk5OSA1LjYwMTU2IDE0LjExMzQgNS42MDE1NiAxMy43NTk5VjExLjAzOTlDNS42MDE1NiAxMC42ODY0IDUuMzE1MDIgMTAuMzk5OSA0Ljk2MTU2IDEwLjM5OTlaIiBmaWxsPSIjZmZmIi8%2BCjxwYXRoIGQ9Ik0xMy43NTg0IDEuNjAwMUgxMS4wMzg0QzEwLjY4NSAxLjYwMDEgMTAuMzk4NCAxLjg4NjY0IDEwLjM5ODQgMi4yNDAxVjQuOTYwMUMxMC4zOTg0IDUuMzEzNTYgMTAuNjg1IDUuNjAwMSAxMS4wMzg0IDUuNjAwMUgxMy43NTg0QzE0LjExMTkgNS42MDAxIDE0LjM5ODQgNS4zMTM1NiAxNC4zOTg0IDQuOTYwMVYyLjI0MDFDMTQuMzk4NCAxLjg4NjY0IDE0LjExMTkgMS42MDAxIDEzLjc1ODQgMS42MDAxWiIgZmlsbD0iI2ZmZiIvPgo8cGF0aCBkPSJNNCAxMkwxMiA0TDQgMTJaIiBmaWxsPSIjZmZmIi8%2BCjxwYXRoIGQ9Ik00IDEyTDEyIDQiIHN0cm9rZT0iI2ZmZiIgc3Ryb2tlLXdpZHRoPSIxLjUiIHN0cm9rZS1saW5lY2FwPSJyb3VuZCIvPgo8L3N2Zz4K&logoColor=ffffff)](https://zread.ai/Mirrowel/LLM-API-Key-Proxy)
-## Detailed Setup and Features
-This project provides a powerful solution for developers building complex applications, such as agentic systems, that interact with multiple Large Language Model (LLM) providers. It consists of two distinct but complementary components:
-1.  **A Universal API Proxy**: A self-hosted FastAPI application that provides a single, OpenAI-compatible endpoint for all your LLM requests. Powered by `litellm`, it allows you to seamlessly switch between different providers and models without altering your application's code.
-2.  **A Resilience & Key Management Library**: The core engine that powers the proxy. This reusable Python library intelligently manages a pool of API keys to ensure your application is highly available and resilient to transient provider errors or performance issues.
-## Features
--   **Universal API Endpoint**: Simplifies development by providing a single, OpenAI-compatible interface for diverse LLM providers.
--   **High Availability**: The underlying library ensures your application remains operational by gracefully handling transient provider errors and API key-specific issues.
--   **Resilient Performance**: A global timeout on all requests prevents your application from hanging on unresponsive provider APIs.
--   **Advanced Concurrency Control**: A single API key can be used for multiple concurrent requests. By default, it supports concurrent requests to *different* models. With configuration (`MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`), it can also support multiple concurrent requests to the *same* model using the same key.
--   **Intelligent Key Management**: Optimizes request distribution across your pool of keys by selecting the best available one for each call.
--   **Automated OAuth Discovery**: Automatically discovers, validates, and manages OAuth credentials from standard provider directories (e.g., `~/.gemini/`, `~/.qwen/`, `~/.iflow/`).
--   **Stateless Deployment Support**: Deploy easily to platforms like Railway, Render, or Vercel. The new export tool converts complex OAuth credentials (Gemini CLI, Qwen, iFlow) into simple environment variables, removing the need for persistent storage or file uploads.
--   **Batch Request Processing**: Efficiently aggregates multiple embedding requests into single batch API calls, improving throughput and reducing rate limit hits.
--   **New Provider Support**: Full support for **iFlow** (API Key & OAuth), **Qwen Code** (API Key & OAuth), and **NVIDIA NIM** with DeepSeek thinking support, including special handling for their API quirks (tool schema cleaning, reasoning support, dedicated logging).
--   **Duplicate Credential Detection**: Intelligently detects if multiple local credential files belong to the same user account and logs a warning, preventing redundancy in your key pool.
--   **Escalating Per-Model Cooldowns**: If a key fails for a specific model, it's placed on a temporary, escalating cooldown for that model, allowing it to be used with others.
--   **Automatic Daily Resets**: Cooldowns and usage statistics are automatically reset daily, making the system self-maintaining.
--   **Detailed Request Logging**: Enable comprehensive logging for debugging. Each request gets its own directory with full request/response details, streaming chunks, and performance metadata.
--   **Provider Agnostic**: Compatible with any provider supported by `litellm`.
--   **OpenAI-Compatible Proxy**: Offers a familiar API interface with additional endpoints for model and provider discovery.
--   **Advanced Model Filtering**: Supports both blacklists and whitelists to give you fine-grained control over which models are available through the proxy.
--   **🆕 Antigravity Provider**: Full support for Google's internal Antigravity API, providing access to Gemini 3 and Claude models with advanced features:
-    - **🚀 Claude Opus 4.5** - Anthropic's most powerful model (thinking mode only)
-    - **Claude Sonnet 4.5** - Supports both thinking and non-thinking modes
-    - **Gemini 3 Pro** - With thinkingLevel support (low/high)
-    - Credential prioritization with automatic paid/free tier detection
-    - Thought signature caching for multi-turn conversations
-    - Tool hallucination prevention via parameter signature injection
-    - Automatic thinking block sanitization for Claude models (with recovery strategies)
-    - Note: Claude thinking mode requires careful conversation state management (see [Antigravity documentation](DOCUMENTATION.md#antigravity-claude-extended-thinking-sanitization) for details)
--   **🆕 Credential Prioritization**: Automatic tier detection and priority-based credential selection ensures paid-tier credentials are used for premium models that require them.
--   **🆕 Sequential Rotation Mode**: Choose between balanced (distribute load evenly) or sequential (use until exhausted) credential rotation strategies. Sequential mode maximizes cache hit rates for providers like Antigravity.
--   **🆕 Per-Model Quota Tracking**: Granular per-model usage tracking with authoritative quota reset timestamps from provider error responses. Each model maintains its own window with `window_start_ts` and `quota_reset_ts`.
--   **🆕 Model Quota Groups**: Group models that share quota limits (e.g., Claude Sonnet and Opus). When one model in a group hits quota, all receive the same cooldown timestamp.
--   **🆕 Priority-Based Concurrency**: Assign credentials to priority tiers (1=highest) with configurable concurrency multipliers. Paid-tier credentials can handle more concurrent requests than free-tier ones.
--   **🆕 Provider-Specific Quota Parsing**: Extended provider interface with `parse_quota_error()` method to extract precise retry-after times from provider-specific error formats (e.g., Google RPC format).
--   **🆕 Flexible Rolling Windows**: Support for provider-specific quota reset configurations (5-hour, 7-day, etc.) replacing hardcoded daily resets.
--   **🆕 Weighted Random Rotation**: Configurable credential rotation strategy - choose between deterministic (perfect balance) or weighted random (unpredictable, harder to fingerprint) selection.
--   **🆕 Enhanced Gemini CLI**: Improved project discovery, paid vs free tier detection, and Gemini 3 support with thoughtSignature caching.
--   **🆕 Temperature Override**: Global temperature=0 override option to prevent tool hallucination issues with low-temperature settings.
--   **🆕 Provider Cache System**: Modular caching system for preserving conversation state (thought signatures, thinking content) across requests.
--   **🆕 Refactored OAuth Base**: Shared [`GoogleOAuthBase`](src/rotator_library/providers/google_oauth_base.py) class eliminates code duplication across OAuth providers.
--   **🆕 Interactive Launcher TUI**: Beautiful, cross-platform TUI for configuration and management with an integrated settings tool for advanced configuration.
 ---
-## 1. Quick Start
-### Windows (Simplest)
-1.  **Download the latest release** from the [GitHub Releases page](https://github.com/Mirrowel/LLM-API-Key-Proxy/releases/latest).
-2.  Unzip the downloaded file.
-3.  **Run the executable** (run without arguments). This launches the **interactive TUI launcher** which allows you to:
-    -   🚀 Run the proxy server with your configured settings
-    -   ⚙️ Configure proxy settings (Host, Port, PROXY_API_KEY, Request Logging)
-    -   🔑 Manage credentials (add/edit API keys & OAuth credentials)
-    -   📊 View provider status and advanced settings
-    -   🔧 Configure advanced settings interactively (custom API bases, model definitions, concurrency limits)
-    -   🔄 Reload configuration without restarting
-> **Note:** The legacy `launcher.bat` is deprecated.
 ### macOS / Linux
-**Option A: Using the Executable (Recommended)**
-If you downloaded the pre-compiled binary for your platform, no Python installation is required.
-1.  **Download the latest release** from the GitHub Releases page.
-2.  Open a terminal and make the binary executable:
-    ```bash
-    chmod +x proxy_app
-    ```
-3.  **Run the Interactive Launcher**:
-    ```bash
-    ./proxy_app
-    ```
-    This launches the TUI where you can configure and run the proxy.
-4.  **Or run directly with arguments** to bypass the launcher:
-    ```bash
-    ./proxy_app --host 0.0.0.0 --port 8000
-    ```
-**Option B: Manual Setup (Source Code)**
-If you are running from source, use these commands:
-**1. Install Dependencies**
 ```bash
-# Ensure you have Python 3.10+ installed
-python3 -m venv venv
-source venv/bin/activate
-pip install -r requirements.txt
 ```
-**2. Launch the Interactive TUI**
 ```bash
-export PYTHONPATH=$PYTHONPATH:$(pwd)/src
 python src/proxy_app/main.py
 ```
-**3. Or run directly with arguments to bypass the launcher**
-```bash
-export PYTHONPATH=$PYTHONPATH:$(pwd)/src
-python src/proxy_app/main.py --host 0.0.0.0 --port 8000
-```
-*To enable logging, add `--enable-request-logging` to the command.*
 ---
-## 2. Interactive TUI Launcher
-The proxy now includes a powerful **interactive Text User Interface (TUI)** that makes configuration and management effortless.
-### Features
-- **🎯 Main Menu**:
-  - Run proxy server with saved settings
-  - Configure proxy settings (host, port, API key, logging)
-  - Manage credentials (API keys & OAuth)
-  - View provider & advanced settings status
-  - Reload configuration
-- **🔧 Advanced Settings Tool**:
-  - Configure custom OpenAI-compatible providers
-  - Define provider models (simple or advanced JSON format)
-  - Set concurrency limits per provider
-  - Configure rotation modes (balanced vs sequential)
-  - Manage priority-based concurrency multipliers
-  - Interactive numbered menus for easy selection
-  - Pending changes system with save/discard options
-- **📊 Status Dashboard**:
-  - Shows configured providers and credential counts
-  - Displays custom providers and API bases
-  - Shows active advanced settings
-  - Real-time configuration status
-### How to Use
-**Running without arguments launches the TUI:**
-```bash
-# Windows
-proxy_app.exe
-# macOS/Linux
-./proxy_app
-# From source
-python src/proxy_app/main.py
 ```
-**Running with arguments bypasses the TUI:**
 ```bash
-# Direct startup (skips TUI)
-proxy_app.exe --host 0.0.0.0 --port 8000
 ```
-### Configuration Files
-The TUI manages two configuration files:
-- **`launcher_config.json`**: Stores launcher-specific settings (host, port, logging preference)
-- **`.env`**: Stores all credentials and advanced settings (PROXY_API_KEY, provider credentials, custom settings)
-All advanced settings configured through the TUI are stored in `.env` for compatibility with manual editing and deployment platforms.
----
-## 3. Detailed Setup (From Source)
-This guide is for users who want to run the proxy from the source code on any operating system.
-### Step 1: Clone and Install
-First, clone the repository and install the required dependencies into a virtual environment.
-**Linux/macOS:**
-```bash
-# Clone the repository
-git clone https://github.com/Mirrowel/LLM-API-Key-Proxy.git
-cd LLM-API-Key-Proxy
-# Create and activate a virtual environment
-python3 -m venv venv
-source venv/bin/activate
-# Install dependencies
-pip install -r requirements.txt
-```
-**Windows:**
-```powershell
-# Clone the repository
-git clone https://github.com/Mirrowel/LLM-API-Key-Proxy.git
-cd LLM-API-Key-Proxy
-# Create and activate a virtual environment
-python -m venv venv
-.\venv\Scripts\Activate.ps1
-# Install dependencies
-pip install -r requirements.txt
-```
-### Step 2: Configure API Keys
-Create a `.env` file to store your secret keys. You can do this by copying the example file.
-**Linux/macOS:**
 ```bash
-cp .env.example .env
 ```
-**Windows:**
-```powershell
-copy .env.example .env
 ```
-Now, open the new `.env` file and add your keys.
-**Refer to the `.env.example` file for the correct format and a full list of supported providers.**
-The proxy supports two types of credentials:
-1.  **API Keys**: Standard secret keys from providers like OpenAI, Anthropic, etc.
-2.  **OAuth Credentials**: For services that use OAuth 2.0, like the Gemini CLI.
-#### Automated Credential Discovery (Recommended)
-For many providers, **no configuration is necessary**. The proxy automatically discovers and manages credentials from their default locations:
--   **API Keys**: Scans your environment variables for keys matching the format `PROVIDER_API_KEY_1` (e.g., `GEMINI_API_KEY_1`).
--   **OAuth Credentials**: Scans default system directories (e.g., `~/.gemini/`, `~/.qwen/`, `~/.iflow/`) for all `*.json` credential files.
-You only need to create a `.env` file to set your `PROXY_API_KEY` and to override or add credentials if the automatic discovery doesn't suit your needs.
-#### Interactive Credential Management Tool
-The proxy includes a powerful interactive CLI tool for managing all your credentials. This is the recommended way to set up credentials:
-```bash
-python -m rotator_library.credential_tool
 ```
-**Or use the TUI Launcher** (recommended):
-```bash
-python src/proxy_app/main.py
-# Then select "3. 🔑 Manage Credentials"
-```
-**Main Menu Features:**
-1. **Add OAuth Credential** - Interactive OAuth flow for Gemini CLI, Antigravity, Qwen Code, and iFlow
-   - Automatically opens your browser for authentication
-   - Handles the entire OAuth flow including callbacks
-   - Saves credentials to the local `oauth_creds/` directory
-   - For Gemini CLI: Automatically discovers or creates a Google Cloud project
-   - For Antigravity: Similar to Gemini CLI with Antigravity-specific scopes
-   - For Qwen Code: Uses Device Code flow (you'll enter a code in your browser)
-   - For iFlow: Starts a local callback server on port 11451
-2. **Add API Key** - Add standard API keys for any LiteLLM-supported provider
-   - Interactive prompts guide you through the process
-   - Automatically saves to your `.env` file
-   - Supports multiple keys per provider (numbered automatically)
-3. **Export Credentials to .env** - The "Stateless Deployment" feature
-   - Converts file-based OAuth credentials into environment variables
-   - Essential for platforms without persistent file storage
-   - Generates a ready-to-paste `.env` block for each credential
-**Stateless Deployment Workflow (Railway, Render, Vercel, etc.):**
-If you're deploying to a platform without persistent file storage:
-1. **Setup credentials locally first**:
-   ```bash
-   python -m rotator_library.credential_tool
-   # Select "Add OAuth Credential" and complete the flow
-   ```
-2. **Export to environment variables**:
-   ```bash
-   python -m rotator_library.credential_tool
-   # Select "Export Gemini CLI to .env" (or Qwen/iFlow)
-   # Choose your credential file
-   ```
-3. **Copy the generated output**:
-   - The tool creates a file like `gemini_cli_credential_1.env`
-   - Contains all necessary `GEMINI_CLI_*` variables
-4. **Paste into your hosting platform**:
-   - Add each variable to your platform's environment settings
-   - Set `SKIP_OAUTH_INIT_CHECK=true` to skip interactive validation
-   - No credential files needed; everything loads from environment variables
-**Local-First OAuth Management:**
-The proxy uses a "local-first" approach for OAuth credentials:
-- **Local Storage**: All OAuth credentials are stored in `oauth_creds/` directory
-- **Automatic Discovery**: On first run, the proxy scans system paths (`~/.gemini/`, `~/.qwen/`, `~/.iflow/`) and imports found credentials
-- **Deduplication**: Intelligently detects duplicate accounts (by email/user ID) and warns you
-- **Priority**: Local files take priority over system-wide credentials
-- **No System Pollution**: Your project's credentials are isolated from global system credentials
-**Example `.env` configuration:**
-```env
-# A secret key for your proxy server to authenticate requests.
-# This can be any secret string you choose.
-PROXY_API_KEY="a-very-secret-and-unique-key"
-# --- Provider API Keys (Optional) ---
-# The proxy automatically finds keys in your environment variables.
-# You can also define them here. Add multiple keys by numbering them (_1, _2).
-GEMINI_API_KEY_1="YOUR_GEMINI_API_KEY_1"
-GEMINI_API_KEY_2="YOUR_GEMINI_API_KEY_2"
-OPENROUTER_API_KEY_1="YOUR_OPENROUTER_API_KEY_1"
-# --- OAuth Credentials (Optional) ---
-# The proxy automatically finds credentials in standard system paths.
-# You can override this by specifying a path to your credential file.
-GEMINI_CLI_OAUTH_1="/path/to/your/specific/gemini_creds.json"
-# --- Gemini CLI: Stateless Deployment Support ---
-# For hosts without file persistence (Railway, Render, etc.), you can provide
-# Gemini CLI credentials directly via environment variables:
-GEMINI_CLI_ACCESS_TOKEN="ya29.your-access-token"
-GEMINI_CLI_REFRESH_TOKEN="1//your-refresh-token"
-GEMINI_CLI_EXPIRY_DATE="1234567890000"
-GEMINI_CLI_EMAIL="your-email@gmail.com"
-# Optional: GEMINI_CLI_PROJECT_ID, GEMINI_CLI_CLIENT_ID, etc.
-# See IMPLEMENTATION_SUMMARY.md for full list of supported variables
-# --- Dual Authentication Support ---
-# Some providers (qwen_code, iflow) support BOTH OAuth and direct API keys.
-# You can use either method, or mix both for credential rotation:
-QWEN_CODE_API_KEY_1="your-qwen-api-key"  # Direct API key
-# AND/OR use OAuth: oauth_creds/qwen_code_oauth_1.json
-IFLOW_API_KEY_1="sk-your-iflow-key"      # Direct API key
-# AND/OR use OAuth: oauth_creds/iflow_oauth_1.json
-```
-### 4. Run the Proxy
-You can run the proxy in two ways:
-**A) Using the Compiled Executable (Recommended)**
-A pre-compiled, standalone executable for Windows is available on the [latest GitHub Release](https://github.com/Mirrowel/LLM-API-Key-Proxy/releases/latest). This is the easiest way to get started as it requires no setup.
-For the simplest experience, follow the **Quick Start** guide at the top of this document.
-**B) Running from Source**
-Start the server by running the `main.py` script
-```bash
-python src/proxy_app/main.py
-```
-This launches the interactive TUI launcher by default. To run the proxy directly, use:
-```bash
-python src/proxy_app/main.py --host 0.0.0.0 --port 8000
-```
-The proxy is now running and available at `http://127.0.0.1:8000`.
-### 5. Make a Request
-You can now send requests to the proxy. The endpoint is `http://127.0.0.1:8000/v1/chat/completions`.
-Remember to:
-1.  Set the `Authorization` header to `Bearer your-super-secret-proxy-key`.
-2.  Specify the `model` in the format `provider/model_name`.
-Here is an example using `curl`:
-```bash
-curl -X POST http://127.0.0.1:8000/v1/chat/completions \
--H "Content-Type: application/json" \
--H "Authorization: Bearer your-super-secret-proxy-key" \
--d '{
-    "model": "gemini/gemini-2.5-flash",
-    "messages": [{"role": "user", "content": "What is the capital of France?"}]
-}'
-```
 ---
-## Advanced Usage
-### Using with the OpenAI Python Library (Recommended)
-The proxy is OpenAI-compatible, so you can use it directly with the `openai` Python client.
-```python
-import openai
-# Point the client to your local proxy
-client = openai.OpenAI(
-    base_url="http://127.0.0.1:8000/v1",
-    api_key="a-very-secret-and-unique-key" # Use your PROXY_API_KEY here
-)
-# Make a request
-response = client.chat.completions.create(
-    model="gemini/gemini-2.5-flash", # Specify provider and model
-    messages=[
-        {"role": "user", "content": "Write a short poem about space."}
-    ]
-)
-print(response.choices[0].message.content)
 ```
-### Using with `curl`
-```bash
-You can also send requests directly using tools like `curl`.
-```bash
-curl -X POST http://127.0.0.1:8000/v1/chat/completions \
--H "Content-Type: application/json" \
--H "Authorization: Bearer a-very-secret-and-unique-key" \
--d '{
-    "model": "gemini/gemini-2.5-flash",
-    "messages": [{"role": "user", "content": "What is the capital of France?"}]
-}'
 ```
-### Available API Endpoints
--   `POST /v1/chat/completions`: The main endpoint for making chat requests.
--   `POST /v1/embeddings`: The endpoint for creating embeddings.
--   `GET /v1/models`: Returns a list of all available models from your configured providers.
--   `GET /v1/providers`: Returns a list of all configured providers.
--   `POST /v1/token-count`: Calculates the token count for a given message payload.
----
-## 4. Advanced Topics
-### Batch Request Processing
-The proxy includes a `Batch Manager` that optimizes high-volume embedding requests.
-- **Automatic Aggregation**: Multiple individual embedding requests are automatically collected into a single batch API call.
-- **Configurable**: Works out of the box, but can be tuned for specific needs.
-- **Benefits**: Significantly reduces the number of HTTP requests to providers, helping you stay within rate limits while improving throughput.
-### How It Works
-The proxy is built on a robust architecture:
-1.  **Intelligent Routing**: The `UsageManager` selects the best available key from your pool. It prioritizes idle keys first, then keys that can handle concurrency, ensuring optimal load balancing.
-2.  **Resilience & Deadlines**: Every request has a strict deadline (`global_timeout`). If a provider is slow or fails, the proxy retries with a different key immediately, ensuring your application never hangs.
-3.  **Batching**: High-volume embedding requests are automatically aggregated into optimized batches, reducing API calls and staying within rate limits.
-4.  **Deep Observability**: (Optional) Detailed logs capture every byte of the transaction, including raw streaming chunks, for precise debugging of complex agentic interactions.
-### Command-Line Arguments and Scripts
-The proxy server can be configured at runtime using the following command-line arguments:
--   `--host`: The IP address to bind the server to. Defaults to `0.0.0.0` (accessible from your local network).
--   `--port`: The port to run the server on. Defaults to `8000`.
--   `--enable-request-logging`: A flag to enable detailed, per-request logging. When active, the proxy creates a unique directory for each transaction in the `logs/detailed_logs/` folder, containing the full request, response, streaming chunks, and performance metadata. This is highly recommended for debugging.
-### New Provider Highlights
-#### **Gemini CLI (Advanced)**
-A powerful provider that mimics the Google Cloud Code extension.
--   **Zero-Config Project Discovery**: Automatically finds your Google Cloud Project ID or onboards you to a free-tier project if none exists.
--   **Internal API Access**: Uses high-limit internal endpoints (`cloudcode-pa.googleapis.com`) rather than the public Vertex AI API.
--   **Smart Rate Limiting**: Automatically falls back to preview models (e.g., `gemini-2.5-pro-preview`) if the main model hits a rate limit.
-#### **Qwen Code**
--   **Dual Authentication**: Use either standard API keys or OAuth 2.0 Device Flow credentials.
--   **Schema Cleaning**: Automatically removes `strict` and `additionalProperties` from tool schemas to prevent API errors.
--   **Stream Stability**: Injects a dummy `do_not_call_me` tool to prevent stream corruption issues when no tools are provided.
--   **Reasoning Support**: Parses `<think>` tags in responses and exposes them as `reasoning_content` (similar to OpenAI's o1 format).
--   **Dedicated Logging**: Optional per-request file logging to `logs/qwen_code_logs/` for debugging.
--   **Custom Models**: Define additional models via `QWEN_CODE_MODELS` environment variable (JSON array format).
-#### **iFlow**
--   **Dual Authentication**: Use either standard API keys or OAuth 2.0 Authorization Code Flow.
--   **Hybrid Auth**: OAuth flow provides an access token, but actual API calls use a separate `apiKey` retrieved from user profile.
--   **Local Callback Server**: OAuth flow runs a temporary server on port 11451 to capture the redirect.
--   **Schema Cleaning**: Same as Qwen Code - removes unsupported properties from tool schemas.
--   **Stream Stability**: Injects placeholder tools to stabilize streaming for empty tool lists.
--   **Dedicated Logging**: Optional per-request file logging to `logs/iflow_logs/` for debugging proprietary API behaviors.
--   **Custom Models**: Define additional models via `IFLOW_MODELS` environment variable (JSON array format).
-### Advanced Configuration
-The following advanced settings can be added to your `.env` file (or configured interactively via the TUI Settings Tool):
-#### OAuth and Refresh Settings
--   **`OAUTH_REFRESH_INTERVAL`**: Controls how often (in seconds) the background refresher checks for expired OAuth tokens. Default is `600` (10 minutes).
-    ```env
-    OAUTH_REFRESH_INTERVAL=600  # Check every 10 minutes
-    ```
--   **`SKIP_OAUTH_INIT_CHECK`**: Set to `true` to skip the interactive OAuth setup/validation check on startup. Essential for non-interactive environments like Docker containers or CI/CD pipelines.
-    ```env
-    SKIP_OAUTH_INIT_CHECK=true
-#### **Antigravity (Advanced - Gemini 3 \ Claude Opus 4.5 / Sonnet 4.5 Access)**
-The newest and most sophisticated provider, offering access to cutting-edge models via Google's internal Antigravity API.
 **Supported Models:**
--   Gemini 2.5 (Pro/Flash) with `thinkingBudget` parameter
--   **Gemini 3 Pro (High/Low)** - Latest preview models
--   **🆕 Claude Opus 4.5 + Thinking** - Anthropic's most powerful model via Antigravity proxy
--   **Claude Sonnet 4.5 + Thinking** via Antigravity proxy
-**Advanced Features:**
--   **Thought Signature Caching**: Preserves encrypted signatures for multi-turn Gemini 3 conversations
--   **Tool Hallucination Prevention**: Automatic system instruction and parameter signature injection for Gemini 3 to prevent tools from being called with incorrect parameters
--   **Thinking Preservation**: Caches Claude thinking content for consistency across conversation turns
--   **Automatic Fallback**: Tries sandbox endpoints before falling back to production
--   **Schema Cleaning**: Handles Claude-specific tool schema requirements
-**Configuration:**
--   **OAuth Setup**: Uses Google OAuth similar to Gemini CLI (separate scopes)
--   **Stateless Deployment**: Full environment variable support
--   **Paid Tier Recommended**: Gemini 3 models require a paid Google Cloud project
 **Environment Variables:**
 ```env
-# Stateless deployment
-ANTIGRAVITY_ACCESS_TOKEN="..."
-ANTIGRAVITY_REFRESH_TOKEN="..."
-ANTIGRAVITY_EXPIRY_DATE="..."
-ANTIGRAVITY_EMAIL="user@gmail.com"
 # Feature toggles
-ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true  # Multi-turn conversation support
-ANTIGRAVITY_GEMINI3_TOOL_FIX=true  # Prevent tool hallucination
 ```
-    ```
-#### Credential Rotation Modes
--   **`ROTATION_MODE_<PROVIDER>`**: Controls how credentials are rotated when multiple are available. Default: `balanced` (except Antigravity which defaults to `sequential`).
-    - `balanced`: Rotate credentials evenly across requests to distribute load. Best for per-minute rate limits.
-    - `sequential`: Use one credential until exhausted (429 error), then switch to next. Best for daily/weekly quotas.
-    ```env
-    ROTATION_MODE_GEMINI=sequential    # Use Gemini keys until quota exhausted
-    ROTATION_MODE_OPENAI=balanced      # Distribute load across OpenAI keys (default)
-    ROTATION_MODE_ANTIGRAVITY=balanced # Override Antigravity's sequential default
-    ```
-#### Priority-Based Concurrency Multipliers
--   **`CONCURRENCY_MULTIPLIER_<PROVIDER>_PRIORITY_<N>`**: Assign concurrency multipliers to priority tiers. Higher-tier credentials handle more concurrent requests.
-    ```env
-    # Universal multipliers (apply to all rotation modes)
-    CONCURRENCY_MULTIPLIER_ANTIGRAVITY_PRIORITY_1=10   # 10x for paid ultra tier
-    CONCURRENCY_MULTIPLIER_ANTIGRAVITY_PRIORITY_3=1    # 1x for lower tiers
-    # Mode-specific overrides
-    CONCURRENCY_MULTIPLIER_ANTIGRAVITY_PRIORITY_2_BALANCED=1  # P2 = 1x in balanced mode only
-    ```
-    **Provider Defaults** (built into provider classes):
-    - **Antigravity**: Priority 1: 5x, Priority 2: 3x, Priority 3+: 2x (sequential) or 1x (balanced)
-    - **Gemini CLI**: Priority 1: 5x, Priority 2: 3x, Others: 1x
-#### Model Quota Groups
--   **`QUOTA_GROUPS_<PROVIDER>_<GROUP>`**: Define models that share quota/cooldown timing. When one model hits quota, all in the group receive the same cooldown timestamp.
-    ```env
-    QUOTA_GROUPS_ANTIGRAVITY_CLAUDE="claude-sonnet-4-5,claude-opus-4-5"
-    QUOTA_GROUPS_ANTIGRAVITY_GEMINI="gemini-3-pro-preview,gemini-3-pro-image-preview"
-    # To disable a default group:
-    QUOTA_GROUPS_ANTIGRAVITY_CLAUDE=""
-    ```
-    **Default Groups**:
-    - **Antigravity**: Claude group (Sonnet 4.5 + Opus 4.5) with Opus counting 2x vs Sonnet
-#### Concurrency Control
--   **`MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`**: Set the maximum number of simultaneous requests allowed per API key for a specific provider. Default is `1` (no concurrency). Useful for high-throughput providers.
-    ```env
-    MAX_CONCURRENT_REQUESTS_PER_KEY_OPENAI=3
-    MAX_CONCURRENT_REQUESTS_PER_KEY_ANTHROPIC=2
-    MAX_CONCURRENT_REQUESTS_PER_KEY_GEMINI=1
-    ```
-#### Custom Model Lists
-For providers that support custom model definitions (Qwen Code, iFlow), you can override the default model list:
--   **`QWEN_CODE_MODELS`**: JSON array of custom Qwen Code models. These models take priority over hardcoded defaults.
-    ```env
-    QWEN_CODE_MODELS='["qwen3-coder-plus", "qwen3-coder-flash", "custom-model-id"]'
-    ```
--   **`IFLOW_MODELS`**: JSON array of custom iFlow models. These models take priority over hardcoded defaults.
-    ```env
-    IFLOW_MODELS='["glm-4.6", "qwen3-coder-plus", "deepseek-v3.2"]'
-    ```
-#### Provider-Specific Settings
--   **`GEMINI_CLI_PROJECT_ID`**: Manually specify a Google Cloud Project ID for Gemini CLI OAuth. Only needed if automatic discovery fails.
-#### Antigravity Provider
--   **`ANTIGRAVITY_OAUTH_1`**: Path to Antigravity OAuth credential file (auto-discovered from `~/.antigravity/` or use the credential tool).
-    ```env
-    ANTIGRAVITY_OAUTH_1="/path/to/your/antigravity_creds.json"
-    ```
--   **Stateless Deployment** (Environment Variables):
-    ```env
-    ANTIGRAVITY_ACCESS_TOKEN="ya29.your-access-token"
-#### Credential Rotation Strategy
--   **`ROTATION_TOLERANCE`**: Controls how credentials are selected for requests. Set via environment variable or programmatically.
-    - `0.0`: **Deterministic** - Always selects the least-used credential for perfect load balance
-    - `3.0` (default, recommended): **Weighted Random** - Randomly selects with bias toward less-used credentials. Provides unpredictability (harder to fingerprint/detect) while maintaining good balance
-    - `5.0+`: **High Randomness** - Maximum unpredictability, even heavily-used credentials can be selected
-    ```env
-    # For maximum security/unpredictability (recommended for production)
-    ROTATION_TOLERANCE=3.0
-    # For perfect load balancing (default)
-    ROTATION_TOLERANCE=0.0
-    ```
-    **Why use weighted random?**
-    - Makes traffic patterns less predictable
-    - Still maintains good load distribution across keys
-    - Recommended for production environments with multiple credentials
-    ANTIGRAVITY_REFRESH_TOKEN="1//your-refresh-token"
-    ANTIGRAVITY_EXPIRY_DATE="1234567890000"
-    ANTIGRAVITY_EMAIL="your-email@gmail.com"
-    ```
--   **`ANTIGRAVITY_ENABLE_SIGNATURE_CACHE`**: Enable/disable thought signature caching for Gemini 3 multi-turn conversations. Default: `true`.
-    ```env
-    ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true
-    ```
--   **`ANTIGRAVITY_GEMINI3_TOOL_FIX`**: Enable/disable tool hallucination prevention for Gemini 3 models. Default: `true`.
-    ```env
-    ANTIGRAVITY_GEMINI3_TOOL_FIX=true
-    ```
-#### Temperature Override (Global)
--   **`OVERRIDE_TEMPERATURE_ZERO`**: Prevents tool hallucination caused by temperature=0 settings. Modes:
-    - `"remove"`: Deletes temperature=0 from requests (lets provider use default)
-    - `"set"`: Changes temperature=0 to temperature=1.0
-    - `"false"` or unset: Disabled (default)
-#### Credential Prioritization
--   **`GEMINI_CLI_PROJECT_ID`**: Manually specify a Google Cloud Project ID for Gemini CLI OAuth. Auto-discovered unless unexpected failure occurs.
-    ```env
-    GEMINI_CLI_PROJECT_ID="your-gcp-project-id"
-    ```
-    ```env
-    GEMINI_CLI_PROJECT_ID="your-gcp-project-id"
-    ```
-**Example:**
-```bash
-python src/proxy_app/main.py --host 127.0.0.1 --port 9999 --enable-request-logging
-```
-#### Windows Batch Scripts
-For convenience on Windows, you can use the provided `.bat` scripts in the root directory:
--   **`launcher.bat`** *(deprecated)*: Legacy launcher with manual menu system. Still functional but superseded by the new TUI.
-### Troubleshooting
--   **`401 Unauthorized`**: Ensure your `PROXY_API_KEY` is set correctly in the `.env` file and included in the `Authorization: Bearer <key>` header of your request.
--   **`500 Internal Server Error`**: Check the console logs of the `uvicorn` server for detailed error messages. This could indicate an issue with one of your provider API keys (e.g., it's invalid or has been revoked) or a problem with the provider's service. If you have logging enabled (`--enable-request-logging`), inspect the `final_response.json` and `metadata.json` files in the corresponding log directory under `logs/detailed_logs/` for the specific error returned by the upstream provider.
--   **All keys on cooldown**: If you see a message that all keys are on cooldown, it means all your keys for a specific provider have recently failed. If you have logging enabled (`--enable-request-logging`), check the `logs/detailed_logs/` directory to find the logs for the failed requests and inspect the `final_response.json` to see the underlying error from the provider.
----
-## Library and Technical Docs
--   **Using the Library**: For documentation on how to use the `api-key-manager` library directly in your own Python projects, please refer to its [README.md](src/rotator_library/README.md).
--   **Technical Details**: For a more in-depth technical explanation of the library's architecture, components, and internal workings, please refer to the [Technical Documentation](DOCUMENTATION.md).
-### Advanced Model Filtering (Whitelists & Blacklists)
-The proxy provides a powerful way to control which models are available to your applications using environment variables in your `.env` file.
-#### How It Works
-The filtering logic is applied in this order:
-1.  **Whitelist Check**: If a provider has a whitelist defined (`WHITELIST_MODELS_<PROVIDER>`), any model on that list will **always be available**, even if it's on the blacklist.
-2.  **Blacklist Check**: For any model *not* on the whitelist, the proxy checks the blacklist (`IGNORE_MODELS_<PROVIDER>`). If the model is on the blacklist, it will be hidden.
-3.  **Default**: If a model is on neither list, it will be available.
-This allows for two powerful patterns:
-#### Use Case 1: Pure Whitelist Mode
-You can expose *only* the specific models you want. To do this, set the blacklist to `*` to block all models by default, and then add the desired models to the whitelist.
-**Example `.env`:**
-```env
-# Block all Gemini models by default
-IGNORE_MODELS_GEMINI="*"
-# Only allow gemini-1.5-pro and gemini-1.5-flash
-WHITELIST_MODELS_GEMINI="gemini-1.5-pro-latest,gemini-1.5-flash-latest"
 ```
-#### Use Case 2: Exemption Mode
-You can block a broad category of models and then use the whitelist to make specific exceptions.
-**Example `.env`:**
-```env
-# Block all preview models from OpenAI
-IGNORE_MODELS_OPENAI="*-preview*"
-# But make an exception for a specific preview model you want to test
-WHITELIST_MODELS_OPENAI="gpt-4o-2024-08-06-preview"
 ```

+# Universal LLM API Proxy & Resilience Library
+[![ko-fi](https://ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/C0C0UZS4P)
 [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/Mirrowel/LLM-API-Key-Proxy) [![zread](https://img.shields.io/badge/Ask_Zread-_.svg?style=flat&color=00b0aa&labelColor=000000&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZyB3aWR0aD0iMTYiIGhlaWdodD0iMTYiIHZpZXdCb3g9IjAgMCAxNiAxNiIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPHBhdGggZD0iTTQuOTYxNTYgMS42MDAxSDIuMjQxNTZDMS44ODgxIDEuNjAwMSAxLjYwMTU2IDEuODg2NjQgMS42MDE1NiAyLjI0MDFWNC45NjAxQzEuNjAxNTYgNS4zMTM1NiAxLjg4ODEgNS42MDAxIDIuMjQxNTYgNS42MDAxSDQuOTYxNTZDNS4zMTUwMiA1LjYwMDEgNS42MDE1NiA1LjMxMzU2IDUuNjAxNTYgNC45NjAxVjIuMjQwMUM1LjYwMTU2IDEuODg2NjQgNS4zMTUwMiAxLjYwMDEgNC45NjE1NiAxLjYwMDFaIiBmaWxsPSIjZmZmIi8%2BCjxwYXRoIGQ9Ik00Ljk2MTU2IDEwLjM5OTlIMi4yNDE1NkMxLjg4ODEgMTAuMzk5OSAxLjYwMTU2IDEwLjY4NjQgMS42MDE1NiAxMS4wMzk5VjEzLjc1OTlDMS42MDE1NiAxNC4xMTM0IDEuODg4MSAxNC4zOTk5IDIuMjQxNTYgMTQuMzk5OUg0Ljk2MTU2QzUuMzE1MDIgMTQuMzk5OSA1LjYwMTU2IDE0LjExMzQgNS42MDE1NiAxMy43NTk5VjExLjAzOTlDNS42MDE1NiAxMC42ODY0IDUuMzE1MDIgMTAuMzk5OSA0Ljk2MTU2IDEwLjM5OTlaIiBmaWxsPSIjZmZmIi8%2BCjxwYXRoIGQ9Ik0xMy43NTg0IDEuNjAwMUgxMS4wMzg0QzEwLjY4NSAxLjYwMDEgMTAuMzk4NCAxLjg4NjY0IDEwLjM5ODQgMi4yNDAxVjQuOTYwMUMxMC4zOTg0IDUuMzEzNTYgMTAuNjg1IDUuNjAwMSAxMS4wMzg0IDUuNjAwMUgxMy43NTg0QzE0LjExMTkgNS42MDAxIDE0LjM5ODQgNS4zMTM1NiAxNC4zOTg0IDQuOTYwMVYyLjI0MDFDMTQuMzk4NCAxLjg4NjY0IDE0LjExMTkgMS42MDAxIDEzLjc1ODQgMS42MDAxWiIgZmlsbD0iI2ZmZiIvPgo8cGF0aCBkPSJNNCAxMkwxMiA0TDQgMTJaIiBmaWxsPSIjZmZmIi8%2BCjxwYXRoIGQ9Ik00IDEyTDEyIDQiIHN0cm9rZT0iI2ZmZiIgc3Ryb2tlLXdpZHRoPSIxLjUiIHN0cm9rZS1saW5lY2FwPSJyb3VuZCIvPgo8L3N2Zz4K&logoColor=ffffff)](https://zread.ai/Mirrowel/LLM-API-Key-Proxy)
+**One proxy. Any LLM provider. Zero code changes.**
+A self-hosted proxy that provides a single, OpenAI-compatible API endpoint for all your LLM providers. Works with any application that supports custom OpenAI base URLs—no code changes required in your existing tools.
+This project consists of two components:
+1. **The API Proxy** — A FastAPI application providing a universal `/v1/chat/completions` endpoint
+2. **The Resilience Library** — A reusable Python library for intelligent API key management, rotation, and failover
+---
+## Why Use This?
+- **Universal Compatibility** — Works with any app supporting OpenAI-compatible APIs: Opencode, Continue, Roo/Kilo Code, JanitorAI, SillyTavern, custom applications, and more
+- **One Endpoint, Many Providers** — Configure Gemini, OpenAI, Anthropic, and [any LiteLLM-supported provider](https://docs.litellm.ai/docs/providers) once. Access them all through a single API key
+- **Built-in Resilience** — Automatic key rotation, failover on errors, rate limit handling, and intelligent cooldowns
+- **Exclusive Provider Support** — Includes custom providers not available elsewhere: **Antigravity** (Gemini 3 + Claude Sonnet/Opus 4.5), **Gemini CLI**, **Qwen Code**, and **iFlow**
 ---
+## Quick Start
+### Windows
+1. **Download** the latest release from [GitHub Releases](https://github.com/Mirrowel/LLM-API-Key-Proxy/releases/latest)
+2. **Unzip** the downloaded file
+3. **Run** `proxy_app.exe` — the interactive TUI launcher opens
+<!-- TODO: Add TUI main menu screenshot here -->
 ### macOS / Linux
 ```bash
+# Download and extract the release for your platform
+chmod +x proxy_app
+./proxy_app
 ```
+### From Source
 ```bash
+git clone https://github.com/Mirrowel/LLM-API-Key-Proxy.git
+cd LLM-API-Key-Proxy
+python3 -m venv venv
+source venv/bin/activate  # Windows: venv\Scripts\activate
+pip install -r requirements.txt
 python src/proxy_app/main.py
 ```
+> **Tip:** Running with command-line arguments (e.g., `--host 0.0.0.0 --port 8000`) bypasses the TUI and starts the proxy directly.
 ---
+## Connecting to the Proxy
+Once the proxy is running, configure your application with these settings:
+| Setting | Value |
+|---------|-------|
+| **Base URL / API Endpoint** | `http://127.0.0.1:8000/v1` |
+| **API Key** | Your `PROXY_API_KEY` |
+### Model Format: `provider/model_name`
+**Important:** Models must be specified in the format `provider/model_name`. The `provider/` prefix tells the proxy which backend to route the request to.
+```
+gemini/gemini-2.5-flash          ← Gemini API
+openai/gpt-4o                    ← OpenAI API
+anthropic/claude-3-5-sonnet      ← Anthropic API
+openrouter/anthropic/claude-3-opus  ← OpenRouter
+gemini_cli/gemini-2.5-pro        ← Gemini CLI (OAuth)
+antigravity/gemini-3-pro-preview ← Antigravity (Gemini 3, Claude Opus 4.5)
+```
+### Usage Examples
+<details>
+<summary><b>Python (OpenAI Library)</b></summary>
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://127.0.0.1:8000/v1",
+    api_key="your-proxy-api-key"
+)
+response = client.chat.completions.create(
+    model="gemini/gemini-2.5-flash",  # provider/model format
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+print(response.choices[0].message.content)
 ```
+</details>
+<details>
+<summary><b>curl</b></summary>
 ```bash
+curl -X POST http://127.0.0.1:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer your-proxy-api-key" \
+  -d '{
+    "model": "gemini/gemini-2.5-flash",
+    "messages": [{"role": "user", "content": "What is the capital of France?"}]
+  }'
 ```
+</details>
+<details>
+<summary><b>JanitorAI / SillyTavern / Other Chat UIs</b></summary>
+1. Go to **API Settings**
+2. Select **"Proxy"** or **"Custom OpenAI"** mode
+3. Configure:
+   - **API URL:** `http://127.0.0.1:8000/v1`
+   - **API Key:** Your `PROXY_API_KEY`
+   - **Model:** `provider/model_name` (e.g., `gemini/gemini-2.5-flash`)
+4. Save and start chatting
+</details>
+<details>
+<summary><b>Continue / Cursor / IDE Extensions</b></summary>
+In your configuration file (e.g., `config.json`):
+```json
+{
+  "models": [{
+    "title": "Gemini via Proxy",
+    "provider": "openai",
+    "model": "gemini/gemini-2.5-flash",
+    "apiBase": "http://127.0.0.1:8000/v1",
+    "apiKey": "your-proxy-api-key"
+  }]
+}
+```
+</details>
+### API Endpoints
+| Endpoint | Description |
+|----------|-------------|
+| `GET /` | Status check — confirms proxy is running |
+| `POST /v1/chat/completions` | Chat completions (main endpoint) |
+| `POST /v1/embeddings` | Text embeddings |
+| `GET /v1/models` | List all available models with pricing & capabilities |
+| `GET /v1/models/{model_id}` | Get details for a specific model |
+| `GET /v1/providers` | List configured providers |
+| `POST /v1/token-count` | Calculate token count for a payload |
+| `POST /v1/cost-estimate` | Estimate cost based on token counts |
+> **Tip:** The `/v1/models` endpoint is useful for discovering available models in your client. Many apps can fetch this list automatically. Add `?enriched=false` for a minimal response without pricing data.
+---
+## Managing Credentials
+The proxy includes an interactive tool for managing all your API keys and OAuth credentials.
+### Using the TUI
+<!-- TODO: Add TUI credentials menu screenshot here -->
+1. Run the proxy without arguments to open the TUI
+2. Select **"🔑 Manage Credentials"**
+3. Choose to add API keys or OAuth credentials
+### Using the Command Line
 ```bash
+python -m rotator_library.credential_tool
 ```
+### Credential Types
+| Type | Providers | How to Add |
+|------|-----------|------------|
+| **API Keys** | Gemini, OpenAI, Anthropic, OpenRouter, Groq, Mistral, NVIDIA, Cohere, Chutes | Enter key in TUI or add to `.env` |
+| **OAuth** | Gemini CLI, Antigravity, Qwen Code, iFlow | Interactive browser login via credential tool |
+### The `.env` File
+Credentials are stored in a `.env` file. You can edit it directly or use the TUI:
+```env
+# Required: Authentication key for YOUR proxy
+PROXY_API_KEY="your-secret-proxy-key"
+# Provider API Keys (add multiple with _1, _2, etc.)
+GEMINI_API_KEY_1="your-gemini-key"
+GEMINI_API_KEY_2="another-gemini-key"
+OPENAI_API_KEY_1="your-openai-key"
+ANTHROPIC_API_KEY_1="your-anthropic-key"
 ```
+> Copy `.env.example` to `.env` as a starting point.
+---
+## The Resilience Library
+The proxy is powered by a standalone Python library that you can use directly in your own applications.
+### Key Features
+- **Async-native** with `asyncio` and `httpx`
+- **Intelligent key selection** with tiered, model-aware locking
+- **Deadline-driven requests** with configurable global timeout
+- **Automatic failover** between keys on errors
+- **OAuth support** for Gemini CLI, Antigravity, Qwen, iFlow
+- **Stateless deployment ready** — load credentials from environment variables
+### Basic Usage
+```python
+from rotator_library import RotatingClient
+client = RotatingClient(
+    api_keys={"gemini": ["key1", "key2"], "openai": ["key3"]},
+    global_timeout=30,
+    max_retries=2
+)
+async with client:
+    response = await client.acompletion(
+        model="gemini/gemini-2.5-flash",
+        messages=[{"role": "user", "content": "Hello!"}]
+    )
 ```
+### Library Documentation
+See the [Library README](src/rotator_library/README.md) for complete documentation including:
+- All initialization parameters
+- Streaming support
+- Error handling and cooldown strategies
+- Provider plugin system
+- Credential prioritization
+---
+## Interactive TUI
+The proxy includes a powerful text-based UI for configuration and management.
+<!-- TODO: Add TUI main menu screenshot here -->
+### TUI Features
+- **🚀 Run Proxy** — Start the server with saved settings
+- **⚙️ Configure Settings** — Host, port, API key, request logging
+- **🔑 Manage Credentials** — Add/edit API keys and OAuth credentials
+- **📊 View Status** — See configured providers and credential counts
+- **🔧 Advanced Settings** — Custom providers, model definitions, concurrency
+### Configuration Files
+| File | Contents |
+|------|----------|
+| `.env` | All credentials and advanced settings |
+| `launcher_config.json` | TUI-specific settings (host, port, logging) |
+---
+## Features
+### Core Capabilities
+- **Universal OpenAI-compatible endpoint** for all providers
+- **Multi-provider support** via [LiteLLM](https://docs.litellm.ai/docs/providers) fallback
+- **Automatic key rotation** and load balancing
+- **Interactive TUI** for easy configuration
+- **Detailed request logging** for debugging
+<details>
+<summary><b>🛡️ Resilience & High Availability</b></summary>
+- **Global timeout** with deadline-driven retries
+- **Escalating cooldowns** per model (10s → 30s → 60s → 120s)
+- **Key-level lockouts** for consistently failing keys
+- **Stream error detection** and graceful recovery
+- **Batch embedding aggregation** for improved throughput
+- **Automatic daily resets** for cooldowns and usage stats
+</details>
+<details>
+<summary><b>🔑 Credential Management</b></summary>
+- **Auto-discovery** of API keys from environment variables
+- **OAuth discovery** from standard paths (`~/.gemini/`, `~/.qwen/`, `~/.iflow/`)
+- **Duplicate detection** warns when same account added multiple times
+- **Credential prioritization** — paid tier used before free tier
+- **Stateless deployment** — export OAuth to environment variables
+- **Local-first storage** — credentials isolated in `oauth_creds/` directory
+</details>
+<details>
+<summary><b>⚙️ Advanced Configuration</b></summary>
+- **Model whitelists/blacklists** with wildcard support
+- **Per-provider concurrency limits** (`MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`)
+- **Rotation modes** — balanced (distribute load) or sequential (use until exhausted)
+- **Priority multipliers** — higher concurrency for paid credentials
+- **Model quota groups** — shared cooldowns for related models
+- **Temperature override** — prevent tool hallucination issues
+- **Weighted random rotation** — unpredictable selection patterns
+</details>
+<details>
+<summary><b>🔌 Provider-Specific Features</b></summary>
+**Gemini CLI:**
+- Zero-config Google Cloud project discovery
+- Internal API access with higher rate limits
+- Automatic fallback to preview models on rate limit
+- Paid vs free tier detection
+**Antigravity:**
+- Gemini 3 Pro with `thinkingLevel` support
+- Claude Opus 4.5 (thinking mode)
+- Claude Sonnet 4.5 (thinking and non-thinking)
+- Thought signature caching for multi-turn conversations
+- Tool hallucination prevention
+**Qwen Code:**
+- Dual auth (API key + OAuth Device Flow)
+- `<think>` tag parsing as `reasoning_content`
+- Tool schema cleaning
+**iFlow:**
+- Dual auth (API key + OAuth Authorization Code)
+- Hybrid auth with separate API key fetch
+- Tool schema cleaning
+**NVIDIA NIM:**
+- Dynamic model discovery
+- DeepSeek thinking support
+</details>
+<details>
+<summary><b>📝 Logging & Debugging</b></summary>
+- **Per-request file logging** with `--enable-request-logging`
+- **Unique request directories** with full transaction details
+- **Streaming chunk capture** for debugging
+- **Performance metadata** (duration, tokens, model used)
+- **Provider-specific logs** for Qwen, iFlow, Antigravity
+</details>
 ---
+## Advanced Configuration
+<details>
+<summary><b>Environment Variables Reference</b></summary>
+### Proxy Settings
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `PROXY_API_KEY` | Authentication key for your proxy | Required |
+| `OAUTH_REFRESH_INTERVAL` | Token refresh check interval (seconds) | `600` |
+| `SKIP_OAUTH_INIT_CHECK` | Skip interactive OAuth setup on startup | `false` |
+### Per-Provider Settings
+| Pattern | Description | Example |
+|---------|-------------|---------|
+| `<PROVIDER>_API_KEY_<N>` | API key for provider | `GEMINI_API_KEY_1` |
+| `MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>` | Concurrent request limit | `MAX_CONCURRENT_REQUESTS_PER_KEY_OPENAI=3` |
+| `ROTATION_MODE_<PROVIDER>` | `balanced` or `sequential` | `ROTATION_MODE_GEMINI=sequential` |
+| `IGNORE_MODELS_<PROVIDER>` | Blacklist (comma-separated, supports `*`) | `IGNORE_MODELS_OPENAI=*-preview*` |
+| `WHITELIST_MODELS_<PROVIDER>` | Whitelist (overrides blacklist) | `WHITELIST_MODELS_GEMINI=gemini-2.5-pro` |
+### Advanced Features
+| Variable | Description |
+|----------|-------------|
+| `ROTATION_TOLERANCE` | `0.0`=deterministic, `3.0`=weighted random (default) |
+| `CONCURRENCY_MULTIPLIER_<PROVIDER>_PRIORITY_<N>` | Concurrency multiplier per priority tier |
+| `QUOTA_GROUPS_<PROVIDER>_<GROUP>` | Models sharing quota limits |
+| `OVERRIDE_TEMPERATURE_ZERO` | `remove` or `set` to prevent tool hallucination |
+</details>
+<details>
+<summary><b>Model Filtering (Whitelists & Blacklists)</b></summary>
+Control which models are exposed through your proxy.
+### Blacklist Only
+```env
+# Hide all preview models
+IGNORE_MODELS_OPENAI="*-preview*"
 ```
+### Pure Whitelist Mode
+```env
+# Block all, then allow specific models
+IGNORE_MODELS_GEMINI="*"
+WHITELIST_MODELS_GEMINI="gemini-2.5-pro,gemini-2.5-flash"
+```
+### Exemption Mode
+```env
+# Block preview models, but allow one specific preview
+IGNORE_MODELS_OPENAI="*-preview*"
+WHITELIST_MODELS_OPENAI="gpt-4o-2024-08-06-preview"
+```
+**Logic order:** Whitelist check → Blacklist check → Default allow
+</details>
+<details>
+<summary><b>Concurrency & Rotation Settings</b></summary>
+### Concurrency Limits
+```env
+# Allow 3 concurrent requests per OpenAI key
+MAX_CONCURRENT_REQUESTS_PER_KEY_OPENAI=3
+# Default is 1 (no concurrency)
+MAX_CONCURRENT_REQUESTS_PER_KEY_GEMINI=1
 ```
+### Rotation Modes
+```env
+# balanced (default): Distribute load evenly - best for per-minute rate limits
+ROTATION_MODE_OPENAI=balanced
+# sequential: Use until exhausted - best for daily/weekly quotas
+ROTATION_MODE_GEMINI=sequential
+```
+### Priority Multipliers
+Paid credentials can handle more concurrent requests:
+```env
+# Priority 1 (paid ultra): 10x concurrency
+CONCURRENCY_MULTIPLIER_ANTIGRAVITY_PRIORITY_1=10
+# Priority 2 (standard paid): 3x
+CONCURRENCY_MULTIPLIER_ANTIGRAVITY_PRIORITY_2=3
+```
+### Model Quota Groups
+Models sharing quota limits:
+```env
+# Claude models share quota - when one hits limit, both cool down
+QUOTA_GROUPS_ANTIGRAVITY_CLAUDE="claude-sonnet-4-5,claude-opus-4-5"
+```
+</details>
+<details>
+<summary><b>Timeout Configuration</b></summary>
+Fine-grained control over HTTP timeouts:
+```env
+TIMEOUT_CONNECT=30              # Connection establishment
+TIMEOUT_WRITE=30                # Request body send
+TIMEOUT_POOL=60                 # Connection pool acquisition
+TIMEOUT_READ_STREAMING=180      # Between streaming chunks (3 min)
+TIMEOUT_READ_NON_STREAMING=600  # Full response wait (10 min)
+```
+**Recommendations:**
+- Long thinking tasks: Increase `TIMEOUT_READ_STREAMING` to 300-360s
+- Unstable network: Increase `TIMEOUT_CONNECT` to 60s
+- Large outputs: Increase `TIMEOUT_READ_NON_STREAMING` to 900s+
+</details>
+---
+## OAuth Providers
+<details>
+<summary><b>Gemini CLI</b></summary>
+Uses Google OAuth to access internal Gemini endpoints with higher rate limits.
+**Setup:**
+1. Run `python -m rotator_library.credential_tool`
+2. Select "Add OAuth Credential" → "Gemini CLI"
+3. Complete browser authentication
+4. Credentials saved to `oauth_creds/gemini_cli_oauth_1.json`
+**Features:**
+- Zero-config project discovery
+- Automatic free-tier project onboarding
+- Paid vs free tier detection
+- Smart fallback on rate limits
+**Environment Variables (for stateless deployment):**
+```env
+GEMINI_CLI_ACCESS_TOKEN="ya29.your-access-token"
+GEMINI_CLI_REFRESH_TOKEN="1//your-refresh-token"
+GEMINI_CLI_EXPIRY_DATE="1234567890000"
+GEMINI_CLI_EMAIL="your-email@gmail.com"
+GEMINI_CLI_PROJECT_ID="your-gcp-project-id"  # Optional
+```
+</details>
+<details>
+<summary><b>Antigravity (Gemini 3 + Claude Opus 4.5)</b></summary>
+Access Google's internal Antigravity API for cutting-edge models.
 **Supported Models:**
+- **Gemini 3 Pro** — with `thinkingLevel` support (low/high)
+- **Claude Opus 4.5** — Anthropic's most powerful model (thinking mode only)
+- **Claude Sonnet 4.5** — supports both thinking and non-thinking modes
+- Gemini 2.5 Pro/Flash
+**Setup:**
+1. Run `python -m rotator_library.credential_tool`
+2. Select "Add OAuth Credential" → "Antigravity"
+3. Complete browser authentication
+**Advanced Features:**
+- Thought signature caching for multi-turn conversations
+- Tool hallucination prevention via parameter signature injection
+- Automatic thinking block sanitization for Claude
+- Credential prioritization (paid resets every 5 hours, free weekly)
 **Environment Variables:**
 ```env
+ANTIGRAVITY_ACCESS_TOKEN="ya29.your-access-token"
+ANTIGRAVITY_REFRESH_TOKEN="1//your-refresh-token"
+ANTIGRAVITY_EXPIRY_DATE="1234567890000"
+ANTIGRAVITY_EMAIL="your-email@gmail.com"
 # Feature toggles
+ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true
+ANTIGRAVITY_GEMINI3_TOOL_FIX=true
 ```
+> **Note:** Gemini 3 models require a paid-tier Google Cloud project.
+</details>
+<details>
+<summary><b>Qwen Code</b></summary>
+Uses OAuth Device Flow for Qwen/Dashscope APIs.
+**Setup:**
+1. Run the credential tool
+2. Select "Add OAuth Credential" → "Qwen Code"
+3. Enter the code displayed in your browser
+4. Or add API key directly: `QWEN_CODE_API_KEY_1="your-key"`
+**Features:**
+- Dual auth (API key or OAuth)
+- `<think>` tag parsing as `reasoning_content`
+- Automatic tool schema cleaning
+- Custom models via `QWEN_CODE_MODELS` env var
+</details>
+<details>
+<summary><b>iFlow</b></summary>
+Uses OAuth Authorization Code flow with local callback server.
+**Setup:**
+1. Run the credential tool
+2. Select "Add OAuth Credential" → "iFlow"
+3. Complete browser authentication (callback on port 11451)
+4. Or add API key directly: `IFLOW_API_KEY_1="sk-your-key"`
+**Features:**
+- Dual auth (API key or OAuth)
+- Hybrid auth (OAuth token fetches separate API key)
+- Automatic tool schema cleaning
+- Custom models via `IFLOW_MODELS` env var
+</details>
+<details>
+<summary><b>Stateless Deployment (Export to Environment Variables)</b></summary>
+For platforms without file persistence (Railway, Render, Vercel):
+1. **Set up credentials locally:**
+   ```bash
+   python -m rotator_library.credential_tool
+   # Complete OAuth flows
+   ```
+2. **Export to environment variables:**
+   ```bash
+   python -m rotator_library.credential_tool
+   # Select "Export [Provider] to .env"
+   ```
+3. **Copy generated variables to your platform:**
+   The tool creates files like `gemini_cli_credential_1.env` containing all necessary variables.
+4. **Set `SKIP_OAUTH_INIT_CHECK=true`** to skip interactive validation on startup.
+</details>
+<details>
+<summary><b>OAuth Callback Port Configuration</b></summary>
+Customize OAuth callback ports if defaults conflict:
+| Provider | Default Port | Environment Variable |
+|----------|-------------|---------------------|
+| Gemini CLI | 8085 | `GEMINI_CLI_OAUTH_PORT` |
+| Antigravity | 51121 | `ANTIGRAVITY_OAUTH_PORT` |
+| iFlow | 11451 | `IFLOW_OAUTH_PORT` |
+</details>
+---
+## Deployment
+<details>
+<summary><b>Command-Line Arguments</b></summary>
+```bash
+python src/proxy_app/main.py [OPTIONS]
+Options:
+  --host TEXT                Host to bind (default: 0.0.0.0)
+  --port INTEGER             Port to run on (default: 8000)
+  --enable-request-logging   Enable detailed per-request logging
+  --add-credential           Launch interactive credential setup tool
 ```
+**Examples:**
+```bash
+# Run on custom port
+python src/proxy_app/main.py --host 127.0.0.1 --port 9000
+# Run with logging
+python src/proxy_app/main.py --enable-request-logging
+# Add credentials without starting proxy
+python src/proxy_app/main.py --add-credential
+```
+</details>
+<details>
+<summary><b>Render / Railway / Vercel</b></summary>
+See the [Deployment Guide](Deployment%20guide.md) for complete instructions.
+**Quick Setup:**
+1. Fork the repository
+2. Create a `.env` file with your credentials
+3. Create a new Web Service pointing to your repo
+4. Set build command: `pip install -r requirements.txt`
+5. Set start command: `uvicorn src.proxy_app.main:app --host 0.0.0.0 --port $PORT`
+6. Upload `.env` as a secret file
+**OAuth Credentials:**
+Export OAuth credentials to environment variables using the credential tool, then add them to your platform's environment settings.
+</details>
+<details>
+<summary><b>Custom VPS / Docker</b></summary>
+**Option 1: Authenticate locally, deploy credentials**
+1. Complete OAuth flows on your local machine
+2. Export to environment variables
+3. Deploy `.env` to your server
+**Option 2: SSH Port Forwarding**
+```bash
+# Forward callback ports through SSH
+ssh -L 51121:localhost:51121 -L 8085:localhost:8085 user@your-vps
+# Then run credential tool on the VPS
+```
+**Systemd Service:**
+```ini
+[Unit]
+Description=LLM API Key Proxy
+After=network.target
+[Service]
+Type=simple
+WorkingDirectory=/path/to/LLM-API-Key-Proxy
+ExecStart=/path/to/python -m uvicorn src.proxy_app.main:app --host 0.0.0.0 --port 8000
+Restart=always
+[Install]
+WantedBy=multi-user.target
 ```
+See [VPS Deployment](Deployment%20guide.md#appendix-deploying-to-a-custom-vps) for complete guide.
+</details>
+---
+## Troubleshooting
+| Issue | Solution |
+|-------|----------|
+| `401 Unauthorized` | Verify `PROXY_API_KEY` matches your `Authorization: Bearer` header exactly |
+| `500 Internal Server Error` | Check provider key validity; enable `--enable-request-logging` for details |
+| All keys on cooldown | All keys failed recently; check `logs/detailed_logs/` for upstream errors |
+| Model not found | Verify format is `provider/model_name` (e.g., `gemini/gemini-2.5-flash`) |
+| OAuth callback failed | Ensure callback port (8085, 51121, 11451) isn't blocked by firewall |
+| Streaming hangs | Increase `TIMEOUT_READ_STREAMING`; check provider status |
+**Detailed Logs:**
+When `--enable-request-logging` is enabled, check `logs/detailed_logs/` for:
+- `request.json` — Exact request payload
+- `final_response.json` — Complete response or error
+- `streaming_chunks.jsonl` — All SSE chunks received
+- `metadata.json` — Performance metrics
+---
+## Documentation
+| Document | Description |
+|----------|-------------|
+| [Technical Documentation](DOCUMENTATION.md) | Architecture, internals, provider implementations |
+| [Library README](src/rotator_library/README.md) | Using the resilience library directly |
+| [Deployment Guide](Deployment%20guide.md) | Hosting on Render, Railway, VPS |
+| [.env.example](.env.example) | Complete environment variable reference |
+---
+## License
+This project is dual-licensed:
+- **Proxy Application** (`src/proxy_app/`) — [MIT License](src/proxy_app/LICENSE)
+- **Resilience Library** (`src/rotator_library/`) — [LGPL-3.0](src/rotator_library/COPYING.LESSER)

src/proxy_app/detailed_logger.py CHANGED Viewed

@@ -3,16 +3,33 @@ import time
 import uuid
 from datetime import datetime
 from pathlib import Path
-from typing import Any, Dict, Optional, List
 import logging
-LOGS_DIR = Path(__file__).resolve().parent.parent.parent / "logs"
-DETAILED_LOGS_DIR = LOGS_DIR / "detailed_logs"
 class DetailedLogger:
     """
     Logs comprehensive details of each API transaction to a unique, timestamped directory.
     """
     def __init__(self):
         """
         Initializes the logger for a single request, creating a unique directory to store all related log files.
@@ -20,17 +37,26 @@ class DetailedLogger:
         self.start_time = time.time()
         self.request_id = str(uuid.uuid4())
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-        self.log_dir = DETAILED_LOGS_DIR / f"{timestamp}_{self.request_id}"
-        self.log_dir.mkdir(parents=True, exist_ok=True)
         self.streaming = False
     def _write_json(self, filename: str, data: Dict[str, Any]):
         """Helper to write data to a JSON file in the log directory."""
-        try:
-            with open(self.log_dir / filename, "w", encoding="utf-8") as f:
-                json.dump(data, f, indent=4, ensure_ascii=False)
-        except Exception as e:
-            logging.error(f"[{self.request_id}] Failed to write to {filename}: {e}")
     def log_request(self, headers: Dict[str, Any], body: Dict[str, Any]):
         """Logs the initial request details."""
@@ -39,23 +65,22 @@ class DetailedLogger:
             "request_id": self.request_id,
             "timestamp_utc": datetime.utcnow().isoformat(),
             "headers": dict(headers),
-            "body": body
         }
         self._write_json("request.json", request_data)
     def log_stream_chunk(self, chunk: Dict[str, Any]):
         """Logs an individual chunk from a streaming response to a JSON Lines file."""
-        try:
-            log_entry = {
-                "timestamp_utc": datetime.utcnow().isoformat(),
-                "chunk": chunk
-            }
-            with open(self.log_dir / "streaming_chunks.jsonl", "a", encoding="utf-8") as f:
-                f.write(json.dumps(log_entry, ensure_ascii=False) + "\n")
-        except Exception as e:
-            logging.error(f"[{self.request_id}] Failed to write stream chunk: {e}")
-    def log_final_response(self, status_code: int, headers: Optional[Dict[str, Any]], body: Dict[str, Any]):
         """Logs the complete final response, either from a non-streaming call or after reassembling a stream."""
         end_time = time.time()
         duration_ms = (end_time - self.start_time) * 1000
@@ -66,7 +91,7 @@ class DetailedLogger:
             "status_code": status_code,
             "duration_ms": round(duration_ms),
             "headers": dict(headers) if headers else None,
-            "body": body
         }
         self._write_json("final_response.json", response_data)
         self._log_metadata(response_data)
@@ -75,10 +100,10 @@ class DetailedLogger:
         """Recursively searches for and extracts 'reasoning' fields from the response body."""
         if not isinstance(response_body, dict):
             return None
         if "reasoning" in response_body:
             return response_body["reasoning"]
         if "choices" in response_body and response_body["choices"]:
             message = response_body["choices"][0].get("message", {})
             if "reasoning" in message:
@@ -93,8 +118,13 @@ class DetailedLogger:
         usage = response_data.get("body", {}).get("usage") or {}
         model = response_data.get("body", {}).get("model", "N/A")
         finish_reason = "N/A"
-        if "choices" in response_data.get("body", {}) and response_data["body"]["choices"]:
-            finish_reason = response_data["body"]["choices"][0].get("finish_reason", "N/A")
         metadata = {
             "request_id": self.request_id,
@@ -110,12 +140,12 @@ class DetailedLogger:
             },
             "finish_reason": finish_reason,
             "reasoning_found": False,
-            "reasoning_content": None
         }
         reasoning = self._extract_reasoning(response_data.get("body", {}))
         if reasoning:
             metadata["reasoning_found"] = True
             metadata["reasoning_content"] = reasoning
-        self._write_json("metadata.json", metadata)

 import uuid
 from datetime import datetime
 from pathlib import Path
+from typing import Any, Dict, Optional
 import logging
+from rotator_library.utils.resilient_io import (
+    safe_write_json,
+    safe_log_write,
+    safe_mkdir,
+)
+from rotator_library.utils.paths import get_logs_dir
+def _get_detailed_logs_dir() -> Path:
+    """Get the detailed logs directory, creating it if needed."""
+    logs_dir = get_logs_dir()
+    detailed_dir = logs_dir / "detailed_logs"
+    detailed_dir.mkdir(parents=True, exist_ok=True)
+    return detailed_dir
 class DetailedLogger:
     """
     Logs comprehensive details of each API transaction to a unique, timestamped directory.
+    Uses fire-and-forget logging - if disk writes fail, logs are dropped (not buffered)
+    to prevent memory issues, especially with streaming responses.
     """
     def __init__(self):
         """
         Initializes the logger for a single request, creating a unique directory to store all related log files.
         self.start_time = time.time()
         self.request_id = str(uuid.uuid4())
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.log_dir = _get_detailed_logs_dir() / f"{timestamp}_{self.request_id}"
         self.streaming = False
+        self._dir_available = safe_mkdir(self.log_dir, logging)
     def _write_json(self, filename: str, data: Dict[str, Any]):
         """Helper to write data to a JSON file in the log directory."""
+        if not self._dir_available:
+            # Try to create directory again in case it was recreated
+            self._dir_available = safe_mkdir(self.log_dir, logging)
+            if not self._dir_available:
+                return
+        safe_write_json(
+            self.log_dir / filename,
+            data,
+            logging,
+            atomic=False,
+            indent=4,
+            ensure_ascii=False,
+        )
     def log_request(self, headers: Dict[str, Any], body: Dict[str, Any]):
         """Logs the initial request details."""
             "request_id": self.request_id,
             "timestamp_utc": datetime.utcnow().isoformat(),
             "headers": dict(headers),
+            "body": body,
         }
         self._write_json("request.json", request_data)
     def log_stream_chunk(self, chunk: Dict[str, Any]):
         """Logs an individual chunk from a streaming response to a JSON Lines file."""
+        if not self._dir_available:
+            return
+        log_entry = {"timestamp_utc": datetime.utcnow().isoformat(), "chunk": chunk}
+        content = json.dumps(log_entry, ensure_ascii=False) + "\n"
+        safe_log_write(self.log_dir / "streaming_chunks.jsonl", content, logging)
+    def log_final_response(
+        self, status_code: int, headers: Optional[Dict[str, Any]], body: Dict[str, Any]
+    ):
         """Logs the complete final response, either from a non-streaming call or after reassembling a stream."""
         end_time = time.time()
         duration_ms = (end_time - self.start_time) * 1000
             "status_code": status_code,
             "duration_ms": round(duration_ms),
             "headers": dict(headers) if headers else None,
+            "body": body,
         }
         self._write_json("final_response.json", response_data)
         self._log_metadata(response_data)
         """Recursively searches for and extracts 'reasoning' fields from the response body."""
         if not isinstance(response_body, dict):
             return None
         if "reasoning" in response_body:
             return response_body["reasoning"]
         if "choices" in response_body and response_body["choices"]:
             message = response_body["choices"][0].get("message", {})
             if "reasoning" in message:
         usage = response_data.get("body", {}).get("usage") or {}
         model = response_data.get("body", {}).get("model", "N/A")
         finish_reason = "N/A"
+        if (
+            "choices" in response_data.get("body", {})
+            and response_data["body"]["choices"]
+        ):
+            finish_reason = response_data["body"]["choices"][0].get(
+                "finish_reason", "N/A"
+            )
         metadata = {
             "request_id": self.request_id,
             },
             "finish_reason": finish_reason,
             "reasoning_found": False,
+            "reasoning_content": None,
         }
         reasoning = self._extract_reasoning(response_data.get("body", {}))
         if reasoning:
             metadata["reasoning_found"] = True
             metadata["reasoning_content"] = reasoning
+        self._write_json("metadata.json", metadata)

src/proxy_app/launcher_tui.py CHANGED Viewed

@@ -16,6 +16,20 @@ from dotenv import load_dotenv, set_key
 console = Console()
 def clear_screen():
     """
     Cross-platform terminal clear that works robustly on both
@@ -74,7 +88,7 @@ class LauncherConfig:
     @staticmethod
     def update_proxy_api_key(new_key: str):
         """Update PROXY_API_KEY in .env only"""
-        env_file = Path.cwd() / ".env"
         set_key(str(env_file), "PROXY_API_KEY", new_key)
         load_dotenv(dotenv_path=env_file, override=True)
@@ -85,7 +99,7 @@ class SettingsDetector:
     @staticmethod
     def _load_local_env() -> dict:
         """Load environment variables from local .env file only"""
-        env_file = Path.cwd() / ".env"
         env_dict = {}
         if not env_file.exists():
             return env_dict
@@ -107,7 +121,7 @@ class SettingsDetector:
     @staticmethod
     def get_all_settings() -> dict:
-        """Returns comprehensive settings overview"""
         return {
             "credentials": SettingsDetector.detect_credentials(),
             "custom_bases": SettingsDetector.detect_custom_api_bases(),
@@ -117,6 +131,17 @@ class SettingsDetector:
             "provider_settings": SettingsDetector.detect_provider_settings(),
         }
     @staticmethod
     def detect_credentials() -> dict:
         """Detect API keys and OAuth credentials"""
@@ -260,7 +285,7 @@ class LauncherTUI:
         self.console = Console()
         self.config = LauncherConfig()
         self.running = True
-        self.env_file = Path.cwd() / ".env"
         # Load .env file to ensure environment variables are available
         load_dotenv(dotenv_path=self.env_file, override=True)
@@ -277,8 +302,8 @@ class LauncherTUI:
         """Display main menu and handle selection"""
         clear_screen()
-        # Detect all settings
-        settings = SettingsDetector.get_all_settings()
         credentials = settings["credentials"]
         custom_bases = settings["custom_bases"]
@@ -363,18 +388,17 @@ class LauncherTUI:
         self.console.print("━" * 70)
         provider_count = len(credentials)
         custom_count = len(custom_bases)
-        provider_settings = settings.get("provider_settings", {})
         has_advanced = bool(
             settings["model_definitions"]
             or settings["concurrency_limits"]
             or settings["model_filters"]
-            or provider_settings
         )
-        self.console.print(f"   Providers:           {provider_count} configured")
-        self.console.print(f"   Custom Providers:    {custom_count} configured")
         self.console.print(
-            f"   Advanced Settings:   {'Active (view in menu 4)' if has_advanced else 'None'}"
         )
         # Show menu
@@ -418,7 +442,7 @@ class LauncherTUI:
         elif choice == "4":
             self.show_provider_settings_menu()
         elif choice == "5":
-            load_dotenv(dotenv_path=Path.cwd() / ".env", override=True)
             self.config = LauncherConfig()  # Reload config
             self.console.print("\n[green]✅ Configuration reloaded![/green]")
         elif choice == "6":
@@ -659,13 +683,14 @@ class LauncherTUI:
         """Display provider/advanced settings (read-only + launch tool)"""
         clear_screen()
-        settings = SettingsDetector.get_all_settings()
         credentials = settings["credentials"]
         custom_bases = settings["custom_bases"]
         model_defs = settings["model_definitions"]
         concurrency = settings["concurrency_limits"]
         filters = settings["model_filters"]
-        provider_settings = settings.get("provider_settings", {})
         self.console.print(
             Panel.fit(
@@ -740,23 +765,13 @@ class LauncherTUI:
                 status = " + ".join(status_parts) if status_parts else "None"
                 self.console.print(f"   • {provider:15} ✅ {status}")
-        # Provider-Specific Settings
         self.console.print()
         self.console.print("[bold]🔬 Provider-Specific Settings[/bold]")
         self.console.print("━" * 70)
-        try:
-            from proxy_app.settings_tool import PROVIDER_SETTINGS_MAP
-        except ImportError:
-            from .settings_tool import PROVIDER_SETTINGS_MAP
-        for provider in PROVIDER_SETTINGS_MAP.keys():
-            display_name = provider.replace("_", " ").title()
-            modified = provider_settings.get(provider, 0)
-            if modified > 0:
-                self.console.print(
-                    f"   • {display_name:20} [yellow]{modified} setting{'s' if modified > 1 else ''} modified[/yellow]"
-                )
-            else:
-                self.console.print(f"   • {display_name:20} [dim]using defaults[/dim]")
         # Actions
         self.console.print()
@@ -823,15 +838,31 @@ class LauncherTUI:
         # Run the tool with from_launcher=True to skip duplicate loading screen
         run_credential_tool(from_launcher=True)
         # Reload environment after credential tool
-        load_dotenv(dotenv_path=Path.cwd() / ".env", override=True)
     def launch_settings_tool(self):
         """Launch settings configuration tool"""
-        from proxy_app.settings_tool import run_settings_tool
         run_settings_tool()
         # Reload environment after settings tool
-        load_dotenv(dotenv_path=Path.cwd() / ".env", override=True)
     def show_about(self):
         """Display About page with project information"""
@@ -919,9 +950,9 @@ class LauncherTUI:
             )
             ensure_env_defaults()
-            load_dotenv(dotenv_path=Path.cwd() / ".env", override=True)
             run_credential_tool()
-            load_dotenv(dotenv_path=Path.cwd() / ".env", override=True)
             # Check again after credential tool
             if not os.getenv("PROXY_API_KEY"):

 console = Console()
+def _get_env_file() -> Path:
+    """
+    Get .env file path (lightweight - no heavy imports).
+    Returns:
+        Path to .env file - EXE directory if frozen, else current working directory
+    """
+    if getattr(sys, "frozen", False):
+        # Running as PyInstaller EXE - use EXE's directory
+        return Path(sys.executable).parent / ".env"
+    # Running as script - use current working directory
+    return Path.cwd() / ".env"
 def clear_screen():
     """
     Cross-platform terminal clear that works robustly on both
     @staticmethod
     def update_proxy_api_key(new_key: str):
         """Update PROXY_API_KEY in .env only"""
+        env_file = _get_env_file()
         set_key(str(env_file), "PROXY_API_KEY", new_key)
         load_dotenv(dotenv_path=env_file, override=True)
     @staticmethod
     def _load_local_env() -> dict:
         """Load environment variables from local .env file only"""
+        env_file = _get_env_file()
         env_dict = {}
         if not env_file.exists():
             return env_dict
     @staticmethod
     def get_all_settings() -> dict:
+        """Returns comprehensive settings overview (includes provider_settings which triggers heavy imports)"""
         return {
             "credentials": SettingsDetector.detect_credentials(),
             "custom_bases": SettingsDetector.detect_custom_api_bases(),
             "provider_settings": SettingsDetector.detect_provider_settings(),
         }
+    @staticmethod
+    def get_basic_settings() -> dict:
+        """Returns basic settings overview without provider_settings (avoids heavy imports)"""
+        return {
+            "credentials": SettingsDetector.detect_credentials(),
+            "custom_bases": SettingsDetector.detect_custom_api_bases(),
+            "model_definitions": SettingsDetector.detect_model_definitions(),
+            "concurrency_limits": SettingsDetector.detect_concurrency_limits(),
+            "model_filters": SettingsDetector.detect_model_filters(),
+        }
     @staticmethod
     def detect_credentials() -> dict:
         """Detect API keys and OAuth credentials"""
         self.console = Console()
         self.config = LauncherConfig()
         self.running = True
+        self.env_file = _get_env_file()
         # Load .env file to ensure environment variables are available
         load_dotenv(dotenv_path=self.env_file, override=True)
         """Display main menu and handle selection"""
         clear_screen()
+        # Detect basic settings (excludes provider_settings to avoid heavy imports)
+        settings = SettingsDetector.get_basic_settings()
         credentials = settings["credentials"]
         custom_bases = settings["custom_bases"]
         self.console.print("━" * 70)
         provider_count = len(credentials)
         custom_count = len(custom_bases)
+        self.console.print(f"   Providers:           {provider_count} configured")
+        self.console.print(f"   Custom Providers:    {custom_count} configured")
+        # Note: provider_settings detection is deferred to avoid heavy imports on startup
         has_advanced = bool(
             settings["model_definitions"]
             or settings["concurrency_limits"]
             or settings["model_filters"]
         )
         self.console.print(
+            f"   Advanced Settings:   {'Active (view in menu 4)' if has_advanced else 'None (view menu 4 for details)'}"
         )
         # Show menu
         elif choice == "4":
             self.show_provider_settings_menu()
         elif choice == "5":
+            load_dotenv(dotenv_path=_get_env_file(), override=True)
             self.config = LauncherConfig()  # Reload config
             self.console.print("\n[green]✅ Configuration reloaded![/green]")
         elif choice == "6":
         """Display provider/advanced settings (read-only + launch tool)"""
         clear_screen()
+        # Use basic settings to avoid heavy imports - provider_settings deferred to Settings Tool
+        settings = SettingsDetector.get_basic_settings()
         credentials = settings["credentials"]
         custom_bases = settings["custom_bases"]
         model_defs = settings["model_definitions"]
         concurrency = settings["concurrency_limits"]
         filters = settings["model_filters"]
         self.console.print(
             Panel.fit(
                 status = " + ".join(status_parts) if status_parts else "None"
                 self.console.print(f"   • {provider:15} ✅ {status}")
+        # Provider-Specific Settings (deferred to Settings Tool to avoid heavy imports)
         self.console.print()
         self.console.print("[bold]🔬 Provider-Specific Settings[/bold]")
         self.console.print("━" * 70)
+        self.console.print(
+            "   [dim]Launch Settings Tool to view/configure provider-specific settings[/dim]"
+        )
         # Actions
         self.console.print()
         # Run the tool with from_launcher=True to skip duplicate loading screen
         run_credential_tool(from_launcher=True)
         # Reload environment after credential tool
+        load_dotenv(dotenv_path=_get_env_file(), override=True)
     def launch_settings_tool(self):
         """Launch settings configuration tool"""
+        import time
+        clear_screen()
+        self.console.print("━" * 70)
+        self.console.print("Advanced Settings Configuration Tool")
+        self.console.print("━" * 70)
+        _start_time = time.time()
+        with self.console.status("Initializing settings tool...", spinner="dots"):
+            from proxy_app.settings_tool import run_settings_tool
+        _elapsed = time.time() - _start_time
+        self.console.print(f"✓ Settings tool ready in {_elapsed:.2f}s")
+        time.sleep(0.3)
         run_settings_tool()
         # Reload environment after settings tool
+        load_dotenv(dotenv_path=_get_env_file(), override=True)
     def show_about(self):
         """Display About page with project information"""
             )
             ensure_env_defaults()
+            load_dotenv(dotenv_path=_get_env_file(), override=True)
             run_credential_tool()
+            load_dotenv(dotenv_path=_get_env_file(), override=True)
             # Check again after credential tool
             if not os.getenv("PROXY_API_KEY"):

src/proxy_app/main.py CHANGED Viewed

@@ -52,11 +52,17 @@ _start_time = time.time()
 from dotenv import load_dotenv
 from glob import glob
 # Load main .env first
-load_dotenv()
 # Load any additional .env files (e.g., antigravity_all_combined.env, gemini_cli_all_combined.env)
-_root_dir = Path.cwd()
 _env_files_found = list(_root_dir.glob("*.env"))
 for _env_file in sorted(_root_dir.glob("*.env")):
     if _env_file.name != ".env":  # Skip main .env (already loaded)
@@ -234,8 +240,10 @@ print(
 # Note: Debug logging will be added after logging configuration below
 # --- Logging Configuration ---
-LOG_DIR = Path(__file__).resolve().parent.parent.parent / "logs"
-LOG_DIR.mkdir(exist_ok=True)
 # Configure a console handler with color (INFO and above only, no DEBUG)
 console_handler = colorlog.StreamHandler(sys.stdout)
@@ -324,7 +332,7 @@ litellm_logger.propagate = False
 logging.debug(f"Modules loaded in {_elapsed:.2f}s")
 # Load environment variables from .env file
-load_dotenv()
 # --- Configuration ---
 USE_EMBEDDING_BATCHER = False
@@ -570,11 +578,11 @@ async def lifespan(app: FastAPI):
     )
     # Log loaded credentials summary (compact, always visible for deployment verification)
-    #_api_summary = ', '.join([f"{p}:{len(c)}" for p, c in api_keys.items()]) if api_keys else "none"
-    #_oauth_summary = ', '.join([f"{p}:{len(c)}" for p, c in oauth_credentials.items()]) if oauth_credentials else "none"
-    #_total_summary = ', '.join([f"{p}:{len(c)}" for p, c in client.all_credentials.items()])
-    #print(f"🔑 Credentials loaded: {_total_summary} (API: {_api_summary} | OAuth: {_oauth_summary})")
-    client.background_refresher.start() # Start the background task
     app.state.rotating_client = client
     # Warn if no provider credentials are configured
@@ -1263,8 +1271,8 @@ async def cost_estimate(request: Request, _=Depends(verify_api_key)):
 if __name__ == "__main__":
-    # Define ENV_FILE for onboarding checks
-    ENV_FILE = Path.cwd() / ".env"
     # Check if launcher TUI should be shown (no arguments provided)
     if len(sys.argv) == 1:
@@ -1331,7 +1339,7 @@ if __name__ == "__main__":
         ensure_env_defaults()
         # Reload environment variables after ensure_env_defaults creates/updates .env
-        load_dotenv(override=True)
         run_credential_tool()
     else:
         # Check if onboarding is needed
@@ -1349,11 +1357,11 @@ if __name__ == "__main__":
             from rotator_library.credential_tool import ensure_env_defaults
             ensure_env_defaults()
-            load_dotenv(override=True)
             run_credential_tool()
             # After credential tool exits, reload and re-check
-            load_dotenv(override=True)
             # Re-read PROXY_API_KEY from environment
             PROXY_API_KEY = os.getenv("PROXY_API_KEY")

 from dotenv import load_dotenv
 from glob import glob
+# Get the application root directory (EXE dir if frozen, else CWD)
+# Inlined here to avoid triggering heavy rotator_library imports before loading screen
+if getattr(sys, "frozen", False):
+    _root_dir = Path(sys.executable).parent
+else:
+    _root_dir = Path.cwd()
 # Load main .env first
+load_dotenv(_root_dir / ".env")
 # Load any additional .env files (e.g., antigravity_all_combined.env, gemini_cli_all_combined.env)
 _env_files_found = list(_root_dir.glob("*.env"))
 for _env_file in sorted(_root_dir.glob("*.env")):
     if _env_file.name != ".env":  # Skip main .env (already loaded)
 # Note: Debug logging will be added after logging configuration below
 # --- Logging Configuration ---
+# Import path utilities here (after loading screen) to avoid triggering heavy imports early
+from rotator_library.utils.paths import get_logs_dir, get_data_file
+LOG_DIR = get_logs_dir(_root_dir)
 # Configure a console handler with color (INFO and above only, no DEBUG)
 console_handler = colorlog.StreamHandler(sys.stdout)
 logging.debug(f"Modules loaded in {_elapsed:.2f}s")
 # Load environment variables from .env file
+load_dotenv(_root_dir / ".env")
 # --- Configuration ---
 USE_EMBEDDING_BATCHER = False
     )
     # Log loaded credentials summary (compact, always visible for deployment verification)
+    # _api_summary = ', '.join([f"{p}:{len(c)}" for p, c in api_keys.items()]) if api_keys else "none"
+    # _oauth_summary = ', '.join([f"{p}:{len(c)}" for p, c in oauth_credentials.items()]) if oauth_credentials else "none"
+    # _total_summary = ', '.join([f"{p}:{len(c)}" for p, c in client.all_credentials.items()])
+    # print(f"🔑 Credentials loaded: {_total_summary} (API: {_api_summary} | OAuth: {_oauth_summary})")
+    client.background_refresher.start()  # Start the background task
     app.state.rotating_client = client
     # Warn if no provider credentials are configured
 if __name__ == "__main__":
+    # Define ENV_FILE for onboarding checks using centralized path
+    ENV_FILE = get_data_file(".env")
     # Check if launcher TUI should be shown (no arguments provided)
     if len(sys.argv) == 1:
         ensure_env_defaults()
         # Reload environment variables after ensure_env_defaults creates/updates .env
+        load_dotenv(ENV_FILE, override=True)
         run_credential_tool()
     else:
         # Check if onboarding is needed
             from rotator_library.credential_tool import ensure_env_defaults
             ensure_env_defaults()
+            load_dotenv(ENV_FILE, override=True)
             run_credential_tool()
             # After credential tool exits, reload and re-check
+            load_dotenv(ENV_FILE, override=True)
             # Re-read PROXY_API_KEY from environment
             PROXY_API_KEY = os.getenv("PROXY_API_KEY")

src/proxy_app/settings_tool.py CHANGED Viewed

@@ -12,8 +12,36 @@ from rich.prompt import Prompt, IntPrompt, Confirm
 from rich.panel import Panel
 from dotenv import set_key, unset_key
 console = Console()
 def clear_screen():
     """
@@ -31,7 +59,7 @@ class AdvancedSettings:
     """Manages pending changes to .env"""
     def __init__(self):
-        self.env_file = Path.cwd() / ".env"
         self.pending_changes = {}  # key -> value (None means delete)
         self.load_current_settings()
@@ -39,7 +67,7 @@ class AdvancedSettings:
         """Load current .env values into env vars"""
         from dotenv import load_dotenv
-        load_dotenv(override=True)
     def set(self, key: str, value: str):
         """Stage a change"""
@@ -70,6 +98,70 @@ class AdvancedSettings:
         """Check if there are pending changes"""
         return bool(self.pending_changes)
 class CustomProviderManager:
     """Manages custom provider API bases"""
@@ -383,6 +475,11 @@ ANTIGRAVITY_SETTINGS = {
         "default": "\n\nSTRICT PARAMETERS: {params}.",
         "description": "Template for Claude strict parameter hints in tool descriptions",
     },
 }
 # Gemini CLI provider environment variables
@@ -427,12 +524,27 @@ GEMINI_CLI_SETTINGS = {
         "default": "",
         "description": "GCP Project ID for paid tier users (required for paid tiers)",
     },
 }
 # Map provider names to their settings definitions
 PROVIDER_SETTINGS_MAP = {
     "antigravity": ANTIGRAVITY_SETTINGS,
     "gemini_cli": GEMINI_CLI_SETTINGS,
 }
@@ -516,9 +628,61 @@ class SettingsTool:
         self.provider_settings_mgr = ProviderSettingsManager(self.settings)
         self.running = True
     def get_available_providers(self) -> List[str]:
         """Get list of providers that have credentials configured"""
-        env_file = Path.cwd() / ".env"
         providers = set()
         # Scan for providers with API keys from local .env
@@ -541,7 +705,9 @@ class SettingsTool:
                 pass
         # Also check for OAuth providers from files
-        oauth_dir = Path("oauth_creds")
         if oauth_dir.exists():
             for file in oauth_dir.glob("*_oauth_*.json"):
                 provider = file.name.split("_oauth_")[0]
@@ -579,12 +745,7 @@ class SettingsTool:
         self.console.print()
         self.console.print("━" * 70)
-        if self.settings.has_pending():
-            self.console.print(
-                '[yellow]ℹ️  Changes are pending until you select "Save & Exit"[/yellow]'
-            )
-        else:
-            self.console.print("[dim]ℹ️  No pending changes[/dim]")
         self.console.print()
         self.console.print(
@@ -618,6 +779,7 @@ class SettingsTool:
         while True:
             clear_screen()
             providers = self.provider_mgr.get_current_providers()
             self.console.print(
@@ -631,9 +793,48 @@ class SettingsTool:
             self.console.print("[bold]📋 Configured Custom Providers[/bold]")
             self.console.print("━" * 70)
-            if providers:
-                for name, base in providers.items():
-                    self.console.print(f"   • {name:15} {base}")
             else:
                 self.console.print("   [dim]No custom providers configured[/dim]")
@@ -662,7 +863,7 @@ class SettingsTool:
                     if api_base:
                         self.provider_mgr.add_provider(name, api_base)
                         self.console.print(
-                            f"\n[green]✅ Custom provider '{name}' configured![/green]"
                         )
                         self.console.print(
                             f"   To use: set {name.upper()}_API_KEY in credentials"
@@ -670,14 +871,18 @@ class SettingsTool:
                         input("\nPress Enter to continue...")
             elif choice == "2":
-                if not providers:
                     self.console.print("\n[yellow]No providers to edit[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 # Show numbered list
                 self.console.print("\n[bold]Select provider to edit:[/bold]")
-                providers_list = list(providers.keys())
                 for idx, prov in enumerate(providers_list, 1):
                     self.console.print(f"   {idx}. {prov}")
@@ -686,7 +891,9 @@ class SettingsTool:
                     choices=[str(i) for i in range(1, len(providers_list) + 1)],
                 )
                 name = providers_list[choice_idx - 1]
-                current_base = providers.get(name, "")
                 self.console.print(f"\nCurrent API Base: {current_base}")
                 new_base = Prompt.ask(
@@ -703,16 +910,33 @@ class SettingsTool:
                 input("\nPress Enter to continue...")
             elif choice == "3":
-                if not providers:
                     self.console.print("\n[yellow]No providers to remove[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 # Show numbered list
                 self.console.print("\n[bold]Select provider to remove:[/bold]")
-                providers_list = list(providers.keys())
                 for idx, prov in enumerate(providers_list, 1):
-                    self.console.print(f"   {idx}. {prov}")
                 choice_idx = IntPrompt.ask(
                     "Select option",
@@ -721,10 +945,18 @@ class SettingsTool:
                 name = providers_list[choice_idx - 1]
                 if Confirm.ask(f"Remove '{name}'?"):
-                    self.provider_mgr.remove_provider(name)
-                    self.console.print(
-                        f"\n[green]✅ Provider '{name}' removed![/green]"
-                    )
                     input("\nPress Enter to continue...")
             elif choice == "4":
@@ -735,7 +967,8 @@ class SettingsTool:
         while True:
             clear_screen()
-            all_providers = self.model_mgr.get_all_providers_with_models()
             self.console.print(
                 Panel.fit(
@@ -748,10 +981,69 @@ class SettingsTool:
             self.console.print("[bold]📋 Configured Provider Models[/bold]")
             self.console.print("━" * 70)
-            if all_providers:
-                for provider, count in all_providers.items():
                     self.console.print(
-                        f"   • {provider:15} {count} model{'s' if count > 1 else ''}"
                     )
             else:
                 self.console.print("   [dim]No model definitions configured[/dim]")
@@ -778,19 +1070,36 @@ class SettingsTool:
             if choice == "1":
                 self.add_model_definitions()
             elif choice == "2":
-                if not all_providers:
                     self.console.print("\n[yellow]No providers to edit[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
-                self.edit_model_definitions(list(all_providers.keys()))
             elif choice == "3":
-                if not all_providers:
                     self.console.print("\n[yellow]No providers to view[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
-                self.view_model_definitions(list(all_providers.keys()))
             elif choice == "4":
-                if not all_providers:
                     self.console.print("\n[yellow]No providers to remove[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
@@ -799,9 +1108,14 @@ class SettingsTool:
                 self.console.print(
                     "\n[bold]Select provider to remove models from:[/bold]"
                 )
-                providers_list = list(all_providers.keys())
                 for idx, prov in enumerate(providers_list, 1):
-                    self.console.print(f"   {idx}. {prov}")
                 choice_idx = IntPrompt.ask(
                     "Select option",
@@ -810,10 +1124,18 @@ class SettingsTool:
                 provider = providers_list[choice_idx - 1]
                 if Confirm.ask(f"Remove all model definitions for '{provider}'?"):
-                    self.model_mgr.remove_models(provider)
-                    self.console.print(
-                        f"\n[green]✅ Model definitions removed for '{provider}'![/green]"
-                    )
                     input("\nPress Enter to continue...")
             elif choice == "5":
                 break
@@ -1140,7 +1462,7 @@ class SettingsTool:
             self.console.print("[bold]📋 Current Settings[/bold]")
             self.console.print("━" * 70)
-            # Display all settings with current values
             settings_list = list(definitions.keys())
             for idx, key in enumerate(settings_list, 1):
                 definition = definitions[key]
@@ -1149,37 +1471,88 @@ class SettingsTool:
                 setting_type = definition.get("type", "str")
                 description = definition.get("description", "")
                 # Format value display
                 if setting_type == "bool":
                     value_display = (
                         "[green]✓ Enabled[/green]"
-                        if current
                         else "[red]✗ Disabled[/red]"
                     )
                 elif setting_type == "int":
-                    value_display = f"[cyan]{current}[/cyan]"
                 else:
                     value_display = (
-                        f"[cyan]{current or '(not set)'}[/cyan]"
-                        if current
                         else "[dim](not set)[/dim]"
                     )
-                # Check if modified from default
-                modified = current != default
-                mod_marker = "[yellow]*[/yellow]" if modified else " "
                 # Short key name for display (strip provider prefix)
                 short_key = key.replace(f"{provider.upper()}_", "")
-                self.console.print(
-                    f"  {mod_marker}{idx:2}. {short_key:35} {value_display}"
-                )
                 self.console.print(f"       [dim]{description}[/dim]")
             self.console.print()
             self.console.print("━" * 70)
-            self.console.print("[dim]* = modified from default[/dim]")
             self.console.print()
             self.console.print("[bold]⚙️  Actions[/bold]")
             self.console.print()
@@ -1299,6 +1672,7 @@ class SettingsTool:
         while True:
             clear_screen()
             modes = self.rotation_mgr.get_current_modes()
             available_providers = self.get_available_providers()
@@ -1322,20 +1696,78 @@ class SettingsTool:
             self.console.print("[bold]📋 Current Rotation Mode Settings[/bold]")
             self.console.print("━" * 70)
-            if modes:
-                for provider, mode in modes.items():
-                    default_mode = self.rotation_mgr.get_default_mode(provider)
-                    is_custom = mode != default_mode
-                    marker = "[yellow]*[/yellow]" if is_custom else " "
                     mode_display = (
                         f"[green]{mode}[/green]"
                         if mode == "sequential"
                         else f"[blue]{mode}[/blue]"
                     )
-                    self.console.print(f"  {marker}• {provider:20} {mode_display}")
             # Show providers with default modes
-            providers_with_defaults = [p for p in available_providers if p not in modes]
             if providers_with_defaults:
                 self.console.print()
                 self.console.print("[dim]Providers using default modes:[/dim]")
@@ -1423,12 +1855,16 @@ class SettingsTool:
                     self.rotation_mgr.set_mode(provider, new_mode)
                     self.console.print(
-                        f"\n[green]✅ Rotation mode for '{provider}' set to {new_mode}![/green]"
                     )
                     input("\nPress Enter to continue...")
             elif choice == "2":
-                if not modes:
                     self.console.print(
                         "\n[yellow]No custom rotation modes to reset[/yellow]"
                     )
@@ -1439,12 +1875,18 @@ class SettingsTool:
                 self.console.print(
                     "\n[bold]Select provider to reset to default:[/bold]"
                 )
-                modes_list = list(modes.keys())
                 for idx, prov in enumerate(modes_list, 1):
                     default_mode = self.rotation_mgr.get_default_mode(prov)
-                    self.console.print(
-                        f"   {idx}. {prov} (will reset to: {default_mode})"
-                    )
                 choice_idx = IntPrompt.ask(
                     "Select option",
@@ -1452,12 +1894,21 @@ class SettingsTool:
                 )
                 provider = modes_list[choice_idx - 1]
                 default_mode = self.rotation_mgr.get_default_mode(provider)
                 if Confirm.ask(f"Reset '{provider}' to default mode ({default_mode})?"):
-                    self.rotation_mgr.remove_mode(provider)
-                    self.console.print(
-                        f"\n[green]✅ Rotation mode for '{provider}' reset to default ({default_mode})![/green]"
-                    )
                     input("\nPress Enter to continue...")
             elif choice == "3":
@@ -1630,6 +2081,7 @@ class SettingsTool:
         while True:
             clear_screen()
             limits = self.concurrency_mgr.get_current_limits()
             self.console.print(
@@ -1643,10 +2095,57 @@ class SettingsTool:
             self.console.print("[bold]📋 Current Concurrency Settings[/bold]")
             self.console.print("━" * 70)
-            if limits:
-                for provider, limit in limits.items():
-                    self.console.print(f"   • {provider:15} {limit} requests/key")
-                self.console.print(f"   • Default:        1 request/key (all others)")
             else:
                 self.console.print("   • Default:        1 request/key (all providers)")
@@ -1704,7 +2203,7 @@ class SettingsTool:
                     if 1 <= limit <= 100:
                         self.concurrency_mgr.set_limit(provider, limit)
                         self.console.print(
-                            f"\n[green]✅ Concurrency limit set for '{provider}': {limit} requests/key[/green]"
                         )
                     else:
                         self.console.print(
@@ -1713,14 +2212,18 @@ class SettingsTool:
                     input("\nPress Enter to continue...")
             elif choice == "2":
-                if not limits:
                     self.console.print("\n[yellow]No limits to edit[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 # Show numbered list
                 self.console.print("\n[bold]Select provider to edit:[/bold]")
-                limits_list = list(limits.keys())
                 for idx, prov in enumerate(limits_list, 1):
                     self.console.print(f"   {idx}. {prov}")
@@ -1729,7 +2232,8 @@ class SettingsTool:
                     choices=[str(i) for i in range(1, len(limits_list) + 1)],
                 )
                 provider = limits_list[choice_idx - 1]
-                current_limit = limits.get(provider, 1)
                 self.console.print(f"\nCurrent limit: {current_limit} requests/key")
                 new_limit = IntPrompt.ask(
@@ -1750,7 +2254,18 @@ class SettingsTool:
                 input("\nPress Enter to continue...")
             elif choice == "3":
-                if not limits:
                     self.console.print("\n[yellow]No limits to remove[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
@@ -1759,9 +2274,14 @@ class SettingsTool:
                 self.console.print(
                     "\n[bold]Select provider to remove limit from:[/bold]"
                 )
-                limits_list = list(limits.keys())
                 for idx, prov in enumerate(limits_list, 1):
-                    self.console.print(f"   {idx}. {prov}")
                 choice_idx = IntPrompt.ask(
                     "Select option",
@@ -1772,18 +2292,118 @@ class SettingsTool:
                 if Confirm.ask(
                     f"Remove concurrency limit for '{provider}' (reset to default 1)?"
                 ):
-                    self.concurrency_mgr.remove_limit(provider)
-                    self.console.print(
-                        f"\n[green]✅ Limit removed for '{provider}' - using default (1 request/key)[/green]"
-                    )
                     input("\nPress Enter to continue...")
             elif choice == "4":
                 break
     def save_and_exit(self):
         """Save pending changes and exit"""
         if self.settings.has_pending():
             if Confirm.ask("\n[bold yellow]Save all pending changes?[/bold yellow]"):
                 self.settings.save()
                 self.console.print("\n[green]✅ All changes saved to .env![/green]")
@@ -1801,6 +2421,9 @@ class SettingsTool:
     def exit_without_saving(self):
         """Exit without saving"""
         if self.settings.has_pending():
             if Confirm.ask("\n[bold red]Discard all pending changes?[/bold red]"):
                 self.settings.discard()
                 self.console.print("\n[yellow]Changes discarded[/yellow]")

 from rich.panel import Panel
 from dotenv import set_key, unset_key
+from rotator_library.utils.paths import get_data_file
 console = Console()
+# Sentinel value for distinguishing "no pending change" from "pending change to None"
+_NOT_FOUND = object()
+# Import default OAuth port values from provider modules
+# These serve as the source of truth for default port values
+try:
+    from rotator_library.providers.gemini_auth_base import GeminiAuthBase
+    GEMINI_CLI_DEFAULT_OAUTH_PORT = GeminiAuthBase.CALLBACK_PORT
+except ImportError:
+    GEMINI_CLI_DEFAULT_OAUTH_PORT = 8085
+try:
+    from rotator_library.providers.antigravity_auth_base import AntigravityAuthBase
+    ANTIGRAVITY_DEFAULT_OAUTH_PORT = AntigravityAuthBase.CALLBACK_PORT
+except ImportError:
+    ANTIGRAVITY_DEFAULT_OAUTH_PORT = 51121
+try:
+    from rotator_library.providers.iflow_auth_base import (
+        CALLBACK_PORT as IFLOW_DEFAULT_OAUTH_PORT,
+    )
+except ImportError:
+    IFLOW_DEFAULT_OAUTH_PORT = 11451
 def clear_screen():
     """
     """Manages pending changes to .env"""
     def __init__(self):
+        self.env_file = get_data_file(".env")
         self.pending_changes = {}  # key -> value (None means delete)
         self.load_current_settings()
         """Load current .env values into env vars"""
         from dotenv import load_dotenv
+        load_dotenv(self.env_file, override=True)
     def set(self, key: str, value: str):
         """Stage a change"""
         """Check if there are pending changes"""
         return bool(self.pending_changes)
+    def get_pending_value(self, key: str):
+        """Get pending value for a key. Returns sentinel _NOT_FOUND if no pending change."""
+        return self.pending_changes.get(key, _NOT_FOUND)
+    def get_original_value(self, key: str) -> Optional[str]:
+        """Get the current .env value (before pending changes)"""
+        return os.getenv(key)
+    def get_change_type(self, key: str) -> Optional[str]:
+        """Returns 'add', 'edit', 'remove', or None if no pending change"""
+        if key not in self.pending_changes:
+            return None
+        if self.pending_changes[key] is None:
+            return "remove"
+        elif os.getenv(key) is not None:
+            return "edit"
+        else:
+            return "add"
+    def get_pending_keys_by_pattern(
+        self, prefix: str = "", suffix: str = ""
+    ) -> List[str]:
+        """Get all pending change keys that match prefix and/or suffix"""
+        return [
+            k
+            for k in self.pending_changes.keys()
+            if k.startswith(prefix) and k.endswith(suffix)
+        ]
+    def get_changes_summary(self) -> Dict[str, List[tuple]]:
+        """Get categorized summary of all pending changes.
+        Returns dict with 'add', 'edit', 'remove' keys,
+        each containing list of (key, old_val, new_val) tuples.
+        """
+        summary: Dict[str, List[tuple]] = {"add": [], "edit": [], "remove": []}
+        for key, new_val in self.pending_changes.items():
+            old_val = os.getenv(key)
+            change_type = self.get_change_type(key)
+            if change_type:
+                summary[change_type].append((key, old_val, new_val))
+        # Sort each list alphabetically by key
+        for change_type in summary:
+            summary[change_type].sort(key=lambda x: x[0])
+        return summary
+    def get_pending_counts(self) -> Dict[str, int]:
+        """Get counts of pending changes by type"""
+        adds = len(
+            [
+                k
+                for k, v in self.pending_changes.items()
+                if v is not None and os.getenv(k) is None
+            ]
+        )
+        edits = len(
+            [
+                k
+                for k, v in self.pending_changes.items()
+                if v is not None and os.getenv(k) is not None
+            ]
+        )
+        removes = len([k for k, v in self.pending_changes.items() if v is None])
+        return {"add": adds, "edit": edits, "remove": removes}
 class CustomProviderManager:
     """Manages custom provider API bases"""
         "default": "\n\nSTRICT PARAMETERS: {params}.",
         "description": "Template for Claude strict parameter hints in tool descriptions",
     },
+    "ANTIGRAVITY_OAUTH_PORT": {
+        "type": "int",
+        "default": ANTIGRAVITY_DEFAULT_OAUTH_PORT,
+        "description": "Local port for OAuth callback server during authentication",
+    },
 }
 # Gemini CLI provider environment variables
         "default": "",
         "description": "GCP Project ID for paid tier users (required for paid tiers)",
     },
+    "GEMINI_CLI_OAUTH_PORT": {
+        "type": "int",
+        "default": GEMINI_CLI_DEFAULT_OAUTH_PORT,
+        "description": "Local port for OAuth callback server during authentication",
+    },
+}
+# iFlow provider environment variables
+IFLOW_SETTINGS = {
+    "IFLOW_OAUTH_PORT": {
+        "type": "int",
+        "default": IFLOW_DEFAULT_OAUTH_PORT,
+        "description": "Local port for OAuth callback server during authentication",
+    },
 }
 # Map provider names to their settings definitions
 PROVIDER_SETTINGS_MAP = {
     "antigravity": ANTIGRAVITY_SETTINGS,
     "gemini_cli": GEMINI_CLI_SETTINGS,
+    "iflow": IFLOW_SETTINGS,
 }
         self.provider_settings_mgr = ProviderSettingsManager(self.settings)
         self.running = True
+    def _format_item(
+        self,
+        name: str,
+        value: str,
+        change_type: Optional[str],
+        old_value: Optional[str] = None,
+        width: int = 15,
+    ) -> str:
+        """Format a list item with change indicator.
+        change_type: None, 'add', 'edit', 'remove'
+        Returns formatted string like:
+          "   + myapi          https://api.example.com" (green)
+          "   ~ openai         1 → 5 requests/key" (yellow)
+          "   - oldapi         https://old.api.com" (red)
+          "   • groq           3 requests/key" (normal)
+        """
+        if change_type == "add":
+            return f"   [green]+ {name:{width}} {value}[/green]"
+        elif change_type == "edit":
+            if old_value is not None:
+                return f"   [yellow]~ {name:{width}} {old_value} → {value}[/yellow]"
+            else:
+                return f"   [yellow]~ {name:{width}} {value}[/yellow]"
+        elif change_type == "remove":
+            return f"   [red]- {name:{width}} {value}[/red]"
+        else:
+            return f"   • {name:{width}} {value}"
+    def _get_pending_status_text(self) -> str:
+        """Get formatted pending changes status text for main menu."""
+        if not self.settings.has_pending():
+            return "[dim]ℹ️  No pending changes[/dim]"
+        counts = self.settings.get_pending_counts()
+        parts = []
+        if counts["add"]:
+            parts.append(
+                f"[green]{counts['add']} addition{'s' if counts['add'] > 1 else ''}[/green]"
+            )
+        if counts["edit"]:
+            parts.append(
+                f"[yellow]{counts['edit']} modification{'s' if counts['edit'] > 1 else ''}[/yellow]"
+            )
+        if counts["remove"]:
+            parts.append(
+                f"[red]{counts['remove']} removal{'s' if counts['remove'] > 1 else ''}[/red]"
+            )
+        return f"[bold]ℹ️  Pending changes: {', '.join(parts)}[/bold]"
+        self.running = True
     def get_available_providers(self) -> List[str]:
         """Get list of providers that have credentials configured"""
+        env_file = get_data_file(".env")
         providers = set()
         # Scan for providers with API keys from local .env
                 pass
         # Also check for OAuth providers from files
+        from rotator_library.utils.paths import get_oauth_dir
+        oauth_dir = get_oauth_dir()
         if oauth_dir.exists():
             for file in oauth_dir.glob("*_oauth_*.json"):
                 provider = file.name.split("_oauth_")[0]
         self.console.print()
         self.console.print("━" * 70)
+        self.console.print(self._get_pending_status_text())
         self.console.print()
         self.console.print(
         while True:
             clear_screen()
+            # Get current providers from env
             providers = self.provider_mgr.get_current_providers()
             self.console.print(
             self.console.print("[bold]📋 Configured Custom Providers[/bold]")
             self.console.print("━" * 70)
+            # Build combined view with pending changes
+            all_providers: Dict[str, Dict[str, Any]] = {}
+            # Add current providers (from env)
+            for name, base in providers.items():
+                key = f"{name.upper()}_API_BASE"
+                change_type = self.settings.get_change_type(key)
+                if change_type == "remove":
+                    all_providers[name] = {"value": base, "type": "remove", "old": None}
+                elif change_type == "edit":
+                    new_val = self.settings.pending_changes[key]
+                    all_providers[name] = {
+                        "value": new_val,
+                        "type": "edit",
+                        "old": base,
+                    }
+                else:
+                    all_providers[name] = {"value": base, "type": None, "old": None}
+            # Add pending new providers (additions)
+            for key in self.settings.get_pending_keys_by_pattern(suffix="_API_BASE"):
+                if self.settings.get_change_type(key) == "add":
+                    name = key.replace("_API_BASE", "").lower()
+                    if name not in all_providers:
+                        all_providers[name] = {
+                            "value": self.settings.pending_changes[key],
+                            "type": "add",
+                            "old": None,
+                        }
+            if all_providers:
+                # Sort alphabetically
+                for name in sorted(all_providers.keys()):
+                    info = all_providers[name]
+                    self.console.print(
+                        self._format_item(
+                            name,
+                            info["value"],
+                            info["type"],
+                            info["old"],
+                        )
+                    )
             else:
                 self.console.print("   [dim]No custom providers configured[/dim]")
                     if api_base:
                         self.provider_mgr.add_provider(name, api_base)
                         self.console.print(
+                            f"\n[green]✅ Custom provider '{name}' staged![/green]"
                         )
                         self.console.print(
                             f"   To use: set {name.upper()}_API_KEY in credentials"
                         input("\nPress Enter to continue...")
             elif choice == "2":
+                # Get editable providers (existing + pending additions, excluding pending removals)
+                editable = {
+                    k: v for k, v in all_providers.items() if v["type"] != "remove"
+                }
+                if not editable:
                     self.console.print("\n[yellow]No providers to edit[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 # Show numbered list
                 self.console.print("\n[bold]Select provider to edit:[/bold]")
+                providers_list = sorted(editable.keys())
                 for idx, prov in enumerate(providers_list, 1):
                     self.console.print(f"   {idx}. {prov}")
                     choices=[str(i) for i in range(1, len(providers_list) + 1)],
                 )
                 name = providers_list[choice_idx - 1]
+                info = editable[name]
+                # Get effective current value (could be pending or from env)
+                current_base = info["value"]
                 self.console.print(f"\nCurrent API Base: {current_base}")
                 new_base = Prompt.ask(
                 input("\nPress Enter to continue...")
             elif choice == "3":
+                # Get removable providers (existing ones not already pending removal)
+                removable = {
+                    k: v
+                    for k, v in all_providers.items()
+                    if v["type"] != "remove" and v["type"] != "add"
+                }
+                # For pending additions, we can "undo" by removing from pending
+                pending_adds = {
+                    k: v for k, v in all_providers.items() if v["type"] == "add"
+                }
+                if not removable and not pending_adds:
                     self.console.print("\n[yellow]No providers to remove[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 # Show numbered list
                 self.console.print("\n[bold]Select provider to remove:[/bold]")
+                # Show existing providers first, then pending additions
+                providers_list = sorted(removable.keys()) + sorted(pending_adds.keys())
                 for idx, prov in enumerate(providers_list, 1):
+                    if prov in pending_adds:
+                        self.console.print(
+                            f"   {idx}. {prov} [green](pending add)[/green]"
+                        )
+                    else:
+                        self.console.print(f"   {idx}. {prov}")
                 choice_idx = IntPrompt.ask(
                     "Select option",
                 name = providers_list[choice_idx - 1]
                 if Confirm.ask(f"Remove '{name}'?"):
+                    if name in pending_adds:
+                        # Undo pending addition - remove from pending_changes
+                        key = f"{name.upper()}_API_BASE"
+                        del self.settings.pending_changes[key]
+                        self.console.print(
+                            f"\n[green]✅ Pending addition of '{name}' cancelled![/green]"
+                        )
+                    else:
+                        self.provider_mgr.remove_provider(name)
+                        self.console.print(
+                            f"\n[green]✅ Provider '{name}' marked for removal![/green]"
+                        )
                     input("\nPress Enter to continue...")
             elif choice == "4":
         while True:
             clear_screen()
+            # Get current providers with models from env
+            all_providers_env = self.model_mgr.get_all_providers_with_models()
             self.console.print(
                 Panel.fit(
             self.console.print("[bold]📋 Configured Provider Models[/bold]")
             self.console.print("━" * 70)
+            # Build combined view with pending changes
+            all_models: Dict[str, Dict[str, Any]] = {}
+            suffix = "_MODELS"
+            # Add current providers (from env)
+            for provider, count in all_providers_env.items():
+                key = f"{provider.upper()}{suffix}"
+                change_type = self.settings.get_change_type(key)
+                if change_type == "remove":
+                    all_models[provider] = {
+                        "value": f"{count} model{'s' if count > 1 else ''}",
+                        "type": "remove",
+                        "old": None,
+                    }
+                elif change_type == "edit":
+                    # Get new model count from pending
+                    new_val = self.settings.pending_changes[key]
+                    try:
+                        parsed = json.loads(new_val)
+                        new_count = (
+                            len(parsed) if isinstance(parsed, (dict, list)) else 0
+                        )
+                    except (json.JSONDecodeError, ValueError):
+                        new_count = 0
+                    all_models[provider] = {
+                        "value": f"{new_count} model{'s' if new_count > 1 else ''}",
+                        "type": "edit",
+                        "old": f"{count} model{'s' if count > 1 else ''}",
+                    }
+                else:
+                    all_models[provider] = {
+                        "value": f"{count} model{'s' if count > 1 else ''}",
+                        "type": None,
+                        "old": None,
+                    }
+            # Add pending new model definitions (additions)
+            for key in self.settings.get_pending_keys_by_pattern(suffix=suffix):
+                if self.settings.get_change_type(key) == "add":
+                    provider = key.replace(suffix, "").lower()
+                    if provider not in all_models:
+                        new_val = self.settings.pending_changes[key]
+                        try:
+                            parsed = json.loads(new_val)
+                            new_count = (
+                                len(parsed) if isinstance(parsed, (dict, list)) else 0
+                            )
+                        except (json.JSONDecodeError, ValueError):
+                            new_count = 0
+                        all_models[provider] = {
+                            "value": f"{new_count} model{'s' if new_count > 1 else ''}",
+                            "type": "add",
+                            "old": None,
+                        }
+            if all_models:
+                # Sort alphabetically
+                for provider in sorted(all_models.keys()):
+                    info = all_models[provider]
                     self.console.print(
+                        self._format_item(
+                            provider, info["value"], info["type"], info["old"]
+                        )
                     )
             else:
                 self.console.print("   [dim]No model definitions configured[/dim]")
             if choice == "1":
                 self.add_model_definitions()
             elif choice == "2":
+                # Get editable models (existing + pending additions, excluding pending removals)
+                editable = {
+                    k: v for k, v in all_models.items() if v["type"] != "remove"
+                }
+                if not editable:
                     self.console.print("\n[yellow]No providers to edit[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
+                self.edit_model_definitions(sorted(editable.keys()))
             elif choice == "3":
+                viewable = {
+                    k: v for k, v in all_models.items() if v["type"] != "remove"
+                }
+                if not viewable:
                     self.console.print("\n[yellow]No providers to view[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
+                self.view_model_definitions(sorted(viewable.keys()))
             elif choice == "4":
+                # Get removable models (existing ones not already pending removal)
+                removable = {
+                    k: v
+                    for k, v in all_models.items()
+                    if v["type"] != "remove" and v["type"] != "add"
+                }
+                pending_adds = {
+                    k: v for k, v in all_models.items() if v["type"] == "add"
+                }
+                if not removable and not pending_adds:
                     self.console.print("\n[yellow]No providers to remove[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 self.console.print(
                     "\n[bold]Select provider to remove models from:[/bold]"
                 )
+                providers_list = sorted(removable.keys()) + sorted(pending_adds.keys())
                 for idx, prov in enumerate(providers_list, 1):
+                    if prov in pending_adds:
+                        self.console.print(
+                            f"   {idx}. {prov} [green](pending add)[/green]"
+                        )
+                    else:
+                        self.console.print(f"   {idx}. {prov}")
                 choice_idx = IntPrompt.ask(
                     "Select option",
                 provider = providers_list[choice_idx - 1]
                 if Confirm.ask(f"Remove all model definitions for '{provider}'?"):
+                    if provider in pending_adds:
+                        # Undo pending addition
+                        key = f"{provider.upper()}{suffix}"
+                        del self.settings.pending_changes[key]
+                        self.console.print(
+                            f"\n[green]✅ Pending models for '{provider}' cancelled![/green]"
+                        )
+                    else:
+                        self.model_mgr.remove_models(provider)
+                        self.console.print(
+                            f"\n[green]✅ Model definitions marked for removal for '{provider}'![/green]"
+                        )
                     input("\nPress Enter to continue...")
             elif choice == "5":
                 break
             self.console.print("[bold]📋 Current Settings[/bold]")
             self.console.print("━" * 70)
+            # Display all settings with current values and pending changes
             settings_list = list(definitions.keys())
             for idx, key in enumerate(settings_list, 1):
                 definition = definitions[key]
                 setting_type = definition.get("type", "str")
                 description = definition.get("description", "")
+                # Check for pending changes
+                change_type = self.settings.get_change_type(key)
+                pending_val = self.settings.get_pending_value(key)
+                # Determine effective value to display
+                if pending_val is not _NOT_FOUND and pending_val is not None:
+                    # Has pending change - convert to proper type for display
+                    if setting_type == "bool":
+                        effective = pending_val.lower() in ("true", "1", "yes")
+                    elif setting_type == "int":
+                        try:
+                            effective = int(pending_val)
+                        except (ValueError, TypeError):
+                            effective = pending_val
+                    else:
+                        effective = pending_val
+                elif pending_val is None and change_type == "remove":
+                    # Pending removal - will revert to default
+                    effective = default
+                else:
+                    effective = current
                 # Format value display
                 if setting_type == "bool":
                     value_display = (
                         "[green]✓ Enabled[/green]"
+                        if effective
                         else "[red]✗ Disabled[/red]"
                     )
+                    old_display = (
+                        (
+                            "[green]✓ Enabled[/green]"
+                            if current
+                            else "[red]✗ Disabled[/red]"
+                        )
+                        if change_type
+                        else None
+                    )
                 elif setting_type == "int":
+                    value_display = f"[cyan]{effective}[/cyan]"
+                    old_display = f"[cyan]{current}[/cyan]" if change_type else None
                 else:
                     value_display = (
+                        f"[cyan]{effective or '(not set)'}[/cyan]"
+                        if effective
                         else "[dim](not set)[/dim]"
                     )
+                    old_display = (
+                        f"[cyan]{current}[/cyan]" if change_type and current else None
+                    )
                 # Short key name for display (strip provider prefix)
                 short_key = key.replace(f"{provider.upper()}_", "")
+                # Determine display marker based on pending change type
+                if change_type == "add":
+                    self.console.print(
+                        f"  [green]+{idx:2}. {short_key:35} {value_display}[/green]"
+                    )
+                elif change_type == "edit":
+                    self.console.print(
+                        f"  [yellow]~{idx:2}. {short_key:35} {old_display} → {value_display}[/yellow]"
+                    )
+                elif change_type == "remove":
+                    self.console.print(
+                        f"  [red]-{idx:2}. {short_key:35} {old_display} → [dim](default: {default})[/dim][/red]"
+                    )
+                else:
+                    # Check if modified from default (in env, not pending)
+                    modified = current != default
+                    mod_marker = "[yellow]*[/yellow]" if modified else " "
+                    self.console.print(
+                        f"  {mod_marker}{idx:2}. {short_key:35} {value_display}"
+                    )
                 self.console.print(f"       [dim]{description}[/dim]")
             self.console.print()
             self.console.print("━" * 70)
+            self.console.print(
+                "[dim]* = modified from default, + = pending add, ~ = pending edit, - = pending reset[/dim]"
+            )
             self.console.print()
             self.console.print("[bold]⚙️  Actions[/bold]")
             self.console.print()
         while True:
             clear_screen()
+            # Get current modes from env
             modes = self.rotation_mgr.get_current_modes()
             available_providers = self.get_available_providers()
             self.console.print("[bold]📋 Current Rotation Mode Settings[/bold]")
             self.console.print("━" * 70)
+            # Build combined view with pending changes
+            all_modes: Dict[str, Dict[str, Any]] = {}
+            prefix = "ROTATION_MODE_"
+            # Add current modes (from env)
+            for provider, mode in modes.items():
+                key = f"{prefix}{provider.upper()}"
+                change_type = self.settings.get_change_type(key)
+                default_mode = self.rotation_mgr.get_default_mode(provider)
+                if change_type == "remove":
+                    all_modes[provider] = {"value": mode, "type": "remove", "old": None}
+                elif change_type == "edit":
+                    new_val = self.settings.pending_changes[key]
+                    all_modes[provider] = {
+                        "value": new_val,
+                        "type": "edit",
+                        "old": mode,
+                    }
+                else:
+                    all_modes[provider] = {"value": mode, "type": None, "old": None}
+            # Add pending new modes (additions)
+            for key in self.settings.get_pending_keys_by_pattern(prefix=prefix):
+                if self.settings.get_change_type(key) == "add":
+                    provider = key.replace(prefix, "").lower()
+                    if provider not in all_modes:
+                        all_modes[provider] = {
+                            "value": self.settings.pending_changes[key],
+                            "type": "add",
+                            "old": None,
+                        }
+            if all_modes:
+                # Sort alphabetically
+                for provider in sorted(all_modes.keys()):
+                    info = all_modes[provider]
+                    mode = info["value"]
                     mode_display = (
                         f"[green]{mode}[/green]"
                         if mode == "sequential"
                         else f"[blue]{mode}[/blue]"
                     )
+                    old_display = None
+                    if info["old"]:
+                        old_display = (
+                            f"[green]{info['old']}[/green]"
+                            if info["old"] == "sequential"
+                            else f"[blue]{info['old']}[/blue]"
+                        )
+                    if info["type"] == "add":
+                        self.console.print(
+                            f"   [green]+ {provider:20} {mode_display}[/green]"
+                        )
+                    elif info["type"] == "edit":
+                        self.console.print(
+                            f"   [yellow]~ {provider:20} {old_display} → {mode_display}[/yellow]"
+                        )
+                    elif info["type"] == "remove":
+                        self.console.print(
+                            f"   [red]- {provider:20} {mode_display}[/red]"
+                        )
+                    else:
+                        default_mode = self.rotation_mgr.get_default_mode(provider)
+                        is_custom = mode != default_mode
+                        marker = "[yellow]*[/yellow]" if is_custom else " "
+                        self.console.print(f"  {marker}• {provider:20} {mode_display}")
             # Show providers with default modes
+            providers_with_defaults = [
+                p for p in available_providers if p not in modes and p not in all_modes
+            ]
             if providers_with_defaults:
                 self.console.print()
                 self.console.print("[dim]Providers using default modes:[/dim]")
                     self.rotation_mgr.set_mode(provider, new_mode)
                     self.console.print(
+                        f"\n[green]✅ Rotation mode for '{provider}' staged as {new_mode}![/green]"
                     )
                     input("\nPress Enter to continue...")
             elif choice == "2":
+                # Get resettable modes (existing + pending adds, excluding pending removes)
+                resettable = {
+                    k: v for k, v in all_modes.items() if v["type"] != "remove"
+                }
+                if not resettable:
                     self.console.print(
                         "\n[yellow]No custom rotation modes to reset[/yellow]"
                     )
                 self.console.print(
                     "\n[bold]Select provider to reset to default:[/bold]"
                 )
+                modes_list = sorted(resettable.keys())
                 for idx, prov in enumerate(modes_list, 1):
                     default_mode = self.rotation_mgr.get_default_mode(prov)
+                    info = resettable[prov]
+                    if info["type"] == "add":
+                        self.console.print(
+                            f"   {idx}. {prov} [green](pending add)[/green] - will cancel"
+                        )
+                    else:
+                        self.console.print(
+                            f"   {idx}. {prov} (will reset to: {default_mode})"
+                        )
                 choice_idx = IntPrompt.ask(
                     "Select option",
                 )
                 provider = modes_list[choice_idx - 1]
                 default_mode = self.rotation_mgr.get_default_mode(provider)
+                info = resettable[provider]
                 if Confirm.ask(f"Reset '{provider}' to default mode ({default_mode})?"):
+                    if info["type"] == "add":
+                        # Undo pending addition
+                        key = f"{prefix}{provider.upper()}"
+                        del self.settings.pending_changes[key]
+                        self.console.print(
+                            f"\n[green]✅ Pending mode for '{provider}' cancelled![/green]"
+                        )
+                    else:
+                        self.rotation_mgr.remove_mode(provider)
+                        self.console.print(
+                            f"\n[green]✅ Rotation mode for '{provider}' marked for reset to default ({default_mode})![/green]"
+                        )
                     input("\nPress Enter to continue...")
             elif choice == "3":
         while True:
             clear_screen()
+            # Get current limits from env
             limits = self.concurrency_mgr.get_current_limits()
             self.console.print(
             self.console.print("[bold]📋 Current Concurrency Settings[/bold]")
             self.console.print("━" * 70)
+            # Build combined view with pending changes
+            all_limits: Dict[str, Dict[str, Any]] = {}
+            prefix = "MAX_CONCURRENT_REQUESTS_PER_KEY_"
+            # Add current limits (from env)
+            for provider, limit in limits.items():
+                key = f"{prefix}{provider.upper()}"
+                change_type = self.settings.get_change_type(key)
+                if change_type == "remove":
+                    all_limits[provider] = {
+                        "value": str(limit),
+                        "type": "remove",
+                        "old": None,
+                    }
+                elif change_type == "edit":
+                    new_val = self.settings.pending_changes[key]
+                    all_limits[provider] = {
+                        "value": new_val,
+                        "type": "edit",
+                        "old": str(limit),
+                    }
+                else:
+                    all_limits[provider] = {
+                        "value": str(limit),
+                        "type": None,
+                        "old": None,
+                    }
+            # Add pending new limits (additions)
+            for key in self.settings.get_pending_keys_by_pattern(prefix=prefix):
+                if self.settings.get_change_type(key) == "add":
+                    provider = key.replace(prefix, "").lower()
+                    if provider not in all_limits:
+                        all_limits[provider] = {
+                            "value": self.settings.pending_changes[key],
+                            "type": "add",
+                            "old": None,
+                        }
+            if all_limits:
+                # Sort alphabetically
+                for provider in sorted(all_limits.keys()):
+                    info = all_limits[provider]
+                    value_display = f"{info['value']} requests/key"
+                    old_display = f"{info['old']} requests/key" if info["old"] else None
+                    self.console.print(
+                        self._format_item(
+                            provider, value_display, info["type"], old_display
+                        )
+                    )
+                self.console.print("   • Default:        1 request/key (all others)")
             else:
                 self.console.print("   • Default:        1 request/key (all providers)")
                     if 1 <= limit <= 100:
                         self.concurrency_mgr.set_limit(provider, limit)
                         self.console.print(
+                            f"\n[green]✅ Concurrency limit staged for '{provider}': {limit} requests/key[/green]"
                         )
                     else:
                         self.console.print(
                     input("\nPress Enter to continue...")
             elif choice == "2":
+                # Get editable limits (existing + pending additions, excluding pending removals)
+                editable = {
+                    k: v for k, v in all_limits.items() if v["type"] != "remove"
+                }
+                if not editable:
                     self.console.print("\n[yellow]No limits to edit[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 # Show numbered list
                 self.console.print("\n[bold]Select provider to edit:[/bold]")
+                limits_list = sorted(editable.keys())
                 for idx, prov in enumerate(limits_list, 1):
                     self.console.print(f"   {idx}. {prov}")
                     choices=[str(i) for i in range(1, len(limits_list) + 1)],
                 )
                 provider = limits_list[choice_idx - 1]
+                info = editable[provider]
+                current_limit = int(info["value"])
                 self.console.print(f"\nCurrent limit: {current_limit} requests/key")
                 new_limit = IntPrompt.ask(
                 input("\nPress Enter to continue...")
             elif choice == "3":
+                # Get removable limits (existing ones not already pending removal)
+                removable = {
+                    k: v
+                    for k, v in all_limits.items()
+                    if v["type"] != "remove" and v["type"] != "add"
+                }
+                # For pending additions, we can "undo" by removing from pending
+                pending_adds = {
+                    k: v for k, v in all_limits.items() if v["type"] == "add"
+                }
+                if not removable and not pending_adds:
                     self.console.print("\n[yellow]No limits to remove[/yellow]")
                     input("\nPress Enter to continue...")
                     continue
                 self.console.print(
                     "\n[bold]Select provider to remove limit from:[/bold]"
                 )
+                limits_list = sorted(removable.keys()) + sorted(pending_adds.keys())
                 for idx, prov in enumerate(limits_list, 1):
+                    if prov in pending_adds:
+                        self.console.print(
+                            f"   {idx}. {prov} [green](pending add)[/green]"
+                        )
+                    else:
+                        self.console.print(f"   {idx}. {prov}")
                 choice_idx = IntPrompt.ask(
                     "Select option",
                 if Confirm.ask(
                     f"Remove concurrency limit for '{provider}' (reset to default 1)?"
                 ):
+                    if provider in pending_adds:
+                        # Undo pending addition
+                        key = f"{prefix}{provider.upper()}"
+                        del self.settings.pending_changes[key]
+                        self.console.print(
+                            f"\n[green]✅ Pending limit for '{provider}' cancelled![/green]"
+                        )
+                    else:
+                        self.concurrency_mgr.remove_limit(provider)
+                        self.console.print(
+                            f"\n[green]✅ Limit marked for removal for '{provider}'[/green]"
+                        )
                     input("\nPress Enter to continue...")
             elif choice == "4":
                 break
+    def _show_changes_summary(self):
+        """Display categorized summary of all pending changes."""
+        self.console.print(
+            Panel.fit(
+                "[bold cyan]📋 Pending Changes Summary[/bold cyan]",
+                border_style="cyan",
+            )
+        )
+        self.console.print()
+        # Define categories with their key patterns
+        categories = [
+            ("Custom Provider API Bases", "_API_BASE", "suffix"),
+            ("Model Definitions", "_MODELS", "suffix"),
+            ("Concurrency Limits", "MAX_CONCURRENT_REQUESTS_PER_KEY_", "prefix"),
+            ("Rotation Modes", "ROTATION_MODE_", "prefix"),
+            ("Priority Multipliers", "CONCURRENCY_MULTIPLIER_", "prefix"),
+        ]
+        # Get provider-specific settings keys
+        provider_settings_keys = set()
+        for provider_settings in PROVIDER_SETTINGS_MAP.values():
+            provider_settings_keys.update(provider_settings.keys())
+        changes = self.settings.get_changes_summary()
+        displayed_keys = set()
+        for category_name, pattern, pattern_type in categories:
+            category_changes = {"add": [], "edit": [], "remove": []}
+            for change_type in ["add", "edit", "remove"]:
+                for key, old_val, new_val in changes[change_type]:
+                    matches = False
+                    if pattern_type == "suffix" and key.endswith(pattern):
+                        matches = True
+                    elif pattern_type == "prefix" and key.startswith(pattern):
+                        matches = True
+                    if matches:
+                        category_changes[change_type].append((key, old_val, new_val))
+                        displayed_keys.add(key)
+            # Check if this category has any changes
+            has_changes = any(category_changes[t] for t in ["add", "edit", "remove"])
+            if has_changes:
+                self.console.print(f"[bold]{category_name}:[/bold]")
+                # Sort: additions, modifications, removals (alphabetically within each)
+                for change_type in ["add", "edit", "remove"]:
+                    for key, old_val, new_val in sorted(
+                        category_changes[change_type], key=lambda x: x[0]
+                    ):
+                        if change_type == "add":
+                            self.console.print(f"  [green]+ {key} = {new_val}[/green]")
+                        elif change_type == "edit":
+                            self.console.print(
+                                f"  [yellow]~ {key}: {old_val} → {new_val}[/yellow]"
+                            )
+                        else:
+                            self.console.print(f"  [red]- {key}[/red]")
+                self.console.print()
+        # Handle provider-specific settings that don't match the patterns above
+        provider_changes = {"add": [], "edit": [], "remove": []}
+        for change_type in ["add", "edit", "remove"]:
+            for key, old_val, new_val in changes[change_type]:
+                if key not in displayed_keys and key in provider_settings_keys:
+                    provider_changes[change_type].append((key, old_val, new_val))
+        has_provider_changes = any(
+            provider_changes[t] for t in ["add", "edit", "remove"]
+        )
+        if has_provider_changes:
+            self.console.print("[bold]Provider-Specific Settings:[/bold]")
+            for change_type in ["add", "edit", "remove"]:
+                for key, old_val, new_val in sorted(
+                    provider_changes[change_type], key=lambda x: x[0]
+                ):
+                    if change_type == "add":
+                        self.console.print(f"  [green]+ {key} = {new_val}[/green]")
+                    elif change_type == "edit":
+                        self.console.print(
+                            f"  [yellow]~ {key}: {old_val} → {new_val}[/yellow]"
+                        )
+                    else:
+                        self.console.print(f"  [red]- {key}[/red]")
+            self.console.print()
+        self.console.print("━" * 70)
     def save_and_exit(self):
         """Save pending changes and exit"""
         if self.settings.has_pending():
+            clear_screen()
+            self._show_changes_summary()
             if Confirm.ask("\n[bold yellow]Save all pending changes?[/bold yellow]"):
                 self.settings.save()
                 self.console.print("\n[green]✅ All changes saved to .env![/green]")
     def exit_without_saving(self):
         """Exit without saving"""
         if self.settings.has_pending():
+            clear_screen()
+            self._show_changes_summary()
             if Confirm.ask("\n[bold red]Discard all pending changes?[/bold red]"):
                 self.settings.discard()
                 self.console.print("\n[yellow]Changes discarded[/yellow]")

src/rotator_library/client.py CHANGED Viewed

@@ -10,6 +10,7 @@ import litellm
 from litellm.exceptions import APIConnectionError
 from litellm.litellm_core_utils.token_counter import token_counter
 import logging
 from typing import List, Dict, Any, AsyncGenerator, Optional, Union
 lib_logger = logging.getLogger("rotator_library")
@@ -19,7 +20,7 @@ lib_logger = logging.getLogger("rotator_library")
 lib_logger.propagate = False
 from .usage_manager import UsageManager
-from .failure_logger import log_failure
 from .error_handler import (
     PreRequestCallbackError,
     classify_error,
@@ -37,6 +38,7 @@ from .cooldown_manager import CooldownManager
 from .credential_manager import CredentialManager
 from .background_refresher import BackgroundRefresher
 from .model_definitions import ModelDefinitions
 class StreamedAPIError(Exception):
@@ -58,7 +60,7 @@ class RotatingClient:
         api_keys: Optional[Dict[str, List[str]]] = None,
         oauth_credentials: Optional[Dict[str, List[str]]] = None,
         max_retries: int = 2,
-        usage_file_path: str = "key_usage.json",
         configure_logging: bool = True,
         global_timeout: int = 30,
         abort_on_callback_error: bool = True,
@@ -68,6 +70,7 @@ class RotatingClient:
         enable_request_logging: bool = False,
         max_concurrent_requests_per_key: Optional[Dict[str, int]] = None,
         rotation_tolerance: float = 3.0,
     ):
         """
         Initialize the RotatingClient with intelligent credential rotation.
@@ -76,7 +79,7 @@ class RotatingClient:
             api_keys: Dictionary mapping provider names to lists of API keys
             oauth_credentials: Dictionary mapping provider names to OAuth credential paths
             max_retries: Maximum number of retry attempts per credential
-            usage_file_path: Path to store usage statistics
             configure_logging: Whether to configure library logging
             global_timeout: Global timeout for requests in seconds
             abort_on_callback_error: Whether to abort on pre-request callback errors
@@ -89,7 +92,18 @@ class RotatingClient:
                 - 0.0: Deterministic, least-used credential always selected
                 - 2.0 - 4.0 (default, recommended): Balanced randomness, can pick credentials within 2 uses of max
                 - 5.0+: High randomness, more unpredictable selection patterns
         """
         os.environ["LITELLM_LOG"] = "ERROR"
         litellm.set_verbose = False
         litellm.drop_params = True
@@ -124,7 +138,9 @@ class RotatingClient:
         if oauth_credentials:
             self.oauth_credentials = oauth_credentials
         else:
-            self.credential_manager = CredentialManager(os.environ)
             self.oauth_credentials = self.credential_manager.discover_and_prepare()
         self.background_refresher = BackgroundRefresher(self)
         self.oauth_providers = set(self.oauth_credentials.keys())
@@ -242,8 +258,14 @@ class RotatingClient:
                 f"Provider '{provider}' sequential fallback multiplier: {fallback}x"
             )
         self.usage_manager = UsageManager(
-            file_path=usage_file_path,
             rotation_tolerance=rotation_tolerance,
             provider_rotation_modes=provider_rotation_modes,
             provider_plugins=PROVIDER_PLUGINS,

 from litellm.exceptions import APIConnectionError
 from litellm.litellm_core_utils.token_counter import token_counter
 import logging
+from pathlib import Path
 from typing import List, Dict, Any, AsyncGenerator, Optional, Union
 lib_logger = logging.getLogger("rotator_library")
 lib_logger.propagate = False
 from .usage_manager import UsageManager
+from .failure_logger import log_failure, configure_failure_logger
 from .error_handler import (
     PreRequestCallbackError,
     classify_error,
 from .credential_manager import CredentialManager
 from .background_refresher import BackgroundRefresher
 from .model_definitions import ModelDefinitions
+from .utils.paths import get_default_root, get_logs_dir, get_oauth_dir, get_data_file
 class StreamedAPIError(Exception):
         api_keys: Optional[Dict[str, List[str]]] = None,
         oauth_credentials: Optional[Dict[str, List[str]]] = None,
         max_retries: int = 2,
+        usage_file_path: Optional[Union[str, Path]] = None,
         configure_logging: bool = True,
         global_timeout: int = 30,
         abort_on_callback_error: bool = True,
         enable_request_logging: bool = False,
         max_concurrent_requests_per_key: Optional[Dict[str, int]] = None,
         rotation_tolerance: float = 3.0,
+        data_dir: Optional[Union[str, Path]] = None,
     ):
         """
         Initialize the RotatingClient with intelligent credential rotation.
             api_keys: Dictionary mapping provider names to lists of API keys
             oauth_credentials: Dictionary mapping provider names to OAuth credential paths
             max_retries: Maximum number of retry attempts per credential
+            usage_file_path: Path to store usage statistics. If None, uses data_dir/key_usage.json
             configure_logging: Whether to configure library logging
             global_timeout: Global timeout for requests in seconds
             abort_on_callback_error: Whether to abort on pre-request callback errors
                 - 0.0: Deterministic, least-used credential always selected
                 - 2.0 - 4.0 (default, recommended): Balanced randomness, can pick credentials within 2 uses of max
                 - 5.0+: High randomness, more unpredictable selection patterns
+            data_dir: Root directory for all data files (logs, cache, oauth_creds, key_usage.json).
+                      If None, auto-detects: EXE directory if frozen, else current working directory.
         """
+        # Resolve data_dir early - this becomes the root for all file operations
+        if data_dir is not None:
+            self.data_dir = Path(data_dir).resolve()
+        else:
+            self.data_dir = get_default_root()
+        # Configure failure logger to use correct logs directory
+        configure_failure_logger(get_logs_dir(self.data_dir))
         os.environ["LITELLM_LOG"] = "ERROR"
         litellm.set_verbose = False
         litellm.drop_params = True
         if oauth_credentials:
             self.oauth_credentials = oauth_credentials
         else:
+            self.credential_manager = CredentialManager(
+                os.environ, oauth_dir=get_oauth_dir(self.data_dir)
+            )
             self.oauth_credentials = self.credential_manager.discover_and_prepare()
         self.background_refresher = BackgroundRefresher(self)
         self.oauth_providers = set(self.oauth_credentials.keys())
                 f"Provider '{provider}' sequential fallback multiplier: {fallback}x"
             )
+        # Resolve usage file path - use provided path or default to data_dir
+        if usage_file_path is not None:
+            resolved_usage_path = Path(usage_file_path)
+        else:
+            resolved_usage_path = self.data_dir / "key_usage.json"
         self.usage_manager = UsageManager(
+            file_path=resolved_usage_path,
             rotation_tolerance=rotation_tolerance,
             provider_rotation_modes=provider_rotation_modes,
             provider_plugins=PROVIDER_PLUGINS,

src/rotator_library/credential_manager.py CHANGED Viewed

@@ -3,12 +3,11 @@ import re
 import shutil
 import logging
 from pathlib import Path
-from typing import Dict, List, Optional, Set
-lib_logger = logging.getLogger('rotator_library')
-OAUTH_BASE_DIR = Path.cwd() / "oauth_creds"
-OAUTH_BASE_DIR.mkdir(exist_ok=True)
 # Standard directories where tools like `gemini login` store credentials.
 DEFAULT_OAUTH_DIRS = {
@@ -33,38 +32,53 @@ class CredentialManager:
     """
     Discovers OAuth credential files from standard locations, copies them locally,
     and updates the configuration to use the local paths.
     Also discovers environment variable-based OAuth credentials for stateless deployments.
     Supports two env var formats:
     1. Single credential (legacy): PROVIDER_ACCESS_TOKEN, PROVIDER_REFRESH_TOKEN
     2. Multiple credentials (numbered): PROVIDER_1_ACCESS_TOKEN, PROVIDER_2_ACCESS_TOKEN, etc.
     When env-based credentials are detected, virtual paths like "env://provider/1" are created.
     """
-    def __init__(self, env_vars: Dict[str, str]):
         self.env_vars = env_vars
     def _discover_env_oauth_credentials(self) -> Dict[str, List[str]]:
         """
         Discover OAuth credentials defined via environment variables.
         Supports two formats:
         1. Single credential: ANTIGRAVITY_ACCESS_TOKEN + ANTIGRAVITY_REFRESH_TOKEN
         2. Multiple credentials: ANTIGRAVITY_1_ACCESS_TOKEN + ANTIGRAVITY_1_REFRESH_TOKEN, etc.
         Returns:
             Dict mapping provider name to list of virtual paths (e.g., "env://antigravity/1")
         """
         env_credentials: Dict[str, Set[str]] = {}
         for provider, env_prefix in ENV_OAUTH_PROVIDERS.items():
             found_indices: Set[str] = set()
             # Check for numbered credentials (PROVIDER_N_ACCESS_TOKEN pattern)
             # Pattern: ANTIGRAVITY_1_ACCESS_TOKEN, ANTIGRAVITY_2_ACCESS_TOKEN, etc.
             numbered_pattern = re.compile(rf"^{env_prefix}_(\d+)_ACCESS_TOKEN$")
             for key in self.env_vars.keys():
                 match = numbered_pattern.match(key)
                 if match:
@@ -73,28 +87,34 @@ class CredentialManager:
                     refresh_key = f"{env_prefix}_{index}_REFRESH_TOKEN"
                     if refresh_key in self.env_vars and self.env_vars[refresh_key]:
                         found_indices.add(index)
             # Check for legacy single credential (PROVIDER_ACCESS_TOKEN pattern)
             # Only use this if no numbered credentials exist
             if not found_indices:
                 access_key = f"{env_prefix}_ACCESS_TOKEN"
                 refresh_key = f"{env_prefix}_REFRESH_TOKEN"
-                if (access_key in self.env_vars and self.env_vars[access_key] and
-                    refresh_key in self.env_vars and self.env_vars[refresh_key]):
                     # Use "0" as the index for legacy single credential
                     found_indices.add("0")
             if found_indices:
                 env_credentials[provider] = found_indices
-                lib_logger.info(f"Found {len(found_indices)} env-based credential(s) for {provider}")
         # Convert to virtual paths
         result: Dict[str, List[str]] = {}
         for provider, indices in env_credentials.items():
             # Sort indices numerically for consistent ordering
             sorted_indices = sorted(indices, key=lambda x: int(x))
             result[provider] = [f"env://{provider}/{idx}" for idx in sorted_indices]
         return result
     def discover_and_prepare(self) -> Dict[str, List[str]]:
@@ -105,7 +125,9 @@ class CredentialManager:
         # These take priority for stateless deployments
         env_oauth_creds = self._discover_env_oauth_credentials()
         for provider, virtual_paths in env_oauth_creds.items():
-            lib_logger.info(f"Using {len(virtual_paths)} env-based credential(s) for {provider}")
             final_config[provider] = virtual_paths
         # Extract OAuth file paths from environment variables
@@ -115,21 +137,29 @@ class CredentialManager:
                 provider = key.split("_OAUTH_")[0].lower()
                 if provider not in env_oauth_paths:
                     env_oauth_paths[provider] = []
-                if value: # Only consider non-empty values
                     env_oauth_paths[provider].append(value)
         # PHASE 2: Discover file-based OAuth credentials
         for provider, default_dir in DEFAULT_OAUTH_DIRS.items():
             # Skip if already discovered from environment variables
             if provider in final_config:
-                lib_logger.debug(f"Skipping file discovery for {provider} - using env-based credentials")
                 continue
             # Check for existing local credentials first. If found, use them and skip discovery.
-            local_provider_creds = sorted(list(OAUTH_BASE_DIR.glob(f"{provider}_oauth_*.json")))
             if local_provider_creds:
-                lib_logger.info(f"Found {len(local_provider_creds)} existing local credential(s) for {provider}. Skipping discovery.")
-                final_config[provider] = [str(p.resolve()) for p in local_provider_creds]
                 continue
             # If no local credentials exist, proceed with a one-time discovery and copy.
@@ -140,13 +170,13 @@ class CredentialManager:
                 path = Path(path_str).expanduser()
                 if path.exists():
                     discovered_paths.add(path)
             # 2. If no overrides are provided via .env, scan the default directory
             # [MODIFIED] This logic is now disabled to prefer local-first credential management.
             # if not discovered_paths and default_dir.exists():
             #     for json_file in default_dir.glob('*.json'):
             #         discovered_paths.add(json_file)
             if not discovered_paths:
                 lib_logger.debug(f"No credential files found for provider: {provider}")
                 continue
@@ -156,18 +186,24 @@ class CredentialManager:
             for i, source_path in enumerate(sorted(list(discovered_paths))):
                 account_id = i + 1
                 local_filename = f"{provider}_oauth_{account_id}.json"
-                local_path = OAUTH_BASE_DIR / local_filename
                 try:
                     # Since we've established no local files exist, we can copy directly.
                     shutil.copy(source_path, local_path)
-                    lib_logger.info(f"Copied '{source_path.name}' to local pool at '{local_path}'.")
                     prepared_paths.append(str(local_path.resolve()))
                 except Exception as e:
-                    lib_logger.error(f"Failed to process OAuth file from '{source_path}': {e}")
             if prepared_paths:
-                lib_logger.info(f"Discovered and prepared {len(prepared_paths)} credential(s) for provider: {provider}")
                 final_config[provider] = prepared_paths
         lib_logger.info("OAuth credential discovery complete.")

 import shutil
 import logging
 from pathlib import Path
+from typing import Dict, List, Optional, Set, Union
+from .utils.paths import get_oauth_dir
+lib_logger = logging.getLogger("rotator_library")
 # Standard directories where tools like `gemini login` store credentials.
 DEFAULT_OAUTH_DIRS = {
     """
     Discovers OAuth credential files from standard locations, copies them locally,
     and updates the configuration to use the local paths.
     Also discovers environment variable-based OAuth credentials for stateless deployments.
     Supports two env var formats:
     1. Single credential (legacy): PROVIDER_ACCESS_TOKEN, PROVIDER_REFRESH_TOKEN
     2. Multiple credentials (numbered): PROVIDER_1_ACCESS_TOKEN, PROVIDER_2_ACCESS_TOKEN, etc.
     When env-based credentials are detected, virtual paths like "env://provider/1" are created.
     """
+    def __init__(
+        self,
+        env_vars: Dict[str, str],
+        oauth_dir: Optional[Union[Path, str]] = None,
+    ):
+        """
+        Initialize the CredentialManager.
+        Args:
+            env_vars: Dictionary of environment variables (typically os.environ).
+            oauth_dir: Directory for storing OAuth credentials.
+                       If None, uses get_oauth_dir() which respects EXE vs script mode.
+        """
         self.env_vars = env_vars
+        self.oauth_base_dir = Path(oauth_dir) if oauth_dir else get_oauth_dir()
+        self.oauth_base_dir.mkdir(parents=True, exist_ok=True)
     def _discover_env_oauth_credentials(self) -> Dict[str, List[str]]:
         """
         Discover OAuth credentials defined via environment variables.
         Supports two formats:
         1. Single credential: ANTIGRAVITY_ACCESS_TOKEN + ANTIGRAVITY_REFRESH_TOKEN
         2. Multiple credentials: ANTIGRAVITY_1_ACCESS_TOKEN + ANTIGRAVITY_1_REFRESH_TOKEN, etc.
         Returns:
             Dict mapping provider name to list of virtual paths (e.g., "env://antigravity/1")
         """
         env_credentials: Dict[str, Set[str]] = {}
         for provider, env_prefix in ENV_OAUTH_PROVIDERS.items():
             found_indices: Set[str] = set()
             # Check for numbered credentials (PROVIDER_N_ACCESS_TOKEN pattern)
             # Pattern: ANTIGRAVITY_1_ACCESS_TOKEN, ANTIGRAVITY_2_ACCESS_TOKEN, etc.
             numbered_pattern = re.compile(rf"^{env_prefix}_(\d+)_ACCESS_TOKEN$")
             for key in self.env_vars.keys():
                 match = numbered_pattern.match(key)
                 if match:
                     refresh_key = f"{env_prefix}_{index}_REFRESH_TOKEN"
                     if refresh_key in self.env_vars and self.env_vars[refresh_key]:
                         found_indices.add(index)
             # Check for legacy single credential (PROVIDER_ACCESS_TOKEN pattern)
             # Only use this if no numbered credentials exist
             if not found_indices:
                 access_key = f"{env_prefix}_ACCESS_TOKEN"
                 refresh_key = f"{env_prefix}_REFRESH_TOKEN"
+                if (
+                    access_key in self.env_vars
+                    and self.env_vars[access_key]
+                    and refresh_key in self.env_vars
+                    and self.env_vars[refresh_key]
+                ):
                     # Use "0" as the index for legacy single credential
                     found_indices.add("0")
             if found_indices:
                 env_credentials[provider] = found_indices
+                lib_logger.info(
+                    f"Found {len(found_indices)} env-based credential(s) for {provider}"
+                )
         # Convert to virtual paths
         result: Dict[str, List[str]] = {}
         for provider, indices in env_credentials.items():
             # Sort indices numerically for consistent ordering
             sorted_indices = sorted(indices, key=lambda x: int(x))
             result[provider] = [f"env://{provider}/{idx}" for idx in sorted_indices]
         return result
     def discover_and_prepare(self) -> Dict[str, List[str]]:
         # These take priority for stateless deployments
         env_oauth_creds = self._discover_env_oauth_credentials()
         for provider, virtual_paths in env_oauth_creds.items():
+            lib_logger.info(
+                f"Using {len(virtual_paths)} env-based credential(s) for {provider}"
+            )
             final_config[provider] = virtual_paths
         # Extract OAuth file paths from environment variables
                 provider = key.split("_OAUTH_")[0].lower()
                 if provider not in env_oauth_paths:
                     env_oauth_paths[provider] = []
+                if value:  # Only consider non-empty values
                     env_oauth_paths[provider].append(value)
         # PHASE 2: Discover file-based OAuth credentials
         for provider, default_dir in DEFAULT_OAUTH_DIRS.items():
             # Skip if already discovered from environment variables
             if provider in final_config:
+                lib_logger.debug(
+                    f"Skipping file discovery for {provider} - using env-based credentials"
+                )
                 continue
             # Check for existing local credentials first. If found, use them and skip discovery.
+            local_provider_creds = sorted(
+                list(self.oauth_base_dir.glob(f"{provider}_oauth_*.json"))
+            )
             if local_provider_creds:
+                lib_logger.info(
+                    f"Found {len(local_provider_creds)} existing local credential(s) for {provider}. Skipping discovery."
+                )
+                final_config[provider] = [
+                    str(p.resolve()) for p in local_provider_creds
+                ]
                 continue
             # If no local credentials exist, proceed with a one-time discovery and copy.
                 path = Path(path_str).expanduser()
                 if path.exists():
                     discovered_paths.add(path)
             # 2. If no overrides are provided via .env, scan the default directory
             # [MODIFIED] This logic is now disabled to prefer local-first credential management.
             # if not discovered_paths and default_dir.exists():
             #     for json_file in default_dir.glob('*.json'):
             #         discovered_paths.add(json_file)
             if not discovered_paths:
                 lib_logger.debug(f"No credential files found for provider: {provider}")
                 continue
             for i, source_path in enumerate(sorted(list(discovered_paths))):
                 account_id = i + 1
                 local_filename = f"{provider}_oauth_{account_id}.json"
+                local_path = self.oauth_base_dir / local_filename
                 try:
                     # Since we've established no local files exist, we can copy directly.
                     shutil.copy(source_path, local_path)
+                    lib_logger.info(
+                        f"Copied '{source_path.name}' to local pool at '{local_path}'."
+                    )
                     prepared_paths.append(str(local_path.resolve()))
                 except Exception as e:
+                    lib_logger.error(
+                        f"Failed to process OAuth file from '{source_path}': {e}"
+                    )
             if prepared_paths:
+                lib_logger.info(
+                    f"Discovered and prepared {len(prepared_paths)} credential(s) for provider: {provider}"
+                )
                 final_config[provider] = prepared_paths
         lib_logger.info("OAuth credential discovery complete.")

src/rotator_library/credential_tool.py CHANGED Viewed

@@ -3,22 +3,31 @@
 import asyncio
 import json
 import os
-import re
 import time
 from pathlib import Path
 from dotenv import set_key, get_key
-# NOTE: Heavy imports (provider_factory, PROVIDER_PLUGINS) are deferred
 # to avoid 6-7 second delay before showing loading screen
 from rich.console import Console
 from rich.panel import Panel
 from rich.prompt import Prompt
 from rich.text import Text
-OAUTH_BASE_DIR = Path.cwd() / "oauth_creds"
-OAUTH_BASE_DIR.mkdir(exist_ok=True)
-# Use a direct path to the .env file in the project root
-ENV_FILE = Path.cwd() / ".env"
 console = Console()
@@ -26,12 +35,14 @@ console = Console()
 _provider_factory = None
 _provider_plugins = None
 def _ensure_providers_loaded():
     """Lazy load provider modules only when needed"""
     global _provider_factory, _provider_plugins
     if _provider_factory is None:
         from . import provider_factory as pf
         from .providers import PROVIDER_PLUGINS as pp
         _provider_factory = pf
         _provider_plugins = pp
     return _provider_factory, _provider_plugins
@@ -39,99 +50,34 @@ def _ensure_providers_loaded():
 def clear_screen():
     """
-    Cross-platform terminal clear that works robustly on both
     classic Windows conhost and modern terminals (Windows Terminal, Linux, Mac).
     Uses native OS commands instead of ANSI escape sequences:
     - Windows (conhost & Windows Terminal): cls
     - Unix-like systems (Linux, Mac): clear
     """
-    os.system('cls' if os.name == 'nt' else 'clear')
-def _get_credential_number_from_filename(filename: str) -> int:
-    """
-    Extract credential number from filename like 'provider_oauth_1.json' -> 1
-    """
-    match = re.search(r'_oauth_(\d+)\.json$', filename)
-    if match:
-        return int(match.group(1))
-    return 1
-def _build_env_export_content(
-    provider_prefix: str,
-    cred_number: int,
-    creds: dict,
-    email: str,
-    extra_fields: dict = None,
-    include_client_creds: bool = True
-) -> tuple[list[str], str]:
-    """
-    Build .env content for OAuth credential export with numbered format.
-    Exports all fields from the JSON file as a 1-to-1 mirror.
-    Args:
-        provider_prefix: Environment variable prefix (e.g., "ANTIGRAVITY", "GEMINI_CLI")
-        cred_number: Credential number for this export (1, 2, 3, etc.)
-        creds: The credential dictionary loaded from JSON
-        email: User email for comments
-        extra_fields: Optional dict of additional fields to include
-        include_client_creds: Whether to include client_id/secret (Google OAuth providers)
-    Returns:
-        Tuple of (env_lines list, numbered_prefix string for display)
-    """
-    # Use numbered format: PROVIDER_N_ACCESS_TOKEN
-    numbered_prefix = f"{provider_prefix}_{cred_number}"
-    env_lines = [
-        f"# {provider_prefix} Credential #{cred_number} for: {email}",
-        f"# Exported from: {provider_prefix.lower()}_oauth_{cred_number}.json",
-        f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
-        f"# ",
-        f"# To combine multiple credentials into one .env file, copy these lines",
-        f"# and ensure each credential has a unique number (1, 2, 3, etc.)",
-        "",
-        f"{numbered_prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
-        f"{numbered_prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
-        f"{numbered_prefix}_SCOPE={creds.get('scope', '')}",
-        f"{numbered_prefix}_TOKEN_TYPE={creds.get('token_type', 'Bearer')}",
-        f"{numbered_prefix}_ID_TOKEN={creds.get('id_token', '')}",
-        f"{numbered_prefix}_EXPIRY_DATE={creds.get('expiry_date', 0)}",
-    ]
-    if include_client_creds:
-        env_lines.extend([
-            f"{numbered_prefix}_CLIENT_ID={creds.get('client_id', '')}",
-            f"{numbered_prefix}_CLIENT_SECRET={creds.get('client_secret', '')}",
-            f"{numbered_prefix}_TOKEN_URI={creds.get('token_uri', 'https://oauth2.googleapis.com/token')}",
-            f"{numbered_prefix}_UNIVERSE_DOMAIN={creds.get('universe_domain', 'googleapis.com')}",
-        ])
-    env_lines.append(f"{numbered_prefix}_EMAIL={email}")
-    # Add extra provider-specific fields
-    if extra_fields:
-        for key, value in extra_fields.items():
-            if value:  # Only add non-empty values
-                env_lines.append(f"{numbered_prefix}_{key}={value}")
-    return env_lines, numbered_prefix
 def ensure_env_defaults():
     """
     Ensures the .env file exists and contains essential default values like PROXY_API_KEY.
     """
-    if not ENV_FILE.is_file():
-        ENV_FILE.touch()
-        console.print(f"Creating a new [bold yellow]{ENV_FILE.name}[/bold yellow] file...")
     # Check for PROXY_API_KEY, similar to setup_env.bat
-    if get_key(str(ENV_FILE), "PROXY_API_KEY") is None:
         default_key = "VerysecretKey"
-        console.print(f"Adding default [bold cyan]PROXY_API_KEY[/bold cyan] to [bold yellow]{ENV_FILE.name}[/bold yellow]...")
-        set_key(str(ENV_FILE), "PROXY_API_KEY", default_key)
 async def setup_api_key():
     """
@@ -144,41 +90,74 @@ async def setup_api_key():
     # Verified list of LiteLLM providers with their friendly names and API key variables
     LITELLM_PROVIDERS = {
-        "OpenAI": "OPENAI_API_KEY", "Anthropic": "ANTHROPIC_API_KEY",
-        "Google AI Studio (Gemini)": "GEMINI_API_KEY", "Azure OpenAI": "AZURE_API_KEY",
-        "Vertex AI": "GOOGLE_API_KEY", "AWS Bedrock": "AWS_ACCESS_KEY_ID",
-        "Cohere": "COHERE_API_KEY", "Chutes": "CHUTES_API_KEY",
         "Mistral AI": "MISTRAL_API_KEY",
-        "Codestral (Mistral)": "CODESTRAL_API_KEY", "Groq": "GROQ_API_KEY",
-        "Perplexity": "PERPLEXITYAI_API_KEY", "xAI": "XAI_API_KEY",
-        "Together AI": "TOGETHERAI_API_KEY", "Fireworks AI": "FIREWORKS_AI_API_KEY",
-        "Replicate": "REPLICATE_API_KEY", "Hugging Face": "HUGGINGFACE_API_KEY",
-        "Anyscale": "ANYSCALE_API_KEY", "NVIDIA NIM": "NVIDIA_NIM_API_KEY",
-        "Deepseek": "DEEPSEEK_API_KEY", "AI21": "AI21_API_KEY",
-        "Cerebras": "CEREBRAS_API_KEY", "Moonshot": "MOONSHOT_API_KEY",
-        "Ollama": "OLLAMA_API_KEY", "Xinference": "XINFERENCE_API_KEY",
-        "Infinity": "INFINITY_API_KEY", "OpenRouter": "OPENROUTER_API_KEY",
-        "Deepinfra": "DEEPINFRA_API_KEY", "Cloudflare": "CLOUDFLARE_API_KEY",
-        "Baseten": "BASETEN_API_KEY", "Modal": "MODAL_API_KEY",
-        "Databricks": "DATABRICKS_API_KEY", "AWS SageMaker": "AWS_ACCESS_KEY_ID",
-        "IBM watsonx.ai": "WATSONX_APIKEY", "Predibase": "PREDIBASE_API_KEY",
-        "Clarifai": "CLARIFAI_API_KEY", "NLP Cloud": "NLP_CLOUD_API_KEY",
-        "Voyage AI": "VOYAGE_API_KEY", "Jina AI": "JINA_API_KEY",
-        "Hyperbolic": "HYPERBOLIC_API_KEY", "Morph": "MORPH_API_KEY",
-        "Lambda AI": "LAMBDA_API_KEY", "Novita AI": "NOVITA_API_KEY",
-        "Aleph Alpha": "ALEPH_ALPHA_API_KEY", "SambaNova": "SAMBANOVA_API_KEY",
-        "FriendliAI": "FRIENDLI_TOKEN", "Galadriel": "GALADRIEL_API_KEY",
-        "CompactifAI": "COMPACTIFAI_API_KEY", "Lemonade": "LEMONADE_API_KEY",
-        "GradientAI": "GRADIENTAI_API_KEY", "Featherless AI": "FEATHERLESS_AI_API_KEY",
-        "Nebius AI Studio": "NEBIUS_API_KEY", "Dashscope (Qwen)": "DASHSCOPE_API_KEY",
-        "Bytez": "BYTEZ_API_KEY", "Oracle OCI": "OCI_API_KEY",
-        "DataRobot": "DATAROBOT_API_KEY", "OVHCloud": "OVHCLOUD_API_KEY",
-        "Volcengine": "VOLCENGINE_API_KEY", "Snowflake": "SNOWFLAKE_API_KEY",
-        "Nscale": "NSCALE_API_KEY", "Recraft": "RECRAFT_API_KEY",
-        "v0": "V0_API_KEY", "Vercel": "VERCEL_AI_GATEWAY_API_KEY",
-        "Topaz": "TOPAZ_API_KEY", "ElevenLabs": "ELEVENLABS_API_KEY",
         "Deepgram": "DEEPGRAM_API_KEY",
-        "GitHub Models": "GITHUB_TOKEN", "GitHub Copilot": "GITHUB_COPILOT_API_KEY",
     }
     # Discover custom providers and add them to the list
@@ -186,37 +165,37 @@ async def setup_api_key():
     # qwen_code API key support is a fallback
     # iflow API key support is a feature
     _, PROVIDER_PLUGINS = _ensure_providers_loaded()
     # Build a set of environment variables already in LITELLM_PROVIDERS
     # to avoid duplicates based on the actual API key names
     litellm_env_vars = set(LITELLM_PROVIDERS.values())
     # Providers to exclude from API key list
     exclude_providers = {
-        'gemini_cli',  # OAuth-only
-        'antigravity',  # OAuth-only
-        'qwen_code',  # API key is fallback, OAuth is primary - don't advertise
-        'openai_compatible',  # Base class, not a real provider
     }
     discovered_providers = {}
     for provider_key in PROVIDER_PLUGINS.keys():
         if provider_key in exclude_providers:
             continue
         # Create environment variable name
         env_var = provider_key.upper() + "_API_KEY"
         # Check if this env var already exists in LITELLM_PROVIDERS
         # This catches duplicates like GEMINI_API_KEY, MISTRAL_API_KEY, etc.
         if env_var in litellm_env_vars:
             # Already in LITELLM_PROVIDERS with better name, skip this one
             continue
         # Create display name for this custom provider
-        display_name = provider_key.replace('_', ' ').title()
         discovered_providers[display_name] = env_var
     # LITELLM_PROVIDERS takes precedence (comes first in merge)
     combined_providers = {**LITELLM_PROVIDERS, **discovered_providers}
     provider_display_list = sorted(combined_providers.keys())
@@ -231,15 +210,19 @@ async def setup_api_key():
         else:
             provider_text.append(f"  {i + 1}. {provider_name}\n")
-    console.print(Panel(provider_text, title="Available Providers for API Key", style="bold blue"))
     choice = Prompt.ask(
-        Text.from_markup("[bold]Please select a provider or type [red]'b'[/red] to go back[/bold]"),
         choices=[str(i + 1) for i in range(len(provider_display_list))] + ["b"],
-        show_choices=False
     )
-    if choice.lower() == 'b':
         return
     try:
@@ -251,59 +234,88 @@ async def setup_api_key():
             api_key = Prompt.ask(f"Enter the API key for {display_name}")
             # Check for duplicate API key value
-            if ENV_FILE.is_file():
-                with open(ENV_FILE, "r") as f:
                     for line in f:
                         line = line.strip()
                         if line.startswith(api_var_base) and "=" in line:
-                            existing_key_name, _, existing_key_value = line.partition("=")
                             if existing_key_value == api_key:
-                                warning_text = Text.from_markup(f"This API key already exists as [bold yellow]'{existing_key_name}'[/bold yellow]. Overwriting...")
-                                console.print(Panel(warning_text, style="bold yellow", title="Updating API Key"))
-                                set_key(str(ENV_FILE), existing_key_name, api_key)
-                                success_text = Text.from_markup(f"Successfully updated existing key [bold yellow]'{existing_key_name}'[/bold yellow].")
-                                console.print(Panel(success_text, style="bold green", title="Success"))
                                 return
             # Special handling for AWS
             if display_name in ["AWS Bedrock", "AWS SageMaker"]:
-                console.print(Panel(
-                    Text.from_markup(
-                        "This provider requires both an Access Key ID and a Secret Access Key.\n"
-                        f"The key you entered will be saved as [bold yellow]{api_var_base}_1[/bold yellow].\n"
-                        "Please manually add the [bold cyan]AWS_SECRET_ACCESS_KEY_1[/bold cyan] to your .env file."
-                    ),
-                    title="[bold yellow]Additional Step Required[/bold yellow]",
-                    border_style="yellow"
-                ))
             key_index = 1
             while True:
                 key_name = f"{api_var_base}_{key_index}"
-                if ENV_FILE.is_file():
-                     with open(ENV_FILE, "r") as f:
                         if not any(line.startswith(f"{key_name}=") for line in f):
                             break
                 else:
                     break
                 key_index += 1
             key_name = f"{api_var_base}_{key_index}"
-            set_key(str(ENV_FILE), key_name, api_key)
-            success_text = Text.from_markup(f"Successfully added {display_name} API key as [bold yellow]'{key_name}'[/bold yellow].")
             console.print(Panel(success_text, style="bold green", title="Success"))
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
-        console.print("[bold red]Invalid input. Please enter a number or 'b'.[/bold red]")
 async def setup_new_credential(provider_name: str):
     """
     Interactively sets up a new OAuth credential for a given provider.
     """
     try:
         provider_factory, _ = _ensure_providers_loaded()
@@ -315,668 +327,602 @@ async def setup_new_credential(provider_name: str):
             "gemini_cli": "Gemini CLI (OAuth)",
             "qwen_code": "Qwen Code (OAuth - also supports API keys)",
             "iflow": "iFlow (OAuth - also supports API keys)",
-            "antigravity": "Antigravity (OAuth)"
         }
-        display_name = oauth_friendly_names.get(provider_name, provider_name.replace('_', ' ').title())
-        # Pass provider metadata to auth classes for better display
-        temp_creds = {
-            "_proxy_metadata": {
-                "provider_name": provider_name,
-                "display_name": display_name
-            }
-        }
-        initialized_creds = await auth_instance.initialize_token(temp_creds)
-        user_info = await auth_instance.get_user_info(initialized_creds)
-        email = user_info.get("email")
-        if not email:
-            console.print(Panel(f"Could not retrieve a unique identifier for {provider_name}. Aborting.", style="bold red", title="Error"))
             return
-        for cred_file in OAUTH_BASE_DIR.glob(f"{provider_name}_oauth_*.json"):
-            with open(cred_file, 'r') as f:
-                existing_creds = json.load(f)
-            metadata = existing_creds.get("_proxy_metadata", {})
-            if metadata.get("email") == email:
-                warning_text = Text.from_markup(f"Found existing credential for [bold cyan]'{email}'[/bold cyan] at [bold yellow]'{cred_file.name}'[/bold yellow]. Overwriting...")
-                console.print(Panel(warning_text, style="bold yellow", title="Updating Credential"))
-                # Overwrite the existing file in-place
-                with open(cred_file, 'w') as f:
-                    json.dump(initialized_creds, f, indent=2)
-                success_text = Text.from_markup(f"Successfully updated credential at [bold yellow]'{cred_file.name}'[/bold yellow] for user [bold cyan]'{email}'[/bold cyan].")
-                console.print(Panel(success_text, style="bold green", title="Success"))
-                return
-        existing_files = list(OAUTH_BASE_DIR.glob(f"{provider_name}_oauth_*.json"))
-        next_num = 1
-        if existing_files:
-            nums = [int(re.search(r'_(\d+)\.json$', f.name).group(1)) for f in existing_files if re.search(r'_(\d+)\.json$', f.name)]
-            if nums:
-                next_num = max(nums) + 1
-        new_filename = f"{provider_name}_oauth_{next_num}.json"
-        new_filepath = OAUTH_BASE_DIR / new_filename
-        with open(new_filepath, 'w') as f:
-            json.dump(initialized_creds, f, indent=2)
-        success_text = Text.from_markup(f"Successfully created new credential at [bold yellow]'{new_filepath.name}'[/bold yellow] for user [bold cyan]'{email}'[/bold cyan].")
         console.print(Panel(success_text, style="bold green", title="Success"))
     except Exception as e:
-        console.print(Panel(f"An error occurred during setup for {provider_name}: {e}", style="bold red", title="Error"))
 async def export_gemini_cli_to_env():
     """
     Export a Gemini CLI credential JSON file to .env format.
-    Uses numbered format (GEMINI_CLI_1_*, GEMINI_CLI_2_*) for multiple credential support.
     """
-    console.print(Panel("[bold cyan]Export Gemini CLI Credential to .env[/bold cyan]", expand=False))
-    # Find all gemini_cli credentials
-    gemini_cli_files = sorted(list(OAUTH_BASE_DIR.glob("gemini_cli_oauth_*.json")))
-    if not gemini_cli_files:
-        console.print(Panel("No Gemini CLI credentials found. Please add one first using 'Add OAuth Credential'.",
-                          style="bold red", title="No Credentials"))
         return
     # Display available credentials
     cred_text = Text()
-    for i, cred_file in enumerate(gemini_cli_files):
-        try:
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            cred_text.append(f"  {i + 1}. {cred_file.name} ({email})\n")
-        except Exception as e:
-            cred_text.append(f"  {i + 1}. {cred_file.name} (error reading: {e})\n")
-    console.print(Panel(cred_text, title="Available Gemini CLI Credentials", style="bold blue"))
     choice = Prompt.ask(
-        Text.from_markup("[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"),
-        choices=[str(i + 1) for i in range(len(gemini_cli_files))] + ["b"],
-        show_choices=False
     )
-    if choice.lower() == 'b':
         return
     try:
         choice_index = int(choice) - 1
-        if 0 <= choice_index < len(gemini_cli_files):
-            cred_file = gemini_cli_files[choice_index]
-            # Load the credential
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            # Extract metadata
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            project_id = creds.get("_proxy_metadata", {}).get("project_id", "")
-            tier = creds.get("_proxy_metadata", {}).get("tier", "")
-            # Get credential number from filename
-            cred_number = _get_credential_number_from_filename(cred_file.name)
-            # Generate .env file name with credential number
-            safe_email = email.replace("@", "_at_").replace(".", "_")
-            env_filename = f"gemini_cli_{cred_number}_{safe_email}.env"
-            env_filepath = OAUTH_BASE_DIR / env_filename
-            # Build extra fields
-            extra_fields = {}
-            if project_id:
-                extra_fields["PROJECT_ID"] = project_id
-            if tier:
-                extra_fields["TIER"] = tier
-            # Build .env content using helper
-            env_lines, numbered_prefix = _build_env_export_content(
-                provider_prefix="GEMINI_CLI",
-                cred_number=cred_number,
-                creds=creds,
-                email=email,
-                extra_fields=extra_fields,
-                include_client_creds=True
             )
-            # Write to .env file
-            with open(env_filepath, 'w') as f:
-                f.write('\n'.join(env_lines))
-            success_text = Text.from_markup(
-                f"Successfully exported credential to [bold yellow]'{env_filepath}'[/bold yellow]\n\n"
-                f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
-                f"[bold]To use this credential:[/bold]\n"
-                f"1. Copy the contents to your main .env file, OR\n"
-                f"2. Source it: [bold cyan]source {env_filepath.name}[/bold cyan] (Linux/Mac)\n"
-                f"3. Or on Windows: [bold cyan]Get-Content {env_filepath.name} | ForEach-Object {{ $_ -replace '^([^#].*)$', 'set $1' }} | cmd[/bold cyan]\n\n"
-                f"[bold]To combine multiple credentials:[/bold]\n"
-                f"Copy lines from multiple .env files into one file.\n"
-                f"Each credential uses a unique number ({numbered_prefix}_*)."
-            )
-            console.print(Panel(success_text, style="bold green", title="Success"))
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
-        console.print("[bold red]Invalid input. Please enter a number or 'b'.[/bold red]")
     except Exception as e:
-        console.print(Panel(f"An error occurred during export: {e}", style="bold red", title="Error"))
 async def export_qwen_code_to_env():
     """
     Export a Qwen Code credential JSON file to .env format.
-    Generates one .env file per credential.
     """
-    console.print(Panel("[bold cyan]Export Qwen Code Credential to .env[/bold cyan]", expand=False))
-    # Find all qwen_code credentials
-    qwen_code_files = list(OAUTH_BASE_DIR.glob("qwen_code_oauth_*.json"))
-    if not qwen_code_files:
-        console.print(Panel("No Qwen Code credentials found. Please add one first using 'Add OAuth Credential'.",
-                          style="bold red", title="No Credentials"))
         return
     # Display available credentials
     cred_text = Text()
-    for i, cred_file in enumerate(qwen_code_files):
-        try:
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            cred_text.append(f"  {i + 1}. {cred_file.name} ({email})\n")
-        except Exception as e:
-            cred_text.append(f"  {i + 1}. {cred_file.name} (error reading: {e})\n")
-    console.print(Panel(cred_text, title="Available Qwen Code Credentials", style="bold blue"))
     choice = Prompt.ask(
-        Text.from_markup("[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"),
-        choices=[str(i + 1) for i in range(len(qwen_code_files))] + ["b"],
-        show_choices=False
     )
-    if choice.lower() == 'b':
         return
     try:
         choice_index = int(choice) - 1
-        if 0 <= choice_index < len(qwen_code_files):
-            cred_file = qwen_code_files[choice_index]
-            # Load the credential
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            # Extract metadata
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            # Get credential number from filename
-            cred_number = _get_credential_number_from_filename(cred_file.name)
-            # Generate .env file name with credential number
-            safe_email = email.replace("@", "_at_").replace(".", "_")
-            env_filename = f"qwen_code_{cred_number}_{safe_email}.env"
-            env_filepath = OAUTH_BASE_DIR / env_filename
-            # Use numbered format: QWEN_CODE_N_*
-            numbered_prefix = f"QWEN_CODE_{cred_number}"
-            # Build .env content (Qwen has different structure)
-            env_lines = [
-                f"# QWEN_CODE Credential #{cred_number} for: {email}",
-                f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
-                f"# ",
-                f"# To combine multiple credentials into one .env file, copy these lines",
-                f"# and ensure each credential has a unique number (1, 2, 3, etc.)",
-                "",
-                f"{numbered_prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
-                f"{numbered_prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
-                f"{numbered_prefix}_EXPIRY_DATE={creds.get('expiry_date', 0)}",
-                f"{numbered_prefix}_RESOURCE_URL={creds.get('resource_url', 'https://portal.qwen.ai/v1')}",
-                f"{numbered_prefix}_EMAIL={email}",
-            ]
-            # Write to .env file
-            with open(env_filepath, 'w') as f:
-                f.write('\n'.join(env_lines))
-            success_text = Text.from_markup(
-                f"Successfully exported credential to [bold yellow]'{env_filepath}'[/bold yellow]\n\n"
-                f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
-                f"[bold]To use this credential:[/bold]\n"
-                f"1. Copy the contents to your main .env file, OR\n"
-                f"2. Source it: [bold cyan]source {env_filepath.name}[/bold cyan] (Linux/Mac)\n\n"
-                f"[bold]To combine multiple credentials:[/bold]\n"
-                f"Copy lines from multiple .env files into one file.\n"
-                f"Each credential uses a unique number ({numbered_prefix}_*)."
             )
-            console.print(Panel(success_text, style="bold green", title="Success"))
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
-        console.print("[bold red]Invalid input. Please enter a number or 'b'.[/bold red]")
     except Exception as e:
-        console.print(Panel(f"An error occurred during export: {e}", style="bold red", title="Error"))
 async def export_iflow_to_env():
     """
     Export an iFlow credential JSON file to .env format.
-    Uses numbered format (IFLOW_1_*, IFLOW_2_*) for multiple credential support.
     """
-    console.print(Panel("[bold cyan]Export iFlow Credential to .env[/bold cyan]", expand=False))
-    # Find all iflow credentials
-    iflow_files = sorted(list(OAUTH_BASE_DIR.glob("iflow_oauth_*.json")))
-    if not iflow_files:
-        console.print(Panel("No iFlow credentials found. Please add one first using 'Add OAuth Credential'.",
-                          style="bold red", title="No Credentials"))
         return
     # Display available credentials
     cred_text = Text()
-    for i, cred_file in enumerate(iflow_files):
-        try:
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            cred_text.append(f"  {i + 1}. {cred_file.name} ({email})\n")
-        except Exception as e:
-            cred_text.append(f"  {i + 1}. {cred_file.name} (error reading: {e})\n")
-    console.print(Panel(cred_text, title="Available iFlow Credentials", style="bold blue"))
     choice = Prompt.ask(
-        Text.from_markup("[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"),
-        choices=[str(i + 1) for i in range(len(iflow_files))] + ["b"],
-        show_choices=False
     )
-    if choice.lower() == 'b':
         return
     try:
         choice_index = int(choice) - 1
-        if 0 <= choice_index < len(iflow_files):
-            cred_file = iflow_files[choice_index]
-            # Load the credential
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            # Extract metadata
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            # Get credential number from filename
-            cred_number = _get_credential_number_from_filename(cred_file.name)
-            # Generate .env file name with credential number
-            safe_email = email.replace("@", "_at_").replace(".", "_")
-            env_filename = f"iflow_{cred_number}_{safe_email}.env"
-            env_filepath = OAUTH_BASE_DIR / env_filename
-            # Use numbered format: IFLOW_N_*
-            numbered_prefix = f"IFLOW_{cred_number}"
-            # Build .env content (iFlow has different structure with API key)
-            env_lines = [
-                f"# IFLOW Credential #{cred_number} for: {email}",
-                f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
-                f"# ",
-                f"# To combine multiple credentials into one .env file, copy these lines",
-                f"# and ensure each credential has a unique number (1, 2, 3, etc.)",
-                "",
-                f"{numbered_prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
-                f"{numbered_prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
-                f"{numbered_prefix}_API_KEY={creds.get('api_key', '')}",
-                f"{numbered_prefix}_EXPIRY_DATE={creds.get('expiry_date', '')}",
-                f"{numbered_prefix}_EMAIL={email}",
-                f"{numbered_prefix}_TOKEN_TYPE={creds.get('token_type', 'Bearer')}",
-                f"{numbered_prefix}_SCOPE={creds.get('scope', 'read write')}",
-            ]
-            # Write to .env file
-            with open(env_filepath, 'w') as f:
-                f.write('\n'.join(env_lines))
-            success_text = Text.from_markup(
-                f"Successfully exported credential to [bold yellow]'{env_filepath}'[/bold yellow]\n\n"
-                f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
-                f"[bold]To use this credential:[/bold]\n"
-                f"1. Copy the contents to your main .env file, OR\n"
-                f"2. Source it: [bold cyan]source {env_filepath.name}[/bold cyan] (Linux/Mac)\n\n"
-                f"[bold]To combine multiple credentials:[/bold]\n"
-                f"Copy lines from multiple .env files into one file.\n"
-                f"Each credential uses a unique number ({numbered_prefix}_*)."
             )
-            console.print(Panel(success_text, style="bold green", title="Success"))
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
-        console.print("[bold red]Invalid input. Please enter a number or 'b'.[/bold red]")
     except Exception as e:
-        console.print(Panel(f"An error occurred during export: {e}", style="bold red", title="Error"))
 async def export_antigravity_to_env():
     """
     Export an Antigravity credential JSON file to .env format.
-    Uses numbered format (ANTIGRAVITY_1_*, ANTIGRAVITY_2_*) for multiple credential support.
     """
-    console.print(Panel("[bold cyan]Export Antigravity Credential to .env[/bold cyan]", expand=False))
-    # Find all antigravity credentials
-    antigravity_files = sorted(list(OAUTH_BASE_DIR.glob("antigravity_oauth_*.json")))
-    if not antigravity_files:
-        console.print(Panel("No Antigravity credentials found. Please add one first using 'Add OAuth Credential'.",
-                          style="bold red", title="No Credentials"))
         return
     # Display available credentials
     cred_text = Text()
-    for i, cred_file in enumerate(antigravity_files):
-        try:
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            cred_text.append(f"  {i + 1}. {cred_file.name} ({email})\n")
-        except Exception as e:
-            cred_text.append(f"  {i + 1}. {cred_file.name} (error reading: {e})\n")
-    console.print(Panel(cred_text, title="Available Antigravity Credentials", style="bold blue"))
     choice = Prompt.ask(
-        Text.from_markup("[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"),
-        choices=[str(i + 1) for i in range(len(antigravity_files))] + ["b"],
-        show_choices=False
     )
-    if choice.lower() == 'b':
         return
     try:
         choice_index = int(choice) - 1
-        if 0 <= choice_index < len(antigravity_files):
-            cred_file = antigravity_files[choice_index]
-            # Load the credential
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            # Extract metadata
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            # Get credential number from filename
-            cred_number = _get_credential_number_from_filename(cred_file.name)
-            # Generate .env file name with credential number
-            safe_email = email.replace("@", "_at_").replace(".", "_")
-            env_filename = f"antigravity_{cred_number}_{safe_email}.env"
-            env_filepath = OAUTH_BASE_DIR / env_filename
-            # Build .env content using helper
-            env_lines, numbered_prefix = _build_env_export_content(
-                provider_prefix="ANTIGRAVITY",
-                cred_number=cred_number,
-                creds=creds,
-                email=email,
-                extra_fields=None,
-                include_client_creds=True
             )
-            # Write to .env file
-            with open(env_filepath, 'w') as f:
-                f.write('\n'.join(env_lines))
-            success_text = Text.from_markup(
-                f"Successfully exported credential to [bold yellow]'{env_filepath}'[/bold yellow]\n\n"
-                f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
-                f"[bold]To use this credential:[/bold]\n"
-                f"1. Copy the contents to your main .env file, OR\n"
-                f"2. Source it: [bold cyan]source {env_filepath.name}[/bold cyan] (Linux/Mac)\n"
-                f"3. Or on Windows: [bold cyan]Get-Content {env_filepath.name} | ForEach-Object {{ $_ -replace '^([^#].*)$', 'set $1' }} | cmd[/bold cyan]\n\n"
-                f"[bold]To combine multiple credentials:[/bold]\n"
-                f"Copy lines from multiple .env files into one file.\n"
-                f"Each credential uses a unique number ({numbered_prefix}_*)."
-            )
-            console.print(Panel(success_text, style="bold green", title="Success"))
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
-        console.print("[bold red]Invalid input. Please enter a number or 'b'.[/bold red]")
     except Exception as e:
-        console.print(Panel(f"An error occurred during export: {e}", style="bold red", title="Error"))
-def _build_gemini_cli_env_lines(creds: dict, cred_number: int) -> list[str]:
-    """Build .env lines for a Gemini CLI credential."""
-    email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-    project_id = creds.get("_proxy_metadata", {}).get("project_id", "")
-    tier = creds.get("_proxy_metadata", {}).get("tier", "")
-    extra_fields = {}
-    if project_id:
-        extra_fields["PROJECT_ID"] = project_id
-    if tier:
-        extra_fields["TIER"] = tier
-    env_lines, _ = _build_env_export_content(
-        provider_prefix="GEMINI_CLI",
-        cred_number=cred_number,
-        creds=creds,
-        email=email,
-        extra_fields=extra_fields,
-        include_client_creds=True
-    )
-    return env_lines
-def _build_qwen_code_env_lines(creds: dict, cred_number: int) -> list[str]:
-    """Build .env lines for a Qwen Code credential."""
-    email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-    numbered_prefix = f"QWEN_CODE_{cred_number}"
-    env_lines = [
-        f"# QWEN_CODE Credential #{cred_number} for: {email}",
-        f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
-        "",
-        f"{numbered_prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
-        f"{numbered_prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
-        f"{numbered_prefix}_EXPIRY_DATE={creds.get('expiry_date', 0)}",
-        f"{numbered_prefix}_RESOURCE_URL={creds.get('resource_url', 'https://portal.qwen.ai/v1')}",
-        f"{numbered_prefix}_EMAIL={email}",
-    ]
-    return env_lines
-def _build_iflow_env_lines(creds: dict, cred_number: int) -> list[str]:
-    """Build .env lines for an iFlow credential."""
-    email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-    numbered_prefix = f"IFLOW_{cred_number}"
-    env_lines = [
-        f"# IFLOW Credential #{cred_number} for: {email}",
-        f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
-        "",
-        f"{numbered_prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
-        f"{numbered_prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
-        f"{numbered_prefix}_API_KEY={creds.get('api_key', '')}",
-        f"{numbered_prefix}_EXPIRY_DATE={creds.get('expiry_date', '')}",
-        f"{numbered_prefix}_EMAIL={email}",
-        f"{numbered_prefix}_TOKEN_TYPE={creds.get('token_type', 'Bearer')}",
-        f"{numbered_prefix}_SCOPE={creds.get('scope', 'read write')}",
-    ]
-    return env_lines
-def _build_antigravity_env_lines(creds: dict, cred_number: int) -> list[str]:
-    """Build .env lines for an Antigravity credential."""
-    email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-    env_lines, _ = _build_env_export_content(
-        provider_prefix="ANTIGRAVITY",
-        cred_number=cred_number,
-        creds=creds,
-        email=email,
-        extra_fields=None,
-        include_client_creds=True
-    )
-    return env_lines
 async def export_all_provider_credentials(provider_name: str):
     """
     Export all credentials for a specific provider to individual .env files.
     """
-    provider_config = {
-        "gemini_cli": ("GEMINI_CLI", _build_gemini_cli_env_lines),
-        "qwen_code": ("QWEN_CODE", _build_qwen_code_env_lines),
-        "iflow": ("IFLOW", _build_iflow_env_lines),
-        "antigravity": ("ANTIGRAVITY", _build_antigravity_env_lines),
-    }
-    if provider_name not in provider_config:
         console.print(f"[bold red]Unknown provider: {provider_name}[/bold red]")
         return
-    prefix, build_func = provider_config[provider_name]
-    display_name = prefix.replace("_", " ").title()
-    console.print(Panel(f"[bold cyan]Export All {display_name} Credentials[/bold cyan]", expand=False))
-    # Find all credentials for this provider
-    cred_files = sorted(list(OAUTH_BASE_DIR.glob(f"{provider_name}_oauth_*.json")))
-    if not cred_files:
-        console.print(Panel(f"No {display_name} credentials found.", style="bold red", title="No Credentials"))
         return
     exported_count = 0
-    for cred_file in cred_files:
         try:
-            with open(cred_file, 'r') as f:
-                creds = json.load(f)
-            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
-            cred_number = _get_credential_number_from_filename(cred_file.name)
-            # Generate .env file name
-            safe_email = email.replace("@", "_at_").replace(".", "_")
-            env_filename = f"{provider_name}_{cred_number}_{safe_email}.env"
-            env_filepath = OAUTH_BASE_DIR / env_filename
-            # Build and write .env content
-            env_lines = build_func(creds, cred_number)
-            with open(env_filepath, 'w') as f:
-                f.write('\n'.join(env_lines))
-            console.print(f"  ✓ Exported [cyan]{cred_file.name}[/cyan] → [yellow]{env_filename}[/yellow]")
-            exported_count += 1
         except Exception as e:
-            console.print(f"  ✗ Failed to export {cred_file.name}: {e}")
-    console.print(Panel(
-        f"Successfully exported {exported_count}/{len(cred_files)} {display_name} credentials to individual .env files.",
-        style="bold green", title="Export Complete"
-    ))
 async def combine_provider_credentials(provider_name: str):
     """
     Combine all credentials for a specific provider into a single .env file.
     """
-    provider_config = {
-        "gemini_cli": ("GEMINI_CLI", _build_gemini_cli_env_lines),
-        "qwen_code": ("QWEN_CODE", _build_qwen_code_env_lines),
-        "iflow": ("IFLOW", _build_iflow_env_lines),
-        "antigravity": ("ANTIGRAVITY", _build_antigravity_env_lines),
-    }
-    if provider_name not in provider_config:
         console.print(f"[bold red]Unknown provider: {provider_name}[/bold red]")
         return
-    prefix, build_func = provider_config[provider_name]
-    display_name = prefix.replace("_", " ").title()
-    console.print(Panel(f"[bold cyan]Combine All {display_name} Credentials[/bold cyan]", expand=False))
-    # Find all credentials for this provider
-    cred_files = sorted(list(OAUTH_BASE_DIR.glob(f"{provider_name}_oauth_*.json")))
-    if not cred_files:
-        console.print(Panel(f"No {display_name} credentials found.", style="bold red", title="No Credentials"))
         return
     combined_lines = [
         f"# Combined {display_name} Credentials",
         f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
-        f"# Total credentials: {len(cred_files)}",
         "#",
         "# Copy all lines below into your main .env file",
         "",
     ]
     combined_count = 0
-    for cred_file in cred_files:
         try:
-            with open(cred_file, 'r') as f:
                 creds = json.load(f)
-            cred_number = _get_credential_number_from_filename(cred_file.name)
-            env_lines = build_func(creds, cred_number)
             combined_lines.extend(env_lines)
             combined_lines.append("")  # Blank line between credentials
             combined_count += 1
         except Exception as e:
-            console.print(f"  ✗ Failed to process {cred_file.name}: {e}")
     # Write combined file
     combined_filename = f"{provider_name}_all_combined.env"
-    combined_filepath = OAUTH_BASE_DIR / combined_filename
-    with open(combined_filepath, 'w') as f:
-        f.write('\n'.join(combined_lines))
-    console.print(Panel(
-        Text.from_markup(
-            f"Successfully combined {combined_count} {display_name} credentials into:\n"
-            f"[bold yellow]{combined_filepath}[/bold yellow]\n\n"
-            f"[bold]To use:[/bold] Copy the contents into your main .env file."
-        ),
-        style="bold green", title="Combine Complete"
-    ))
 async def combine_all_credentials():
     """
     Combine ALL credentials from ALL providers into a single .env file.
     """
-    console.print(Panel("[bold cyan]Combine All Provider Credentials[/bold cyan]", expand=False))
-    provider_config = {
-        "gemini_cli": ("GEMINI_CLI", _build_gemini_cli_env_lines),
-        "qwen_code": ("QWEN_CODE", _build_qwen_code_env_lines),
-        "iflow": ("IFLOW", _build_iflow_env_lines),
-        "antigravity": ("ANTIGRAVITY", _build_antigravity_env_lines),
-    }
     combined_lines = [
         "# Combined All Provider Credentials",
         f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
@@ -984,63 +930,83 @@ async def combine_all_credentials():
         "# Copy all lines below into your main .env file",
         "",
     ]
     total_count = 0
     provider_counts = {}
-    for provider_name, (prefix, build_func) in provider_config.items():
-        cred_files = sorted(list(OAUTH_BASE_DIR.glob(f"{provider_name}_oauth_*.json")))
-        if not cred_files:
             continue
-        display_name = prefix.replace("_", " ").title()
         combined_lines.append(f"# ===== {display_name} Credentials =====")
         combined_lines.append("")
         provider_count = 0
-        for cred_file in cred_files:
             try:
-                with open(cred_file, 'r') as f:
                     creds = json.load(f)
-                cred_number = _get_credential_number_from_filename(cred_file.name)
-                env_lines = build_func(creds, cred_number)
                 combined_lines.extend(env_lines)
                 combined_lines.append("")
                 provider_count += 1
                 total_count += 1
             except Exception as e:
-                console.print(f"  ✗ Failed to process {cred_file.name}: {e}")
         provider_counts[display_name] = provider_count
     if total_count == 0:
-        console.print(Panel("No credentials found to combine.", style="bold red", title="No Credentials"))
         return
     # Write combined file
     combined_filename = "all_providers_combined.env"
-    combined_filepath = OAUTH_BASE_DIR / combined_filename
-    with open(combined_filepath, 'w') as f:
-        f.write('\n'.join(combined_lines))
     # Build summary
-    summary_lines = [f"  • {name}: {count} credential(s)" for name, count in provider_counts.items()]
     summary = "\n".join(summary_lines)
-    console.print(Panel(
-        Text.from_markup(
-            f"Successfully combined {total_count} credentials from {len(provider_counts)} providers:\n"
-            f"{summary}\n\n"
-            f"[bold]Output file:[/bold] [yellow]{combined_filepath}[/yellow]\n\n"
-            f"[bold]To use:[/bold] Copy the contents into your main .env file."
-        ),
-        style="bold green", title="Combine Complete"
-    ))
 async def export_credentials_submenu():
@@ -1049,40 +1015,65 @@ async def export_credentials_submenu():
     """
     while True:
         clear_screen()
-        console.print(Panel("[bold cyan]Export Credentials to .env[/bold cyan]", title="--- API Key Proxy ---", expand=False))
-        console.print(Panel(
-            Text.from_markup(
-                "[bold]Individual Exports:[/bold]\n"
-                "1. Export Gemini CLI credential\n"
-                "2. Export Qwen Code credential\n"
-                "3. Export iFlow credential\n"
-                "4. Export Antigravity credential\n"
-                "\n"
-                "[bold]Bulk Exports (per provider):[/bold]\n"
-                "5. Export ALL Gemini CLI credentials\n"
-                "6. Export ALL Qwen Code credentials\n"
-                "7. Export ALL iFlow credentials\n"
-                "8. Export ALL Antigravity credentials\n"
-                "\n"
-                "[bold]Combine Credentials:[/bold]\n"
-                "9. Combine all Gemini CLI into one file\n"
-                "10. Combine all Qwen Code into one file\n"
-                "11. Combine all iFlow into one file\n"
-                "12. Combine all Antigravity into one file\n"
-                "13. Combine ALL providers into one file"
-            ),
-            title="Choose export option",
-            style="bold blue"
-        ))
         export_choice = Prompt.ask(
-            Text.from_markup("[bold]Please select an option or type [red]'b'[/red] to go back[/bold]"),
-            choices=["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "b"],
-            show_choices=False
         )
-        if export_choice.lower() == 'b':
             break
         # Individual exports
@@ -1146,39 +1137,53 @@ async def export_credentials_submenu():
 async def main(clear_on_start=True):
     """
     An interactive CLI tool to add new credentials.
     Args:
-        clear_on_start: If False, skip initial screen clear (used when called from launcher
                        to preserve the loading screen)
     """
     ensure_env_defaults()
     # Only show header if we're clearing (standalone mode)
     if clear_on_start:
-        console.print(Panel("[bold cyan]Interactive Credential Setup[/bold cyan]", title="--- API Key Proxy ---", expand=False))
     while True:
         # Clear screen between menu selections for cleaner UX
         clear_screen()
-        console.print(Panel("[bold cyan]Interactive Credential Setup[/bold cyan]", title="--- API Key Proxy ---", expand=False))
-        console.print(Panel(
-            Text.from_markup(
-                "1. Add OAuth Credential\n"
-                "2. Add API Key\n"
-                "3. Export Credentials"
-            ),
-            title="Choose credential type",
-            style="bold blue"
-        ))
         setup_type = Prompt.ask(
-            Text.from_markup("[bold]Please select an option or type [red]'q'[/red] to quit[/bold]"),
             choices=["1", "2", "3", "q"],
-            show_choices=False
         )
-        if setup_type.lower() == 'q':
             break
         if setup_type == "1":
@@ -1190,69 +1195,88 @@ async def main(clear_on_start=True):
                 "iflow": "iFlow (OAuth - also supports API keys)",
                 "antigravity": "Antigravity (OAuth)",
             }
             provider_text = Text()
             for i, provider in enumerate(available_providers):
-                display_name = oauth_friendly_names.get(provider, provider.replace('_', ' ').title())
                 provider_text.append(f"  {i + 1}. {display_name}\n")
-            console.print(Panel(provider_text, title="Available Providers for OAuth", style="bold blue"))
             choice = Prompt.ask(
-                Text.from_markup("[bold]Please select a provider or type [red]'b'[/red] to go back[/bold]"),
                 choices=[str(i + 1) for i in range(len(available_providers))] + ["b"],
-                show_choices=False
             )
-            if choice.lower() == 'b':
                 continue
             try:
                 choice_index = int(choice) - 1
                 if 0 <= choice_index < len(available_providers):
                     provider_name = available_providers[choice_index]
-                    display_name = oauth_friendly_names.get(provider_name, provider_name.replace('_', ' ').title())
-                    console.print(f"\nStarting OAuth setup for [bold cyan]{display_name}[/bold cyan]...")
                     await setup_new_credential(provider_name)
                     # Don't clear after OAuth - user needs to see full flow
                     console.print("\n[dim]Press Enter to return to main menu...[/dim]")
                     input()
                 else:
-                    console.print("[bold red]Invalid choice. Please try again.[/bold red]")
                     await asyncio.sleep(1.5)
             except ValueError:
-                console.print("[bold red]Invalid input. Please enter a number or 'b'.[/bold red]")
                 await asyncio.sleep(1.5)
         elif setup_type == "2":
             await setup_api_key()
-            #console.print("\n[dim]Press Enter to return to main menu...[/dim]")
-            #input()
         elif setup_type == "3":
             await export_credentials_submenu()
 def run_credential_tool(from_launcher=False):
     """
     Entry point for credential tool.
     Args:
         from_launcher: If True, skip loading screen (launcher already showed it)
     """
     # Check if we need to show loading screen
     if not from_launcher:
         # Standalone mode - show full loading UI
-        os.system('cls' if os.name == 'nt' else 'clear')
         _start_time = time.time()
         # Phase 1: Show initial message
         print("━" * 70)
         print("Interactive Credential Setup Tool")
         print("GitHub: https://github.com/Mirrowel/LLM-API-Key-Proxy")
         print("━" * 70)
         print("Loading credential management components...")
         # Phase 2: Load dependencies with spinner
         with console.status("Loading authentication providers...", spinner="dots"):
             _ensure_providers_loaded()
@@ -1261,14 +1285,16 @@ def run_credential_tool(from_launcher=False):
         with console.status("Initializing credential tool...", spinner="dots"):
             time.sleep(0.2)  # Brief pause for UI consistency
         console.print("✓ Credential tool initialized")
         _elapsed = time.time() - _start_time
         _, PROVIDER_PLUGINS = _ensure_providers_loaded()
-        print(f"✓ Tool ready in {_elapsed:.2f}s ({len(PROVIDER_PLUGINS)} providers available)")
         # Small delay to let user see the ready message
         time.sleep(0.5)
     # Run the main async event loop
     # If from launcher, don't clear screen at start to preserve loading messages
     try:

 import asyncio
 import json
 import os
 import time
 from pathlib import Path
 from dotenv import set_key, get_key
+# NOTE: Heavy imports (provider_factory, PROVIDER_PLUGINS) are deferred
 # to avoid 6-7 second delay before showing loading screen
 from rich.console import Console
 from rich.panel import Panel
 from rich.prompt import Prompt
 from rich.text import Text
+from .utils.paths import get_oauth_dir, get_data_file
+def _get_oauth_base_dir() -> Path:
+    """Get the OAuth base directory (lazy, respects EXE vs script mode)."""
+    oauth_dir = get_oauth_dir()
+    oauth_dir.mkdir(parents=True, exist_ok=True)
+    return oauth_dir
+def _get_env_file() -> Path:
+    """Get the .env file path (lazy, respects EXE vs script mode)."""
+    return get_data_file(".env")
 console = Console()
 _provider_factory = None
 _provider_plugins = None
 def _ensure_providers_loaded():
     """Lazy load provider modules only when needed"""
     global _provider_factory, _provider_plugins
     if _provider_factory is None:
         from . import provider_factory as pf
         from .providers import PROVIDER_PLUGINS as pp
         _provider_factory = pf
         _provider_plugins = pp
     return _provider_factory, _provider_plugins
 def clear_screen():
     """
+    Cross-platform terminal clear that works robustly on both
     classic Windows conhost and modern terminals (Windows Terminal, Linux, Mac).
     Uses native OS commands instead of ANSI escape sequences:
     - Windows (conhost & Windows Terminal): cls
     - Unix-like systems (Linux, Mac): clear
     """
+    os.system("cls" if os.name == "nt" else "clear")
 def ensure_env_defaults():
     """
     Ensures the .env file exists and contains essential default values like PROXY_API_KEY.
     """
+    if not _get_env_file().is_file():
+        _get_env_file().touch()
+        console.print(
+            f"Creating a new [bold yellow]{_get_env_file().name}[/bold yellow] file..."
+        )
     # Check for PROXY_API_KEY, similar to setup_env.bat
+    if get_key(str(_get_env_file()), "PROXY_API_KEY") is None:
         default_key = "VerysecretKey"
+        console.print(
+            f"Adding default [bold cyan]PROXY_API_KEY[/bold cyan] to [bold yellow]{_get_env_file().name}[/bold yellow]..."
+        )
+        set_key(str(_get_env_file()), "PROXY_API_KEY", default_key)
 async def setup_api_key():
     """
     # Verified list of LiteLLM providers with their friendly names and API key variables
     LITELLM_PROVIDERS = {
+        "OpenAI": "OPENAI_API_KEY",
+        "Anthropic": "ANTHROPIC_API_KEY",
+        "Google AI Studio (Gemini)": "GEMINI_API_KEY",
+        "Azure OpenAI": "AZURE_API_KEY",
+        "Vertex AI": "GOOGLE_API_KEY",
+        "AWS Bedrock": "AWS_ACCESS_KEY_ID",
+        "Cohere": "COHERE_API_KEY",
+        "Chutes": "CHUTES_API_KEY",
         "Mistral AI": "MISTRAL_API_KEY",
+        "Codestral (Mistral)": "CODESTRAL_API_KEY",
+        "Groq": "GROQ_API_KEY",
+        "Perplexity": "PERPLEXITYAI_API_KEY",
+        "xAI": "XAI_API_KEY",
+        "Together AI": "TOGETHERAI_API_KEY",
+        "Fireworks AI": "FIREWORKS_AI_API_KEY",
+        "Replicate": "REPLICATE_API_KEY",
+        "Hugging Face": "HUGGINGFACE_API_KEY",
+        "Anyscale": "ANYSCALE_API_KEY",
+        "NVIDIA NIM": "NVIDIA_NIM_API_KEY",
+        "Deepseek": "DEEPSEEK_API_KEY",
+        "AI21": "AI21_API_KEY",
+        "Cerebras": "CEREBRAS_API_KEY",
+        "Moonshot": "MOONSHOT_API_KEY",
+        "Ollama": "OLLAMA_API_KEY",
+        "Xinference": "XINFERENCE_API_KEY",
+        "Infinity": "INFINITY_API_KEY",
+        "OpenRouter": "OPENROUTER_API_KEY",
+        "Deepinfra": "DEEPINFRA_API_KEY",
+        "Cloudflare": "CLOUDFLARE_API_KEY",
+        "Baseten": "BASETEN_API_KEY",
+        "Modal": "MODAL_API_KEY",
+        "Databricks": "DATABRICKS_API_KEY",
+        "AWS SageMaker": "AWS_ACCESS_KEY_ID",
+        "IBM watsonx.ai": "WATSONX_APIKEY",
+        "Predibase": "PREDIBASE_API_KEY",
+        "Clarifai": "CLARIFAI_API_KEY",
+        "NLP Cloud": "NLP_CLOUD_API_KEY",
+        "Voyage AI": "VOYAGE_API_KEY",
+        "Jina AI": "JINA_API_KEY",
+        "Hyperbolic": "HYPERBOLIC_API_KEY",
+        "Morph": "MORPH_API_KEY",
+        "Lambda AI": "LAMBDA_API_KEY",
+        "Novita AI": "NOVITA_API_KEY",
+        "Aleph Alpha": "ALEPH_ALPHA_API_KEY",
+        "SambaNova": "SAMBANOVA_API_KEY",
+        "FriendliAI": "FRIENDLI_TOKEN",
+        "Galadriel": "GALADRIEL_API_KEY",
+        "CompactifAI": "COMPACTIFAI_API_KEY",
+        "Lemonade": "LEMONADE_API_KEY",
+        "GradientAI": "GRADIENTAI_API_KEY",
+        "Featherless AI": "FEATHERLESS_AI_API_KEY",
+        "Nebius AI Studio": "NEBIUS_API_KEY",
+        "Dashscope (Qwen)": "DASHSCOPE_API_KEY",
+        "Bytez": "BYTEZ_API_KEY",
+        "Oracle OCI": "OCI_API_KEY",
+        "DataRobot": "DATAROBOT_API_KEY",
+        "OVHCloud": "OVHCLOUD_API_KEY",
+        "Volcengine": "VOLCENGINE_API_KEY",
+        "Snowflake": "SNOWFLAKE_API_KEY",
+        "Nscale": "NSCALE_API_KEY",
+        "Recraft": "RECRAFT_API_KEY",
+        "v0": "V0_API_KEY",
+        "Vercel": "VERCEL_AI_GATEWAY_API_KEY",
+        "Topaz": "TOPAZ_API_KEY",
+        "ElevenLabs": "ELEVENLABS_API_KEY",
         "Deepgram": "DEEPGRAM_API_KEY",
+        "GitHub Models": "GITHUB_TOKEN",
+        "GitHub Copilot": "GITHUB_COPILOT_API_KEY",
     }
     # Discover custom providers and add them to the list
     # qwen_code API key support is a fallback
     # iflow API key support is a feature
     _, PROVIDER_PLUGINS = _ensure_providers_loaded()
     # Build a set of environment variables already in LITELLM_PROVIDERS
     # to avoid duplicates based on the actual API key names
     litellm_env_vars = set(LITELLM_PROVIDERS.values())
     # Providers to exclude from API key list
     exclude_providers = {
+        "gemini_cli",  # OAuth-only
+        "antigravity",  # OAuth-only
+        "qwen_code",  # API key is fallback, OAuth is primary - don't advertise
+        "openai_compatible",  # Base class, not a real provider
     }
     discovered_providers = {}
     for provider_key in PROVIDER_PLUGINS.keys():
         if provider_key in exclude_providers:
             continue
         # Create environment variable name
         env_var = provider_key.upper() + "_API_KEY"
         # Check if this env var already exists in LITELLM_PROVIDERS
         # This catches duplicates like GEMINI_API_KEY, MISTRAL_API_KEY, etc.
         if env_var in litellm_env_vars:
             # Already in LITELLM_PROVIDERS with better name, skip this one
             continue
         # Create display name for this custom provider
+        display_name = provider_key.replace("_", " ").title()
         discovered_providers[display_name] = env_var
     # LITELLM_PROVIDERS takes precedence (comes first in merge)
     combined_providers = {**LITELLM_PROVIDERS, **discovered_providers}
     provider_display_list = sorted(combined_providers.keys())
         else:
             provider_text.append(f"  {i + 1}. {provider_name}\n")
+    console.print(
+        Panel(provider_text, title="Available Providers for API Key", style="bold blue")
+    )
     choice = Prompt.ask(
+        Text.from_markup(
+            "[bold]Please select a provider or type [red]'b'[/red] to go back[/bold]"
+        ),
         choices=[str(i + 1) for i in range(len(provider_display_list))] + ["b"],
+        show_choices=False,
     )
+    if choice.lower() == "b":
         return
     try:
             api_key = Prompt.ask(f"Enter the API key for {display_name}")
             # Check for duplicate API key value
+            if _get_env_file().is_file():
+                with open(_get_env_file(), "r") as f:
                     for line in f:
                         line = line.strip()
                         if line.startswith(api_var_base) and "=" in line:
+                            existing_key_name, _, existing_key_value = line.partition(
+                                "="
+                            )
                             if existing_key_value == api_key:
+                                warning_text = Text.from_markup(
+                                    f"This API key already exists as [bold yellow]'{existing_key_name}'[/bold yellow]. Overwriting..."
+                                )
+                                console.print(
+                                    Panel(
+                                        warning_text,
+                                        style="bold yellow",
+                                        title="Updating API Key",
+                                    )
+                                )
+                                set_key(
+                                    str(_get_env_file()), existing_key_name, api_key
+                                )
+                                success_text = Text.from_markup(
+                                    f"Successfully updated existing key [bold yellow]'{existing_key_name}'[/bold yellow]."
+                                )
+                                console.print(
+                                    Panel(
+                                        success_text,
+                                        style="bold green",
+                                        title="Success",
+                                    )
+                                )
                                 return
             # Special handling for AWS
             if display_name in ["AWS Bedrock", "AWS SageMaker"]:
+                console.print(
+                    Panel(
+                        Text.from_markup(
+                            "This provider requires both an Access Key ID and a Secret Access Key.\n"
+                            f"The key you entered will be saved as [bold yellow]{api_var_base}_1[/bold yellow].\n"
+                            "Please manually add the [bold cyan]AWS_SECRET_ACCESS_KEY_1[/bold cyan] to your .env file."
+                        ),
+                        title="[bold yellow]Additional Step Required[/bold yellow]",
+                        border_style="yellow",
+                    )
+                )
             key_index = 1
             while True:
                 key_name = f"{api_var_base}_{key_index}"
+                if _get_env_file().is_file():
+                    with open(_get_env_file(), "r") as f:
                         if not any(line.startswith(f"{key_name}=") for line in f):
                             break
                 else:
                     break
                 key_index += 1
             key_name = f"{api_var_base}_{key_index}"
+            set_key(str(_get_env_file()), key_name, api_key)
+            success_text = Text.from_markup(
+                f"Successfully added {display_name} API key as [bold yellow]'{key_name}'[/bold yellow]."
+            )
             console.print(Panel(success_text, style="bold green", title="Success"))
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
+        console.print(
+            "[bold red]Invalid input. Please enter a number or 'b'.[/bold red]"
+        )
 async def setup_new_credential(provider_name: str):
     """
     Interactively sets up a new OAuth credential for a given provider.
+    Delegates all credential management logic to the auth class's setup_credential() method.
     """
     try:
         provider_factory, _ = _ensure_providers_loaded()
             "gemini_cli": "Gemini CLI (OAuth)",
             "qwen_code": "Qwen Code (OAuth - also supports API keys)",
             "iflow": "iFlow (OAuth - also supports API keys)",
+            "antigravity": "Antigravity (OAuth)",
         }
+        display_name = oauth_friendly_names.get(
+            provider_name, provider_name.replace("_", " ").title()
+        )
+        # Call the auth class's setup_credential() method which handles the entire flow:
+        # - OAuth authentication
+        # - Email extraction for deduplication
+        # - File path determination (new or existing)
+        # - Credential file saving
+        # - Post-auth discovery (tier/project for Google OAuth providers)
+        result = await auth_instance.setup_credential(_get_oauth_base_dir())
+        if not result.success:
+            console.print(
+                Panel(
+                    f"Credential setup failed: {result.error}",
+                    style="bold red",
+                    title="Error",
+                )
+            )
             return
+        # Display success message with details
+        if result.is_update:
+            success_text = Text.from_markup(
+                f"Successfully updated credential at [bold yellow]'{Path(result.file_path).name}'[/bold yellow] "
+                f"for user [bold cyan]'{result.email}'[/bold cyan]."
+            )
+        else:
+            success_text = Text.from_markup(
+                f"Successfully created new credential at [bold yellow]'{Path(result.file_path).name}'[/bold yellow] "
+                f"for user [bold cyan]'{result.email}'[/bold cyan]."
+            )
+        # Add tier/project info if available (Google OAuth providers)
+        if hasattr(result, "tier") and result.tier:
+            success_text.append(f"\nTier: {result.tier}")
+        if hasattr(result, "project_id") and result.project_id:
+            success_text.append(f"\nProject: {result.project_id}")
         console.print(Panel(success_text, style="bold green", title="Success"))
     except Exception as e:
+        console.print(
+            Panel(
+                f"An error occurred during setup for {provider_name}: {e}",
+                style="bold red",
+                title="Error",
+            )
+        )
 async def export_gemini_cli_to_env():
     """
     Export a Gemini CLI credential JSON file to .env format.
+    Uses the auth class's build_env_lines() and list_credentials() methods.
     """
+    console.print(
+        Panel(
+            "[bold cyan]Export Gemini CLI Credential to .env[/bold cyan]", expand=False
+        )
+    )
+    # Get auth instance for this provider
+    provider_factory, _ = _ensure_providers_loaded()
+    auth_class = provider_factory.get_provider_auth_class("gemini_cli")
+    auth_instance = auth_class()
+    # List available credentials using auth class
+    credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+    if not credentials:
+        console.print(
+            Panel(
+                "No Gemini CLI credentials found. Please add one first using 'Add OAuth Credential'.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     # Display available credentials
     cred_text = Text()
+    for i, cred_info in enumerate(credentials):
+        cred_text.append(
+            f"  {i + 1}. {Path(cred_info['file_path']).name} ({cred_info['email']})\n"
+        )
+    console.print(
+        Panel(cred_text, title="Available Gemini CLI Credentials", style="bold blue")
+    )
     choice = Prompt.ask(
+        Text.from_markup(
+            "[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"
+        ),
+        choices=[str(i + 1) for i in range(len(credentials))] + ["b"],
+        show_choices=False,
     )
+    if choice.lower() == "b":
         return
     try:
         choice_index = int(choice) - 1
+        if 0 <= choice_index < len(credentials):
+            cred_info = credentials[choice_index]
+            # Use auth class to export
+            env_path = auth_instance.export_credential_to_env(
+                cred_info["file_path"], _get_oauth_base_dir()
             )
+            if env_path:
+                numbered_prefix = f"GEMINI_CLI_{cred_info['number']}"
+                success_text = Text.from_markup(
+                    f"Successfully exported credential to [bold yellow]'{Path(env_path).name}'[/bold yellow]\n\n"
+                    f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
+                    f"[bold]To use this credential:[/bold]\n"
+                    f"1. Copy the contents to your main .env file, OR\n"
+                    f"2. Source it: [bold cyan]source {Path(env_path).name}[/bold cyan] (Linux/Mac)\n"
+                    f"3. Or on Windows: [bold cyan]Get-Content {Path(env_path).name} | ForEach-Object {{ $_ -replace '^([^#].*)$', 'set $1' }} | cmd[/bold cyan]\n\n"
+                    f"[bold]To combine multiple credentials:[/bold]\n"
+                    f"Copy lines from multiple .env files into one file.\n"
+                    f"Each credential uses a unique number ({numbered_prefix}_*)."
+                )
+                console.print(Panel(success_text, style="bold green", title="Success"))
+            else:
+                console.print(
+                    Panel(
+                        "Failed to export credential", style="bold red", title="Error"
+                    )
+                )
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
+        console.print(
+            "[bold red]Invalid input. Please enter a number or 'b'.[/bold red]"
+        )
     except Exception as e:
+        console.print(
+            Panel(
+                f"An error occurred during export: {e}", style="bold red", title="Error"
+            )
+        )
 async def export_qwen_code_to_env():
     """
     Export a Qwen Code credential JSON file to .env format.
+    Uses the auth class's build_env_lines() and list_credentials() methods.
     """
+    console.print(
+        Panel(
+            "[bold cyan]Export Qwen Code Credential to .env[/bold cyan]", expand=False
+        )
+    )
+    # Get auth instance for this provider
+    provider_factory, _ = _ensure_providers_loaded()
+    auth_class = provider_factory.get_provider_auth_class("qwen_code")
+    auth_instance = auth_class()
+    # List available credentials using auth class
+    credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+    if not credentials:
+        console.print(
+            Panel(
+                "No Qwen Code credentials found. Please add one first using 'Add OAuth Credential'.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     # Display available credentials
     cred_text = Text()
+    for i, cred_info in enumerate(credentials):
+        cred_text.append(
+            f"  {i + 1}. {Path(cred_info['file_path']).name} ({cred_info['email']})\n"
+        )
+    console.print(
+        Panel(cred_text, title="Available Qwen Code Credentials", style="bold blue")
+    )
     choice = Prompt.ask(
+        Text.from_markup(
+            "[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"
+        ),
+        choices=[str(i + 1) for i in range(len(credentials))] + ["b"],
+        show_choices=False,
     )
+    if choice.lower() == "b":
         return
     try:
         choice_index = int(choice) - 1
+        if 0 <= choice_index < len(credentials):
+            cred_info = credentials[choice_index]
+            # Use auth class to export
+            env_path = auth_instance.export_credential_to_env(
+                cred_info["file_path"], _get_oauth_base_dir()
             )
+            if env_path:
+                numbered_prefix = f"QWEN_CODE_{cred_info['number']}"
+                success_text = Text.from_markup(
+                    f"Successfully exported credential to [bold yellow]'{Path(env_path).name}'[/bold yellow]\n\n"
+                    f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
+                    f"[bold]To use this credential:[/bold]\n"
+                    f"1. Copy the contents to your main .env file, OR\n"
+                    f"2. Source it: [bold cyan]source {Path(env_path).name}[/bold cyan] (Linux/Mac)\n\n"
+                    f"[bold]To combine multiple credentials:[/bold]\n"
+                    f"Copy lines from multiple .env files into one file.\n"
+                    f"Each credential uses a unique number ({numbered_prefix}_*)."
+                )
+                console.print(Panel(success_text, style="bold green", title="Success"))
+            else:
+                console.print(
+                    Panel(
+                        "Failed to export credential", style="bold red", title="Error"
+                    )
+                )
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
+        console.print(
+            "[bold red]Invalid input. Please enter a number or 'b'.[/bold red]"
+        )
     except Exception as e:
+        console.print(
+            Panel(
+                f"An error occurred during export: {e}", style="bold red", title="Error"
+            )
+        )
 async def export_iflow_to_env():
     """
     Export an iFlow credential JSON file to .env format.
+    Uses the auth class's build_env_lines() and list_credentials() methods.
     """
+    console.print(
+        Panel("[bold cyan]Export iFlow Credential to .env[/bold cyan]", expand=False)
+    )
+    # Get auth instance for this provider
+    provider_factory, _ = _ensure_providers_loaded()
+    auth_class = provider_factory.get_provider_auth_class("iflow")
+    auth_instance = auth_class()
+    # List available credentials using auth class
+    credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+    if not credentials:
+        console.print(
+            Panel(
+                "No iFlow credentials found. Please add one first using 'Add OAuth Credential'.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     # Display available credentials
     cred_text = Text()
+    for i, cred_info in enumerate(credentials):
+        cred_text.append(
+            f"  {i + 1}. {Path(cred_info['file_path']).name} ({cred_info['email']})\n"
+        )
+    console.print(
+        Panel(cred_text, title="Available iFlow Credentials", style="bold blue")
+    )
     choice = Prompt.ask(
+        Text.from_markup(
+            "[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"
+        ),
+        choices=[str(i + 1) for i in range(len(credentials))] + ["b"],
+        show_choices=False,
     )
+    if choice.lower() == "b":
         return
     try:
         choice_index = int(choice) - 1
+        if 0 <= choice_index < len(credentials):
+            cred_info = credentials[choice_index]
+            # Use auth class to export
+            env_path = auth_instance.export_credential_to_env(
+                cred_info["file_path"], _get_oauth_base_dir()
             )
+            if env_path:
+                numbered_prefix = f"IFLOW_{cred_info['number']}"
+                success_text = Text.from_markup(
+                    f"Successfully exported credential to [bold yellow]'{Path(env_path).name}'[/bold yellow]\n\n"
+                    f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
+                    f"[bold]To use this credential:[/bold]\n"
+                    f"1. Copy the contents to your main .env file, OR\n"
+                    f"2. Source it: [bold cyan]source {Path(env_path).name}[/bold cyan] (Linux/Mac)\n\n"
+                    f"[bold]To combine multiple credentials:[/bold]\n"
+                    f"Copy lines from multiple .env files into one file.\n"
+                    f"Each credential uses a unique number ({numbered_prefix}_*)."
+                )
+                console.print(Panel(success_text, style="bold green", title="Success"))
+            else:
+                console.print(
+                    Panel(
+                        "Failed to export credential", style="bold red", title="Error"
+                    )
+                )
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
+        console.print(
+            "[bold red]Invalid input. Please enter a number or 'b'.[/bold red]"
+        )
     except Exception as e:
+        console.print(
+            Panel(
+                f"An error occurred during export: {e}", style="bold red", title="Error"
+            )
+        )
 async def export_antigravity_to_env():
     """
     Export an Antigravity credential JSON file to .env format.
+    Uses the auth class's build_env_lines() and list_credentials() methods.
     """
+    console.print(
+        Panel(
+            "[bold cyan]Export Antigravity Credential to .env[/bold cyan]", expand=False
+        )
+    )
+    # Get auth instance for this provider
+    provider_factory, _ = _ensure_providers_loaded()
+    auth_class = provider_factory.get_provider_auth_class("antigravity")
+    auth_instance = auth_class()
+    # List available credentials using auth class
+    credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+    if not credentials:
+        console.print(
+            Panel(
+                "No Antigravity credentials found. Please add one first using 'Add OAuth Credential'.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     # Display available credentials
     cred_text = Text()
+    for i, cred_info in enumerate(credentials):
+        cred_text.append(
+            f"  {i + 1}. {Path(cred_info['file_path']).name} ({cred_info['email']})\n"
+        )
+    console.print(
+        Panel(cred_text, title="Available Antigravity Credentials", style="bold blue")
+    )
     choice = Prompt.ask(
+        Text.from_markup(
+            "[bold]Please select a credential to export or type [red]'b'[/red] to go back[/bold]"
+        ),
+        choices=[str(i + 1) for i in range(len(credentials))] + ["b"],
+        show_choices=False,
     )
+    if choice.lower() == "b":
         return
     try:
         choice_index = int(choice) - 1
+        if 0 <= choice_index < len(credentials):
+            cred_info = credentials[choice_index]
+            # Use auth class to export
+            env_path = auth_instance.export_credential_to_env(
+                cred_info["file_path"], _get_oauth_base_dir()
             )
+            if env_path:
+                numbered_prefix = f"ANTIGRAVITY_{cred_info['number']}"
+                success_text = Text.from_markup(
+                    f"Successfully exported credential to [bold yellow]'{Path(env_path).name}'[/bold yellow]\n\n"
+                    f"[bold]Environment variable prefix:[/bold] [cyan]{numbered_prefix}_*[/cyan]\n\n"
+                    f"[bold]To use this credential:[/bold]\n"
+                    f"1. Copy the contents to your main .env file, OR\n"
+                    f"2. Source it: [bold cyan]source {Path(env_path).name}[/bold cyan] (Linux/Mac)\n"
+                    f"3. Or on Windows: [bold cyan]Get-Content {Path(env_path).name} | ForEach-Object {{ $_ -replace '^([^#].*)$', 'set $1' }} | cmd[/bold cyan]\n\n"
+                    f"[bold]To combine multiple credentials:[/bold]\n"
+                    f"Copy lines from multiple .env files into one file.\n"
+                    f"Each credential uses a unique number ({numbered_prefix}_*)."
+                )
+                console.print(Panel(success_text, style="bold green", title="Success"))
+            else:
+                console.print(
+                    Panel(
+                        "Failed to export credential", style="bold red", title="Error"
+                    )
+                )
         else:
             console.print("[bold red]Invalid choice. Please try again.[/bold red]")
     except ValueError:
+        console.print(
+            "[bold red]Invalid input. Please enter a number or 'b'.[/bold red]"
+        )
     except Exception as e:
+        console.print(
+            Panel(
+                f"An error occurred during export: {e}", style="bold red", title="Error"
+            )
+        )
 async def export_all_provider_credentials(provider_name: str):
     """
     Export all credentials for a specific provider to individual .env files.
+    Uses the auth class's list_credentials() and export_credential_to_env() methods.
     """
+    # Get auth instance for this provider
+    provider_factory, _ = _ensure_providers_loaded()
+    try:
+        auth_class = provider_factory.get_provider_auth_class(provider_name)
+        auth_instance = auth_class()
+    except Exception:
         console.print(f"[bold red]Unknown provider: {provider_name}[/bold red]")
         return
+    display_name = provider_name.replace("_", " ").title()
+    console.print(
+        Panel(
+            f"[bold cyan]Export All {display_name} Credentials[/bold cyan]",
+            expand=False,
+        )
+    )
+    # List all credentials using auth class
+    credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+    if not credentials:
+        console.print(
+            Panel(
+                f"No {display_name} credentials found.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     exported_count = 0
+    for cred_info in credentials:
         try:
+            # Use auth class to export
+            env_path = auth_instance.export_credential_to_env(
+                cred_info["file_path"], _get_oauth_base_dir()
+            )
+            if env_path:
+                console.print(
+                    f"  ✓ Exported [cyan]{Path(cred_info['file_path']).name}[/cyan] → [yellow]{Path(env_path).name}[/yellow]"
+                )
+                exported_count += 1
+            else:
+                console.print(
+                    f"  ✗ Failed to export {Path(cred_info['file_path']).name}"
+                )
         except Exception as e:
+            console.print(
+                f"  ✗ Failed to export {Path(cred_info['file_path']).name}: {e}"
+            )
+    console.print(
+        Panel(
+            f"Successfully exported {exported_count}/{len(credentials)} {display_name} credentials to individual .env files.",
+            style="bold green",
+            title="Export Complete",
+        )
+    )
 async def combine_provider_credentials(provider_name: str):
     """
     Combine all credentials for a specific provider into a single .env file.
+    Uses the auth class's list_credentials() and build_env_lines() methods.
     """
+    # Get auth instance for this provider
+    provider_factory, _ = _ensure_providers_loaded()
+    try:
+        auth_class = provider_factory.get_provider_auth_class(provider_name)
+        auth_instance = auth_class()
+    except Exception:
         console.print(f"[bold red]Unknown provider: {provider_name}[/bold red]")
         return
+    display_name = provider_name.replace("_", " ").title()
+    console.print(
+        Panel(
+            f"[bold cyan]Combine All {display_name} Credentials[/bold cyan]",
+            expand=False,
+        )
+    )
+    # List all credentials using auth class
+    credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+    if not credentials:
+        console.print(
+            Panel(
+                f"No {display_name} credentials found.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     combined_lines = [
         f"# Combined {display_name} Credentials",
         f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
+        f"# Total credentials: {len(credentials)}",
         "#",
         "# Copy all lines below into your main .env file",
         "",
     ]
     combined_count = 0
+    for cred_info in credentials:
         try:
+            # Load credential file
+            with open(cred_info["file_path"], "r") as f:
                 creds = json.load(f)
+            # Use auth class to build env lines
+            env_lines = auth_instance.build_env_lines(creds, cred_info["number"])
             combined_lines.extend(env_lines)
             combined_lines.append("")  # Blank line between credentials
             combined_count += 1
         except Exception as e:
+            console.print(
+                f"  ✗ Failed to process {Path(cred_info['file_path']).name}: {e}"
+            )
     # Write combined file
     combined_filename = f"{provider_name}_all_combined.env"
+    combined_filepath = _get_oauth_base_dir() / combined_filename
+    with open(combined_filepath, "w") as f:
+        f.write("\n".join(combined_lines))
+    console.print(
+        Panel(
+            Text.from_markup(
+                f"Successfully combined {combined_count} {display_name} credentials into:\n"
+                f"[bold yellow]{combined_filepath}[/bold yellow]\n\n"
+                f"[bold]To use:[/bold] Copy the contents into your main .env file."
+            ),
+            style="bold green",
+            title="Combine Complete",
+        )
+    )
 async def combine_all_credentials():
     """
     Combine ALL credentials from ALL providers into a single .env file.
+    Uses auth class list_credentials() and build_env_lines() methods.
     """
+    console.print(
+        Panel("[bold cyan]Combine All Provider Credentials[/bold cyan]", expand=False)
+    )
+    # List of providers that support OAuth credentials
+    oauth_providers = ["gemini_cli", "qwen_code", "iflow", "antigravity"]
+    provider_factory, _ = _ensure_providers_loaded()
     combined_lines = [
         "# Combined All Provider Credentials",
         f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
         "# Copy all lines below into your main .env file",
         "",
     ]
     total_count = 0
     provider_counts = {}
+    for provider_name in oauth_providers:
+        try:
+            auth_class = provider_factory.get_provider_auth_class(provider_name)
+            auth_instance = auth_class()
+        except Exception:
+            continue  # Skip providers that don't have auth classes
+        credentials = auth_instance.list_credentials(_get_oauth_base_dir())
+        if not credentials:
             continue
+        display_name = provider_name.replace("_", " ").title()
         combined_lines.append(f"# ===== {display_name} Credentials =====")
         combined_lines.append("")
         provider_count = 0
+        for cred_info in credentials:
             try:
+                # Load credential file
+                with open(cred_info["file_path"], "r") as f:
                     creds = json.load(f)
+                # Use auth class to build env lines
+                env_lines = auth_instance.build_env_lines(creds, cred_info["number"])
                 combined_lines.extend(env_lines)
                 combined_lines.append("")
                 provider_count += 1
                 total_count += 1
             except Exception as e:
+                console.print(
+                    f"  ✗ Failed to process {Path(cred_info['file_path']).name}: {e}"
+                )
         provider_counts[display_name] = provider_count
     if total_count == 0:
+        console.print(
+            Panel(
+                "No credentials found to combine.",
+                style="bold red",
+                title="No Credentials",
+            )
+        )
         return
     # Write combined file
     combined_filename = "all_providers_combined.env"
+    combined_filepath = _get_oauth_base_dir() / combined_filename
+    with open(combined_filepath, "w") as f:
+        f.write("\n".join(combined_lines))
     # Build summary
+    summary_lines = [
+        f"  • {name}: {count} credential(s)" for name, count in provider_counts.items()
+    ]
     summary = "\n".join(summary_lines)
+    console.print(
+        Panel(
+            Text.from_markup(
+                f"Successfully combined {total_count} credentials from {len(provider_counts)} providers:\n"
+                f"{summary}\n\n"
+                f"[bold]Output file:[/bold] [yellow]{combined_filepath}[/yellow]\n\n"
+                f"[bold]To use:[/bold] Copy the contents into your main .env file."
+            ),
+            style="bold green",
+            title="Combine Complete",
+        )
+    )
 async def export_credentials_submenu():
     """
     while True:
         clear_screen()
+        console.print(
+            Panel(
+                "[bold cyan]Export Credentials to .env[/bold cyan]",
+                title="--- API Key Proxy ---",
+                expand=False,
+            )
+        )
+        console.print(
+            Panel(
+                Text.from_markup(
+                    "[bold]Individual Exports:[/bold]\n"
+                    "1. Export Gemini CLI credential\n"
+                    "2. Export Qwen Code credential\n"
+                    "3. Export iFlow credential\n"
+                    "4. Export Antigravity credential\n"
+                    "\n"
+                    "[bold]Bulk Exports (per provider):[/bold]\n"
+                    "5. Export ALL Gemini CLI credentials\n"
+                    "6. Export ALL Qwen Code credentials\n"
+                    "7. Export ALL iFlow credentials\n"
+                    "8. Export ALL Antigravity credentials\n"
+                    "\n"
+                    "[bold]Combine Credentials:[/bold]\n"
+                    "9. Combine all Gemini CLI into one file\n"
+                    "10. Combine all Qwen Code into one file\n"
+                    "11. Combine all iFlow into one file\n"
+                    "12. Combine all Antigravity into one file\n"
+                    "13. Combine ALL providers into one file"
+                ),
+                title="Choose export option",
+                style="bold blue",
+            )
+        )
         export_choice = Prompt.ask(
+            Text.from_markup(
+                "[bold]Please select an option or type [red]'b'[/red] to go back[/bold]"
+            ),
+            choices=[
+                "1",
+                "2",
+                "3",
+                "4",
+                "5",
+                "6",
+                "7",
+                "8",
+                "9",
+                "10",
+                "11",
+                "12",
+                "13",
+                "b",
+            ],
+            show_choices=False,
         )
+        if export_choice.lower() == "b":
             break
         # Individual exports
 async def main(clear_on_start=True):
     """
     An interactive CLI tool to add new credentials.
     Args:
+        clear_on_start: If False, skip initial screen clear (used when called from launcher
                        to preserve the loading screen)
     """
     ensure_env_defaults()
     # Only show header if we're clearing (standalone mode)
     if clear_on_start:
+        console.print(
+            Panel(
+                "[bold cyan]Interactive Credential Setup[/bold cyan]",
+                title="--- API Key Proxy ---",
+                expand=False,
+            )
+        )
     while True:
         # Clear screen between menu selections for cleaner UX
         clear_screen()
+        console.print(
+            Panel(
+                "[bold cyan]Interactive Credential Setup[/bold cyan]",
+                title="--- API Key Proxy ---",
+                expand=False,
+            )
+        )
+        console.print(
+            Panel(
+                Text.from_markup(
+                    "1. Add OAuth Credential\n2. Add API Key\n3. Export Credentials"
+                ),
+                title="Choose credential type",
+                style="bold blue",
+            )
+        )
         setup_type = Prompt.ask(
+            Text.from_markup(
+                "[bold]Please select an option or type [red]'q'[/red] to quit[/bold]"
+            ),
             choices=["1", "2", "3", "q"],
+            show_choices=False,
         )
+        if setup_type.lower() == "q":
             break
         if setup_type == "1":
                 "iflow": "iFlow (OAuth - also supports API keys)",
                 "antigravity": "Antigravity (OAuth)",
             }
             provider_text = Text()
             for i, provider in enumerate(available_providers):
+                display_name = oauth_friendly_names.get(
+                    provider, provider.replace("_", " ").title()
+                )
                 provider_text.append(f"  {i + 1}. {display_name}\n")
+            console.print(
+                Panel(
+                    provider_text,
+                    title="Available Providers for OAuth",
+                    style="bold blue",
+                )
+            )
             choice = Prompt.ask(
+                Text.from_markup(
+                    "[bold]Please select a provider or type [red]'b'[/red] to go back[/bold]"
+                ),
                 choices=[str(i + 1) for i in range(len(available_providers))] + ["b"],
+                show_choices=False,
             )
+            if choice.lower() == "b":
                 continue
             try:
                 choice_index = int(choice) - 1
                 if 0 <= choice_index < len(available_providers):
                     provider_name = available_providers[choice_index]
+                    display_name = oauth_friendly_names.get(
+                        provider_name, provider_name.replace("_", " ").title()
+                    )
+                    console.print(
+                        f"\nStarting OAuth setup for [bold cyan]{display_name}[/bold cyan]..."
+                    )
                     await setup_new_credential(provider_name)
                     # Don't clear after OAuth - user needs to see full flow
                     console.print("\n[dim]Press Enter to return to main menu...[/dim]")
                     input()
                 else:
+                    console.print(
+                        "[bold red]Invalid choice. Please try again.[/bold red]"
+                    )
                     await asyncio.sleep(1.5)
             except ValueError:
+                console.print(
+                    "[bold red]Invalid input. Please enter a number or 'b'.[/bold red]"
+                )
                 await asyncio.sleep(1.5)
         elif setup_type == "2":
             await setup_api_key()
+            # console.print("\n[dim]Press Enter to return to main menu...[/dim]")
+            # input()
         elif setup_type == "3":
             await export_credentials_submenu()
 def run_credential_tool(from_launcher=False):
     """
     Entry point for credential tool.
     Args:
         from_launcher: If True, skip loading screen (launcher already showed it)
     """
     # Check if we need to show loading screen
     if not from_launcher:
         # Standalone mode - show full loading UI
+        os.system("cls" if os.name == "nt" else "clear")
         _start_time = time.time()
         # Phase 1: Show initial message
         print("━" * 70)
         print("Interactive Credential Setup Tool")
         print("GitHub: https://github.com/Mirrowel/LLM-API-Key-Proxy")
         print("━" * 70)
         print("Loading credential management components...")
         # Phase 2: Load dependencies with spinner
         with console.status("Loading authentication providers...", spinner="dots"):
             _ensure_providers_loaded()
         with console.status("Initializing credential tool...", spinner="dots"):
             time.sleep(0.2)  # Brief pause for UI consistency
         console.print("✓ Credential tool initialized")
         _elapsed = time.time() - _start_time
         _, PROVIDER_PLUGINS = _ensure_providers_loaded()
+        print(
+            f"✓ Tool ready in {_elapsed:.2f}s ({len(PROVIDER_PLUGINS)} providers available)"
+        )
         # Small delay to let user see the ready message
         time.sleep(0.5)
     # Run the main async event loop
     # If from launcher, don't clear screen at start to preserve loading messages
     try:

src/rotator_library/failure_logger.py CHANGED Viewed

@@ -1,47 +1,93 @@
 import logging
 import json
 from logging.handlers import RotatingFileHandler
-import os
 from datetime import datetime
 from .error_handler import mask_credential
-def setup_failure_logger():
-    """Sets up a dedicated JSON logger for writing detailed failure logs to a file."""
-    log_dir = "logs"
-    if not os.path.exists(log_dir):
-        os.makedirs(log_dir)
-    # Create a logger specifically for failures.
-    # This logger will NOT propagate to the root logger.
     logger = logging.getLogger("failure_logger")
     logger.setLevel(logging.INFO)
     logger.propagate = False
-    # Use a rotating file handler
-    handler = RotatingFileHandler(
-        os.path.join(log_dir, "failures.log"),
-        maxBytes=5 * 1024 * 1024,  # 5 MB
-        backupCount=2,
-    )
-    # Custom JSON formatter for structured logs
-    class JsonFormatter(logging.Formatter):
-        def format(self, record):
-            # The message is already a dict, so we just format it as a JSON string
-            return json.dumps(record.msg)
-    handler.setFormatter(JsonFormatter())
-    # Add handler only if it hasn't been added before
-    if not logger.handlers:
         logger.addHandler(handler)
     return logger
-# Initialize the dedicated logger for detailed failure logs
-failure_logger = setup_failure_logger()
 # Get the main library logger for concise, propagated messages
 main_lib_logger = logging.getLogger("rotator_library")
@@ -52,10 +98,27 @@ def _extract_response_body(error: Exception) -> str:
     Extract the full response body from various error types.
     Handles:
     - httpx.HTTPStatusError: response.text or response.content
     - litellm exceptions: various response attributes
     - Other exceptions: str(error)
     """
     # Try to get response body from httpx errors
     if hasattr(error, "response") and error.response is not None:
         response = error.response
@@ -145,11 +208,19 @@ def log_failure(
         "request_headers": request_headers,
         "error_chain": error_chain if len(error_chain) > 1 else None,
     }
-    failure_logger.error(detailed_log_data)
     # 2. Log a concise summary to the main library logger, which will propagate
     summary_message = (
         f"API call failed for model {model} with key {mask_credential(api_key)}. "
         f"Error: {type(error).__name__}. See failures.log for details."
     )
     main_lib_logger.error(summary_message)

 import logging
 import json
 from logging.handlers import RotatingFileHandler
+from pathlib import Path
 from datetime import datetime
+from typing import Optional, Union
 from .error_handler import mask_credential
+from .utils.paths import get_logs_dir
+class JsonFormatter(logging.Formatter):
+    """Custom JSON formatter for structured logs."""
+    def format(self, record):
+        # The message is already a dict, so we just format it as a JSON string
+        return json.dumps(record.msg)
+# Module-level state for lazy initialization
+_failure_logger: Optional[logging.Logger] = None
+_configured_logs_dir: Optional[Path] = None
+def configure_failure_logger(logs_dir: Optional[Union[Path, str]] = None) -> None:
+    """
+    Configure the failure logger to use a specific logs directory.
+    Call this before first use if you want to override the default location.
+    If not called, the logger will use get_logs_dir() on first use.
+    Args:
+        logs_dir: Path to the logs directory. If None, uses get_logs_dir().
+    """
+    global _configured_logs_dir, _failure_logger
+    _configured_logs_dir = Path(logs_dir) if logs_dir else None
+    # Reset logger so it gets reconfigured on next use
+    _failure_logger = None
+def _setup_failure_logger(logs_dir: Path) -> logging.Logger:
+    """
+    Sets up a dedicated JSON logger for writing detailed failure logs to a file.
+    Args:
+        logs_dir: Path to the logs directory.
+    Returns:
+        Configured logger instance.
+    """
     logger = logging.getLogger("failure_logger")
     logger.setLevel(logging.INFO)
     logger.propagate = False
+    # Clear existing handlers to prevent duplicates on re-setup
+    logger.handlers.clear()
+    try:
+        logs_dir.mkdir(parents=True, exist_ok=True)
+        handler = RotatingFileHandler(
+            logs_dir / "failures.log",
+            maxBytes=5 * 1024 * 1024,  # 5 MB
+            backupCount=2,
+        )
+        handler.setFormatter(JsonFormatter())
         logger.addHandler(handler)
+    except (OSError, PermissionError, IOError) as e:
+        logging.warning(f"Cannot create failure log file handler: {e}")
+        # Add NullHandler to prevent "no handlers" warning
+        logger.addHandler(logging.NullHandler())
     return logger
+def get_failure_logger() -> logging.Logger:
+    """
+    Get the failure logger, initializing it lazily if needed.
+    Returns:
+        The configured failure logger.
+    """
+    global _failure_logger, _configured_logs_dir
+    if _failure_logger is None:
+        logs_dir = _configured_logs_dir if _configured_logs_dir else get_logs_dir()
+        _failure_logger = _setup_failure_logger(logs_dir)
+    return _failure_logger
 # Get the main library logger for concise, propagated messages
 main_lib_logger = logging.getLogger("rotator_library")
     Extract the full response body from various error types.
     Handles:
+    - StreamedAPIError: wraps original exception in .data attribute
     - httpx.HTTPStatusError: response.text or response.content
     - litellm exceptions: various response attributes
     - Other exceptions: str(error)
     """
+    # Handle StreamedAPIError which wraps the original exception in .data
+    # This is used by our streaming wrapper when catching provider errors
+    if hasattr(error, "data") and error.data is not None:
+        inner = error.data
+        # If data is a dict (parsed JSON error), return it as JSON
+        if isinstance(inner, dict):
+            try:
+                return json.dumps(inner, indent=2)
+            except Exception:
+                return str(inner)
+        # If data is an exception, recurse to extract from it
+        if isinstance(inner, Exception):
+            result = _extract_response_body(inner)
+            if result:
+                return result
     # Try to get response body from httpx errors
     if hasattr(error, "response") and error.response is not None:
         response = error.response
         "request_headers": request_headers,
         "error_chain": error_chain if len(error_chain) > 1 else None,
     }
     # 2. Log a concise summary to the main library logger, which will propagate
     summary_message = (
         f"API call failed for model {model} with key {mask_credential(api_key)}. "
         f"Error: {type(error).__name__}. See failures.log for details."
     )
+    # Log to failure logger with resilience - if it fails, just continue
+    try:
+        get_failure_logger().error(detailed_log_data)
+    except (OSError, IOError) as e:
+        # Log file write failed - log to console instead
+        logging.warning(f"Failed to write to failures.log: {e}")
+    # Console log always succeeds
     main_lib_logger.error(summary_message)

src/rotator_library/providers/antigravity_auth_base.py CHANGED Viewed

@@ -1,16 +1,36 @@
 # src/rotator_library/providers/antigravity_auth_base.py
 from .google_oauth_base import GoogleOAuthBase
 class AntigravityAuthBase(GoogleOAuthBase):
     """
     Antigravity OAuth2 authentication implementation.
     Inherits all OAuth functionality from GoogleOAuthBase with Antigravity-specific configuration.
     Uses Antigravity's OAuth credentials and includes additional scopes for cclog and experimentsandconfigs.
     """
-    CLIENT_ID = "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
     CLIENT_SECRET = "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"
     OAUTH_SCOPES = [
         "https://www.googleapis.com/auth/cloud-platform",
@@ -22,3 +42,600 @@ class AntigravityAuthBase(GoogleOAuthBase):
     ENV_PREFIX = "ANTIGRAVITY"
     CALLBACK_PORT = 51121
     CALLBACK_PATH = "/oauthcallback"

 # src/rotator_library/providers/antigravity_auth_base.py
+import asyncio
+import json
+import logging
+import os
+from pathlib import Path
+from typing import Any, Dict, Optional, List
+import httpx
 from .google_oauth_base import GoogleOAuthBase
+lib_logger = logging.getLogger("rotator_library")
+# Code Assist endpoint for project discovery
+CODE_ASSIST_ENDPOINT = "https://cloudcode-pa.googleapis.com/v1internal"
 class AntigravityAuthBase(GoogleOAuthBase):
     """
     Antigravity OAuth2 authentication implementation.
     Inherits all OAuth functionality from GoogleOAuthBase with Antigravity-specific configuration.
     Uses Antigravity's OAuth credentials and includes additional scopes for cclog and experimentsandconfigs.
+    Also provides project/tier discovery functionality that runs during authentication,
+    ensuring credentials have their tier and project_id cached before any API requests.
     """
+    CLIENT_ID = (
+        "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
+    )
     CLIENT_SECRET = "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"
     OAUTH_SCOPES = [
         "https://www.googleapis.com/auth/cloud-platform",
     ENV_PREFIX = "ANTIGRAVITY"
     CALLBACK_PORT = 51121
     CALLBACK_PATH = "/oauthcallback"
+    def __init__(self):
+        super().__init__()
+        # Project and tier caches - shared between auth base and provider
+        self.project_id_cache: Dict[str, str] = {}
+        self.project_tier_cache: Dict[str, str] = {}
+    # =========================================================================
+    # POST-AUTH DISCOVERY HOOK
+    # =========================================================================
+    async def _post_auth_discovery(
+        self, credential_path: str, access_token: str
+    ) -> None:
+        """
+        Discover and cache tier/project information immediately after OAuth authentication.
+        This is called by GoogleOAuthBase._perform_interactive_oauth() after successful auth,
+        ensuring tier and project_id are cached during the authentication flow rather than
+        waiting for the first API request.
+        Args:
+            credential_path: Path to the credential file
+            access_token: The newly obtained access token
+        """
+        lib_logger.debug(
+            f"Starting post-auth discovery for Antigravity credential: {Path(credential_path).name}"
+        )
+        # Skip if already discovered (shouldn't happen during fresh auth, but be defensive)
+        if (
+            credential_path in self.project_id_cache
+            and credential_path in self.project_tier_cache
+        ):
+            lib_logger.debug(
+                f"Tier and project already cached for {Path(credential_path).name}, skipping discovery"
+            )
+            return
+        # Call _discover_project_id which handles tier/project discovery and persistence
+        # Pass empty litellm_params since we're in auth context (no model-specific overrides)
+        project_id = await self._discover_project_id(
+            credential_path, access_token, litellm_params={}
+        )
+        tier = self.project_tier_cache.get(credential_path, "unknown")
+        lib_logger.info(
+            f"Post-auth discovery complete for {Path(credential_path).name}: "
+            f"tier={tier}, project={project_id}"
+        )
+    # =========================================================================
+    # PROJECT ID DISCOVERY
+    # =========================================================================
+    async def _discover_project_id(
+        self, credential_path: str, access_token: str, litellm_params: Dict[str, Any]
+    ) -> str:
+        """
+        Discovers the Google Cloud Project ID, with caching and onboarding for new accounts.
+        This follows the official Gemini CLI discovery flow adapted for Antigravity:
+        1. Check in-memory cache
+        2. Check configured project_id override (litellm_params or env var)
+        3. Check persisted project_id in credential file
+        4. Call loadCodeAssist to check if user is already known (has currentTier)
+           - If currentTier exists AND cloudaicompanionProject returned: use server's project
+           - If no currentTier: user needs onboarding
+        5. Onboard user (FREE tier: pass cloudaicompanionProject=None for server-managed)
+        6. Fallback to GCP Resource Manager project listing
+        Note: Unlike GeminiCli, Antigravity doesn't use tier-based credential prioritization,
+        but we still cache tier info for debugging and consistency.
+        """
+        lib_logger.debug(
+            f"Starting Antigravity project discovery for credential: {credential_path}"
+        )
+        # Check in-memory cache first
+        if credential_path in self.project_id_cache:
+            cached_project = self.project_id_cache[credential_path]
+            lib_logger.debug(f"Using cached project ID: {cached_project}")
+            return cached_project
+        # Check for configured project ID override (from litellm_params or env var)
+        configured_project_id = (
+            litellm_params.get("project_id")
+            or os.getenv("ANTIGRAVITY_PROJECT_ID")
+            or os.getenv("GOOGLE_CLOUD_PROJECT")
+        )
+        if configured_project_id:
+            lib_logger.debug(
+                f"Found configured project_id override: {configured_project_id}"
+            )
+        # Load credentials from file to check for persisted project_id and tier
+        # Skip for env:// paths (environment-based credentials don't persist to files)
+        credential_index = self._parse_env_credential_path(credential_path)
+        if credential_index is None:
+            # Only try to load from file if it's not an env:// path
+            try:
+                with open(credential_path, "r") as f:
+                    creds = json.load(f)
+                metadata = creds.get("_proxy_metadata", {})
+                persisted_project_id = metadata.get("project_id")
+                persisted_tier = metadata.get("tier")
+                if persisted_project_id:
+                    lib_logger.info(
+                        f"Loaded persisted project ID from credential file: {persisted_project_id}"
+                    )
+                    self.project_id_cache[credential_path] = persisted_project_id
+                    # Also load tier if available (for debugging/logging purposes)
+                    if persisted_tier:
+                        self.project_tier_cache[credential_path] = persisted_tier
+                        lib_logger.debug(f"Loaded persisted tier: {persisted_tier}")
+                    return persisted_project_id
+            except (FileNotFoundError, json.JSONDecodeError, KeyError) as e:
+                lib_logger.debug(f"Could not load persisted project ID from file: {e}")
+        lib_logger.debug(
+            "No cached or configured project ID found, initiating discovery..."
+        )
+        headers = {
+            "Authorization": f"Bearer {access_token}",
+            "Content-Type": "application/json",
+        }
+        discovered_project_id = None
+        discovered_tier = None
+        async with httpx.AsyncClient() as client:
+            # 1. Try discovery endpoint with loadCodeAssist
+            lib_logger.debug(
+                "Attempting project discovery via Code Assist loadCodeAssist endpoint..."
+            )
+            try:
+                # Build metadata - include duetProject only if we have a configured project
+                core_client_metadata = {
+                    "ideType": "IDE_UNSPECIFIED",
+                    "platform": "PLATFORM_UNSPECIFIED",
+                    "pluginType": "GEMINI",
+                }
+                if configured_project_id:
+                    core_client_metadata["duetProject"] = configured_project_id
+                # Build load request - pass configured_project_id if available, otherwise None
+                load_request = {
+                    "cloudaicompanionProject": configured_project_id,  # Can be None
+                    "metadata": core_client_metadata,
+                }
+                lib_logger.debug(
+                    f"Sending loadCodeAssist request with cloudaicompanionProject={configured_project_id}"
+                )
+                response = await client.post(
+                    f"{CODE_ASSIST_ENDPOINT}:loadCodeAssist",
+                    headers=headers,
+                    json=load_request,
+                    timeout=20,
+                )
+                response.raise_for_status()
+                data = response.json()
+                # Log full response for debugging
+                lib_logger.debug(
+                    f"loadCodeAssist full response keys: {list(data.keys())}"
+                )
+                # Extract tier information
+                allowed_tiers = data.get("allowedTiers", [])
+                current_tier = data.get("currentTier")
+                lib_logger.debug(f"=== Tier Information ===")
+                lib_logger.debug(f"currentTier: {current_tier}")
+                lib_logger.debug(f"allowedTiers count: {len(allowed_tiers)}")
+                for i, tier in enumerate(allowed_tiers):
+                    tier_id = tier.get("id", "unknown")
+                    is_default = tier.get("isDefault", False)
+                    user_defined = tier.get("userDefinedCloudaicompanionProject", False)
+                    lib_logger.debug(
+                        f"  Tier {i + 1}: id={tier_id}, isDefault={is_default}, userDefinedProject={user_defined}"
+                    )
+                lib_logger.debug(f"========================")
+                # Determine the current tier ID
+                current_tier_id = None
+                if current_tier:
+                    current_tier_id = current_tier.get("id")
+                    lib_logger.debug(f"User has currentTier: {current_tier_id}")
+                # Check if user is already known to server (has currentTier)
+                if current_tier_id:
+                    # User is already onboarded - check for project from server
+                    server_project = data.get("cloudaicompanionProject")
+                    # Check if this tier requires user-defined project (paid tiers)
+                    requires_user_project = any(
+                        t.get("id") == current_tier_id
+                        and t.get("userDefinedCloudaicompanionProject", False)
+                        for t in allowed_tiers
+                    )
+                    is_free_tier = current_tier_id == "free-tier"
+                    if server_project:
+                        # Server returned a project - use it (server wins)
+                        project_id = server_project
+                        lib_logger.debug(f"Server returned project: {project_id}")
+                    elif configured_project_id:
+                        # No server project but we have configured one - use it
+                        project_id = configured_project_id
+                        lib_logger.debug(
+                            f"No server project, using configured: {project_id}"
+                        )
+                    elif is_free_tier:
+                        # Free tier user without server project - try onboarding
+                        lib_logger.debug(
+                            "Free tier user with currentTier but no project - will try onboarding"
+                        )
+                        project_id = None
+                    elif requires_user_project:
+                        # Paid tier requires a project ID to be set
+                        raise ValueError(
+                            f"Paid tier '{current_tier_id}' requires setting ANTIGRAVITY_PROJECT_ID environment variable."
+                        )
+                    else:
+                        # Unknown tier without project - proceed to onboarding
+                        lib_logger.warning(
+                            f"Tier '{current_tier_id}' has no project and none configured - will try onboarding"
+                        )
+                        project_id = None
+                    if project_id:
+                        # Cache tier info
+                        self.project_tier_cache[credential_path] = current_tier_id
+                        discovered_tier = current_tier_id
+                        # Log appropriately based on tier
+                        is_paid = current_tier_id and current_tier_id not in [
+                            "free-tier",
+                            "legacy-tier",
+                            "unknown",
+                        ]
+                        if is_paid:
+                            lib_logger.info(
+                                f"Using Antigravity paid tier '{current_tier_id}' with project: {project_id}"
+                            )
+                        else:
+                            lib_logger.info(
+                                f"Discovered Antigravity project ID via loadCodeAssist: {project_id}"
+                            )
+                        self.project_id_cache[credential_path] = project_id
+                        discovered_project_id = project_id
+                        # Persist to credential file
+                        await self._persist_project_metadata(
+                            credential_path, project_id, discovered_tier
+                        )
+                        return project_id
+                # 2. User needs onboarding - no currentTier or no project found
+                lib_logger.info(
+                    "No existing Antigravity session found (no currentTier), attempting to onboard user..."
+                )
+                # Determine which tier to onboard with
+                onboard_tier = None
+                for tier in allowed_tiers:
+                    if tier.get("isDefault"):
+                        onboard_tier = tier
+                        break
+                # Fallback to legacy tier if no default
+                if not onboard_tier and allowed_tiers:
+                    for tier in allowed_tiers:
+                        if tier.get("id") == "legacy-tier":
+                            onboard_tier = tier
+                            break
+                    if not onboard_tier:
+                        onboard_tier = allowed_tiers[0]
+                if not onboard_tier:
+                    raise ValueError("No onboarding tiers available from server")
+                tier_id = onboard_tier.get("id", "free-tier")
+                requires_user_project = onboard_tier.get(
+                    "userDefinedCloudaicompanionProject", False
+                )
+                lib_logger.debug(
+                    f"Onboarding with tier: {tier_id}, requiresUserProject: {requires_user_project}"
+                )
+                # Build onboard request based on tier type
+                # FREE tier: cloudaicompanionProject = None (server-managed)
+                # PAID tier: cloudaicompanionProject = configured_project_id
+                is_free_tier = tier_id == "free-tier"
+                if is_free_tier:
+                    # Free tier uses server-managed project
+                    onboard_request = {
+                        "tierId": tier_id,
+                        "cloudaicompanionProject": None,  # Server will create/manage
+                        "metadata": core_client_metadata,
+                    }
+                    lib_logger.debug(
+                        "Free tier onboarding: using server-managed project"
+                    )
+                else:
+                    # Paid/legacy tier requires user-provided project
+                    if not configured_project_id and requires_user_project:
+                        raise ValueError(
+                            f"Tier '{tier_id}' requires setting ANTIGRAVITY_PROJECT_ID environment variable."
+                        )
+                    onboard_request = {
+                        "tierId": tier_id,
+                        "cloudaicompanionProject": configured_project_id,
+                        "metadata": {
+                            **core_client_metadata,
+                            "duetProject": configured_project_id,
+                        }
+                        if configured_project_id
+                        else core_client_metadata,
+                    }
+                    lib_logger.debug(
+                        f"Paid tier onboarding: using project {configured_project_id}"
+                    )
+                lib_logger.debug("Initiating onboardUser request...")
+                lro_response = await client.post(
+                    f"{CODE_ASSIST_ENDPOINT}:onboardUser",
+                    headers=headers,
+                    json=onboard_request,
+                    timeout=30,
+                )
+                lro_response.raise_for_status()
+                lro_data = lro_response.json()
+                lib_logger.debug(
+                    f"Initial onboarding response: done={lro_data.get('done')}"
+                )
+                # Poll for onboarding completion (up to 5 minutes)
+                for i in range(150):  # 150 × 2s = 5 minutes
+                    if lro_data.get("done"):
+                        lib_logger.debug(
+                            f"Onboarding completed after {i} polling attempts"
+                        )
+                        break
+                    await asyncio.sleep(2)
+                    if (i + 1) % 15 == 0:  # Log every 30 seconds
+                        lib_logger.info(
+                            f"Still waiting for onboarding completion... ({(i + 1) * 2}s elapsed)"
+                        )
+                    lib_logger.debug(
+                        f"Polling onboarding status... (Attempt {i + 1}/150)"
+                    )
+                    lro_response = await client.post(
+                        f"{CODE_ASSIST_ENDPOINT}:onboardUser",
+                        headers=headers,
+                        json=onboard_request,
+                        timeout=30,
+                    )
+                    lro_response.raise_for_status()
+                    lro_data = lro_response.json()
+                if not lro_data.get("done"):
+                    lib_logger.error("Onboarding process timed out after 5 minutes")
+                    raise ValueError(
+                        "Onboarding process timed out after 5 minutes. Please try again or contact support."
+                    )
+                # Extract project ID from LRO response
+                # Note: onboardUser returns response.cloudaicompanionProject as an object with .id
+                lro_response_data = lro_data.get("response", {})
+                lro_project_obj = lro_response_data.get("cloudaicompanionProject", {})
+                project_id = (
+                    lro_project_obj.get("id")
+                    if isinstance(lro_project_obj, dict)
+                    else None
+                )
+                # Fallback to configured project if LRO didn't return one
+                if not project_id and configured_project_id:
+                    project_id = configured_project_id
+                    lib_logger.debug(
+                        f"LRO didn't return project, using configured: {project_id}"
+                    )
+                if not project_id:
+                    lib_logger.error(
+                        "Onboarding completed but no project ID in response and none configured"
+                    )
+                    raise ValueError(
+                        "Onboarding completed, but no project ID was returned. "
+                        "For paid tiers, set ANTIGRAVITY_PROJECT_ID environment variable."
+                    )
+                lib_logger.debug(
+                    f"Successfully extracted project ID from onboarding response: {project_id}"
+                )
+                # Cache tier info
+                self.project_tier_cache[credential_path] = tier_id
+                discovered_tier = tier_id
+                lib_logger.debug(f"Cached tier information: {tier_id}")
+                # Log concise message based on tier
+                is_paid = tier_id and tier_id not in ["free-tier", "legacy-tier"]
+                if is_paid:
+                    lib_logger.info(
+                        f"Using Antigravity paid tier '{tier_id}' with project: {project_id}"
+                    )
+                else:
+                    lib_logger.info(
+                        f"Successfully onboarded user and discovered project ID: {project_id}"
+                    )
+                self.project_id_cache[credential_path] = project_id
+                discovered_project_id = project_id
+                # Persist to credential file
+                await self._persist_project_metadata(
+                    credential_path, project_id, discovered_tier
+                )
+                return project_id
+            except httpx.HTTPStatusError as e:
+                error_body = ""
+                try:
+                    error_body = e.response.text
+                except Exception:
+                    pass
+                if e.response.status_code == 403:
+                    lib_logger.error(
+                        f"Antigravity Code Assist API access denied (403). Response: {error_body}"
+                    )
+                    lib_logger.error(
+                        "Possible causes: 1) cloudaicompanion.googleapis.com API not enabled, 2) Wrong project ID for paid tier, 3) Account lacks permissions"
+                    )
+                elif e.response.status_code == 404:
+                    lib_logger.warning(
+                        f"Antigravity Code Assist endpoint not found (404). Falling back to project listing."
+                    )
+                elif e.response.status_code == 412:
+                    # Precondition Failed - often means wrong project for free tier onboarding
+                    lib_logger.error(
+                        f"Precondition failed (412): {error_body}. This may mean the project ID is incompatible with the selected tier."
+                    )
+                else:
+                    lib_logger.warning(
+                        f"Antigravity onboarding/discovery failed with status {e.response.status_code}: {error_body}. Falling back to project listing."
+                    )
+            except httpx.RequestError as e:
+                lib_logger.warning(
+                    f"Antigravity onboarding/discovery network error: {e}. Falling back to project listing."
+                )
+        # 3. Fallback to listing all available GCP projects (last resort)
+        lib_logger.debug(
+            "Attempting to discover project via GCP Resource Manager API..."
+        )
+        try:
+            async with httpx.AsyncClient() as client:
+                lib_logger.debug(
+                    "Querying Cloud Resource Manager for available projects..."
+                )
+                response = await client.get(
+                    "https://cloudresourcemanager.googleapis.com/v1/projects",
+                    headers=headers,
+                    timeout=20,
+                )
+                response.raise_for_status()
+                projects = response.json().get("projects", [])
+                lib_logger.debug(f"Found {len(projects)} total projects")
+                active_projects = [
+                    p for p in projects if p.get("lifecycleState") == "ACTIVE"
+                ]
+                lib_logger.debug(f"Found {len(active_projects)} active projects")
+                if not projects:
+                    lib_logger.error(
+                        "No GCP projects found for this account. Please create a project in Google Cloud Console."
+                    )
+                elif not active_projects:
+                    lib_logger.error(
+                        "No active GCP projects found. Please activate a project in Google Cloud Console."
+                    )
+                else:
+                    project_id = active_projects[0]["projectId"]
+                    lib_logger.info(
+                        f"Discovered Antigravity project ID from active projects list: {project_id}"
+                    )
+                    lib_logger.debug(
+                        f"Selected first active project: {project_id} (out of {len(active_projects)} active projects)"
+                    )
+                    self.project_id_cache[credential_path] = project_id
+                    discovered_project_id = project_id
+                    # Persist to credential file (no tier info from resource manager)
+                    await self._persist_project_metadata(
+                        credential_path, project_id, None
+                    )
+                    return project_id
+        except httpx.HTTPStatusError as e:
+            if e.response.status_code == 403:
+                lib_logger.error(
+                    "Failed to list GCP projects due to a 403 Forbidden error. The Cloud Resource Manager API may not be enabled, or your account lacks the 'resourcemanager.projects.list' permission."
+                )
+            else:
+                lib_logger.error(
+                    f"Failed to list GCP projects with status {e.response.status_code}: {e}"
+                )
+        except httpx.RequestError as e:
+            lib_logger.error(f"Network error while listing GCP projects: {e}")
+        raise ValueError(
+            "Could not auto-discover Antigravity project ID. Possible causes:\n"
+            "  1. The cloudaicompanion.googleapis.com API is not enabled (enable it in Google Cloud Console)\n"
+            "  2. No active GCP projects exist for this account (create one in Google Cloud Console)\n"
+            "  3. Account lacks necessary permissions\n"
+            "To manually specify a project, set ANTIGRAVITY_PROJECT_ID in your .env file."
+        )
+    async def _persist_project_metadata(
+        self, credential_path: str, project_id: str, tier: Optional[str]
+    ):
+        """Persists project ID and tier to the credential file for faster future startups."""
+        # Skip persistence for env:// paths (environment-based credentials)
+        credential_index = self._parse_env_credential_path(credential_path)
+        if credential_index is not None:
+            lib_logger.debug(
+                f"Skipping project metadata persistence for env:// credential path: {credential_path}"
+            )
+            return
+        try:
+            # Load current credentials
+            with open(credential_path, "r") as f:
+                creds = json.load(f)
+            # Update metadata
+            if "_proxy_metadata" not in creds:
+                creds["_proxy_metadata"] = {}
+            creds["_proxy_metadata"]["project_id"] = project_id
+            if tier:
+                creds["_proxy_metadata"]["tier"] = tier
+            # Save back using the existing save method (handles atomic writes and permissions)
+            await self._save_credentials(credential_path, creds)
+            lib_logger.debug(
+                f"Persisted project_id and tier to credential file: {credential_path}"
+            )
+        except Exception as e:
+            lib_logger.warning(
+                f"Failed to persist project metadata to credential file: {e}"
+            )
+            # Non-fatal - just means slower startup next time
+    # =========================================================================
+    # CREDENTIAL MANAGEMENT OVERRIDES
+    # =========================================================================
+    def _get_provider_file_prefix(self) -> str:
+        """Return the file prefix for Antigravity credentials."""
+        return "antigravity"
+    def build_env_lines(self, creds: Dict[str, Any], cred_number: int) -> List[str]:
+        """
+        Generate .env file lines for an Antigravity credential.
+        Includes tier and project_id from _proxy_metadata.
+        """
+        # Get base lines from parent class
+        lines = super().build_env_lines(creds, cred_number)
+        # Add Antigravity-specific fields (tier and project_id)
+        metadata = creds.get("_proxy_metadata", {})
+        prefix = f"{self.ENV_PREFIX}_{cred_number}"
+        project_id = metadata.get("project_id", "")
+        tier = metadata.get("tier", "")
+        if project_id:
+            lines.append(f"{prefix}_PROJECT_ID={project_id}")
+        if tier:
+            lines.append(f"{prefix}_TIER={tier}")
+        return lines

src/rotator_library/providers/antigravity_provider.py CHANGED Viewed

@@ -38,6 +38,8 @@ from .provider_interface import ProviderInterface, UsageResetConfigDef, QuotaGro
 from .antigravity_auth_base import AntigravityAuthBase
 from .provider_cache import ProviderCache
 from ..model_definitions import ModelDefinitions
 # =============================================================================
@@ -105,12 +107,23 @@ DEFAULT_SAFETY_SETTINGS = [
     {"category": "HARM_CATEGORY_CIVIC_INTEGRITY", "threshold": "BLOCK_NONE"},
 ]
-# Directory paths
-_BASE_DIR = Path(__file__).resolve().parent.parent.parent.parent
-LOGS_DIR = _BASE_DIR / "logs" / "antigravity_logs"
-CACHE_DIR = _BASE_DIR / "cache" / "antigravity"
-GEMINI3_SIGNATURE_CACHE_FILE = CACHE_DIR / "gemini3_signatures.json"
-CLAUDE_THINKING_CACHE_FILE = CACHE_DIR / "claude_thinking.json"
 # Gemini 3 tool fix system instruction (prevents hallucination)
 DEFAULT_GEMINI3_SYSTEM_INSTRUCTION = """<CRITICAL_TOOL_USAGE_INSTRUCTIONS>
@@ -327,6 +340,33 @@ def _recursively_parse_json_strings(obj: Any) -> Any:
     return obj
 def _clean_claude_schema(schema: Any) -> Any:
     """
     Recursively clean JSON Schema for Antigravity/Google's Proto-based API.
@@ -384,7 +424,6 @@ def _clean_claude_schema(schema: Any) -> Any:
             return first_option
     cleaned = {}
     # Handle 'const' by converting to 'enum' with single value
     if "const" in schema:
         const_value = schema["const"]
@@ -425,7 +464,9 @@ class AntigravityFileLogger:
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
         safe_model = model_name.replace("/", "_").replace(":", "_")
-        self.log_dir = LOGS_DIR / f"{timestamp}_{safe_model}_{uuid.uuid4()}"
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
@@ -658,9 +699,6 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         error_obj = data.get("error", data)
         details = error_obj.get("details", [])
-        if not details:
-            return None
         result = {
             "retry_after": None,
             "reason": None,
@@ -711,6 +749,15 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         # Return None if we couldn't extract retry_after
         if not result["retry_after"]:
             return None
         return result
@@ -718,12 +765,7 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
     def __init__(self):
         super().__init__()
         self.model_definitions = ModelDefinitions()
-        self.project_id_cache: Dict[
-            str, str
-        ] = {}  # Cache project ID per credential path
-        self.project_tier_cache: Dict[
-            str, str
-        ] = {}  # Cache project tier per credential path (for debugging)
         # Base URL management
         self._base_url_index = 0
@@ -735,13 +777,13 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         # Initialize caches using shared ProviderCache
         self._signature_cache = ProviderCache(
-            GEMINI3_SIGNATURE_CACHE_FILE,
             memory_ttl,
             disk_ttl,
             env_prefix="ANTIGRAVITY_SIGNATURE",
         )
         self._thinking_cache = ProviderCache(
-            CLAUDE_THINKING_CACHE_FILE,
             memory_ttl,
             disk_ttl,
             env_prefix="ANTIGRAVITY_THINKING",
@@ -871,9 +913,48 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         This ensures all credential priorities are known before any API calls,
         preventing unknown credentials from getting priority 999.
         """
         await self._load_persisted_tiers(credential_paths)
     async def _load_persisted_tiers(
         self, credential_paths: List[str]
     ) -> Dict[str, str]:
@@ -931,6 +1012,8 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         return loaded
     # =========================================================================
     # MODEL UTILITIES
     # =========================================================================
@@ -1007,524 +1090,7 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         return "thinking_" + "_".join(key_parts) if key_parts else None
-    # =========================================================================
-    # PROJECT ID DISCOVERY
-    # =========================================================================
-    async def _discover_project_id(
-        self, credential_path: str, access_token: str, litellm_params: Dict[str, Any]
-    ) -> str:
-        """
-        Discovers the Google Cloud Project ID, with caching and onboarding for new accounts.
-        This follows the official Gemini CLI discovery flow adapted for Antigravity:
-        1. Check in-memory cache
-        2. Check configured project_id override (litellm_params or env var)
-        3. Check persisted project_id in credential file
-        4. Call loadCodeAssist to check if user is already known (has currentTier)
-           - If currentTier exists AND cloudaicompanionProject returned: use server's project
-           - If no currentTier: user needs onboarding
-        5. Onboard user (FREE tier: pass cloudaicompanionProject=None for server-managed)
-        6. Fallback to GCP Resource Manager project listing
-        Note: Unlike GeminiCli, Antigravity doesn't use tier-based credential prioritization,
-        but we still cache tier info for debugging and consistency.
-        """
-        lib_logger.debug(
-            f"Starting Antigravity project discovery for credential: {credential_path}"
-        )
-        # Check in-memory cache first
-        if credential_path in self.project_id_cache:
-            cached_project = self.project_id_cache[credential_path]
-            lib_logger.debug(f"Using cached project ID: {cached_project}")
-            return cached_project
-        # Check for configured project ID override (from litellm_params or env var)
-        configured_project_id = (
-            litellm_params.get("project_id")
-            or os.getenv("ANTIGRAVITY_PROJECT_ID")
-            or os.getenv("GOOGLE_CLOUD_PROJECT")
-        )
-        if configured_project_id:
-            lib_logger.debug(
-                f"Found configured project_id override: {configured_project_id}"
-            )
-        # Load credentials from file to check for persisted project_id and tier
-        # Skip for env:// paths (environment-based credentials don't persist to files)
-        credential_index = self._parse_env_credential_path(credential_path)
-        if credential_index is None:
-            # Only try to load from file if it's not an env:// path
-            try:
-                with open(credential_path, "r") as f:
-                    creds = json.load(f)
-                metadata = creds.get("_proxy_metadata", {})
-                persisted_project_id = metadata.get("project_id")
-                persisted_tier = metadata.get("tier")
-                if persisted_project_id:
-                    lib_logger.info(
-                        f"Loaded persisted project ID from credential file: {persisted_project_id}"
-                    )
-                    self.project_id_cache[credential_path] = persisted_project_id
-                    # Also load tier if available (for debugging/logging purposes)
-                    if persisted_tier:
-                        self.project_tier_cache[credential_path] = persisted_tier
-                        lib_logger.debug(f"Loaded persisted tier: {persisted_tier}")
-                    return persisted_project_id
-            except (FileNotFoundError, json.JSONDecodeError, KeyError) as e:
-                lib_logger.debug(f"Could not load persisted project ID from file: {e}")
-        lib_logger.debug(
-            "No cached or configured project ID found, initiating discovery..."
-        )
-        headers = {
-            "Authorization": f"Bearer {access_token}",
-            "Content-Type": "application/json",
-        }
-        discovered_project_id = None
-        discovered_tier = None
-        # Use production endpoint for loadCodeAssist (more reliable than sandbox URLs)
-        code_assist_endpoint = "https://cloudcode-pa.googleapis.com/v1internal"
-        async with httpx.AsyncClient() as client:
-            # 1. Try discovery endpoint with loadCodeAssist
-            lib_logger.debug(
-                "Attempting project discovery via Code Assist loadCodeAssist endpoint..."
-            )
-            try:
-                # Build metadata - include duetProject only if we have a configured project
-                core_client_metadata = {
-                    "ideType": "IDE_UNSPECIFIED",
-                    "platform": "PLATFORM_UNSPECIFIED",
-                    "pluginType": "GEMINI",
-                }
-                if configured_project_id:
-                    core_client_metadata["duetProject"] = configured_project_id
-                # Build load request - pass configured_project_id if available, otherwise None
-                load_request = {
-                    "cloudaicompanionProject": configured_project_id,  # Can be None
-                    "metadata": core_client_metadata,
-                }
-                lib_logger.debug(
-                    f"Sending loadCodeAssist request with cloudaicompanionProject={configured_project_id}"
-                )
-                response = await client.post(
-                    f"{code_assist_endpoint}:loadCodeAssist",
-                    headers=headers,
-                    json=load_request,
-                    timeout=20,
-                )
-                response.raise_for_status()
-                data = response.json()
-                # Log full response for debugging
-                lib_logger.debug(
-                    f"loadCodeAssist full response keys: {list(data.keys())}"
-                )
-                # Extract tier information
-                allowed_tiers = data.get("allowedTiers", [])
-                current_tier = data.get("currentTier")
-                lib_logger.debug(f"=== Tier Information ===")
-                lib_logger.debug(f"currentTier: {current_tier}")
-                lib_logger.debug(f"allowedTiers count: {len(allowed_tiers)}")
-                for i, tier in enumerate(allowed_tiers):
-                    tier_id = tier.get("id", "unknown")
-                    is_default = tier.get("isDefault", False)
-                    user_defined = tier.get("userDefinedCloudaicompanionProject", False)
-                    lib_logger.debug(
-                        f"  Tier {i + 1}: id={tier_id}, isDefault={is_default}, userDefinedProject={user_defined}"
-                    )
-                lib_logger.debug(f"========================")
-                # Determine the current tier ID
-                current_tier_id = None
-                if current_tier:
-                    current_tier_id = current_tier.get("id")
-                    lib_logger.debug(f"User has currentTier: {current_tier_id}")
-                # Check if user is already known to server (has currentTier)
-                if current_tier_id:
-                    # User is already onboarded - check for project from server
-                    server_project = data.get("cloudaicompanionProject")
-                    # Check if this tier requires user-defined project (paid tiers)
-                    requires_user_project = any(
-                        t.get("id") == current_tier_id
-                        and t.get("userDefinedCloudaicompanionProject", False)
-                        for t in allowed_tiers
-                    )
-                    is_free_tier = current_tier_id == "free-tier"
-                    if server_project:
-                        # Server returned a project - use it (server wins)
-                        project_id = server_project
-                        lib_logger.debug(f"Server returned project: {project_id}")
-                    elif configured_project_id:
-                        # No server project but we have configured one - use it
-                        project_id = configured_project_id
-                        lib_logger.debug(
-                            f"No server project, using configured: {project_id}"
-                        )
-                    elif is_free_tier:
-                        # Free tier user without server project - try onboarding
-                        lib_logger.debug(
-                            "Free tier user with currentTier but no project - will try onboarding"
-                        )
-                        project_id = None
-                    elif requires_user_project:
-                        # Paid tier requires a project ID to be set
-                        raise ValueError(
-                            f"Paid tier '{current_tier_id}' requires setting ANTIGRAVITY_PROJECT_ID environment variable."
-                        )
-                    else:
-                        # Unknown tier without project - proceed to onboarding
-                        lib_logger.warning(
-                            f"Tier '{current_tier_id}' has no project and none configured - will try onboarding"
-                        )
-                        project_id = None
-                    if project_id:
-                        # Cache tier info
-                        self.project_tier_cache[credential_path] = current_tier_id
-                        discovered_tier = current_tier_id
-                        # Log appropriately based on tier
-                        is_paid = current_tier_id and current_tier_id not in [
-                            "free-tier",
-                            "legacy-tier",
-                            "unknown",
-                        ]
-                        if is_paid:
-                            lib_logger.info(
-                                f"Using Antigravity paid tier '{current_tier_id}' with project: {project_id}"
-                            )
-                        else:
-                            lib_logger.info(
-                                f"Discovered Antigravity project ID via loadCodeAssist: {project_id}"
-                            )
-                        self.project_id_cache[credential_path] = project_id
-                        discovered_project_id = project_id
-                        # Persist to credential file
-                        await self._persist_project_metadata(
-                            credential_path, project_id, discovered_tier
-                        )
-                        return project_id
-                # 2. User needs onboarding - no currentTier or no project found
-                lib_logger.info(
-                    "No existing Antigravity session found (no currentTier), attempting to onboard user..."
-                )
-                # Determine which tier to onboard with
-                onboard_tier = None
-                for tier in allowed_tiers:
-                    if tier.get("isDefault"):
-                        onboard_tier = tier
-                        break
-                # Fallback to legacy tier if no default
-                if not onboard_tier and allowed_tiers:
-                    for tier in allowed_tiers:
-                        if tier.get("id") == "legacy-tier":
-                            onboard_tier = tier
-                            break
-                    if not onboard_tier:
-                        onboard_tier = allowed_tiers[0]
-                if not onboard_tier:
-                    raise ValueError("No onboarding tiers available from server")
-                tier_id = onboard_tier.get("id", "free-tier")
-                requires_user_project = onboard_tier.get(
-                    "userDefinedCloudaicompanionProject", False
-                )
-                lib_logger.debug(
-                    f"Onboarding with tier: {tier_id}, requiresUserProject: {requires_user_project}"
-                )
-                # Build onboard request based on tier type
-                # FREE tier: cloudaicompanionProject = None (server-managed)
-                # PAID tier: cloudaicompanionProject = configured_project_id
-                is_free_tier = tier_id == "free-tier"
-                if is_free_tier:
-                    # Free tier uses server-managed project
-                    onboard_request = {
-                        "tierId": tier_id,
-                        "cloudaicompanionProject": None,  # Server will create/manage
-                        "metadata": core_client_metadata,
-                    }
-                    lib_logger.debug(
-                        "Free tier onboarding: using server-managed project"
-                    )
-                else:
-                    # Paid/legacy tier requires user-provided project
-                    if not configured_project_id and requires_user_project:
-                        raise ValueError(
-                            f"Tier '{tier_id}' requires setting ANTIGRAVITY_PROJECT_ID environment variable."
-                        )
-                    onboard_request = {
-                        "tierId": tier_id,
-                        "cloudaicompanionProject": configured_project_id,
-                        "metadata": {
-                            **core_client_metadata,
-                            "duetProject": configured_project_id,
-                        }
-                        if configured_project_id
-                        else core_client_metadata,
-                    }
-                    lib_logger.debug(
-                        f"Paid tier onboarding: using project {configured_project_id}"
-                    )
-                lib_logger.debug("Initiating onboardUser request...")
-                lro_response = await client.post(
-                    f"{code_assist_endpoint}:onboardUser",
-                    headers=headers,
-                    json=onboard_request,
-                    timeout=30,
-                )
-                lro_response.raise_for_status()
-                lro_data = lro_response.json()
-                lib_logger.debug(
-                    f"Initial onboarding response: done={lro_data.get('done')}"
-                )
-                # Poll for onboarding completion (up to 5 minutes)
-                for i in range(150):  # 150 × 2s = 5 minutes
-                    if lro_data.get("done"):
-                        lib_logger.debug(
-                            f"Onboarding completed after {i} polling attempts"
-                        )
-                        break
-                    await asyncio.sleep(2)
-                    if (i + 1) % 15 == 0:  # Log every 30 seconds
-                        lib_logger.info(
-                            f"Still waiting for onboarding completion... ({(i + 1) * 2}s elapsed)"
-                        )
-                    lib_logger.debug(
-                        f"Polling onboarding status... (Attempt {i + 1}/150)"
-                    )
-                    lro_response = await client.post(
-                        f"{code_assist_endpoint}:onboardUser",
-                        headers=headers,
-                        json=onboard_request,
-                        timeout=30,
-                    )
-                    lro_response.raise_for_status()
-                    lro_data = lro_response.json()
-                if not lro_data.get("done"):
-                    lib_logger.error("Onboarding process timed out after 5 minutes")
-                    raise ValueError(
-                        "Onboarding process timed out after 5 minutes. Please try again or contact support."
-                    )
-                # Extract project ID from LRO response
-                # Note: onboardUser returns response.cloudaicompanionProject as an object with .id
-                lro_response_data = lro_data.get("response", {})
-                lro_project_obj = lro_response_data.get("cloudaicompanionProject", {})
-                project_id = (
-                    lro_project_obj.get("id")
-                    if isinstance(lro_project_obj, dict)
-                    else None
-                )
-                # Fallback to configured project if LRO didn't return one
-                if not project_id and configured_project_id:
-                    project_id = configured_project_id
-                    lib_logger.debug(
-                        f"LRO didn't return project, using configured: {project_id}"
-                    )
-                if not project_id:
-                    lib_logger.error(
-                        "Onboarding completed but no project ID in response and none configured"
-                    )
-                    raise ValueError(
-                        "Onboarding completed, but no project ID was returned. "
-                        "For paid tiers, set ANTIGRAVITY_PROJECT_ID environment variable."
-                    )
-                lib_logger.debug(
-                    f"Successfully extracted project ID from onboarding response: {project_id}"
-                )
-                # Cache tier info
-                self.project_tier_cache[credential_path] = tier_id
-                discovered_tier = tier_id
-                lib_logger.debug(f"Cached tier information: {tier_id}")
-                # Log concise message based on tier
-                is_paid = tier_id and tier_id not in ["free-tier", "legacy-tier"]
-                if is_paid:
-                    lib_logger.info(
-                        f"Using Antigravity paid tier '{tier_id}' with project: {project_id}"
-                    )
-                else:
-                    lib_logger.info(
-                        f"Successfully onboarded user and discovered project ID: {project_id}"
-                    )
-                self.project_id_cache[credential_path] = project_id
-                discovered_project_id = project_id
-                # Persist to credential file
-                await self._persist_project_metadata(
-                    credential_path, project_id, discovered_tier
-                )
-                return project_id
-            except httpx.HTTPStatusError as e:
-                error_body = ""
-                try:
-                    error_body = e.response.text
-                except Exception:
-                    pass
-                if e.response.status_code == 403:
-                    lib_logger.error(
-                        f"Antigravity Code Assist API access denied (403). Response: {error_body}"
-                    )
-                    lib_logger.error(
-                        "Possible causes: 1) cloudaicompanion.googleapis.com API not enabled, 2) Wrong project ID for paid tier, 3) Account lacks permissions"
-                    )
-                elif e.response.status_code == 404:
-                    lib_logger.warning(
-                        f"Antigravity Code Assist endpoint not found (404). Falling back to project listing."
-                    )
-                elif e.response.status_code == 412:
-                    # Precondition Failed - often means wrong project for free tier onboarding
-                    lib_logger.error(
-                        f"Precondition failed (412): {error_body}. This may mean the project ID is incompatible with the selected tier."
-                    )
-                else:
-                    lib_logger.warning(
-                        f"Antigravity onboarding/discovery failed with status {e.response.status_code}: {error_body}. Falling back to project listing."
-                    )
-            except httpx.RequestError as e:
-                lib_logger.warning(
-                    f"Antigravity onboarding/discovery network error: {e}. Falling back to project listing."
-                )
-        # 3. Fallback to listing all available GCP projects (last resort)
-        lib_logger.debug(
-            "Attempting to discover project via GCP Resource Manager API..."
-        )
-        try:
-            async with httpx.AsyncClient() as client:
-                lib_logger.debug(
-                    "Querying Cloud Resource Manager for available projects..."
-                )
-                response = await client.get(
-                    "https://cloudresourcemanager.googleapis.com/v1/projects",
-                    headers=headers,
-                    timeout=20,
-                )
-                response.raise_for_status()
-                projects = response.json().get("projects", [])
-                lib_logger.debug(f"Found {len(projects)} total projects")
-                active_projects = [
-                    p for p in projects if p.get("lifecycleState") == "ACTIVE"
-                ]
-                lib_logger.debug(f"Found {len(active_projects)} active projects")
-                if not projects:
-                    lib_logger.error(
-                        "No GCP projects found for this account. Please create a project in Google Cloud Console."
-                    )
-                elif not active_projects:
-                    lib_logger.error(
-                        "No active GCP projects found. Please activate a project in Google Cloud Console."
-                    )
-                else:
-                    project_id = active_projects[0]["projectId"]
-                    lib_logger.info(
-                        f"Discovered Antigravity project ID from active projects list: {project_id}"
-                    )
-                    lib_logger.debug(
-                        f"Selected first active project: {project_id} (out of {len(active_projects)} active projects)"
-                    )
-                    self.project_id_cache[credential_path] = project_id
-                    discovered_project_id = project_id
-                    # Persist to credential file (no tier info from resource manager)
-                    await self._persist_project_metadata(
-                        credential_path, project_id, None
-                    )
-                    return project_id
-        except httpx.HTTPStatusError as e:
-            if e.response.status_code == 403:
-                lib_logger.error(
-                    "Failed to list GCP projects due to a 403 Forbidden error. The Cloud Resource Manager API may not be enabled, or your account lacks the 'resourcemanager.projects.list' permission."
-                )
-            else:
-                lib_logger.error(
-                    f"Failed to list GCP projects with status {e.response.status_code}: {e}"
-                )
-        except httpx.RequestError as e:
-            lib_logger.error(f"Network error while listing GCP projects: {e}")
-        raise ValueError(
-            "Could not auto-discover Antigravity project ID. Possible causes:\n"
-            "  1. The cloudaicompanion.googleapis.com API is not enabled (enable it in Google Cloud Console)\n"
-            "  2. No active GCP projects exist for this account (create one in Google Cloud Console)\n"
-            "  3. Account lacks necessary permissions\n"
-            "To manually specify a project, set ANTIGRAVITY_PROJECT_ID in your .env file."
-        )
-    async def _persist_project_metadata(
-        self, credential_path: str, project_id: str, tier: Optional[str]
-    ):
-        """Persists project ID and tier to the credential file for faster future startups."""
-        # Skip persistence for env:// paths (environment-based credentials)
-        credential_index = self._parse_env_credential_path(credential_path)
-        if credential_index is not None:
-            lib_logger.debug(
-                f"Skipping project metadata persistence for env:// credential path: {credential_path}"
-            )
-            return
-        try:
-            # Load current credentials
-            with open(credential_path, "r") as f:
-                creds = json.load(f)
-            # Update metadata
-            if "_proxy_metadata" not in creds:
-                creds["_proxy_metadata"] = {}
-            creds["_proxy_metadata"]["project_id"] = project_id
-            if tier:
-                creds["_proxy_metadata"]["tier"] = tier
-            # Save back using the existing save method (handles atomic writes and permissions)
-            await self._save_credentials(credential_path, creds)
-            lib_logger.debug(
-                f"Persisted project_id and tier to credential file: {credential_path}"
-            )
-        except Exception as e:
-            lib_logger.warning(
-                f"Failed to persist project metadata to credential file: {e}"
-            )
-            # Non-fatal - just means slower startup next time
     # =========================================================================
     # THINKING MODE SANITIZATION
@@ -2424,7 +1990,7 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
                 elif first_func_in_msg:
                     # Only add bypass to the first function call if no sig available
                     func_part["thoughtSignature"] = "skip_thought_signature_validator"
-                    lib_logger.warning(
                         f"Missing thoughtSignature for first func call {tool_id}, using bypass"
                     )
                 # Subsequent parallel calls: no signature field at all
@@ -2559,9 +2125,9 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
                                 f"Ignoring duplicate - this may indicate malformed conversation history."
                             )
                             continue
-                        #lib_logger.debug(
                         #    f"[Grouping] Collected response for ID: {resp_id}"
-                        #)
                         collected_responses[resp_id] = resp
                 # Try to satisfy pending groups (newest first)
@@ -2576,10 +2142,10 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
                             collected_responses.pop(gid) for gid in group_ids
                         ]
                         new_contents.append({"parts": group_responses, "role": "user"})
-                        #lib_logger.debug(
                         #    f"[Grouping] Satisfied group with {len(group_responses)} responses: "
                         #    f"ids={group_ids}"
-                        #)
                         pending_groups.pop(i)
                         break
                 continue
@@ -2599,10 +2165,10 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
                     ]
                     if call_ids:
-                        #lib_logger.debug(
                         #    f"[Grouping] Created pending group expecting {len(call_ids)} responses: "
                         #    f"ids={call_ids}, names={func_names}"
-                        #)
                         pending_groups.append(
                             {
                                 "ids": call_ids,
@@ -2967,12 +2533,41 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
             if params and isinstance(params, dict):
                 schema = dict(params)
-                schema.pop("$schema", None)
                 schema.pop("strict", None)
                 schema = _normalize_type_arrays(schema)
                 func_decl["parametersJsonSchema"] = schema
             else:
-                func_decl["parametersJsonSchema"] = {"type": "object", "properties": {}}
             gemini_tools.append({"functionDeclarations": [func_decl]})
@@ -3097,17 +2692,19 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         return antigravity_payload
     def _apply_claude_tool_transform(self, payload: Dict[str, Any]) -> None:
-        """Apply Claude-specific tool schema transformations."""
         tools = payload["request"].get("tools", [])
         for tool in tools:
             for func_decl in tool.get("functionDeclarations", []):
                 if "parametersJsonSchema" in func_decl:
                     params = func_decl["parametersJsonSchema"]
-                    params = (
-                        _clean_claude_schema(params)
-                        if isinstance(params, dict)
-                        else params
-                    )
                     func_decl["parameters"] = params
                     del func_decl["parametersJsonSchema"]
@@ -3336,6 +2933,13 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         raw_args = func_call.get("args", {})
         parsed_args = _recursively_parse_json_strings(raw_args)
         tool_call = {
             "id": tool_id,
             "type": "function",
@@ -3405,7 +3009,7 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         }
         self._thinking_cache.store(cache_key, json.dumps(data))
-        lib_logger.info(f"Cached thinking: {cache_key[:50]}...")
     # =========================================================================
     # PROVIDER INTERFACE IMPLEMENTATION
@@ -3703,7 +3307,12 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         file_logger: Optional[AntigravityFileLogger] = None,
     ) -> litellm.ModelResponse:
         """Handle non-streaming completion."""
-        response = await client.post(url, headers=headers, json=payload, timeout=600.0)
         response.raise_for_status()
         data = response.json()
@@ -3736,11 +3345,15 @@ class AntigravityProvider(AntigravityAuthBase, ProviderInterface):
         }
         async with client.stream(
-            "POST", url, headers=headers, json=payload, timeout=600.0
         ) as response:
             if response.status_code >= 400:
-                # Read error body for raise_for_status to include in exception
-                # Terminal logging commented out - errors are logged in failures.log
                 try:
                     await response.aread()
                     # lib_logger.error(

 from .antigravity_auth_base import AntigravityAuthBase
 from .provider_cache import ProviderCache
 from ..model_definitions import ModelDefinitions
+from ..timeout_config import TimeoutConfig
+from ..utils.paths import get_logs_dir, get_cache_dir
 # =============================================================================
     {"category": "HARM_CATEGORY_CIVIC_INTEGRITY", "threshold": "BLOCK_NONE"},
 ]
+# Directory paths - use centralized path management
+def _get_antigravity_logs_dir():
+    return get_logs_dir() / "antigravity_logs"
+def _get_antigravity_cache_dir():
+    return get_cache_dir(subdir="antigravity")
+def _get_gemini3_signature_cache_file():
+    return _get_antigravity_cache_dir() / "gemini3_signatures.json"
+def _get_claude_thinking_cache_file():
+    return _get_antigravity_cache_dir() / "claude_thinking.json"
 # Gemini 3 tool fix system instruction (prevents hallucination)
 DEFAULT_GEMINI3_SYSTEM_INSTRUCTION = """<CRITICAL_TOOL_USAGE_INSTRUCTIONS>
     return obj
+def _inline_schema_refs(schema: Dict[str, Any]) -> Dict[str, Any]:
+    """Inline local $ref definitions before sanitization."""
+    if not isinstance(schema, dict):
+        return schema
+    defs = schema.get("$defs", schema.get("definitions", {}))
+    if not defs:
+        return schema
+    def resolve(node, seen=()):
+        if not isinstance(node, dict):
+            return [resolve(x, seen) for x in node] if isinstance(node, list) else node
+        if "$ref" in node:
+            ref = node["$ref"]
+            if ref in seen:  # Circular - drop it
+                return {k: resolve(v, seen) for k, v in node.items() if k != "$ref"}
+            for prefix in ("#/$defs/", "#/definitions/"):
+                if isinstance(ref, str) and ref.startswith(prefix):
+                    name = ref[len(prefix) :]
+                    if name in defs:
+                        return resolve(copy.deepcopy(defs[name]), seen + (ref,))
+            return {k: resolve(v, seen) for k, v in node.items() if k != "$ref"}
+        return {k: resolve(v, seen) for k, v in node.items()}
+    return resolve(schema)
 def _clean_claude_schema(schema: Any) -> Any:
     """
     Recursively clean JSON Schema for Antigravity/Google's Proto-based API.
             return first_option
     cleaned = {}
     # Handle 'const' by converting to 'enum' with single value
     if "const" in schema:
         const_value = schema["const"]
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
         safe_model = model_name.replace("/", "_").replace(":", "_")
+        self.log_dir = (
+            _get_antigravity_logs_dir() / f"{timestamp}_{safe_model}_{uuid.uuid4()}"
+        )
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
         error_obj = data.get("error", data)
         details = error_obj.get("details", [])
         result = {
             "retry_after": None,
             "reason": None,
         # Return None if we couldn't extract retry_after
         if not result["retry_after"]:
+            # Handle bare RESOURCE_EXHAUSTED without timing details
+            error_status = error_obj.get("status", "")
+            error_code = error_obj.get("code")
+            if error_status == "RESOURCE_EXHAUSTED" or error_code == 429:
+                result["retry_after"] = 60  # Default fallback
+                result["reason"] = result.get("reason") or "RESOURCE_EXHAUSTED"
+                return result
             return None
         return result
     def __init__(self):
         super().__init__()
         self.model_definitions = ModelDefinitions()
+        # NOTE: project_id_cache and project_tier_cache are inherited from AntigravityAuthBase
         # Base URL management
         self._base_url_index = 0
         # Initialize caches using shared ProviderCache
         self._signature_cache = ProviderCache(
+            _get_gemini3_signature_cache_file(),
             memory_ttl,
             disk_ttl,
             env_prefix="ANTIGRAVITY_SIGNATURE",
         )
         self._thinking_cache = ProviderCache(
+            _get_claude_thinking_cache_file(),
             memory_ttl,
             disk_ttl,
             env_prefix="ANTIGRAVITY_THINKING",
         This ensures all credential priorities are known before any API calls,
         preventing unknown credentials from getting priority 999.
+        For credentials without persisted tier info (new or corrupted), performs
+        full discovery to ensure proper prioritization in sequential rotation mode.
         """
+        # Step 1: Load persisted tiers from files
         await self._load_persisted_tiers(credential_paths)
+        # Step 2: Identify credentials still missing tier info
+        credentials_needing_discovery = [
+            path
+            for path in credential_paths
+            if path not in self.project_tier_cache
+            and self._parse_env_credential_path(path) is None  # Skip env:// paths
+        ]
+        if not credentials_needing_discovery:
+            return  # All credentials have tier info
+        lib_logger.info(
+            f"Antigravity: Discovering tier info for {len(credentials_needing_discovery)} credential(s)..."
+        )
+        # Step 3: Perform discovery for each missing credential (sequential to avoid rate limits)
+        for credential_path in credentials_needing_discovery:
+            try:
+                auth_header = await self.get_auth_header(credential_path)
+                access_token = auth_header["Authorization"].split(" ")[1]
+                await self._discover_project_id(
+                    credential_path, access_token, litellm_params={}
+                )
+                discovered_tier = self.project_tier_cache.get(
+                    credential_path, "unknown"
+                )
+                lib_logger.debug(
+                    f"Discovered tier '{discovered_tier}' for {Path(credential_path).name}"
+                )
+            except Exception as e:
+                lib_logger.warning(
+                    f"Failed to discover tier for {Path(credential_path).name}: {e}. "
+                    f"Credential will use default priority."
+                )
     async def _load_persisted_tiers(
         self, credential_paths: List[str]
     ) -> Dict[str, str]:
         return loaded
+    # NOTE: _post_auth_discovery() is inherited from AntigravityAuthBase
     # =========================================================================
     # MODEL UTILITIES
     # =========================================================================
         return "thinking_" + "_".join(key_parts) if key_parts else None
+    # NOTE: _discover_project_id() and _persist_project_metadata() are inherited from AntigravityAuthBase
     # =========================================================================
     # THINKING MODE SANITIZATION
                 elif first_func_in_msg:
                     # Only add bypass to the first function call if no sig available
                     func_part["thoughtSignature"] = "skip_thought_signature_validator"
+                    lib_logger.debug(
                         f"Missing thoughtSignature for first func call {tool_id}, using bypass"
                     )
                 # Subsequent parallel calls: no signature field at all
                                 f"Ignoring duplicate - this may indicate malformed conversation history."
                             )
                             continue
+                        # lib_logger.debug(
                         #    f"[Grouping] Collected response for ID: {resp_id}"
+                        # )
                         collected_responses[resp_id] = resp
                 # Try to satisfy pending groups (newest first)
                             collected_responses.pop(gid) for gid in group_ids
                         ]
                         new_contents.append({"parts": group_responses, "role": "user"})
+                        # lib_logger.debug(
                         #    f"[Grouping] Satisfied group with {len(group_responses)} responses: "
                         #    f"ids={group_ids}"
+                        # )
                         pending_groups.pop(i)
                         break
                 continue
                     ]
                     if call_ids:
+                        # lib_logger.debug(
                         #    f"[Grouping] Created pending group expecting {len(call_ids)} responses: "
                         #    f"ids={call_ids}, names={func_names}"
+                        # )
                         pending_groups.append(
                             {
                                 "ids": call_ids,
             if params and isinstance(params, dict):
                 schema = dict(params)
                 schema.pop("strict", None)
+                # Inline $ref definitions, then strip unsupported keywords
+                schema = _inline_schema_refs(schema)
+                schema = _clean_claude_schema(schema)
                 schema = _normalize_type_arrays(schema)
+                # Workaround: Antigravity/Gemini fails to emit functionCall
+                # when tool has empty properties {}. Inject a dummy optional
+                # parameter to ensure the tool call is emitted.
+                # Using a required confirmation parameter forces the model to
+                # commit to the tool call rather than just thinking about it.
+                props = schema.get("properties", {})
+                if not props:
+                    schema["properties"] = {
+                        "_confirm": {
+                            "type": "string",
+                            "description": "Enter 'yes' to proceed",
+                        }
+                    }
+                    schema["required"] = ["_confirm"]
                 func_decl["parametersJsonSchema"] = schema
             else:
+                # No parameters provided - use default with required confirm param
+                # to ensure the tool call is emitted properly
+                func_decl["parametersJsonSchema"] = {
+                    "type": "object",
+                    "properties": {
+                        "_confirm": {
+                            "type": "string",
+                            "description": "Enter 'yes' to proceed",
+                        }
+                    },
+                    "required": ["_confirm"],
+                }
             gemini_tools.append({"functionDeclarations": [func_decl]})
         return antigravity_payload
     def _apply_claude_tool_transform(self, payload: Dict[str, Any]) -> None:
+        """Apply Claude-specific tool schema transformations.
+        Converts parametersJsonSchema to parameters and applies Claude-specific
+        schema sanitization (inlines $ref, removes unsupported JSON Schema fields).
+        """
         tools = payload["request"].get("tools", [])
         for tool in tools:
             for func_decl in tool.get("functionDeclarations", []):
                 if "parametersJsonSchema" in func_decl:
                     params = func_decl["parametersJsonSchema"]
+                    if isinstance(params, dict):
+                        params = _inline_schema_refs(params)
+                        params = _clean_claude_schema(params)
                     func_decl["parameters"] = params
                     del func_decl["parametersJsonSchema"]
         raw_args = func_call.get("args", {})
         parsed_args = _recursively_parse_json_strings(raw_args)
+        # Strip the injected _confirm parameter ONLY if it's the sole parameter
+        # This ensures we only strip our injection, not legitimate user params
+        if isinstance(parsed_args, dict) and "_confirm" in parsed_args:
+            if len(parsed_args) == 1:
+                # _confirm is the only param - this was our injection
+                parsed_args.pop("_confirm")
         tool_call = {
             "id": tool_id,
             "type": "function",
         }
         self._thinking_cache.store(cache_key, json.dumps(data))
+        lib_logger.debug(f"Cached thinking: {cache_key[:50]}...")
     # =========================================================================
     # PROVIDER INTERFACE IMPLEMENTATION
         file_logger: Optional[AntigravityFileLogger] = None,
     ) -> litellm.ModelResponse:
         """Handle non-streaming completion."""
+        response = await client.post(
+            url,
+            headers=headers,
+            json=payload,
+            timeout=TimeoutConfig.non_streaming(),
+        )
         response.raise_for_status()
         data = response.json()
         }
         async with client.stream(
+            "POST",
+            url,
+            headers=headers,
+            json=payload,
+            timeout=TimeoutConfig.streaming(),
         ) as response:
             if response.status_code >= 400:
+                # Read error body so it's available in response.text for logging
+                # The actual logging happens in failure_logger via _extract_response_body
                 try:
                     await response.aread()
                     # lib_logger.error(

src/rotator_library/providers/gemini_auth_base.py CHANGED Viewed

@@ -1,15 +1,35 @@
 # src/rotator_library/providers/gemini_auth_base.py
 from .google_oauth_base import GoogleOAuthBase
 class GeminiAuthBase(GoogleOAuthBase):
     """
     Gemini CLI OAuth2 authentication implementation.
     Inherits all OAuth functionality from GoogleOAuthBase with Gemini-specific configuration.
     """
-    CLIENT_ID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
     CLIENT_SECRET = "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"
     OAUTH_SCOPES = [
         "https://www.googleapis.com/auth/cloud-platform",
@@ -18,4 +38,606 @@ class GeminiAuthBase(GoogleOAuthBase):
     ]
     ENV_PREFIX = "GEMINI_CLI"
     CALLBACK_PORT = 8085
-    CALLBACK_PATH = "/oauth2callback"

 # src/rotator_library/providers/gemini_auth_base.py
+import asyncio
+import json
+import logging
+import os
+from pathlib import Path
+from typing import Any, Dict, Optional, List
+import httpx
 from .google_oauth_base import GoogleOAuthBase
+lib_logger = logging.getLogger("rotator_library")
+# Code Assist endpoint for project discovery
+CODE_ASSIST_ENDPOINT = "https://cloudcode-pa.googleapis.com/v1internal"
 class GeminiAuthBase(GoogleOAuthBase):
     """
     Gemini CLI OAuth2 authentication implementation.
     Inherits all OAuth functionality from GoogleOAuthBase with Gemini-specific configuration.
+    Also provides project/tier discovery functionality that runs during authentication,
+    ensuring credentials have their tier and project_id cached before any API requests.
     """
+    CLIENT_ID = (
+        "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
+    )
     CLIENT_SECRET = "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"
     OAUTH_SCOPES = [
         "https://www.googleapis.com/auth/cloud-platform",
     ]
     ENV_PREFIX = "GEMINI_CLI"
     CALLBACK_PORT = 8085
+    CALLBACK_PATH = "/oauth2callback"
+    def __init__(self):
+        super().__init__()
+        # Project and tier caches - shared between auth base and provider
+        self.project_id_cache: Dict[str, str] = {}
+        self.project_tier_cache: Dict[str, str] = {}
+    # =========================================================================
+    # POST-AUTH DISCOVERY HOOK
+    # =========================================================================
+    async def _post_auth_discovery(
+        self, credential_path: str, access_token: str
+    ) -> None:
+        """
+        Discover and cache tier/project information immediately after OAuth authentication.
+        This is called by GoogleOAuthBase._perform_interactive_oauth() after successful auth,
+        ensuring tier and project_id are cached during the authentication flow rather than
+        waiting for the first API request.
+        Args:
+            credential_path: Path to the credential file
+            access_token: The newly obtained access token
+        """
+        lib_logger.debug(
+            f"Starting post-auth discovery for GeminiCli credential: {Path(credential_path).name}"
+        )
+        # Skip if already discovered (shouldn't happen during fresh auth, but be defensive)
+        if (
+            credential_path in self.project_id_cache
+            and credential_path in self.project_tier_cache
+        ):
+            lib_logger.debug(
+                f"Tier and project already cached for {Path(credential_path).name}, skipping discovery"
+            )
+            return
+        # Call _discover_project_id which handles tier/project discovery and persistence
+        # Pass empty litellm_params since we're in auth context (no model-specific overrides)
+        project_id = await self._discover_project_id(
+            credential_path, access_token, litellm_params={}
+        )
+        tier = self.project_tier_cache.get(credential_path, "unknown")
+        lib_logger.info(
+            f"Post-auth discovery complete for {Path(credential_path).name}: "
+            f"tier={tier}, project={project_id}"
+        )
+    # =========================================================================
+    # PROJECT ID DISCOVERY
+    # =========================================================================
+    async def _discover_project_id(
+        self, credential_path: str, access_token: str, litellm_params: Dict[str, Any]
+    ) -> str:
+        """
+        Discovers the Google Cloud Project ID, with caching and onboarding for new accounts.
+        This follows the official Gemini CLI discovery flow:
+        1. Check in-memory cache
+        2. Check configured project_id override (litellm_params or env var)
+        3. Check persisted project_id in credential file
+        4. Call loadCodeAssist to check if user is already known (has currentTier)
+           - If currentTier exists AND cloudaicompanionProject returned: use server's project
+           - If currentTier exists but NO cloudaicompanionProject: use configured project_id (paid tier requires this)
+           - If no currentTier: user needs onboarding
+        5. Onboard user based on tier:
+           - FREE tier: pass cloudaicompanionProject=None (server-managed)
+           - PAID tier: pass cloudaicompanionProject=configured_project_id
+        6. Fallback to GCP Resource Manager project listing
+        """
+        lib_logger.debug(
+            f"Starting project discovery for credential: {credential_path}"
+        )
+        # Check in-memory cache first
+        if credential_path in self.project_id_cache:
+            cached_project = self.project_id_cache[credential_path]
+            lib_logger.debug(f"Using cached project ID: {cached_project}")
+            return cached_project
+        # Check for configured project ID override (from litellm_params or env var)
+        # This is REQUIRED for paid tier users per the official CLI behavior
+        configured_project_id = litellm_params.get("project_id") or os.getenv(
+            "GEMINI_CLI_PROJECT_ID"
+        )
+        if configured_project_id:
+            lib_logger.debug(
+                f"Found configured project_id override: {configured_project_id}"
+            )
+        # Load credentials from file to check for persisted project_id and tier
+        # Skip for env:// paths (environment-based credentials don't persist to files)
+        credential_index = self._parse_env_credential_path(credential_path)
+        if credential_index is None:
+            # Only try to load from file if it's not an env:// path
+            try:
+                with open(credential_path, "r") as f:
+                    creds = json.load(f)
+                metadata = creds.get("_proxy_metadata", {})
+                persisted_project_id = metadata.get("project_id")
+                persisted_tier = metadata.get("tier")
+                if persisted_project_id:
+                    lib_logger.info(
+                        f"Loaded persisted project ID from credential file: {persisted_project_id}"
+                    )
+                    self.project_id_cache[credential_path] = persisted_project_id
+                    # Also load tier if available
+                    if persisted_tier:
+                        self.project_tier_cache[credential_path] = persisted_tier
+                        lib_logger.debug(f"Loaded persisted tier: {persisted_tier}")
+                    return persisted_project_id
+            except (FileNotFoundError, json.JSONDecodeError, KeyError) as e:
+                lib_logger.debug(f"Could not load persisted project ID from file: {e}")
+        lib_logger.debug(
+            "No cached or configured project ID found, initiating discovery..."
+        )
+        headers = {
+            "Authorization": f"Bearer {access_token}",
+            "Content-Type": "application/json",
+        }
+        discovered_project_id = None
+        discovered_tier = None
+        async with httpx.AsyncClient() as client:
+            # 1. Try discovery endpoint with loadCodeAssist
+            lib_logger.debug(
+                "Attempting project discovery via Code Assist loadCodeAssist endpoint..."
+            )
+            try:
+                # Build metadata - include duetProject only if we have a configured project
+                core_client_metadata = {
+                    "ideType": "IDE_UNSPECIFIED",
+                    "platform": "PLATFORM_UNSPECIFIED",
+                    "pluginType": "GEMINI",
+                }
+                if configured_project_id:
+                    core_client_metadata["duetProject"] = configured_project_id
+                # Build load request - pass configured_project_id if available, otherwise None
+                load_request = {
+                    "cloudaicompanionProject": configured_project_id,  # Can be None
+                    "metadata": core_client_metadata,
+                }
+                lib_logger.debug(
+                    f"Sending loadCodeAssist request with cloudaicompanionProject={configured_project_id}"
+                )
+                response = await client.post(
+                    f"{CODE_ASSIST_ENDPOINT}:loadCodeAssist",
+                    headers=headers,
+                    json=load_request,
+                    timeout=20,
+                )
+                response.raise_for_status()
+                data = response.json()
+                # Log full response for debugging
+                lib_logger.debug(
+                    f"loadCodeAssist full response keys: {list(data.keys())}"
+                )
+                # Extract and log ALL tier information for debugging
+                allowed_tiers = data.get("allowedTiers", [])
+                current_tier = data.get("currentTier")
+                lib_logger.debug(f"=== Tier Information ===")
+                lib_logger.debug(f"currentTier: {current_tier}")
+                lib_logger.debug(f"allowedTiers count: {len(allowed_tiers)}")
+                for i, tier in enumerate(allowed_tiers):
+                    tier_id = tier.get("id", "unknown")
+                    is_default = tier.get("isDefault", False)
+                    user_defined = tier.get("userDefinedCloudaicompanionProject", False)
+                    lib_logger.debug(
+                        f"  Tier {i + 1}: id={tier_id}, isDefault={is_default}, userDefinedProject={user_defined}"
+                    )
+                lib_logger.debug(f"========================")
+                # Determine the current tier ID
+                current_tier_id = None
+                if current_tier:
+                    current_tier_id = current_tier.get("id")
+                    lib_logger.debug(f"User has currentTier: {current_tier_id}")
+                # Check if user is already known to server (has currentTier)
+                if current_tier_id:
+                    # User is already onboarded - check for project from server
+                    server_project = data.get("cloudaicompanionProject")
+                    # Check if this tier requires user-defined project (paid tiers)
+                    requires_user_project = any(
+                        t.get("id") == current_tier_id
+                        and t.get("userDefinedCloudaicompanionProject", False)
+                        for t in allowed_tiers
+                    )
+                    is_free_tier = current_tier_id == "free-tier"
+                    if server_project:
+                        # Server returned a project - use it (server wins)
+                        # This is the normal case for FREE tier users
+                        project_id = server_project
+                        lib_logger.debug(f"Server returned project: {project_id}")
+                    elif configured_project_id:
+                        # No server project but we have configured one - use it
+                        # This is the PAID TIER case where server doesn't return a project
+                        project_id = configured_project_id
+                        lib_logger.debug(
+                            f"No server project, using configured: {project_id}"
+                        )
+                    elif is_free_tier:
+                        # Free tier user without server project - this shouldn't happen normally
+                        # but let's not fail, just proceed to onboarding
+                        lib_logger.debug(
+                            "Free tier user with currentTier but no project - will try onboarding"
+                        )
+                        project_id = None
+                    elif requires_user_project:
+                        # Paid tier requires a project ID to be set
+                        raise ValueError(
+                            f"Paid tier '{current_tier_id}' requires setting GEMINI_CLI_PROJECT_ID environment variable. "
+                            "See https://goo.gle/gemini-cli-auth-docs#workspace-gca"
+                        )
+                    else:
+                        # Unknown tier without project - proceed carefully
+                        lib_logger.warning(
+                            f"Tier '{current_tier_id}' has no project and none configured - will try onboarding"
+                        )
+                        project_id = None
+                    if project_id:
+                        # Cache tier info
+                        self.project_tier_cache[credential_path] = current_tier_id
+                        discovered_tier = current_tier_id
+                        # Log appropriately based on tier
+                        is_paid = current_tier_id and current_tier_id not in [
+                            "free-tier",
+                            "legacy-tier",
+                            "unknown",
+                        ]
+                        if is_paid:
+                            lib_logger.info(
+                                f"Using Gemini paid tier '{current_tier_id}' with project: {project_id}"
+                            )
+                        else:
+                            lib_logger.info(
+                                f"Discovered Gemini project ID via loadCodeAssist: {project_id}"
+                            )
+                        self.project_id_cache[credential_path] = project_id
+                        discovered_project_id = project_id
+                        # Persist to credential file
+                        await self._persist_project_metadata(
+                            credential_path, project_id, discovered_tier
+                        )
+                        return project_id
+                # 2. User needs onboarding - no currentTier
+                lib_logger.info(
+                    "No existing Gemini session found (no currentTier), attempting to onboard user..."
+                )
+                # Determine which tier to onboard with
+                onboard_tier = None
+                for tier in allowed_tiers:
+                    if tier.get("isDefault"):
+                        onboard_tier = tier
+                        break
+                # Fallback to LEGACY tier if no default (requires user project)
+                if not onboard_tier and allowed_tiers:
+                    # Look for legacy-tier as fallback
+                    for tier in allowed_tiers:
+                        if tier.get("id") == "legacy-tier":
+                            onboard_tier = tier
+                            break
+                    # If still no tier, use first available
+                    if not onboard_tier:
+                        onboard_tier = allowed_tiers[0]
+                if not onboard_tier:
+                    raise ValueError("No onboarding tiers available from server")
+                tier_id = onboard_tier.get("id", "free-tier")
+                requires_user_project = onboard_tier.get(
+                    "userDefinedCloudaicompanionProject", False
+                )
+                lib_logger.debug(
+                    f"Onboarding with tier: {tier_id}, requiresUserProject: {requires_user_project}"
+                )
+                # Build onboard request based on tier type (following official CLI logic)
+                # FREE tier: cloudaicompanionProject = None (server-managed)
+                # PAID tier: cloudaicompanionProject = configured_project_id (user must provide)
+                is_free_tier = tier_id == "free-tier"
+                if is_free_tier:
+                    # Free tier uses server-managed project
+                    onboard_request = {
+                        "tierId": tier_id,
+                        "cloudaicompanionProject": None,  # Server will create/manage
+                        "metadata": core_client_metadata,
+                    }
+                    lib_logger.debug(
+                        "Free tier onboarding: using server-managed project"
+                    )
+                else:
+                    # Paid/legacy tier requires user-provided project
+                    if not configured_project_id and requires_user_project:
+                        raise ValueError(
+                            f"Tier '{tier_id}' requires setting GEMINI_CLI_PROJECT_ID environment variable. "
+                            "See https://goo.gle/gemini-cli-auth-docs#workspace-gca"
+                        )
+                    onboard_request = {
+                        "tierId": tier_id,
+                        "cloudaicompanionProject": configured_project_id,
+                        "metadata": {
+                            **core_client_metadata,
+                            "duetProject": configured_project_id,
+                        }
+                        if configured_project_id
+                        else core_client_metadata,
+                    }
+                    lib_logger.debug(
+                        f"Paid tier onboarding: using project {configured_project_id}"
+                    )
+                lib_logger.debug("Initiating onboardUser request...")
+                lro_response = await client.post(
+                    f"{CODE_ASSIST_ENDPOINT}:onboardUser",
+                    headers=headers,
+                    json=onboard_request,
+                    timeout=30,
+                )
+                lro_response.raise_for_status()
+                lro_data = lro_response.json()
+                lib_logger.debug(
+                    f"Initial onboarding response: done={lro_data.get('done')}"
+                )
+                for i in range(150):  # Poll for up to 5 minutes (150 × 2s)
+                    if lro_data.get("done"):
+                        lib_logger.debug(
+                            f"Onboarding completed after {i} polling attempts"
+                        )
+                        break
+                    await asyncio.sleep(2)
+                    if (i + 1) % 15 == 0:  # Log every 30 seconds
+                        lib_logger.info(
+                            f"Still waiting for onboarding completion... ({(i + 1) * 2}s elapsed)"
+                        )
+                    lib_logger.debug(
+                        f"Polling onboarding status... (Attempt {i + 1}/150)"
+                    )
+                    lro_response = await client.post(
+                        f"{CODE_ASSIST_ENDPOINT}:onboardUser",
+                        headers=headers,
+                        json=onboard_request,
+                        timeout=30,
+                    )
+                    lro_response.raise_for_status()
+                    lro_data = lro_response.json()
+                if not lro_data.get("done"):
+                    lib_logger.error("Onboarding process timed out after 5 minutes")
+                    raise ValueError(
+                        "Onboarding process timed out after 5 minutes. Please try again or contact support."
+                    )
+                # Extract project ID from LRO response
+                # Note: onboardUser returns response.cloudaicompanionProject as an object with .id
+                lro_response_data = lro_data.get("response", {})
+                lro_project_obj = lro_response_data.get("cloudaicompanionProject", {})
+                project_id = (
+                    lro_project_obj.get("id")
+                    if isinstance(lro_project_obj, dict)
+                    else None
+                )
+                # Fallback to configured project if LRO didn't return one
+                if not project_id and configured_project_id:
+                    project_id = configured_project_id
+                    lib_logger.debug(
+                        f"LRO didn't return project, using configured: {project_id}"
+                    )
+                if not project_id:
+                    lib_logger.error(
+                        "Onboarding completed but no project ID in response and none configured"
+                    )
+                    raise ValueError(
+                        "Onboarding completed, but no project ID was returned. "
+                        "For paid tiers, set GEMINI_CLI_PROJECT_ID environment variable."
+                    )
+                lib_logger.debug(
+                    f"Successfully extracted project ID from onboarding response: {project_id}"
+                )
+                # Cache tier info
+                self.project_tier_cache[credential_path] = tier_id
+                discovered_tier = tier_id
+                lib_logger.debug(f"Cached tier information: {tier_id}")
+                # Log concise message for paid projects
+                is_paid = tier_id and tier_id not in ["free-tier", "legacy-tier"]
+                if is_paid:
+                    lib_logger.info(
+                        f"Using Gemini paid tier '{tier_id}' with project: {project_id}"
+                    )
+                else:
+                    lib_logger.info(
+                        f"Successfully onboarded user and discovered project ID: {project_id}"
+                    )
+                self.project_id_cache[credential_path] = project_id
+                discovered_project_id = project_id
+                # Persist to credential file
+                await self._persist_project_metadata(
+                    credential_path, project_id, discovered_tier
+                )
+                return project_id
+            except httpx.HTTPStatusError as e:
+                error_body = ""
+                try:
+                    error_body = e.response.text
+                except Exception:
+                    pass
+                if e.response.status_code == 403:
+                    lib_logger.error(
+                        f"Gemini Code Assist API access denied (403). Response: {error_body}"
+                    )
+                    lib_logger.error(
+                        "Possible causes: 1) cloudaicompanion.googleapis.com API not enabled, 2) Wrong project ID for paid tier, 3) Account lacks permissions"
+                    )
+                elif e.response.status_code == 404:
+                    lib_logger.warning(
+                        f"Gemini Code Assist endpoint not found (404). Falling back to project listing."
+                    )
+                elif e.response.status_code == 412:
+                    # Precondition Failed - often means wrong project for free tier onboarding
+                    lib_logger.error(
+                        f"Precondition failed (412): {error_body}. This may mean the project ID is incompatible with the selected tier."
+                    )
+                else:
+                    lib_logger.warning(
+                        f"Gemini onboarding/discovery failed with status {e.response.status_code}: {error_body}. Falling back to project listing."
+                    )
+            except httpx.RequestError as e:
+                lib_logger.warning(
+                    f"Gemini onboarding/discovery network error: {e}. Falling back to project listing."
+                )
+        # 3. Fallback to listing all available GCP projects (last resort)
+        lib_logger.debug(
+            "Attempting to discover project via GCP Resource Manager API..."
+        )
+        try:
+            async with httpx.AsyncClient() as client:
+                lib_logger.debug(
+                    "Querying Cloud Resource Manager for available projects..."
+                )
+                response = await client.get(
+                    "https://cloudresourcemanager.googleapis.com/v1/projects",
+                    headers=headers,
+                    timeout=20,
+                )
+                response.raise_for_status()
+                projects = response.json().get("projects", [])
+                lib_logger.debug(f"Found {len(projects)} total projects")
+                active_projects = [
+                    p for p in projects if p.get("lifecycleState") == "ACTIVE"
+                ]
+                lib_logger.debug(f"Found {len(active_projects)} active projects")
+                if not projects:
+                    lib_logger.error(
+                        "No GCP projects found for this account. Please create a project in Google Cloud Console."
+                    )
+                elif not active_projects:
+                    lib_logger.error(
+                        "No active GCP projects found. Please activate a project in Google Cloud Console."
+                    )
+                else:
+                    project_id = active_projects[0]["projectId"]
+                    lib_logger.info(
+                        f"Discovered Gemini project ID from active projects list: {project_id}"
+                    )
+                    lib_logger.debug(
+                        f"Selected first active project: {project_id} (out of {len(active_projects)} active projects)"
+                    )
+                    self.project_id_cache[credential_path] = project_id
+                    discovered_project_id = project_id
+                    # Persist to credential file (no tier info from resource manager)
+                    await self._persist_project_metadata(
+                        credential_path, project_id, None
+                    )
+                    return project_id
+        except httpx.HTTPStatusError as e:
+            if e.response.status_code == 403:
+                lib_logger.error(
+                    "Failed to list GCP projects due to a 403 Forbidden error. The Cloud Resource Manager API may not be enabled, or your account lacks the 'resourcemanager.projects.list' permission."
+                )
+            else:
+                lib_logger.error(
+                    f"Failed to list GCP projects with status {e.response.status_code}: {e}"
+                )
+        except httpx.RequestError as e:
+            lib_logger.error(f"Network error while listing GCP projects: {e}")
+        raise ValueError(
+            "Could not auto-discover Gemini project ID. Possible causes:\n"
+            "  1. The cloudaicompanion.googleapis.com API is not enabled (enable it in Google Cloud Console)\n"
+            "  2. No active GCP projects exist for this account (create one in Google Cloud Console)\n"
+            "  3. Account lacks necessary permissions\n"
+            "To manually specify a project, set GEMINI_CLI_PROJECT_ID in your .env file."
+        )
+    async def _persist_project_metadata(
+        self, credential_path: str, project_id: str, tier: Optional[str]
+    ):
+        """Persists project ID and tier to the credential file for faster future startups."""
+        # Skip persistence for env:// paths (environment-based credentials)
+        credential_index = self._parse_env_credential_path(credential_path)
+        if credential_index is not None:
+            lib_logger.debug(
+                f"Skipping project metadata persistence for env:// credential path: {credential_path}"
+            )
+            return
+        try:
+            # Load current credentials
+            with open(credential_path, "r") as f:
+                creds = json.load(f)
+            # Update metadata
+            if "_proxy_metadata" not in creds:
+                creds["_proxy_metadata"] = {}
+            creds["_proxy_metadata"]["project_id"] = project_id
+            if tier:
+                creds["_proxy_metadata"]["tier"] = tier
+            # Save back using the existing save method (handles atomic writes and permissions)
+            await self._save_credentials(credential_path, creds)
+            lib_logger.debug(
+                f"Persisted project_id and tier to credential file: {credential_path}"
+            )
+        except Exception as e:
+            lib_logger.warning(
+                f"Failed to persist project metadata to credential file: {e}"
+            )
+            # Non-fatal - just means slower startup next time
+    # =========================================================================
+    # CREDENTIAL MANAGEMENT OVERRIDES
+    # =========================================================================
+    def _get_provider_file_prefix(self) -> str:
+        """Return the file prefix for Gemini CLI credentials."""
+        return "gemini_cli"
+    def build_env_lines(self, creds: Dict[str, Any], cred_number: int) -> List[str]:
+        """
+        Generate .env file lines for a Gemini CLI credential.
+        Includes tier and project_id from _proxy_metadata.
+        """
+        # Get base lines from parent class
+        lines = super().build_env_lines(creds, cred_number)
+        # Add Gemini-specific fields (tier and project_id)
+        metadata = creds.get("_proxy_metadata", {})
+        prefix = f"{self.ENV_PREFIX}_{cred_number}"
+        project_id = metadata.get("project_id", "")
+        tier = metadata.get("tier", "")
+        if project_id:
+            lines.append(f"{prefix}_PROJECT_ID={project_id}")
+        if tier:
+            lines.append(f"{prefix}_TIER={tier}")
+        return lines

src/rotator_library/providers/gemini_cli_provider.py CHANGED Viewed

@@ -11,6 +11,8 @@ from .provider_interface import ProviderInterface
 from .gemini_auth_base import GeminiAuthBase
 from .provider_cache import ProviderCache
 from ..model_definitions import ModelDefinitions
 import litellm
 from litellm.exceptions import RateLimitError
 from ..error_handler import extract_retry_after_from_body
@@ -21,8 +23,22 @@ from datetime import datetime
 lib_logger = logging.getLogger("rotator_library")
-LOGS_DIR = Path(__file__).resolve().parent.parent.parent.parent / "logs"
-GEMINI_CLI_LOGS_DIR = LOGS_DIR / "gemini_cli_logs"
 class _GeminiCliFileLogger:
@@ -38,7 +54,7 @@ class _GeminiCliFileLogger:
         # Sanitize model name for directory
         safe_model_name = model_name.replace("/", "_").replace(":", "_")
         self.log_dir = (
-            GEMINI_CLI_LOGS_DIR / f"{timestamp}_{safe_model_name}_{request_id}"
         )
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
@@ -102,12 +118,6 @@ HARDCODED_MODELS = [
     "gemini-3-pro-preview",
 ]
-# Cache directory for Gemini CLI
-CACHE_DIR = (
-    Path(__file__).resolve().parent.parent.parent.parent / "cache" / "gemini_cli"
-)
-GEMINI3_SIGNATURE_CACHE_FILE = CACHE_DIR / "gemini3_signatures.json"
 # Gemini 3 tool fix system instruction (prevents hallucination)
 DEFAULT_GEMINI3_SYSTEM_INSTRUCTION = """<CRITICAL_TOOL_USAGE_INSTRUCTIONS>
 You are operating in a CUSTOM ENVIRONMENT where tool definitions COMPLETELY DIFFER from your training data.
@@ -173,6 +183,98 @@ FINISH_REASON_MAP = {
 }
 def _env_bool(key: str, default: bool = False) -> bool:
     """Get boolean from environment variable."""
     return os.getenv(key, str(default).lower()).lower() in ("true", "1", "yes")
@@ -186,8 +288,8 @@ def _env_int(key: str, default: int) -> int:
 class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
     skip_cost_calculation = True
-    # Balanced by default - Gemini CLI has short cooldowns (seconds, not hours)
-    default_rotation_mode: str = "balanced"
     # =========================================================================
     # TIER CONFIGURATION
@@ -234,32 +336,156 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         error: Exception, error_body: Optional[str] = None
     ) -> Optional[Dict[str, Any]]:
         """
-        Parse Gemini CLI quota errors.
-        Uses the same Google RPC format as Antigravity but typically has
-        much shorter cooldown durations (seconds to minutes, not hours).
         Args:
             error: The caught exception
             error_body: Optional raw response body string
         Returns:
-            Same format as AntigravityProvider.parse_quota_error()
         """
-        # Reuse the same parsing logic as Antigravity since both use Google RPC format
-        from .antigravity_provider import AntigravityProvider
-        return AntigravityProvider.parse_quota_error(error, error_body)
     def __init__(self):
         super().__init__()
         self.model_definitions = ModelDefinitions()
-        self.project_id_cache: Dict[
-            str, str
-        ] = {}  # Cache project ID per credential path
-        self.project_tier_cache: Dict[
-            str, str
-        ] = {}  # Cache project tier per credential path
         # Gemini 3 configuration from environment
         memory_ttl = _env_int("GEMINI_CLI_SIGNATURE_CACHE_TTL", 3600)
@@ -267,7 +493,7 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         # Initialize signature cache for Gemini 3 thoughtSignatures
         self._signature_cache = ProviderCache(
-            GEMINI3_SIGNATURE_CACHE_FILE,
             memory_ttl,
             disk_ttl,
             env_prefix="GEMINI_CLI_SIGNATURE",
@@ -381,7 +607,7 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         # Gemini 3 requires paid tier
         if model_name.startswith("gemini-3-"):
-            return 1  # Only priority 1 (paid) credentials
         return None  # All other models have no restrictions
@@ -391,9 +617,48 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         This ensures all credential priorities are known before any API calls,
         preventing unknown credentials from getting priority 999.
         """
         await self._load_persisted_tiers(credential_paths)
     async def _load_persisted_tiers(
         self, credential_paths: List[str]
     ) -> Dict[str, str]:
@@ -451,6 +716,8 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         return loaded
     # =========================================================================
     # MODEL UTILITIES
     # =========================================================================
@@ -466,520 +733,7 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
             return name[len(self._gemini3_tool_prefix) :]
         return name
-    async def _discover_project_id(
-        self, credential_path: str, access_token: str, litellm_params: Dict[str, Any]
-    ) -> str:
-        """
-        Discovers the Google Cloud Project ID, with caching and onboarding for new accounts.
-        This follows the official Gemini CLI discovery flow:
-        1. Check in-memory cache
-        2. Check configured project_id override (litellm_params or env var)
-        3. Check persisted project_id in credential file
-        4. Call loadCodeAssist to check if user is already known (has currentTier)
-           - If currentTier exists AND cloudaicompanionProject returned: use server's project
-           - If currentTier exists but NO cloudaicompanionProject: use configured project_id (paid tier requires this)
-           - If no currentTier: user needs onboarding
-        5. Onboard user based on tier:
-           - FREE tier: pass cloudaicompanionProject=None (server-managed)
-           - PAID tier: pass cloudaicompanionProject=configured_project_id
-        6. Fallback to GCP Resource Manager project listing
-        """
-        lib_logger.debug(
-            f"Starting project discovery for credential: {credential_path}"
-        )
-        # Check in-memory cache first
-        if credential_path in self.project_id_cache:
-            cached_project = self.project_id_cache[credential_path]
-            lib_logger.debug(f"Using cached project ID: {cached_project}")
-            return cached_project
-        # Check for configured project ID override (from litellm_params or env var)
-        # This is REQUIRED for paid tier users per the official CLI behavior
-        configured_project_id = litellm_params.get("project_id")
-        if configured_project_id:
-            lib_logger.debug(
-                f"Found configured project_id override: {configured_project_id}"
-            )
-        # Load credentials from file to check for persisted project_id and tier
-        # Skip for env:// paths (environment-based credentials don't persist to files)
-        credential_index = self._parse_env_credential_path(credential_path)
-        if credential_index is None:
-            # Only try to load from file if it's not an env:// path
-            try:
-                with open(credential_path, "r") as f:
-                    creds = json.load(f)
-                metadata = creds.get("_proxy_metadata", {})
-                persisted_project_id = metadata.get("project_id")
-                persisted_tier = metadata.get("tier")
-                if persisted_project_id:
-                    lib_logger.info(
-                        f"Loaded persisted project ID from credential file: {persisted_project_id}"
-                    )
-                    self.project_id_cache[credential_path] = persisted_project_id
-                    # Also load tier if available
-                    if persisted_tier:
-                        self.project_tier_cache[credential_path] = persisted_tier
-                        lib_logger.debug(f"Loaded persisted tier: {persisted_tier}")
-                    return persisted_project_id
-            except (FileNotFoundError, json.JSONDecodeError, KeyError) as e:
-                lib_logger.debug(f"Could not load persisted project ID from file: {e}")
-        lib_logger.debug(
-            "No cached or configured project ID found, initiating discovery..."
-        )
-        headers = {
-            "Authorization": f"Bearer {access_token}",
-            "Content-Type": "application/json",
-        }
-        discovered_project_id = None
-        discovered_tier = None
-        async with httpx.AsyncClient() as client:
-            # 1. Try discovery endpoint with loadCodeAssist
-            lib_logger.debug(
-                "Attempting project discovery via Code Assist loadCodeAssist endpoint..."
-            )
-            try:
-                # Build metadata - include duetProject only if we have a configured project
-                core_client_metadata = {
-                    "ideType": "IDE_UNSPECIFIED",
-                    "platform": "PLATFORM_UNSPECIFIED",
-                    "pluginType": "GEMINI",
-                }
-                if configured_project_id:
-                    core_client_metadata["duetProject"] = configured_project_id
-                # Build load request - pass configured_project_id if available, otherwise None
-                load_request = {
-                    "cloudaicompanionProject": configured_project_id,  # Can be None
-                    "metadata": core_client_metadata,
-                }
-                lib_logger.debug(
-                    f"Sending loadCodeAssist request with cloudaicompanionProject={configured_project_id}"
-                )
-                response = await client.post(
-                    f"{CODE_ASSIST_ENDPOINT}:loadCodeAssist",
-                    headers=headers,
-                    json=load_request,
-                    timeout=20,
-                )
-                response.raise_for_status()
-                data = response.json()
-                # Log full response for debugging
-                lib_logger.debug(
-                    f"loadCodeAssist full response keys: {list(data.keys())}"
-                )
-                # Extract and log ALL tier information for debugging
-                allowed_tiers = data.get("allowedTiers", [])
-                current_tier = data.get("currentTier")
-                lib_logger.debug(f"=== Tier Information ===")
-                lib_logger.debug(f"currentTier: {current_tier}")
-                lib_logger.debug(f"allowedTiers count: {len(allowed_tiers)}")
-                for i, tier in enumerate(allowed_tiers):
-                    tier_id = tier.get("id", "unknown")
-                    is_default = tier.get("isDefault", False)
-                    user_defined = tier.get("userDefinedCloudaicompanionProject", False)
-                    lib_logger.debug(
-                        f"  Tier {i + 1}: id={tier_id}, isDefault={is_default}, userDefinedProject={user_defined}"
-                    )
-                lib_logger.debug(f"========================")
-                # Determine the current tier ID
-                current_tier_id = None
-                if current_tier:
-                    current_tier_id = current_tier.get("id")
-                    lib_logger.debug(f"User has currentTier: {current_tier_id}")
-                # Check if user is already known to server (has currentTier)
-                if current_tier_id:
-                    # User is already onboarded - check for project from server
-                    server_project = data.get("cloudaicompanionProject")
-                    # Check if this tier requires user-defined project (paid tiers)
-                    requires_user_project = any(
-                        t.get("id") == current_tier_id
-                        and t.get("userDefinedCloudaicompanionProject", False)
-                        for t in allowed_tiers
-                    )
-                    is_free_tier = current_tier_id == "free-tier"
-                    if server_project:
-                        # Server returned a project - use it (server wins)
-                        # This is the normal case for FREE tier users
-                        project_id = server_project
-                        lib_logger.debug(f"Server returned project: {project_id}")
-                    elif configured_project_id:
-                        # No server project but we have configured one - use it
-                        # This is the PAID TIER case where server doesn't return a project
-                        project_id = configured_project_id
-                        lib_logger.debug(
-                            f"No server project, using configured: {project_id}"
-                        )
-                    elif is_free_tier:
-                        # Free tier user without server project - this shouldn't happen normally
-                        # but let's not fail, just proceed to onboarding
-                        lib_logger.debug(
-                            "Free tier user with currentTier but no project - will try onboarding"
-                        )
-                        project_id = None
-                    elif requires_user_project:
-                        # Paid tier requires a project ID to be set
-                        raise ValueError(
-                            f"Paid tier '{current_tier_id}' requires setting GEMINI_CLI_PROJECT_ID environment variable. "
-                            "See https://goo.gle/gemini-cli-auth-docs#workspace-gca"
-                        )
-                    else:
-                        # Unknown tier without project - proceed carefully
-                        lib_logger.warning(
-                            f"Tier '{current_tier_id}' has no project and none configured - will try onboarding"
-                        )
-                        project_id = None
-                    if project_id:
-                        # Cache tier info
-                        self.project_tier_cache[credential_path] = current_tier_id
-                        discovered_tier = current_tier_id
-                        # Log appropriately based on tier
-                        is_paid = current_tier_id and current_tier_id not in [
-                            "free-tier",
-                            "legacy-tier",
-                            "unknown",
-                        ]
-                        if is_paid:
-                            lib_logger.info(
-                                f"Using Gemini paid tier '{current_tier_id}' with project: {project_id}"
-                            )
-                        else:
-                            lib_logger.info(
-                                f"Discovered Gemini project ID via loadCodeAssist: {project_id}"
-                            )
-                        self.project_id_cache[credential_path] = project_id
-                        discovered_project_id = project_id
-                        # Persist to credential file
-                        await self._persist_project_metadata(
-                            credential_path, project_id, discovered_tier
-                        )
-                        return project_id
-                # 2. User needs onboarding - no currentTier
-                lib_logger.info(
-                    "No existing Gemini session found (no currentTier), attempting to onboard user..."
-                )
-                # Determine which tier to onboard with
-                onboard_tier = None
-                for tier in allowed_tiers:
-                    if tier.get("isDefault"):
-                        onboard_tier = tier
-                        break
-                # Fallback to LEGACY tier if no default (requires user project)
-                if not onboard_tier and allowed_tiers:
-                    # Look for legacy-tier as fallback
-                    for tier in allowed_tiers:
-                        if tier.get("id") == "legacy-tier":
-                            onboard_tier = tier
-                            break
-                    # If still no tier, use first available
-                    if not onboard_tier:
-                        onboard_tier = allowed_tiers[0]
-                if not onboard_tier:
-                    raise ValueError("No onboarding tiers available from server")
-                tier_id = onboard_tier.get("id", "free-tier")
-                requires_user_project = onboard_tier.get(
-                    "userDefinedCloudaicompanionProject", False
-                )
-                lib_logger.debug(
-                    f"Onboarding with tier: {tier_id}, requiresUserProject: {requires_user_project}"
-                )
-                # Build onboard request based on tier type (following official CLI logic)
-                # FREE tier: cloudaicompanionProject = None (server-managed)
-                # PAID tier: cloudaicompanionProject = configured_project_id (user must provide)
-                is_free_tier = tier_id == "free-tier"
-                if is_free_tier:
-                    # Free tier uses server-managed project
-                    onboard_request = {
-                        "tierId": tier_id,
-                        "cloudaicompanionProject": None,  # Server will create/manage
-                        "metadata": core_client_metadata,
-                    }
-                    lib_logger.debug(
-                        "Free tier onboarding: using server-managed project"
-                    )
-                else:
-                    # Paid/legacy tier requires user-provided project
-                    if not configured_project_id and requires_user_project:
-                        raise ValueError(
-                            f"Tier '{tier_id}' requires setting GEMINI_CLI_PROJECT_ID environment variable. "
-                            "See https://goo.gle/gemini-cli-auth-docs#workspace-gca"
-                        )
-                    onboard_request = {
-                        "tierId": tier_id,
-                        "cloudaicompanionProject": configured_project_id,
-                        "metadata": {
-                            **core_client_metadata,
-                            "duetProject": configured_project_id,
-                        }
-                        if configured_project_id
-                        else core_client_metadata,
-                    }
-                    lib_logger.debug(
-                        f"Paid tier onboarding: using project {configured_project_id}"
-                    )
-                lib_logger.debug("Initiating onboardUser request...")
-                lro_response = await client.post(
-                    f"{CODE_ASSIST_ENDPOINT}:onboardUser",
-                    headers=headers,
-                    json=onboard_request,
-                    timeout=30,
-                )
-                lro_response.raise_for_status()
-                lro_data = lro_response.json()
-                lib_logger.debug(
-                    f"Initial onboarding response: done={lro_data.get('done')}"
-                )
-                for i in range(150):  # Poll for up to 5 minutes (150 × 2s)
-                    if lro_data.get("done"):
-                        lib_logger.debug(
-                            f"Onboarding completed after {i} polling attempts"
-                        )
-                        break
-                    await asyncio.sleep(2)
-                    if (i + 1) % 15 == 0:  # Log every 30 seconds
-                        lib_logger.info(
-                            f"Still waiting for onboarding completion... ({(i + 1) * 2}s elapsed)"
-                        )
-                    lib_logger.debug(
-                        f"Polling onboarding status... (Attempt {i + 1}/150)"
-                    )
-                    lro_response = await client.post(
-                        f"{CODE_ASSIST_ENDPOINT}:onboardUser",
-                        headers=headers,
-                        json=onboard_request,
-                        timeout=30,
-                    )
-                    lro_response.raise_for_status()
-                    lro_data = lro_response.json()
-                if not lro_data.get("done"):
-                    lib_logger.error("Onboarding process timed out after 5 minutes")
-                    raise ValueError(
-                        "Onboarding process timed out after 5 minutes. Please try again or contact support."
-                    )
-                # Extract project ID from LRO response
-                # Note: onboardUser returns response.cloudaicompanionProject as an object with .id
-                lro_response_data = lro_data.get("response", {})
-                lro_project_obj = lro_response_data.get("cloudaicompanionProject", {})
-                project_id = (
-                    lro_project_obj.get("id")
-                    if isinstance(lro_project_obj, dict)
-                    else None
-                )
-                # Fallback to configured project if LRO didn't return one
-                if not project_id and configured_project_id:
-                    project_id = configured_project_id
-                    lib_logger.debug(
-                        f"LRO didn't return project, using configured: {project_id}"
-                    )
-                if not project_id:
-                    lib_logger.error(
-                        "Onboarding completed but no project ID in response and none configured"
-                    )
-                    raise ValueError(
-                        "Onboarding completed, but no project ID was returned. "
-                        "For paid tiers, set GEMINI_CLI_PROJECT_ID environment variable."
-                    )
-                lib_logger.debug(
-                    f"Successfully extracted project ID from onboarding response: {project_id}"
-                )
-                # Cache tier info
-                self.project_tier_cache[credential_path] = tier_id
-                discovered_tier = tier_id
-                lib_logger.debug(f"Cached tier information: {tier_id}")
-                # Log concise message for paid projects
-                is_paid = tier_id and tier_id not in ["free-tier", "legacy-tier"]
-                if is_paid:
-                    lib_logger.info(
-                        f"Using Gemini paid tier '{tier_id}' with project: {project_id}"
-                    )
-                else:
-                    lib_logger.info(
-                        f"Successfully onboarded user and discovered project ID: {project_id}"
-                    )
-                self.project_id_cache[credential_path] = project_id
-                discovered_project_id = project_id
-                # Persist to credential file
-                await self._persist_project_metadata(
-                    credential_path, project_id, discovered_tier
-                )
-                return project_id
-            except httpx.HTTPStatusError as e:
-                error_body = ""
-                try:
-                    error_body = e.response.text
-                except Exception:
-                    pass
-                if e.response.status_code == 403:
-                    lib_logger.error(
-                        f"Gemini Code Assist API access denied (403). Response: {error_body}"
-                    )
-                    lib_logger.error(
-                        "Possible causes: 1) cloudaicompanion.googleapis.com API not enabled, 2) Wrong project ID for paid tier, 3) Account lacks permissions"
-                    )
-                elif e.response.status_code == 404:
-                    lib_logger.warning(
-                        f"Gemini Code Assist endpoint not found (404). Falling back to project listing."
-                    )
-                elif e.response.status_code == 412:
-                    # Precondition Failed - often means wrong project for free tier onboarding
-                    lib_logger.error(
-                        f"Precondition failed (412): {error_body}. This may mean the project ID is incompatible with the selected tier."
-                    )
-                else:
-                    lib_logger.warning(
-                        f"Gemini onboarding/discovery failed with status {e.response.status_code}: {error_body}. Falling back to project listing."
-                    )
-            except httpx.RequestError as e:
-                lib_logger.warning(
-                    f"Gemini onboarding/discovery network error: {e}. Falling back to project listing."
-                )
-        # 3. Fallback to listing all available GCP projects (last resort)
-        lib_logger.debug(
-            "Attempting to discover project via GCP Resource Manager API..."
-        )
-        try:
-            async with httpx.AsyncClient() as client:
-                lib_logger.debug(
-                    "Querying Cloud Resource Manager for available projects..."
-                )
-                response = await client.get(
-                    "https://cloudresourcemanager.googleapis.com/v1/projects",
-                    headers=headers,
-                    timeout=20,
-                )
-                response.raise_for_status()
-                projects = response.json().get("projects", [])
-                lib_logger.debug(f"Found {len(projects)} total projects")
-                active_projects = [
-                    p for p in projects if p.get("lifecycleState") == "ACTIVE"
-                ]
-                lib_logger.debug(f"Found {len(active_projects)} active projects")
-                if not projects:
-                    lib_logger.error(
-                        "No GCP projects found for this account. Please create a project in Google Cloud Console."
-                    )
-                elif not active_projects:
-                    lib_logger.error(
-                        "No active GCP projects found. Please activate a project in Google Cloud Console."
-                    )
-                else:
-                    project_id = active_projects[0]["projectId"]
-                    lib_logger.info(
-                        f"Discovered Gemini project ID from active projects list: {project_id}"
-                    )
-                    lib_logger.debug(
-                        f"Selected first active project: {project_id} (out of {len(active_projects)} active projects)"
-                    )
-                    self.project_id_cache[credential_path] = project_id
-                    discovered_project_id = project_id
-                    # [NEW] Persist to credential file (no tier info from resource manager)
-                    await self._persist_project_metadata(
-                        credential_path, project_id, None
-                    )
-                    return project_id
-        except httpx.HTTPStatusError as e:
-            if e.response.status_code == 403:
-                lib_logger.error(
-                    "Failed to list GCP projects due to a 403 Forbidden error. The Cloud Resource Manager API may not be enabled, or your account lacks the 'resourcemanager.projects.list' permission."
-                )
-            else:
-                lib_logger.error(
-                    f"Failed to list GCP projects with status {e.response.status_code}: {e}"
-                )
-        except httpx.RequestError as e:
-            lib_logger.error(f"Network error while listing GCP projects: {e}")
-        raise ValueError(
-            "Could not auto-discover Gemini project ID. Possible causes:\n"
-            "  1. The cloudaicompanion.googleapis.com API is not enabled (enable it in Google Cloud Console)\n"
-            "  2. No active GCP projects exist for this account (create one in Google Cloud Console)\n"
-            "  3. Account lacks necessary permissions\n"
-            "To manually specify a project, set GEMINI_CLI_PROJECT_ID in your .env file."
-        )
-    async def _persist_project_metadata(
-        self, credential_path: str, project_id: str, tier: Optional[str]
-    ):
-        """Persists project ID and tier to the credential file for faster future startups."""
-        # Skip persistence for env:// paths (environment-based credentials)
-        credential_index = self._parse_env_credential_path(credential_path)
-        if credential_index is not None:
-            lib_logger.debug(
-                f"Skipping project metadata persistence for env:// credential path: {credential_path}"
-            )
-            return
-        try:
-            # Load current credentials
-            with open(credential_path, "r") as f:
-                creds = json.load(f)
-            # Update metadata
-            if "_proxy_metadata" not in creds:
-                creds["_proxy_metadata"] = {}
-            creds["_proxy_metadata"]["project_id"] = project_id
-            if tier:
-                creds["_proxy_metadata"]["tier"] = tier
-            # Save back using the existing save method (handles atomic writes and permissions)
-            await self._save_credentials(credential_path, creds)
-            lib_logger.debug(
-                f"Persisted project_id and tier to credential file: {credential_path}"
-            )
-        except Exception as e:
-            lib_logger.warning(
-                f"Failed to persist project metadata to credential file: {e}"
-            )
-            # Non-fatal - just means slower startup next time
     def _check_mixed_tier_warning(self):
         """Check if mixed free/paid tier credentials are loaded and emit warning."""
@@ -1166,7 +920,7 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
                                     func_part["thoughtSignature"] = (
                                         "skip_thought_signature_validator"
                                     )
-                                    lib_logger.warning(
                                         f"Missing thoughtSignature for first func call {tool_id}, using bypass"
                                     )
                                 # Subsequent parallel calls: no signature field at all
@@ -1178,23 +932,39 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
             elif role == "tool":
                 tool_call_id = msg.get("tool_call_id")
                 function_name = tool_call_id_to_name.get(tool_call_id)
-                if function_name:
-                    # Add prefix for Gemini 3
-                    if is_gemini_3 and self._enable_gemini3_tool_fix:
-                        function_name = f"{self._gemini3_tool_prefix}{function_name}"
-                    # Wrap the tool response in a 'result' object
-                    response_content = {"result": content}
-                    # Accumulate tool responses - they'll be combined into one user message
-                    pending_tool_parts.append(
-                        {
-                            "functionResponse": {
-                                "name": function_name,
-                                "response": response_content,
-                                "id": tool_call_id,
-                            }
-                        }
                     )
                 # Don't add parts here - tool responses are handled via pending_tool_parts
                 continue
@@ -1210,6 +980,216 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         return system_instruction, gemini_contents
     def _handle_reasoning_parameters(
         self, payload: Dict[str, Any], model: str
     ) -> Optional[Dict[str, Any]]:
@@ -1329,13 +1309,24 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
                 # Get current tool index from accumulator (default 0) and increment
                 current_tool_idx = accumulator.get("tool_idx", 0) if accumulator else 0
                 tool_call = {
                     "index": current_tool_idx,
                     "id": tool_call_id,
                     "type": "function",
                     "function": {
                         "name": function_name,
-                        "arguments": json.dumps(function_call.get("args", {})),
                     },
                 }
@@ -1643,13 +1634,32 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
                     schema = self._gemini_cli_transform_schema(
                         new_function["parameters"]
                     )
                     new_function["parametersJsonSchema"] = schema
                     del new_function["parameters"]
                 elif "parametersJsonSchema" not in new_function:
-                    # Set default empty schema if neither exists
                     new_function["parametersJsonSchema"] = {
                         "type": "object",
-                        "properties": {},
                     }
                 # Gemini 3 specific transformations
@@ -1889,6 +1899,9 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
             system_instruction, contents = self._transform_messages(
                 kwargs.get("messages", []), model_name
             )
             request_payload = {
                 "model": model_name,
                 "project": project_id,
@@ -1965,7 +1978,7 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
                         headers=final_headers,
                         json=request_payload,
                         params={"alt": "sse"},
-                        timeout=600,
                     ) as response:
                         # Read and log error body before raise_for_status for better debugging
                         if response.status_code >= 400:
@@ -2176,6 +2189,8 @@ class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
         # Transform messages to Gemini format
         system_instruction, contents = self._transform_messages(messages)
         # Build request payload
         request_payload = {

 from .gemini_auth_base import GeminiAuthBase
 from .provider_cache import ProviderCache
 from ..model_definitions import ModelDefinitions
+from ..timeout_config import TimeoutConfig
+from ..utils.paths import get_logs_dir, get_cache_dir
 import litellm
 from litellm.exceptions import RateLimitError
 from ..error_handler import extract_retry_after_from_body
 lib_logger = logging.getLogger("rotator_library")
+def _get_gemini_cli_logs_dir() -> Path:
+    """Get the Gemini CLI logs directory."""
+    logs_dir = get_logs_dir() / "gemini_cli_logs"
+    logs_dir.mkdir(parents=True, exist_ok=True)
+    return logs_dir
+def _get_gemini_cli_cache_dir() -> Path:
+    """Get the Gemini CLI cache directory."""
+    return get_cache_dir(subdir="gemini_cli")
+def _get_gemini3_signature_cache_file() -> Path:
+    """Get the Gemini 3 signature cache file path."""
+    return _get_gemini_cli_cache_dir() / "gemini3_signatures.json"
 class _GeminiCliFileLogger:
         # Sanitize model name for directory
         safe_model_name = model_name.replace("/", "_").replace(":", "_")
         self.log_dir = (
+            _get_gemini_cli_logs_dir() / f"{timestamp}_{safe_model_name}_{request_id}"
         )
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
     "gemini-3-pro-preview",
 ]
 # Gemini 3 tool fix system instruction (prevents hallucination)
 DEFAULT_GEMINI3_SYSTEM_INSTRUCTION = """<CRITICAL_TOOL_USAGE_INSTRUCTIONS>
 You are operating in a CUSTOM ENVIRONMENT where tool definitions COMPLETELY DIFFER from your training data.
 }
+def _recursively_parse_json_strings(obj: Any) -> Any:
+    """
+    Recursively parse JSON strings in nested data structures.
+    Gemini sometimes returns tool arguments with JSON-stringified values:
+    {"files": "[{...}]"} instead of {"files": [{...}]}.
+    Additionally handles:
+    - Malformed double-encoded JSON (extra trailing '}' or ']')
+    - Escaped string content (\n, \t, etc.)
+    """
+    if isinstance(obj, dict):
+        return {k: _recursively_parse_json_strings(v) for k, v in obj.items()}
+    elif isinstance(obj, list):
+        return [_recursively_parse_json_strings(item) for item in obj]
+    elif isinstance(obj, str):
+        stripped = obj.strip()
+        # Check if string contains control character escape sequences that need unescaping
+        # This handles cases where diff content has literal \n or \t instead of actual newlines/tabs
+        #
+        # IMPORTANT: We intentionally do NOT unescape strings containing \" or \\
+        # because these are typically intentional escapes in code/config content
+        # (e.g., JSON embedded in YAML: BOT_NAMES_JSON: '["mirrobot", ...]')
+        # Unescaping these would corrupt the content and cause issues like
+        # oldString and newString becoming identical when they should differ.
+        has_control_char_escapes = "\\n" in obj or "\\t" in obj
+        has_intentional_escapes = '\\"' in obj or "\\\\" in obj
+        if has_control_char_escapes and not has_intentional_escapes:
+            try:
+                # Use json.loads with quotes to properly unescape the string
+                # This converts \n -> newline, \t -> tab
+                unescaped = json.loads(f'"{obj}"')
+                # Log the fix with a snippet for debugging
+                snippet = obj[:80] + "..." if len(obj) > 80 else obj
+                lib_logger.debug(
+                    f"[GeminiCli] Unescaped control chars in string: "
+                    f"{len(obj) - len(unescaped)} chars changed. Snippet: {snippet!r}"
+                )
+                return unescaped
+            except (json.JSONDecodeError, ValueError):
+                # If unescaping fails, continue with original processing
+                pass
+        # Check if it looks like JSON (starts with { or [)
+        if stripped and stripped[0] in ("{", "["):
+            # Try standard parsing first
+            if (stripped.startswith("{") and stripped.endswith("}")) or (
+                stripped.startswith("[") and stripped.endswith("]")
+            ):
+                try:
+                    parsed = json.loads(obj)
+                    return _recursively_parse_json_strings(parsed)
+                except (json.JSONDecodeError, ValueError):
+                    pass
+            # Handle malformed JSON: array that doesn't end with ]
+            # e.g., '[{"path": "..."}]}' instead of '[{"path": "..."}]'
+            if stripped.startswith("[") and not stripped.endswith("]"):
+                try:
+                    # Find the last ] and truncate there
+                    last_bracket = stripped.rfind("]")
+                    if last_bracket > 0:
+                        cleaned = stripped[: last_bracket + 1]
+                        parsed = json.loads(cleaned)
+                        lib_logger.warning(
+                            f"[GeminiCli] Auto-corrected malformed JSON string: "
+                            f"truncated {len(stripped) - len(cleaned)} extra chars"
+                        )
+                        return _recursively_parse_json_strings(parsed)
+                except (json.JSONDecodeError, ValueError):
+                    pass
+            # Handle malformed JSON: object that doesn't end with }
+            if stripped.startswith("{") and not stripped.endswith("}"):
+                try:
+                    # Find the last } and truncate there
+                    last_brace = stripped.rfind("}")
+                    if last_brace > 0:
+                        cleaned = stripped[: last_brace + 1]
+                        parsed = json.loads(cleaned)
+                        lib_logger.warning(
+                            f"[GeminiCli] Auto-corrected malformed JSON string: "
+                            f"truncated {len(stripped) - len(cleaned)} extra chars"
+                        )
+                        return _recursively_parse_json_strings(parsed)
+                except (json.JSONDecodeError, ValueError):
+                    pass
+    return obj
 def _env_bool(key: str, default: bool = False) -> bool:
     """Get boolean from environment variable."""
     return os.getenv(key, str(default).lower()).lower() in ("true", "1", "yes")
 class GeminiCliProvider(GeminiAuthBase, ProviderInterface):
     skip_cost_calculation = True
+    # Sequential mode - stick with one credential until it gets a 429, then switch
+    default_rotation_mode: str = "sequential"
     # =========================================================================
     # TIER CONFIGURATION
         error: Exception, error_body: Optional[str] = None
     ) -> Optional[Dict[str, Any]]:
         """
+        Parse Gemini CLI rate limit/quota errors.
+        Handles the Gemini CLI error format which embeds reset time in the message:
+        "You have exhausted your capacity on this model. Your quota will reset after 2s."
+        Unlike Antigravity which uses structured RetryInfo/quotaResetDelay metadata,
+        Gemini CLI embeds the reset time in a human-readable message.
+        Example error format:
+        {
+          "error": {
+            "code": 429,
+            "message": "You have exhausted your capacity on this model. Your quota will reset after 2s.",
+            "status": "RESOURCE_EXHAUSTED",
+            "details": [
+              {
+                "@type": "type.googleapis.com/google.rpc.ErrorInfo",
+                "reason": "RATE_LIMIT_EXCEEDED",
+                "domain": "cloudcode-pa.googleapis.com",
+                "metadata": { "uiMessage": "true", "model": "gemini-3-pro-preview" }
+              }
+            ]
+          }
+        }
         Args:
             error: The caught exception
             error_body: Optional raw response body string
         Returns:
+            None if not a parseable quota error, otherwise:
+            {
+                "retry_after": int,
+                "reason": str | None,
+                "reset_timestamp": str | None,
+                "quota_reset_timestamp": float | None,
+            }
         """
+        import re as regex_module
+        # Get error body from exception if not provided
+        body = error_body
+        if not body:
+            if hasattr(error, "response") and hasattr(error.response, "text"):
+                try:
+                    body = error.response.text
+                except Exception:
+                    pass
+            if not body and hasattr(error, "body"):
+                body = str(error.body)
+            if not body and hasattr(error, "message"):
+                body = str(error.message)
+            if not body:
+                body = str(error)
+        if not body:
+            return None
+        result = {
+            "retry_after": None,
+            "reason": None,
+            "reset_timestamp": None,
+            "quota_reset_timestamp": None,
+        }
+        # 1. Try to extract retry time from human-readable message
+        # Pattern: "Your quota will reset after 2s." or "quota will reset after 156h14m36s"
+        retry_after = extract_retry_after_from_body(body)
+        if retry_after:
+            result["retry_after"] = retry_after
+        # 2. Try to parse JSON to get structured details (reason, any RetryInfo fallback)
+        try:
+            json_match = regex_module.search(r"\{[\s\S]*\}", body)
+            if json_match:
+                data = json.loads(json_match.group(0))
+                error_obj = data.get("error", data)
+                details = error_obj.get("details", [])
+                for detail in details:
+                    detail_type = detail.get("@type", "")
+                    # Extract reason from ErrorInfo
+                    if "ErrorInfo" in detail_type:
+                        if not result["reason"]:
+                            result["reason"] = detail.get("reason")
+                        # Check metadata for any additional timing info
+                        metadata = detail.get("metadata", {})
+                        quota_delay = metadata.get("quotaResetDelay")
+                        if quota_delay and not result["retry_after"]:
+                            parsed = GeminiCliProvider._parse_duration(quota_delay)
+                            if parsed:
+                                result["retry_after"] = parsed
+                    # Check for RetryInfo (fallback, in case format changes)
+                    if "RetryInfo" in detail_type and not result["retry_after"]:
+                        retry_delay = detail.get("retryDelay")
+                        if retry_delay:
+                            parsed = GeminiCliProvider._parse_duration(retry_delay)
+                            if parsed:
+                                result["retry_after"] = parsed
+        except (json.JSONDecodeError, AttributeError, TypeError):
+            pass
+        # Return None if we couldn't extract retry_after
+        if not result["retry_after"]:
+            return None
+        return result
+    @staticmethod
+    def _parse_duration(duration_str: str) -> Optional[int]:
+        """
+        Parse duration strings like '2s', '156h14m36.73s', '515092.73s' to seconds.
+        Args:
+            duration_str: Duration string to parse
+        Returns:
+            Total seconds as integer, or None if parsing fails
+        """
+        import re as regex_module
+        if not duration_str:
+            return None
+        # Handle pure seconds format: "515092.730699158s" or "2s"
+        pure_seconds_match = regex_module.match(r"^([\d.]+)s$", duration_str)
+        if pure_seconds_match:
+            return int(float(pure_seconds_match.group(1)))
+        # Handle compound format: "143h4m52.730699158s"
+        total_seconds = 0
+        patterns = [
+            (r"(\d+)h", 3600),  # hours
+            (r"(\d+)m", 60),  # minutes
+            (r"([\d.]+)s", 1),  # seconds
+        ]
+        for pattern, multiplier in patterns:
+            match = regex_module.search(pattern, duration_str)
+            if match:
+                total_seconds += float(match.group(1)) * multiplier
+        return int(total_seconds) if total_seconds > 0 else None
     def __init__(self):
         super().__init__()
         self.model_definitions = ModelDefinitions()
+        # NOTE: project_id_cache and project_tier_cache are inherited from GeminiAuthBase
         # Gemini 3 configuration from environment
         memory_ttl = _env_int("GEMINI_CLI_SIGNATURE_CACHE_TTL", 3600)
         # Initialize signature cache for Gemini 3 thoughtSignatures
         self._signature_cache = ProviderCache(
+            _get_gemini3_signature_cache_file(),
             memory_ttl,
             disk_ttl,
             env_prefix="GEMINI_CLI_SIGNATURE",
         # Gemini 3 requires paid tier
         if model_name.startswith("gemini-3-"):
+            return 2  # Only priority 2 (paid) credentials
         return None  # All other models have no restrictions
         This ensures all credential priorities are known before any API calls,
         preventing unknown credentials from getting priority 999.
+        For credentials without persisted tier info (new or corrupted), performs
+        full discovery to ensure proper prioritization in sequential rotation mode.
         """
+        # Step 1: Load persisted tiers from files
         await self._load_persisted_tiers(credential_paths)
+        # Step 2: Identify credentials still missing tier info
+        credentials_needing_discovery = [
+            path
+            for path in credential_paths
+            if path not in self.project_tier_cache
+            and self._parse_env_credential_path(path) is None  # Skip env:// paths
+        ]
+        if not credentials_needing_discovery:
+            return  # All credentials have tier info
+        lib_logger.info(
+            f"GeminiCli: Discovering tier info for {len(credentials_needing_discovery)} credential(s)..."
+        )
+        # Step 3: Perform discovery for each missing credential (sequential to avoid rate limits)
+        for credential_path in credentials_needing_discovery:
+            try:
+                auth_header = await self.get_auth_header(credential_path)
+                access_token = auth_header["Authorization"].split(" ")[1]
+                await self._discover_project_id(
+                    credential_path, access_token, litellm_params={}
+                )
+                discovered_tier = self.project_tier_cache.get(
+                    credential_path, "unknown"
+                )
+                lib_logger.debug(
+                    f"Discovered tier '{discovered_tier}' for {Path(credential_path).name}"
+                )
+            except Exception as e:
+                lib_logger.warning(
+                    f"Failed to discover tier for {Path(credential_path).name}: {e}. "
+                    f"Credential will use default priority."
+                )
     async def _load_persisted_tiers(
         self, credential_paths: List[str]
     ) -> Dict[str, str]:
         return loaded
+    # NOTE: _post_auth_discovery() is inherited from GeminiAuthBase
     # =========================================================================
     # MODEL UTILITIES
     # =========================================================================
             return name[len(self._gemini3_tool_prefix) :]
         return name
+    # NOTE: _discover_project_id() and _persist_project_metadata() are inherited from GeminiAuthBase
     def _check_mixed_tier_warning(self):
         """Check if mixed free/paid tier credentials are loaded and emit warning."""
                                     func_part["thoughtSignature"] = (
                                         "skip_thought_signature_validator"
                                     )
+                                    lib_logger.debug(
                                         f"Missing thoughtSignature for first func call {tool_id}, using bypass"
                                     )
                                 # Subsequent parallel calls: no signature field at all
             elif role == "tool":
                 tool_call_id = msg.get("tool_call_id")
                 function_name = tool_call_id_to_name.get(tool_call_id)
+                # Log warning if tool_call_id not found in mapping (can happen after context compaction)
+                if not function_name:
+                    lib_logger.warning(
+                        f"[ID Mismatch] Tool response has ID '{tool_call_id}' which was not found in tool_id_to_name map. "
+                        f"Available IDs: {list(tool_call_id_to_name.keys())}. Using 'unknown_function' as fallback."
+                    )
+                    function_name = "unknown_function"
+                # Add prefix for Gemini 3
+                if is_gemini_3 and self._enable_gemini3_tool_fix:
+                    function_name = f"{self._gemini3_tool_prefix}{function_name}"
+                # Try to parse content as JSON first, fall back to string
+                try:
+                    parsed_content = (
+                        json.loads(content) if isinstance(content, str) else content
                     )
+                except (json.JSONDecodeError, TypeError):
+                    parsed_content = content
+                # Wrap the tool response in a 'result' object
+                response_content = {"result": parsed_content}
+                # Accumulate tool responses - they'll be combined into one user message
+                pending_tool_parts.append(
+                    {
+                        "functionResponse": {
+                            "name": function_name,
+                            "response": response_content,
+                            "id": tool_call_id,
+                        }
+                    }
+                )
                 # Don't add parts here - tool responses are handled via pending_tool_parts
                 continue
         return system_instruction, gemini_contents
+    def _fix_tool_response_grouping(
+        self, contents: List[Dict[str, Any]]
+    ) -> List[Dict[str, Any]]:
+        """
+        Group function calls with their responses for Gemini CLI compatibility.
+        Converts linear format (call, response, call, response)
+        to grouped format (model with calls, user with all responses).
+        IMPORTANT: Preserves ID-based pairing to prevent mismatches.
+        When IDs don't match, attempts recovery by:
+        1. Matching by function name first
+        2. Matching by order if names don't match
+        3. Inserting placeholder responses if responses are missing
+        4. Inserting responses at the CORRECT position (after their corresponding call)
+        """
+        new_contents = []
+        # Each pending group tracks:
+        # - ids: expected response IDs
+        # - func_names: expected function names (for orphan matching)
+        # - insert_after_idx: position in new_contents where model message was added
+        pending_groups = []
+        collected_responses = {}  # Dict mapping ID -> response_part
+        for content in contents:
+            role = content.get("role")
+            parts = content.get("parts", [])
+            response_parts = [p for p in parts if "functionResponse" in p]
+            if response_parts:
+                # Collect responses by ID (ignore duplicates - keep first occurrence)
+                for resp in response_parts:
+                    resp_id = resp.get("functionResponse", {}).get("id", "")
+                    if resp_id:
+                        if resp_id in collected_responses:
+                            lib_logger.warning(
+                                f"[Grouping] Duplicate response ID detected: {resp_id}. "
+                                f"Ignoring duplicate - this may indicate malformed conversation history."
+                            )
+                            continue
+                        collected_responses[resp_id] = resp
+                # Try to satisfy pending groups (newest first)
+                for i in range(len(pending_groups) - 1, -1, -1):
+                    group = pending_groups[i]
+                    group_ids = group["ids"]
+                    # Check if we have ALL responses for this group
+                    if all(gid in collected_responses for gid in group_ids):
+                        # Extract responses in the same order as the function calls
+                        group_responses = [
+                            collected_responses.pop(gid) for gid in group_ids
+                        ]
+                        new_contents.append({"parts": group_responses, "role": "user"})
+                        pending_groups.pop(i)
+                        break
+                continue
+            if role == "model":
+                func_calls = [p for p in parts if "functionCall" in p]
+                new_contents.append(content)
+                if func_calls:
+                    call_ids = [
+                        fc.get("functionCall", {}).get("id", "") for fc in func_calls
+                    ]
+                    call_ids = [cid for cid in call_ids if cid]  # Filter empty IDs
+                    # Also extract function names for orphan matching
+                    func_names = [
+                        fc.get("functionCall", {}).get("name", "") for fc in func_calls
+                    ]
+                    if call_ids:
+                        pending_groups.append(
+                            {
+                                "ids": call_ids,
+                                "func_names": func_names,
+                                "insert_after_idx": len(new_contents) - 1,
+                            }
+                        )
+            else:
+                new_contents.append(content)
+        # Handle remaining groups (shouldn't happen in well-formed conversations)
+        # Attempt recovery by matching orphans to unsatisfied calls
+        # Process in REVERSE order of insert_after_idx so insertions don't shift indices
+        pending_groups.sort(key=lambda g: g["insert_after_idx"], reverse=True)
+        for group in pending_groups:
+            group_ids = group["ids"]
+            group_func_names = group.get("func_names", [])
+            insert_idx = group["insert_after_idx"] + 1
+            group_responses = []
+            lib_logger.debug(
+                f"[Grouping Recovery] Processing unsatisfied group: "
+                f"ids={group_ids}, names={group_func_names}, insert_at={insert_idx}"
+            )
+            for i, expected_id in enumerate(group_ids):
+                expected_name = group_func_names[i] if i < len(group_func_names) else ""
+                if expected_id in collected_responses:
+                    # Direct ID match
+                    group_responses.append(collected_responses.pop(expected_id))
+                    lib_logger.debug(
+                        f"[Grouping Recovery] Direct ID match for '{expected_id}'"
+                    )
+                elif collected_responses:
+                    # Try to find orphan with matching function name first
+                    matched_orphan_id = None
+                    # First pass: match by function name
+                    for orphan_id, orphan_resp in collected_responses.items():
+                        orphan_name = orphan_resp.get("functionResponse", {}).get(
+                            "name", ""
+                        )
+                        # Match if names are equal
+                        if orphan_name == expected_name:
+                            matched_orphan_id = orphan_id
+                            lib_logger.debug(
+                                f"[Grouping Recovery] Matched orphan '{orphan_id}' by name '{orphan_name}'"
+                            )
+                            break
+                    # Second pass: if no name match, try "unknown_function" orphans
+                    if not matched_orphan_id:
+                        for orphan_id, orphan_resp in collected_responses.items():
+                            orphan_name = orphan_resp.get("functionResponse", {}).get(
+                                "name", ""
+                            )
+                            if orphan_name == "unknown_function":
+                                matched_orphan_id = orphan_id
+                                lib_logger.debug(
+                                    f"[Grouping Recovery] Matched unknown_function orphan '{orphan_id}' "
+                                    f"to expected '{expected_name}'"
+                                )
+                                break
+                    # Third pass: if still no match, take first available (order-based)
+                    if not matched_orphan_id:
+                        matched_orphan_id = next(iter(collected_responses))
+                        lib_logger.debug(
+                            f"[Grouping Recovery] No name match, using first available orphan '{matched_orphan_id}'"
+                        )
+                    if matched_orphan_id:
+                        orphan_resp = collected_responses.pop(matched_orphan_id)
+                        # Fix the ID in the response to match the call
+                        old_id = orphan_resp["functionResponse"].get("id", "")
+                        orphan_resp["functionResponse"]["id"] = expected_id
+                        # Fix the name if it was "unknown_function"
+                        if (
+                            orphan_resp["functionResponse"].get("name")
+                            == "unknown_function"
+                            and expected_name
+                        ):
+                            orphan_resp["functionResponse"]["name"] = expected_name
+                            lib_logger.info(
+                                f"[Grouping Recovery] Fixed function name from 'unknown_function' to '{expected_name}'"
+                            )
+                        lib_logger.warning(
+                            f"[Grouping] Auto-repaired ID mismatch: mapped response '{old_id}' "
+                            f"to call '{expected_id}' (function: {expected_name})"
+                        )
+                        group_responses.append(orphan_resp)
+                else:
+                    # No responses available - create placeholder
+                    placeholder_resp = {
+                        "functionResponse": {
+                            "name": expected_name or "unknown_function",
+                            "response": {
+                                "result": {
+                                    "error": "Tool response was lost during context processing. "
+                                    "This is a recovered placeholder.",
+                                    "recovered": True,
+                                }
+                            },
+                            "id": expected_id,
+                        }
+                    }
+                    lib_logger.warning(
+                        f"[Grouping Recovery] Created placeholder response for missing tool: "
+                        f"id='{expected_id}', name='{expected_name}'"
+                    )
+                    group_responses.append(placeholder_resp)
+            if group_responses:
+                # Insert at the correct position (right after the model message with the calls)
+                new_contents.insert(
+                    insert_idx, {"parts": group_responses, "role": "user"}
+                )
+                lib_logger.info(
+                    f"[Grouping Recovery] Inserted {len(group_responses)} responses at position {insert_idx} "
+                    f"(expected {len(group_ids)})"
+                )
+        # Warn about unmatched responses
+        if collected_responses:
+            lib_logger.warning(
+                f"[Grouping] {len(collected_responses)} unmatched responses remaining: "
+                f"ids={list(collected_responses.keys())}"
+            )
+        return new_contents
     def _handle_reasoning_parameters(
         self, payload: Dict[str, Any], model: str
     ) -> Optional[Dict[str, Any]]:
                 # Get current tool index from accumulator (default 0) and increment
                 current_tool_idx = accumulator.get("tool_idx", 0) if accumulator else 0
+                # Get args, recursively parse any JSON strings, and strip _confirm if sole param
+                raw_args = function_call.get("args", {})
+                tool_args = _recursively_parse_json_strings(raw_args)
+                # Strip _confirm ONLY if it's the sole parameter
+                # This ensures we only strip our injection, not legitimate user params
+                if isinstance(tool_args, dict) and "_confirm" in tool_args:
+                    if len(tool_args) == 1:
+                        # _confirm is the only param - this was our injection
+                        tool_args.pop("_confirm")
                 tool_call = {
                     "index": current_tool_idx,
                     "id": tool_call_id,
                     "type": "function",
                     "function": {
                         "name": function_name,
+                        "arguments": json.dumps(tool_args),
                     },
                 }
                     schema = self._gemini_cli_transform_schema(
                         new_function["parameters"]
                     )
+                    # Workaround: Gemini fails to emit functionCall for tools
+                    # with empty properties {}. Inject a required confirmation param.
+                    # Using a required parameter forces the model to commit to
+                    # the tool call rather than just thinking about it.
+                    props = schema.get("properties", {})
+                    if not props:
+                        schema["properties"] = {
+                            "_confirm": {
+                                "type": "string",
+                                "description": "Enter 'yes' to proceed",
+                            }
+                        }
+                        schema["required"] = ["_confirm"]
                     new_function["parametersJsonSchema"] = schema
                     del new_function["parameters"]
                 elif "parametersJsonSchema" not in new_function:
+                    # Set default schema with required confirm param if neither exists
                     new_function["parametersJsonSchema"] = {
                         "type": "object",
+                        "properties": {
+                            "_confirm": {
+                                "type": "string",
+                                "description": "Enter 'yes' to proceed",
+                            }
+                        },
+                        "required": ["_confirm"],
                     }
                 # Gemini 3 specific transformations
             system_instruction, contents = self._transform_messages(
                 kwargs.get("messages", []), model_name
             )
+            # Fix tool response grouping (handles ID mismatches, missing responses)
+            contents = self._fix_tool_response_grouping(contents)
             request_payload = {
                 "model": model_name,
                 "project": project_id,
                         headers=final_headers,
                         json=request_payload,
                         params={"alt": "sse"},
+                        timeout=TimeoutConfig.streaming(),
                     ) as response:
                         # Read and log error body before raise_for_status for better debugging
                         if response.status_code >= 400:
         # Transform messages to Gemini format
         system_instruction, contents = self._transform_messages(messages)
+        # Fix tool response grouping (handles ID mismatches, missing responses)
+        contents = self._fix_tool_response_grouping(contents)
         # Build request payload
         request_payload = {

src/rotator_library/providers/google_oauth_base.py CHANGED Viewed

@@ -1,16 +1,17 @@
 # src/rotator_library/providers/google_oauth_base.py
 import os
 import webbrowser
-from typing import Union, Optional
 import json
 import time
 import asyncio
 import logging
 from pathlib import Path
 from typing import Dict, Any
-import tempfile
-import shutil
 import httpx
 from rich.console import Console
@@ -20,12 +21,31 @@ from rich.markup import escape as rich_escape
 from ..utils.headless_detection import is_headless_environment
 from ..utils.reauth_coordinator import get_reauth_coordinator
 lib_logger = logging.getLogger("rotator_library")
 console = Console()
 class GoogleOAuthBase:
     """
     Base class for Google OAuth2 authentication providers.
@@ -55,6 +75,25 @@ class GoogleOAuthBase:
     CALLBACK_PATH: str = "/oauth2callback"
     REFRESH_EXPIRY_BUFFER_SECONDS: int = 30 * 60  # 30 minutes
     def __init__(self):
         # Validate that subclass has set required attributes
         if self.CLIENT_ID is None:
@@ -83,19 +122,36 @@ class GoogleOAuthBase:
             str, float
         ] = {}  # Track backoff timers (Unix timestamp)
-        # [QUEUE SYSTEM] Sequential refresh processing
         self._refresh_queue: asyncio.Queue = asyncio.Queue()
-        self._queued_credentials: set = set()  # Track credentials already in queue
-        # [FIX PR#34] Changed from set to dict mapping credential path to timestamp
-        # This enables TTL-based stale entry cleanup as defense in depth
         self._unavailable_credentials: Dict[
             str, float
         ] = {}  # Maps credential path -> timestamp when marked unavailable
-        self._unavailable_ttl_seconds: int = 300  # 5 minutes TTL for stale entries
         self._queue_tracking_lock = asyncio.Lock()  # Protects queue sets
-        self._queue_processor_task: Optional[asyncio.Task] = (
-            None  # Background worker task
-        )
     def _parse_env_credential_path(self, path: str) -> Optional[str]:
         """
@@ -228,17 +284,7 @@ class GoogleOAuthBase:
                         f"Environment variables for {self.ENV_PREFIX} credential index {credential_index} not found"
                     )
-            # For file paths, first try loading from legacy env vars (for backwards compatibility)
-            env_creds = self._load_from_env()
-            if env_creds:
-                lib_logger.info(
-                    f"Using {self.ENV_PREFIX} credentials from environment variables"
-                )
-                # Cache env-based credentials using the path as key
-                self._credentials_cache[path] = env_creds
-                return env_creds
-            # Fall back to file-based loading
             try:
                 lib_logger.debug(
                     f"Loading {self.ENV_PREFIX} credentials from file: {path}"
@@ -251,6 +297,15 @@ class GoogleOAuthBase:
                 self._credentials_cache[path] = creds
                 return creds
             except FileNotFoundError:
                 raise IOError(
                     f"{self.ENV_PREFIX} OAuth credential file not found at '{path}'"
                 )
@@ -258,70 +313,29 @@ class GoogleOAuthBase:
                 raise IOError(
                     f"Failed to load {self.ENV_PREFIX} OAuth credentials from '{path}': {e}"
                 )
-            except Exception as e:
-                raise IOError(
-                    f"Failed to load {self.ENV_PREFIX} OAuth credentials from '{path}': {e}"
-                )
     async def _save_credentials(self, path: str, creds: Dict[str, Any]):
         # Don't save to file if credentials were loaded from environment
         if creds.get("_proxy_metadata", {}).get("loaded_from_env"):
             lib_logger.debug("Credentials loaded from env, skipping file save")
-            # Still update cache for in-memory consistency
-            self._credentials_cache[path] = creds
             return
-        # [ATOMIC WRITE] Use tempfile + move pattern to ensure atomic writes
-        # This prevents credential corruption if the process is interrupted during write
-        parent_dir = os.path.dirname(os.path.abspath(path))
-        os.makedirs(parent_dir, exist_ok=True)
-        tmp_fd = None
-        tmp_path = None
-        try:
-            # Create temp file in same directory as target (ensures same filesystem)
-            tmp_fd, tmp_path = tempfile.mkstemp(
-                dir=parent_dir, prefix=".tmp_", suffix=".json", text=True
-            )
-            # Write JSON to temp file
-            with os.fdopen(tmp_fd, "w") as f:
-                json.dump(creds, f, indent=2)
-                tmp_fd = None  # fdopen closes the fd
-            # Set secure permissions (0600 = owner read/write only)
-            try:
-                os.chmod(tmp_path, 0o600)
-            except (OSError, AttributeError):
-                # Windows may not support chmod, ignore
-                pass
-            # Atomic move (overwrites target if it exists)
-            shutil.move(tmp_path, path)
-            tmp_path = None  # Successfully moved
-            # Update cache AFTER successful file write (prevents cache/file inconsistency)
-            self._credentials_cache[path] = creds
             lib_logger.debug(
-                f"Saved updated {self.ENV_PREFIX} OAuth credentials to '{path}' (atomic write)."
             )
-        except Exception as e:
-            lib_logger.error(
-                f"Failed to save updated {self.ENV_PREFIX} OAuth credentials to '{path}': {e}"
             )
-            # Clean up temp file if it still exists
-            if tmp_fd is not None:
-                try:
-                    os.close(tmp_fd)
-                except:
-                    pass
-            if tmp_path and os.path.exists(tmp_path):
-                try:
-                    os.unlink(tmp_path)
-                except:
-                    pass
-            raise
     def _is_token_expired(self, creds: Dict[str, Any]) -> bool:
         expiry = creds.get("token_expiry")  # gcloud format
@@ -518,7 +532,7 @@ class GoogleOAuthBase:
         """Proactively refresh a credential by queueing it for refresh."""
         creds = await self._load_credentials(credential_path)
         if self._is_token_expired(creds):
-            # Queue for refresh with needs_reauth=False (automated refresh)
             await self._queue_refresh(credential_path, force=False, needs_reauth=False)
     async def _get_lock(self, path: str) -> asyncio.Lock:
@@ -529,34 +543,69 @@ class GoogleOAuthBase:
                 self._refresh_locks[path] = asyncio.Lock()
             return self._refresh_locks[path]
     def is_credential_available(self, path: str) -> bool:
-        """Check if a credential is available for rotation (not queued/refreshing).
-        [FIX PR#34] Now includes TTL-based stale entry cleanup as defense in depth.
-        If a credential has been unavailable for longer than _unavailable_ttl_seconds,
-        it is automatically cleaned up and considered available.
         """
-        if path not in self._unavailable_credentials:
-            return True
-        # [FIX PR#34] Check if the entry is stale (TTL expired)
-        marked_time = self._unavailable_credentials.get(path)
-        if marked_time is not None:
-            now = time.time()
-            if now - marked_time > self._unavailable_ttl_seconds:
-                # Entry is stale - clean it up and return available
-                lib_logger.warning(
-                    f"Credential '{Path(path).name}' was stuck in unavailable state for "
-                    f"{int(now - marked_time)}s (TTL: {self._unavailable_ttl_seconds}s). "
-                    f"Auto-cleaning stale entry."
                 )
-                # Note: This is a sync method, so we can't use async lock here.
-                # However, pop from dict is thread-safe for single operations.
-                # The _queue_tracking_lock protects concurrent modifications in async context.
-                self._unavailable_credentials.pop(path, None)
-                return True
-        return False
     async def _ensure_queue_processor_running(self):
         """Lazily starts the queue processor if not already running."""
@@ -565,15 +614,27 @@ class GoogleOAuthBase:
                 self._process_refresh_queue()
             )
     async def _queue_refresh(
         self, path: str, force: bool = False, needs_reauth: bool = False
     ):
-        """Add a credential to the refresh queue if not already queued.
         Args:
             path: Credential file path
             force: Force refresh even if not expired
-            needs_reauth: True if full re-authentication needed (bypasses backoff)
         """
         # IMPORTANT: Only check backoff for simple automated refreshes
         # Re-authentication (interactive OAuth) should BYPASS backoff since it needs user input
@@ -583,108 +644,226 @@ class GoogleOAuthBase:
                 backoff_until = self._next_refresh_after[path]
                 if now < backoff_until:
                     # Credential is in backoff for automated refresh, do not queue
-                    remaining = int(backoff_until - now)
-                    lib_logger.debug(
-                        f"Skipping automated refresh for '{Path(path).name}' (in backoff for {remaining}s)"
-                    )
                     return
         async with self._queue_tracking_lock:
             if path not in self._queued_credentials:
                 self._queued_credentials.add(path)
-                # [FIX PR#34] Store timestamp when marking unavailable (for TTL cleanup)
-                self._unavailable_credentials[path] = time.time()
-                lib_logger.debug(
-                    f"Marked '{Path(path).name}' as unavailable. "
-                    f"Total unavailable: {len(self._unavailable_credentials)}"
-                )
-                await self._refresh_queue.put((path, force, needs_reauth))
-                await self._ensure_queue_processor_running()
     async def _process_refresh_queue(self):
-        """Background worker that processes refresh requests sequentially."""
         while True:
             path = None
             try:
                 # Wait for an item with timeout to allow graceful shutdown
                 try:
-                    path, force, needs_reauth = await asyncio.wait_for(
                         self._refresh_queue.get(), timeout=60.0
                     )
                 except asyncio.TimeoutError:
-                    # [FIX PR#34] Clean up any stale unavailable entries before exiting
-                    # If we're idle for 60s, no refreshes are in progress
                     async with self._queue_tracking_lock:
-                        if self._unavailable_credentials:
-                            stale_count = len(self._unavailable_credentials)
-                            lib_logger.warning(
-                                f"Queue processor idle timeout. Cleaning {stale_count} "
-                                f"stale unavailable credentials: {list(self._unavailable_credentials.keys())}"
-                            )
-                            self._unavailable_credentials.clear()
                     self._queue_processor_task = None
                     return
                 try:
-                    # Perform the actual refresh (still using per-credential lock)
-                    async with await self._get_lock(path):
-                        # Re-check if still expired (may have changed since queueing)
-                        creds = self._credentials_cache.get(path)
-                        if creds and not self._is_token_expired(creds):
-                            # No longer expired, mark as available
-                            async with self._queue_tracking_lock:
-                                self._unavailable_credentials.pop(path, None)
-                                lib_logger.debug(
-                                    f"Credential '{Path(path).name}' no longer expired, marked available. "
-                                    f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                                )
-                            continue
-                        # Perform refresh
-                        if not creds:
-                            creds = await self._load_credentials(path)
-                        await self._refresh_token(path, creds, force=force)
-                        # SUCCESS: Mark as available again
-                        async with self._queue_tracking_lock:
-                            self._unavailable_credentials.pop(path, None)
-                            lib_logger.debug(
-                                f"Refresh SUCCESS for '{Path(path).name}', marked available. "
-                                f"Remaining unavailable: {len(self._unavailable_credentials)}"
                             )
                 finally:
-                    # [FIX PR#34] Remove from BOTH queued set AND unavailable credentials
-                    # This ensures cleanup happens in ALL exit paths (success, exception, etc.)
                     async with self._queue_tracking_lock:
                         self._queued_credentials.discard(path)
-                        # [FIX PR#34] Always clean up unavailable credentials in finally block
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"Finally cleanup for '{Path(path).name}'. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
-                    self._refresh_queue.task_done()
             except asyncio.CancelledError:
-                # [FIX PR#34] Clean up the current credential before breaking
                 if path:
                     async with self._queue_tracking_lock:
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"CancelledError cleanup for '{Path(path).name}'. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
                 break
             except Exception as e:
-                lib_logger.error(f"Error in queue processor: {e}")
-                # Even on error, mark as available (backoff will prevent immediate retry)
                 if path:
                     async with self._queue_tracking_lock:
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"Error cleanup for '{Path(path).name}': {e}. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
     async def _perform_interactive_oauth(
         self, path: str, creds: Dict[str, Any], display_name: str
@@ -744,14 +923,14 @@ class GoogleOAuthBase:
         try:
             server = await asyncio.start_server(
-                handle_callback, "127.0.0.1", self.CALLBACK_PORT
             )
             from urllib.parse import urlencode
             auth_url = "https://accounts.google.com/o/oauth2/v2/auth?" + urlencode(
                 {
                     "client_id": self.CLIENT_ID,
-                    "redirect_uri": f"http://localhost:{self.CALLBACK_PORT}{self.CALLBACK_PATH}",
                     "scope": " ".join(self.OAUTH_SCOPES),
                     "access_type": "offline",
                     "response_type": "code",
@@ -826,7 +1005,7 @@ class GoogleOAuthBase:
                     "code": auth_code.strip(),
                     "client_id": self.CLIENT_ID,
                     "client_secret": self.CLIENT_SECRET,
-                    "redirect_uri": f"http://localhost:{self.CALLBACK_PORT}{self.CALLBACK_PATH}",
                     "grant_type": "authorization_code",
                 },
             )
@@ -864,6 +1043,18 @@ class GoogleOAuthBase:
             lib_logger.info(
                 f"{self.ENV_PREFIX} OAuth initialized successfully for '{display_name}'."
             )
         return new_creds
     async def initialize_token(
@@ -940,10 +1131,51 @@ class GoogleOAuthBase:
             )
     async def get_auth_header(self, credential_path: str) -> Dict[str, str]:
-        creds = await self._load_credentials(credential_path)
-        if self._is_token_expired(creds):
-            creds = await self._refresh_token(credential_path, creds)
-        return {"Authorization": f"Bearer {creds['access_token']}"}
     async def get_user_info(
         self, creds_or_path: Union[Dict[str, Any], str]
@@ -976,3 +1208,372 @@ class GoogleOAuthBase:
             if path:
                 await self._save_credentials(path, creds)
             return {"email": user_info.get("email")}

 # src/rotator_library/providers/google_oauth_base.py
 import os
+import re
 import webbrowser
+from dataclasses import dataclass, field
+from typing import Union, Optional, List
 import json
 import time
 import asyncio
 import logging
 from pathlib import Path
 from typing import Dict, Any
+from glob import glob
 import httpx
 from rich.console import Console
 from ..utils.headless_detection import is_headless_environment
 from ..utils.reauth_coordinator import get_reauth_coordinator
+from ..utils.resilient_io import safe_write_json
 lib_logger = logging.getLogger("rotator_library")
 console = Console()
+@dataclass
+class CredentialSetupResult:
+    """
+    Standardized result structure for credential setup operations.
+    Used by all auth classes to return consistent setup results to the credential tool.
+    """
+    success: bool
+    file_path: Optional[str] = None
+    email: Optional[str] = None
+    tier: Optional[str] = None
+    project_id: Optional[str] = None
+    is_update: bool = False
+    error: Optional[str] = None
+    credentials: Optional[Dict[str, Any]] = field(default=None, repr=False)
 class GoogleOAuthBase:
     """
     Base class for Google OAuth2 authentication providers.
     CALLBACK_PATH: str = "/oauth2callback"
     REFRESH_EXPIRY_BUFFER_SECONDS: int = 30 * 60  # 30 minutes
+    @property
+    def callback_port(self) -> int:
+        """
+        Get the OAuth callback port, checking environment variable first.
+        Reads from {ENV_PREFIX}_OAUTH_PORT environment variable, falling back
+        to the class's CALLBACK_PORT default if not set.
+        """
+        env_var = f"{self.ENV_PREFIX}_OAUTH_PORT"
+        env_value = os.getenv(env_var)
+        if env_value:
+            try:
+                return int(env_value)
+            except ValueError:
+                lib_logger.warning(
+                    f"Invalid {env_var} value: {env_value}, using default {self.CALLBACK_PORT}"
+                )
+        return self.CALLBACK_PORT
     def __init__(self):
         # Validate that subclass has set required attributes
         if self.CLIENT_ID is None:
             str, float
         ] = {}  # Track backoff timers (Unix timestamp)
+        # [QUEUE SYSTEM] Sequential refresh processing with two separate queues
+        # Normal refresh queue: for proactive token refresh (old token still valid)
         self._refresh_queue: asyncio.Queue = asyncio.Queue()
+        self._queue_processor_task: Optional[asyncio.Task] = None
+        # Re-auth queue: for invalid refresh tokens (requires user interaction)
+        self._reauth_queue: asyncio.Queue = asyncio.Queue()
+        self._reauth_processor_task: Optional[asyncio.Task] = None
+        # Tracking sets/dicts
+        self._queued_credentials: set = set()  # Track credentials in either queue
+        # Only credentials in re-auth queue are marked unavailable (not normal refresh)
+        # TTL cleanup is defense-in-depth for edge cases where re-auth processor crashes
         self._unavailable_credentials: Dict[
             str, float
         ] = {}  # Maps credential path -> timestamp when marked unavailable
+        # TTL should exceed reauth timeout (300s) to avoid premature cleanup
+        self._unavailable_ttl_seconds: int = 360  # 6 minutes TTL for stale entries
         self._queue_tracking_lock = asyncio.Lock()  # Protects queue sets
+        # Retry tracking for normal refresh queue
+        self._queue_retry_count: Dict[
+            str, int
+        ] = {}  # Track retry attempts per credential
+        # Configuration constants
+        self._refresh_timeout_seconds: int = 15  # Max time for single refresh
+        self._refresh_interval_seconds: int = 30  # Delay between queue items
+        self._refresh_max_retries: int = 3  # Attempts before kicked out
+        self._reauth_timeout_seconds: int = 300  # Time for user to complete OAuth
     def _parse_env_credential_path(self, path: str) -> Optional[str]:
         """
                         f"Environment variables for {self.ENV_PREFIX} credential index {credential_index} not found"
                     )
+            # Try file-based loading first (preferred for explicit file paths)
             try:
                 lib_logger.debug(
                     f"Loading {self.ENV_PREFIX} credentials from file: {path}"
                 self._credentials_cache[path] = creds
                 return creds
             except FileNotFoundError:
+                # File not found - fall back to legacy env vars for backwards compatibility
+                # This handles the case where only env vars are set and file paths are placeholders
+                env_creds = self._load_from_env()
+                if env_creds:
+                    lib_logger.info(
+                        f"File '{path}' not found, using {self.ENV_PREFIX} credentials from environment variables"
+                    )
+                    self._credentials_cache[path] = env_creds
+                    return env_creds
                 raise IOError(
                     f"{self.ENV_PREFIX} OAuth credential file not found at '{path}'"
                 )
                 raise IOError(
                     f"Failed to load {self.ENV_PREFIX} OAuth credentials from '{path}': {e}"
                 )
     async def _save_credentials(self, path: str, creds: Dict[str, Any]):
+        """Save credentials with in-memory fallback if disk unavailable."""
+        # Always update cache first (memory is reliable)
+        self._credentials_cache[path] = creds
         # Don't save to file if credentials were loaded from environment
         if creds.get("_proxy_metadata", {}).get("loaded_from_env"):
             lib_logger.debug("Credentials loaded from env, skipping file save")
             return
+        # Attempt disk write - if it fails, we still have the cache
+        # buffer_on_failure ensures data is retried periodically and saved on shutdown
+        if safe_write_json(
+            path, creds, lib_logger, secure_permissions=True, buffer_on_failure=True
+        ):
             lib_logger.debug(
+                f"Saved updated {self.ENV_PREFIX} OAuth credentials to '{path}'."
             )
+        else:
+            lib_logger.warning(
+                f"Credentials for {self.ENV_PREFIX} cached in memory only (buffered for retry)."
             )
     def _is_token_expired(self, creds: Dict[str, Any]) -> bool:
         expiry = creds.get("token_expiry")  # gcloud format
         """Proactively refresh a credential by queueing it for refresh."""
         creds = await self._load_credentials(credential_path)
         if self._is_token_expired(creds):
+            # lib_logger.info(f"Proactive refresh triggered for '{Path(credential_path).name}'")
             await self._queue_refresh(credential_path, force=False, needs_reauth=False)
     async def _get_lock(self, path: str) -> asyncio.Lock:
                 self._refresh_locks[path] = asyncio.Lock()
             return self._refresh_locks[path]
+    def _is_token_truly_expired(self, creds: Dict[str, Any]) -> bool:
+        """Check if token is TRULY expired (past actual expiry, not just threshold).
+        This is different from _is_token_expired() which uses a buffer for proactive refresh.
+        This method checks if the token is actually unusable.
+        """
+        expiry = creds.get("token_expiry")  # gcloud format
+        if not expiry:  # gemini-cli format
+            expiry_timestamp = creds.get("expiry_date", 0) / 1000
+        else:
+            expiry_timestamp = time.mktime(time.strptime(expiry, "%Y-%m-%dT%H:%M:%SZ"))
+        return expiry_timestamp < time.time()
     def is_credential_available(self, path: str) -> bool:
+        """Check if a credential is available for rotation.
+        Credentials are unavailable if:
+        1. In re-auth queue (token is truly broken, requires user interaction)
+        2. Token is TRULY expired (past actual expiry, not just threshold)
+        Note: Credentials in normal refresh queue are still available because
+        the old token is valid until actual expiry.
+        TTL cleanup (defense-in-depth): If a credential has been in the re-auth
+        queue longer than _unavailable_ttl_seconds without being processed, it's
+        cleaned up. This should only happen if the re-auth processor crashes or
+        is cancelled without proper cleanup.
         """
+        # Check if in re-auth queue (truly unavailable)
+        if path in self._unavailable_credentials:
+            marked_time = self._unavailable_credentials.get(path)
+            if marked_time is not None:
+                now = time.time()
+                if now - marked_time > self._unavailable_ttl_seconds:
+                    # Entry is stale - clean it up and return available
+                    # This is a defense-in-depth for edge cases where re-auth
+                    # processor crashed or was cancelled without cleanup
+                    lib_logger.warning(
+                        f"Credential '{Path(path).name}' stuck in re-auth queue for "
+                        f"{int(now - marked_time)}s (TTL: {self._unavailable_ttl_seconds}s). "
+                        f"Re-auth processor may have crashed. Auto-cleaning stale entry."
+                    )
+                    # Clean up both tracking structures for consistency
+                    self._unavailable_credentials.pop(path, None)
+                    self._queued_credentials.discard(path)
+                else:
+                    return False  # Still in re-auth, not available
+        # Check if token is TRULY expired (not just threshold-expired)
+        creds = self._credentials_cache.get(path)
+        if creds and self._is_token_truly_expired(creds):
+            # Token is actually expired - should not be used
+            # Queue for refresh if not already queued
+            if path not in self._queued_credentials:
+                # lib_logger.debug(
+                #     f"Credential '{Path(path).name}' is truly expired, queueing for refresh"
+                # )
+                asyncio.create_task(
+                    self._queue_refresh(path, force=True, needs_reauth=False)
                 )
+            return False
+        return True
     async def _ensure_queue_processor_running(self):
         """Lazily starts the queue processor if not already running."""
                 self._process_refresh_queue()
             )
+    async def _ensure_reauth_processor_running(self):
+        """Lazily starts the re-auth queue processor if not already running."""
+        if self._reauth_processor_task is None or self._reauth_processor_task.done():
+            self._reauth_processor_task = asyncio.create_task(
+                self._process_reauth_queue()
+            )
     async def _queue_refresh(
         self, path: str, force: bool = False, needs_reauth: bool = False
     ):
+        """Add a credential to the appropriate refresh queue if not already queued.
         Args:
             path: Credential file path
             force: Force refresh even if not expired
+            needs_reauth: True if full re-authentication needed (routes to re-auth queue)
+        Queue routing:
+        - needs_reauth=True: Goes to re-auth queue, marks as unavailable
+        - needs_reauth=False: Goes to normal refresh queue, does NOT mark unavailable
+          (old token is still valid until actual expiry)
         """
         # IMPORTANT: Only check backoff for simple automated refreshes
         # Re-authentication (interactive OAuth) should BYPASS backoff since it needs user input
                 backoff_until = self._next_refresh_after[path]
                 if now < backoff_until:
                     # Credential is in backoff for automated refresh, do not queue
+                    # remaining = int(backoff_until - now)
+                    # lib_logger.debug(
+                    #     f"Skipping automated refresh for '{Path(path).name}' (in backoff for {remaining}s)"
+                    # )
                     return
         async with self._queue_tracking_lock:
             if path not in self._queued_credentials:
                 self._queued_credentials.add(path)
+                if needs_reauth:
+                    # Re-auth queue: mark as unavailable (token is truly broken)
+                    self._unavailable_credentials[path] = time.time()
+                    # lib_logger.debug(
+                    #     f"Queued '{Path(path).name}' for RE-AUTH (marked unavailable). "
+                    #     f"Total unavailable: {len(self._unavailable_credentials)}"
+                    # )
+                    await self._reauth_queue.put(path)
+                    await self._ensure_reauth_processor_running()
+                else:
+                    # Normal refresh queue: do NOT mark unavailable (old token still valid)
+                    # lib_logger.debug(
+                    #     f"Queued '{Path(path).name}' for refresh (still available). "
+                    #     f"Queue size: {self._refresh_queue.qsize() + 1}"
+                    # )
+                    await self._refresh_queue.put((path, force))
+                    await self._ensure_queue_processor_running()
     async def _process_refresh_queue(self):
+        """Background worker that processes normal refresh requests sequentially.
+        Key behaviors:
+        - 15s timeout per refresh operation
+        - 30s delay between processing credentials (prevents thundering herd)
+        - On failure: back of queue, max 3 retries before kicked
+        - If 401/403 detected: routes to re-auth queue
+        - Does NOT mark credentials unavailable (old token still valid)
+        """
+        # lib_logger.info("Refresh queue processor started")
         while True:
             path = None
             try:
                 # Wait for an item with timeout to allow graceful shutdown
                 try:
+                    path, force = await asyncio.wait_for(
                         self._refresh_queue.get(), timeout=60.0
                     )
                 except asyncio.TimeoutError:
+                    # Queue is empty and idle for 60s - clean up and exit
                     async with self._queue_tracking_lock:
+                        # Clear any stale retry counts
+                        self._queue_retry_count.clear()
                     self._queue_processor_task = None
+                    # lib_logger.debug("Refresh queue processor idle, shutting down")
                     return
                 try:
+                    # Quick check if still expired (optimization to avoid unnecessary refresh)
+                    creds = self._credentials_cache.get(path)
+                    if creds and not self._is_token_expired(creds):
+                        # No longer expired, skip refresh
+                        # lib_logger.debug(
+                        #     f"Credential '{Path(path).name}' no longer expired, skipping refresh"
+                        # )
+                        # Clear retry count on skip (not a failure)
+                        self._queue_retry_count.pop(path, None)
+                        continue
+                    # Perform refresh with timeout
+                    if not creds:
+                        creds = await self._load_credentials(path)
+                    try:
+                        async with asyncio.timeout(self._refresh_timeout_seconds):
+                            await self._refresh_token(path, creds, force=force)
+                        # SUCCESS: Clear retry count
+                        self._queue_retry_count.pop(path, None)
+                        # lib_logger.info(f"Refresh SUCCESS for '{Path(path).name}'")
+                    except asyncio.TimeoutError:
+                        lib_logger.warning(
+                            f"Refresh timeout ({self._refresh_timeout_seconds}s) for '{Path(path).name}'"
+                        )
+                        await self._handle_refresh_failure(path, force, "timeout")
+                    except httpx.HTTPStatusError as e:
+                        status_code = e.response.status_code
+                        if status_code in (401, 403):
+                            # Invalid refresh token - route to re-auth queue
+                            lib_logger.warning(
+                                f"Refresh token invalid for '{Path(path).name}' (HTTP {status_code}). "
+                                f"Routing to re-auth queue."
                             )
+                            self._queue_retry_count.pop(path, None)  # Clear retry count
+                            async with self._queue_tracking_lock:
+                                self._queued_credentials.discard(
+                                    path
+                                )  # Remove from queued
+                            await self._queue_refresh(
+                                path, force=True, needs_reauth=True
+                            )
+                        else:
+                            await self._handle_refresh_failure(
+                                path, force, f"HTTP {status_code}"
+                            )
+                    except Exception as e:
+                        await self._handle_refresh_failure(path, force, str(e))
+                finally:
+                    # Remove from queued set (unless re-queued by failure handler)
+                    async with self._queue_tracking_lock:
+                        # Only discard if not re-queued (check if still in queue set from retry)
+                        if (
+                            path in self._queued_credentials
+                            and self._queue_retry_count.get(path, 0) == 0
+                        ):
+                            self._queued_credentials.discard(path)
+                    self._refresh_queue.task_done()
+                # Wait between credentials to spread load
+                await asyncio.sleep(self._refresh_interval_seconds)
+            except asyncio.CancelledError:
+                # lib_logger.debug("Refresh queue processor cancelled")
+                break
+            except Exception as e:
+                lib_logger.error(f"Error in refresh queue processor: {e}")
+                if path:
+                    async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
+    async def _handle_refresh_failure(self, path: str, force: bool, error: str):
+        """Handle a refresh failure with back-of-line retry logic.
+        - Increments retry count
+        - If under max retries: re-adds to END of queue
+        - If at max retries: kicks credential out (retried next BackgroundRefresher cycle)
+        """
+        retry_count = self._queue_retry_count.get(path, 0) + 1
+        self._queue_retry_count[path] = retry_count
+        if retry_count >= self._refresh_max_retries:
+            # Kicked out until next BackgroundRefresher cycle
+            lib_logger.error(
+                f"Max retries ({self._refresh_max_retries}) reached for '{Path(path).name}' "
+                f"(last error: {error}). Will retry next refresh cycle."
+            )
+            self._queue_retry_count.pop(path, None)
+            async with self._queue_tracking_lock:
+                self._queued_credentials.discard(path)
+            return
+        # Re-add to END of queue for retry
+        lib_logger.warning(
+            f"Refresh failed for '{Path(path).name}' ({error}). "
+            f"Retry {retry_count}/{self._refresh_max_retries}, back of queue."
+        )
+        # Keep in queued_credentials set, add back to queue
+        await self._refresh_queue.put((path, force))
+    async def _process_reauth_queue(self):
+        """Background worker that processes re-auth requests.
+        Key behaviors:
+        - Credentials ARE marked unavailable (token is truly broken)
+        - Uses ReauthCoordinator for interactive OAuth
+        - No automatic retry (requires user action)
+        - Cleans up unavailable status when done
+        """
+        # lib_logger.info("Re-auth queue processor started")
+        while True:
+            path = None
+            try:
+                # Wait for an item with timeout to allow graceful shutdown
+                try:
+                    path = await asyncio.wait_for(
+                        self._reauth_queue.get(), timeout=60.0
+                    )
+                except asyncio.TimeoutError:
+                    # Queue is empty and idle for 60s - exit
+                    self._reauth_processor_task = None
+                    # lib_logger.debug("Re-auth queue processor idle, shutting down")
+                    return
+                try:
+                    lib_logger.info(f"Starting re-auth for '{Path(path).name}'...")
+                    await self.initialize_token(path)
+                    lib_logger.info(f"Re-auth SUCCESS for '{Path(path).name}'")
+                except Exception as e:
+                    lib_logger.error(f"Re-auth FAILED for '{Path(path).name}': {e}")
+                    # No automatic retry for re-auth (requires user action)
                 finally:
+                    # Always clean up
                     async with self._queue_tracking_lock:
                         self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
+                        # lib_logger.debug(
+                        #     f"Re-auth cleanup for '{Path(path).name}'. "
+                        #     f"Remaining unavailable: {len(self._unavailable_credentials)}"
+                        # )
+                    self._reauth_queue.task_done()
             except asyncio.CancelledError:
+                # Clean up current credential before breaking
                 if path:
                     async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
+                # lib_logger.debug("Re-auth queue processor cancelled")
                 break
             except Exception as e:
+                lib_logger.error(f"Error in re-auth queue processor: {e}")
                 if path:
                     async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
     async def _perform_interactive_oauth(
         self, path: str, creds: Dict[str, Any], display_name: str
         try:
             server = await asyncio.start_server(
+                handle_callback, "127.0.0.1", self.callback_port
             )
             from urllib.parse import urlencode
             auth_url = "https://accounts.google.com/o/oauth2/v2/auth?" + urlencode(
                 {
                     "client_id": self.CLIENT_ID,
+                    "redirect_uri": f"http://localhost:{self.callback_port}{self.CALLBACK_PATH}",
                     "scope": " ".join(self.OAUTH_SCOPES),
                     "access_type": "offline",
                     "response_type": "code",
                     "code": auth_code.strip(),
                     "client_id": self.CLIENT_ID,
                     "client_secret": self.CLIENT_SECRET,
+                    "redirect_uri": f"http://localhost:{self.callback_port}{self.CALLBACK_PATH}",
                     "grant_type": "authorization_code",
                 },
             )
             lib_logger.info(
                 f"{self.ENV_PREFIX} OAuth initialized successfully for '{display_name}'."
             )
+            # Perform post-auth discovery (tier, project, etc.) while we have a fresh token
+            if path:
+                try:
+                    await self._post_auth_discovery(path, new_creds["access_token"])
+                except Exception as e:
+                    # Don't fail auth if discovery fails - it can be retried on first request
+                    lib_logger.warning(
+                        f"Post-auth discovery failed for '{display_name}': {e}. "
+                        "Tier/project will be discovered on first request."
+                    )
         return new_creds
     async def initialize_token(
             )
     async def get_auth_header(self, credential_path: str) -> Dict[str, str]:
+        """Get auth header with graceful degradation if refresh fails."""
+        try:
+            creds = await self._load_credentials(credential_path)
+            if self._is_token_expired(creds):
+                try:
+                    creds = await self._refresh_token(credential_path, creds)
+                except Exception as e:
+                    # Check if we have a cached token that might still work
+                    cached = self._credentials_cache.get(credential_path)
+                    if cached and cached.get("access_token"):
+                        lib_logger.warning(
+                            f"Token refresh failed for {Path(credential_path).name}: {e}. "
+                            "Using cached token (may be expired)."
+                        )
+                        creds = cached
+                    else:
+                        raise
+            return {"Authorization": f"Bearer {creds['access_token']}"}
+        except Exception as e:
+            # Check if any cached credential exists as last resort
+            cached = self._credentials_cache.get(credential_path)
+            if cached and cached.get("access_token"):
+                lib_logger.error(
+                    f"Credential load failed for {credential_path}: {e}. "
+                    "Using stale cached token as last resort."
+                )
+                return {"Authorization": f"Bearer {cached['access_token']}"}
+            raise
+    async def _post_auth_discovery(
+        self, credential_path: str, access_token: str
+    ) -> None:
+        """
+        Hook for subclasses to perform post-authentication discovery.
+        Called after successful OAuth authentication (both initial and re-auth).
+        Subclasses can override this to discover and cache tier/project information
+        during the authentication flow rather than waiting for the first API request.
+        Args:
+            credential_path: Path to the credential file
+            access_token: The newly obtained access token
+        """
+        # Default implementation does nothing - subclasses can override
+        pass
     async def get_user_info(
         self, creds_or_path: Union[Dict[str, Any], str]
             if path:
                 await self._save_credentials(path, creds)
             return {"email": user_info.get("email")}
+    # =========================================================================
+    # CREDENTIAL MANAGEMENT METHODS
+    # =========================================================================
+    def _get_provider_file_prefix(self) -> str:
+        """
+        Get the file name prefix for this provider's credential files.
+        Override in subclasses if the prefix differs from ENV_PREFIX.
+        Default: lowercase ENV_PREFIX with underscores (e.g., "gemini_cli")
+        """
+        return self.ENV_PREFIX.lower()
+    def _get_oauth_base_dir(self) -> Path:
+        """
+        Get the base directory for OAuth credential files.
+        Can be overridden to customize credential storage location.
+        """
+        return Path.cwd() / "oauth_creds"
+    def _find_existing_credential_by_email(
+        self, email: str, base_dir: Optional[Path] = None
+    ) -> Optional[Path]:
+        """
+        Find an existing credential file for the given email.
+        Args:
+            email: Email address to search for
+            base_dir: Directory to search in (defaults to oauth_creds)
+        Returns:
+            Path to existing credential file, or None if not found
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        for cred_file in glob(pattern):
+            try:
+                with open(cred_file, "r") as f:
+                    creds = json.load(f)
+                existing_email = creds.get("_proxy_metadata", {}).get("email")
+                if existing_email == email:
+                    return Path(cred_file)
+            except (json.JSONDecodeError, IOError) as e:
+                lib_logger.debug(f"Could not read credential file {cred_file}: {e}")
+                continue
+        return None
+    def _get_next_credential_number(self, base_dir: Optional[Path] = None) -> int:
+        """
+        Get the next available credential number for new credential files.
+        Args:
+            base_dir: Directory to scan (defaults to oauth_creds)
+        Returns:
+            Next available credential number (1, 2, 3, etc.)
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        existing_numbers = []
+        for cred_file in glob(pattern):
+            match = re.search(r"_oauth_(\d+)\.json$", cred_file)
+            if match:
+                existing_numbers.append(int(match.group(1)))
+        if not existing_numbers:
+            return 1
+        return max(existing_numbers) + 1
+    def _build_credential_path(
+        self, base_dir: Optional[Path] = None, number: Optional[int] = None
+    ) -> Path:
+        """
+        Build a path for a new credential file.
+        Args:
+            base_dir: Directory for credential files (defaults to oauth_creds)
+            number: Credential number (auto-determined if None)
+        Returns:
+            Path for the new credential file
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        if number is None:
+            number = self._get_next_credential_number(base_dir)
+        prefix = self._get_provider_file_prefix()
+        filename = f"{prefix}_oauth_{number}.json"
+        return base_dir / filename
+    async def setup_credential(
+        self, base_dir: Optional[Path] = None
+    ) -> CredentialSetupResult:
+        """
+        Complete credential setup flow: OAuth -> save -> discovery.
+        This is the main entry point for setting up new credentials.
+        Handles the entire lifecycle:
+        1. Perform OAuth authentication
+        2. Get user info (email) for deduplication
+        3. Find existing credential or create new file path
+        4. Save credentials to file
+        5. Perform post-auth discovery (tier/project for Google OAuth)
+        Args:
+            base_dir: Directory for credential files (defaults to oauth_creds)
+        Returns:
+            CredentialSetupResult with status and details
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        # Ensure directory exists
+        base_dir.mkdir(exist_ok=True)
+        try:
+            # Step 1: Perform OAuth authentication (returns credentials dict)
+            temp_creds = {
+                "_proxy_metadata": {"display_name": f"new {self.ENV_PREFIX} credential"}
+            }
+            new_creds = await self.initialize_token(temp_creds)
+            # Step 2: Get user info for deduplication
+            user_info = await self.get_user_info(new_creds)
+            email = user_info.get("email")
+            if not email:
+                return CredentialSetupResult(
+                    success=False, error="Could not retrieve email from OAuth response"
+                )
+            # Step 3: Check for existing credential with same email
+            existing_path = self._find_existing_credential_by_email(email, base_dir)
+            is_update = existing_path is not None
+            if is_update:
+                file_path = existing_path
+                lib_logger.info(
+                    f"Found existing credential for {email}, updating {file_path.name}"
+                )
+            else:
+                file_path = self._build_credential_path(base_dir)
+                lib_logger.info(
+                    f"Creating new credential for {email} at {file_path.name}"
+                )
+            # Step 4: Save credentials to file
+            await self._save_credentials(str(file_path), new_creds)
+            # Step 5: Perform post-auth discovery (tier, project_id)
+            # This is already called in _perform_interactive_oauth, but we call it again
+            # in case credentials were loaded from existing token
+            tier = None
+            project_id = None
+            try:
+                await self._post_auth_discovery(
+                    str(file_path), new_creds["access_token"]
+                )
+                # Reload credentials to get discovered metadata
+                with open(file_path, "r") as f:
+                    updated_creds = json.load(f)
+                tier = updated_creds.get("_proxy_metadata", {}).get("tier")
+                project_id = updated_creds.get("_proxy_metadata", {}).get("project_id")
+                new_creds = updated_creds
+            except Exception as e:
+                lib_logger.warning(
+                    f"Post-auth discovery failed: {e}. Tier/project will be discovered on first request."
+                )
+            return CredentialSetupResult(
+                success=True,
+                file_path=str(file_path),
+                email=email,
+                tier=tier,
+                project_id=project_id,
+                is_update=is_update,
+                credentials=new_creds,
+            )
+        except Exception as e:
+            lib_logger.error(f"Credential setup failed: {e}")
+            return CredentialSetupResult(success=False, error=str(e))
+    def build_env_lines(self, creds: Dict[str, Any], cred_number: int) -> List[str]:
+        """
+        Generate .env file lines for a credential.
+        Subclasses should override to include provider-specific fields
+        (e.g., tier, project_id for Google OAuth providers).
+        Args:
+            creds: Credential dictionary loaded from JSON
+            cred_number: Credential number (1, 2, 3, etc.)
+        Returns:
+            List of .env file lines
+        """
+        email = creds.get("_proxy_metadata", {}).get("email", "unknown")
+        prefix = f"{self.ENV_PREFIX}_{cred_number}"
+        lines = [
+            f"# {self.ENV_PREFIX} Credential #{cred_number} for: {email}",
+            f"# Exported from: {self._get_provider_file_prefix()}_oauth_{cred_number}.json",
+            f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
+            "#",
+            "# To combine multiple credentials into one .env file, copy these lines",
+            "# and ensure each credential has a unique number (1, 2, 3, etc.)",
+            "",
+            f"{prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
+            f"{prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
+            f"{prefix}_SCOPE={creds.get('scope', '')}",
+            f"{prefix}_TOKEN_TYPE={creds.get('token_type', 'Bearer')}",
+            f"{prefix}_ID_TOKEN={creds.get('id_token', '')}",
+            f"{prefix}_EXPIRY_DATE={creds.get('expiry_date', 0)}",
+            f"{prefix}_CLIENT_ID={creds.get('client_id', '')}",
+            f"{prefix}_CLIENT_SECRET={creds.get('client_secret', '')}",
+            f"{prefix}_TOKEN_URI={creds.get('token_uri', 'https://oauth2.googleapis.com/token')}",
+            f"{prefix}_UNIVERSE_DOMAIN={creds.get('universe_domain', 'googleapis.com')}",
+            f"{prefix}_EMAIL={email}",
+        ]
+        return lines
+    def export_credential_to_env(
+        self, credential_path: str, output_dir: Optional[Path] = None
+    ) -> Optional[str]:
+        """
+        Export a credential file to .env format.
+        Args:
+            credential_path: Path to the credential JSON file
+            output_dir: Directory for output .env file (defaults to same as credential)
+        Returns:
+            Path to the exported .env file, or None on error
+        """
+        try:
+            cred_path = Path(credential_path)
+            # Load credential
+            with open(cred_path, "r") as f:
+                creds = json.load(f)
+            # Extract metadata
+            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
+            # Get credential number from filename
+            match = re.search(r"_oauth_(\d+)\.json$", cred_path.name)
+            cred_number = int(match.group(1)) if match else 1
+            # Build output path
+            if output_dir is None:
+                output_dir = cred_path.parent
+            safe_email = email.replace("@", "_at_").replace(".", "_")
+            prefix = self._get_provider_file_prefix()
+            env_filename = f"{prefix}_{cred_number}_{safe_email}.env"
+            env_path = output_dir / env_filename
+            # Build and write content
+            env_lines = self.build_env_lines(creds, cred_number)
+            with open(env_path, "w") as f:
+                f.write("\n".join(env_lines))
+            lib_logger.info(f"Exported credential to {env_path}")
+            return str(env_path)
+        except Exception as e:
+            lib_logger.error(f"Failed to export credential: {e}")
+            return None
+    def list_credentials(self, base_dir: Optional[Path] = None) -> List[Dict[str, Any]]:
+        """
+        List all credential files for this provider.
+        Args:
+            base_dir: Directory to search (defaults to oauth_creds)
+        Returns:
+            List of dicts with credential info:
+            - file_path: Path to credential file
+            - email: User email
+            - tier: Tier info (if available)
+            - project_id: Project ID (if available)
+            - number: Credential number
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        credentials = []
+        for cred_file in sorted(glob(pattern)):
+            try:
+                with open(cred_file, "r") as f:
+                    creds = json.load(f)
+                metadata = creds.get("_proxy_metadata", {})
+                # Extract number from filename
+                match = re.search(r"_oauth_(\d+)\.json$", cred_file)
+                number = int(match.group(1)) if match else 0
+                credentials.append(
+                    {
+                        "file_path": cred_file,
+                        "email": metadata.get("email", "unknown"),
+                        "tier": metadata.get("tier"),
+                        "project_id": metadata.get("project_id"),
+                        "number": number,
+                    }
+                )
+            except Exception as e:
+                lib_logger.debug(f"Could not read credential file {cred_file}: {e}")
+                continue
+        return credentials
+    def delete_credential(self, credential_path: str) -> bool:
+        """
+        Delete a credential file.
+        Args:
+            credential_path: Path to the credential file
+        Returns:
+            True if deleted successfully, False otherwise
+        """
+        try:
+            cred_path = Path(credential_path)
+            # Validate that it's one of our credential files
+            prefix = self._get_provider_file_prefix()
+            if not cred_path.name.startswith(f"{prefix}_oauth_"):
+                lib_logger.error(
+                    f"File {cred_path.name} does not appear to be a {self.ENV_PREFIX} credential"
+                )
+                return False
+            if not cred_path.exists():
+                lib_logger.warning(f"Credential file does not exist: {credential_path}")
+                return False
+            # Remove from cache if present
+            self._credentials_cache.pop(credential_path, None)
+            # Delete the file
+            cred_path.unlink()
+            lib_logger.info(f"Deleted credential file: {credential_path}")
+            return True
+        except Exception as e:
+            lib_logger.error(f"Failed to delete credential: {e}")
+            return False

src/rotator_library/providers/iflow_auth_base.py CHANGED Viewed

@@ -9,11 +9,12 @@ import logging
 import webbrowser
 import socket
 import os
 from pathlib import Path
-from typing import Dict, Any, Tuple, Union, Optional
 from urllib.parse import urlencode, parse_qs, urlparse
-import tempfile
-import shutil
 import httpx
 from aiohttp import web
@@ -24,6 +25,7 @@ from rich.text import Text
 from rich.markup import escape as rich_escape
 from ..utils.headless_detection import is_headless_environment
 from ..utils.reauth_coordinator import get_reauth_coordinator
 lib_logger = logging.getLogger("rotator_library")
@@ -40,6 +42,39 @@ IFLOW_CLIENT_SECRET = "4Z3YjXycVsQvyGF1etiNlIBB4RsqSDtW"
 # Local callback server port
 CALLBACK_PORT = 11451
 # Refresh tokens 24 hours before expiry
 REFRESH_EXPIRY_BUFFER_SECONDS = 24 * 60 * 60
@@ -171,19 +206,36 @@ class IFlowAuthBase:
             str, float
         ] = {}  # Track backoff timers (Unix timestamp)
-        # [QUEUE SYSTEM] Sequential refresh processing
         self._refresh_queue: asyncio.Queue = asyncio.Queue()
-        self._queued_credentials: set = set()  # Track credentials already in queue
-        # [FIX PR#34] Changed from set to dict mapping credential path to timestamp
-        # This enables TTL-based stale entry cleanup as defense in depth
         self._unavailable_credentials: Dict[
             str, float
         ] = {}  # Maps credential path -> timestamp when marked unavailable
-        self._unavailable_ttl_seconds: int = 300  # 5 minutes TTL for stale entries
         self._queue_tracking_lock = asyncio.Lock()  # Protects queue sets
-        self._queue_processor_task: Optional[asyncio.Task] = (
-            None  # Background worker task
-        )
     def _parse_env_credential_path(self, path: str) -> Optional[str]:
         """
@@ -305,76 +357,40 @@ class IFlowAuthBase:
                         f"Environment variables for iFlow credential index {credential_index} not found"
                     )
-            # For file paths, try loading from legacy env vars first
-            env_creds = self._load_from_env()
-            if env_creds:
-                lib_logger.info("Using iFlow credentials from environment variables")
-                self._credentials_cache[path] = env_creds
-                return env_creds
-            # Fall back to file-based loading
-            return await self._read_creds_from_file(path)
     async def _save_credentials(self, path: str, creds: Dict[str, Any]):
-        """Saves credentials to cache and file using atomic writes."""
         # Don't save to file if credentials were loaded from environment
         if creds.get("_proxy_metadata", {}).get("loaded_from_env"):
             lib_logger.debug("Credentials loaded from env, skipping file save")
-            # Still update cache for in-memory consistency
-            self._credentials_cache[path] = creds
             return
-        # [ATOMIC WRITE] Use tempfile + move pattern to ensure atomic writes
-        # This prevents credential corruption if the process is interrupted during write
-        parent_dir = os.path.dirname(os.path.abspath(path))
-        os.makedirs(parent_dir, exist_ok=True)
-        tmp_fd = None
-        tmp_path = None
-        try:
-            # Create temp file in same directory as target (ensures same filesystem)
-            tmp_fd, tmp_path = tempfile.mkstemp(
-                dir=parent_dir, prefix=".tmp_", suffix=".json", text=True
-            )
-            # Write JSON to temp file
-            with os.fdopen(tmp_fd, "w") as f:
-                json.dump(creds, f, indent=2)
-                tmp_fd = None  # fdopen closes the fd
-            # Set secure permissions (0600 = owner read/write only)
-            try:
-                os.chmod(tmp_path, 0o600)
-            except (OSError, AttributeError):
-                # Windows may not support chmod, ignore
-                pass
-            # Atomic move (overwrites target if it exists)
-            shutil.move(tmp_path, path)
-            tmp_path = None  # Successfully moved
-            # Update cache AFTER successful file write
-            self._credentials_cache[path] = creds
-            lib_logger.debug(
-                f"Saved updated iFlow OAuth credentials to '{path}' (atomic write)."
-            )
-        except Exception as e:
-            lib_logger.error(
-                f"Failed to save updated iFlow OAuth credentials to '{path}': {e}"
             )
-            # Clean up temp file if it still exists
-            if tmp_fd is not None:
-                try:
-                    os.close(tmp_fd)
-                except:
-                    pass
-            if tmp_path and os.path.exists(tmp_path):
-                try:
-                    os.unlink(tmp_path)
-                except:
-                    pass
-            raise
     def _is_token_expired(self, creds: Dict[str, Any]) -> bool:
         """Checks if the token is expired (with buffer for proactive refresh)."""
@@ -399,6 +415,29 @@ class IFlowAuthBase:
         return expiry_timestamp < time.time() + REFRESH_EXPIRY_BUFFER_SECONDS
     async def _fetch_user_info(self, access_token: str) -> Dict[str, Any]:
         """
         Fetches user info (including API key) from iFlow API.
@@ -553,6 +592,26 @@ class IFlowAuthBase:
                         )
                         response.raise_for_status()
                         new_token_data = response.json()
                         break  # Success
                     except httpx.HTTPStatusError as e:
@@ -654,6 +713,16 @@ class IFlowAuthBase:
             # Update tokens
             access_token = new_token_data.get("access_token")
             if not access_token:
                 raise ValueError("Missing access_token in refresh response")
             creds_from_file["access_token"] = access_token
@@ -749,7 +818,7 @@ class IFlowAuthBase:
         Proactively refreshes tokens if they're close to expiry.
         Only applies to OAuth credentials (file paths or env:// paths). Direct API keys are skipped.
         """
-        lib_logger.debug(f"proactively_refresh called for: {credential_identifier}")
         # Try to load credentials - this will fail for direct API keys
         # and succeed for OAuth credentials (file paths or env:// paths)
@@ -757,21 +826,21 @@ class IFlowAuthBase:
             creds = await self._load_credentials(credential_identifier)
         except IOError as e:
             # Not a valid credential path (likely a direct API key string)
-            lib_logger.debug(
-                f"Skipping refresh for '{credential_identifier}' - not an OAuth credential: {e}"
-            )
             return
         is_expired = self._is_token_expired(creds)
-        lib_logger.debug(
-            f"Token expired check for '{Path(credential_identifier).name}': {is_expired}"
-        )
         if is_expired:
-            lib_logger.debug(
-                f"Queueing refresh for '{Path(credential_identifier).name}'"
-            )
-            # Queue for refresh with needs_reauth=False (automated refresh)
             await self._queue_refresh(
                 credential_identifier, force=False, needs_reauth=False
             )
@@ -785,30 +854,55 @@ class IFlowAuthBase:
             return self._refresh_locks[path]
     def is_credential_available(self, path: str) -> bool:
-        """Check if a credential is available for rotation (not queued/refreshing).
-        [FIX PR#34] Now includes TTL-based stale entry cleanup as defense in depth.
-        If a credential has been unavailable for longer than _unavailable_ttl_seconds,
-        it is automatically cleaned up and considered available.
         """
-        if path not in self._unavailable_credentials:
-            return True
-        # [FIX PR#34] Check if the entry is stale (TTL expired)
-        marked_time = self._unavailable_credentials.get(path)
-        if marked_time is not None:
-            now = time.time()
-            if now - marked_time > self._unavailable_ttl_seconds:
-                # Entry is stale - clean it up and return available
-                lib_logger.warning(
-                    f"Credential '{Path(path).name}' was stuck in unavailable state for "
-                    f"{int(now - marked_time)}s (TTL: {self._unavailable_ttl_seconds}s). "
-                    f"Auto-cleaning stale entry."
                 )
-                self._unavailable_credentials.pop(path, None)
-                return True
-        return False
     async def _ensure_queue_processor_running(self):
         """Lazily starts the queue processor if not already running."""
@@ -817,15 +911,27 @@ class IFlowAuthBase:
                 self._process_refresh_queue()
             )
     async def _queue_refresh(
         self, path: str, force: bool = False, needs_reauth: bool = False
     ):
-        """Add a credential to the refresh queue if not already queued.
         Args:
             path: Credential file path
             force: Force refresh even if not expired
-            needs_reauth: True if full re-authentication needed (bypasses backoff)
         """
         # IMPORTANT: Only check backoff for simple automated refreshes
         # Re-authentication (interactive OAuth) should BYPASS backoff since it needs user input
@@ -835,114 +941,223 @@ class IFlowAuthBase:
                 backoff_until = self._next_refresh_after[path]
                 if now < backoff_until:
                     # Credential is in backoff for automated refresh, do not queue
-                    remaining = int(backoff_until - now)
-                    lib_logger.debug(
-                        f"Skipping automated refresh for '{Path(path).name}' (in backoff for {remaining}s)"
-                    )
                     return
         async with self._queue_tracking_lock:
             if path not in self._queued_credentials:
                 self._queued_credentials.add(path)
-                # [FIX PR#34] Store timestamp when marking unavailable (for TTL cleanup)
-                self._unavailable_credentials[path] = time.time()
-                lib_logger.debug(
-                    f"Marked '{Path(path).name}' as unavailable. "
-                    f"Total unavailable: {len(self._unavailable_credentials)}"
-                )
-                await self._refresh_queue.put((path, force, needs_reauth))
-                await self._ensure_queue_processor_running()
     async def _process_refresh_queue(self):
-        """Background worker that processes refresh requests sequentially."""
         while True:
             path = None
             try:
                 # Wait for an item with timeout to allow graceful shutdown
                 try:
-                    path, force, needs_reauth = await asyncio.wait_for(
                         self._refresh_queue.get(), timeout=60.0
                     )
                 except asyncio.TimeoutError:
-                    # [FIX PR#34] Clean up any stale unavailable entries before exiting
-                    # If we're idle for 60s, no refreshes are in progress
                     async with self._queue_tracking_lock:
-                        if self._unavailable_credentials:
-                            stale_count = len(self._unavailable_credentials)
-                            lib_logger.warning(
-                                f"Queue processor idle timeout. Cleaning {stale_count} "
-                                f"stale unavailable credentials: {list(self._unavailable_credentials.keys())}"
-                            )
-                            self._unavailable_credentials.clear()
-                        # [FIX BUG#6] Also clear queued credentials to prevent stuck state
-                        if self._queued_credentials:
-                            lib_logger.debug(
-                                f"Clearing {len(self._queued_credentials)} queued credentials on timeout"
-                            )
-                            self._queued_credentials.clear()
                     self._queue_processor_task = None
                     return
                 try:
-                    # Perform the actual refresh (still using per-credential lock)
-                    async with await self._get_lock(path):
-                        # Re-check if still expired (may have changed since queueing)
-                        creds = self._credentials_cache.get(path)
-                        if creds and not self._is_token_expired(creds):
-                            # No longer expired, mark as available
-                            async with self._queue_tracking_lock:
-                                self._unavailable_credentials.pop(path, None)
-                                lib_logger.debug(
-                                    f"Credential '{Path(path).name}' no longer expired, marked available. "
-                                    f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                                )
-                            continue
-                        # Perform refresh
-                        if not creds:
-                            creds = await self._load_credentials(path)
-                        await self._refresh_token(path, force=force)
-                        # SUCCESS: Mark as available again
-                        async with self._queue_tracking_lock:
-                            self._unavailable_credentials.pop(path, None)
-                            lib_logger.debug(
-                                f"Refresh SUCCESS for '{Path(path).name}', marked available. "
-                                f"Remaining unavailable: {len(self._unavailable_credentials)}"
                             )
                 finally:
-                    # [FIX PR#34] Remove from BOTH queued set AND unavailable credentials
-                    # This ensures cleanup happens in ALL exit paths (success, exception, etc.)
                     async with self._queue_tracking_lock:
                         self._queued_credentials.discard(path)
-                        # [FIX PR#34] Always clean up unavailable credentials in finally block
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"Finally cleanup for '{Path(path).name}'. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
-                    self._refresh_queue.task_done()
             except asyncio.CancelledError:
-                # [FIX PR#34] Clean up the current credential before breaking
                 if path:
                     async with self._queue_tracking_lock:
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"CancelledError cleanup for '{Path(path).name}'. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
                 break
             except Exception as e:
-                lib_logger.error(f"Error in queue processor: {e}")
-                # Even on error, mark as available (backoff will prevent immediate retry)
                 if path:
                     async with self._queue_tracking_lock:
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"Error cleanup for '{Path(path).name}': {e}. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
     async def _perform_interactive_oauth(
         self, path: str, creds: Dict[str, Any], display_name: str
@@ -968,7 +1183,8 @@ class IFlowAuthBase:
         state = secrets.token_urlsafe(32)
         # Build authorization URL
-        redirect_uri = f"http://localhost:{CALLBACK_PORT}/oauth2callback"
         auth_params = {
             "loginMethod": "phone",
             "type": "phone",
@@ -979,7 +1195,7 @@ class IFlowAuthBase:
         auth_url = f"{IFLOW_OAUTH_AUTHORIZE_ENDPOINT}?{urlencode(auth_params)}"
         # Start OAuth callback server
-        callback_server = OAuthCallbackServer(port=CALLBACK_PORT)
         try:
             await callback_server.start(expected_state=state)
@@ -1182,3 +1398,261 @@ class IFlowAuthBase:
         except Exception as e:
             lib_logger.error(f"Failed to get iFlow user info from credentials: {e}")
             return {"email": None}

 import webbrowser
 import socket
 import os
+import re
+from dataclasses import dataclass, field
 from pathlib import Path
+from glob import glob
+from typing import Dict, Any, Tuple, Union, Optional, List
 from urllib.parse import urlencode, parse_qs, urlparse
 import httpx
 from aiohttp import web
 from rich.markup import escape as rich_escape
 from ..utils.headless_detection import is_headless_environment
 from ..utils.reauth_coordinator import get_reauth_coordinator
+from ..utils.resilient_io import safe_write_json
 lib_logger = logging.getLogger("rotator_library")
 # Local callback server port
 CALLBACK_PORT = 11451
+@dataclass
+class IFlowCredentialSetupResult:
+    """
+    Standardized result structure for iFlow credential setup operations.
+    """
+    success: bool
+    file_path: Optional[str] = None
+    email: Optional[str] = None
+    is_update: bool = False
+    error: Optional[str] = None
+    credentials: Optional[Dict[str, Any]] = field(default=None, repr=False)
+def get_callback_port() -> int:
+    """
+    Get the OAuth callback port, checking environment variable first.
+    Reads from IFLOW_OAUTH_PORT environment variable, falling back
+    to the default CALLBACK_PORT if not set.
+    """
+    env_value = os.getenv("IFLOW_OAUTH_PORT")
+    if env_value:
+        try:
+            return int(env_value)
+        except ValueError:
+            logging.getLogger("rotator_library").warning(
+                f"Invalid IFLOW_OAUTH_PORT value: {env_value}, using default {CALLBACK_PORT}"
+            )
+    return CALLBACK_PORT
 # Refresh tokens 24 hours before expiry
 REFRESH_EXPIRY_BUFFER_SECONDS = 24 * 60 * 60
             str, float
         ] = {}  # Track backoff timers (Unix timestamp)
+        # [QUEUE SYSTEM] Sequential refresh processing with two separate queues
+        # Normal refresh queue: for proactive token refresh (old token still valid)
         self._refresh_queue: asyncio.Queue = asyncio.Queue()
+        self._queue_processor_task: Optional[asyncio.Task] = None
+        # Re-auth queue: for invalid refresh tokens (requires user interaction)
+        self._reauth_queue: asyncio.Queue = asyncio.Queue()
+        self._reauth_processor_task: Optional[asyncio.Task] = None
+        # Tracking sets/dicts
+        self._queued_credentials: set = set()  # Track credentials in either queue
+        # Only credentials in re-auth queue are marked unavailable (not normal refresh)
+        # TTL cleanup is defense-in-depth for edge cases where re-auth processor crashes
         self._unavailable_credentials: Dict[
             str, float
         ] = {}  # Maps credential path -> timestamp when marked unavailable
+        # TTL should exceed reauth timeout (300s) to avoid premature cleanup
+        self._unavailable_ttl_seconds: int = 360  # 6 minutes TTL for stale entries
         self._queue_tracking_lock = asyncio.Lock()  # Protects queue sets
+        # Retry tracking for normal refresh queue
+        self._queue_retry_count: Dict[
+            str, int
+        ] = {}  # Track retry attempts per credential
+        # Configuration constants
+        self._refresh_timeout_seconds: int = 15  # Max time for single refresh
+        self._refresh_interval_seconds: int = 30  # Delay between queue items
+        self._refresh_max_retries: int = 3  # Attempts before kicked out
+        self._reauth_timeout_seconds: int = 300  # Time for user to complete OAuth
     def _parse_env_credential_path(self, path: str) -> Optional[str]:
         """
                         f"Environment variables for iFlow credential index {credential_index} not found"
                     )
+            # Try file-based loading first (preferred for explicit file paths)
+            try:
+                return await self._read_creds_from_file(path)
+            except IOError:
+                # File not found - fall back to legacy env vars for backwards compatibility
+                env_creds = self._load_from_env()
+                if env_creds:
+                    lib_logger.info(
+                        f"File '{path}' not found, using iFlow credentials from environment variables"
+                    )
+                    self._credentials_cache[path] = env_creds
+                    return env_creds
+                raise  # Re-raise the original file not found error
     async def _save_credentials(self, path: str, creds: Dict[str, Any]):
+        """Save credentials with in-memory fallback if disk unavailable."""
+        # Always update cache first (memory is reliable)
+        self._credentials_cache[path] = creds
         # Don't save to file if credentials were loaded from environment
         if creds.get("_proxy_metadata", {}).get("loaded_from_env"):
             lib_logger.debug("Credentials loaded from env, skipping file save")
             return
+        # Attempt disk write - if it fails, we still have the cache
+        # buffer_on_failure ensures data is retried periodically and saved on shutdown
+        if safe_write_json(
+            path, creds, lib_logger, secure_permissions=True, buffer_on_failure=True
+        ):
+            lib_logger.debug(f"Saved updated iFlow OAuth credentials to '{path}'.")
+        else:
+            lib_logger.warning(
+                "iFlow credentials cached in memory only (buffered for retry)."
             )
     def _is_token_expired(self, creds: Dict[str, Any]) -> bool:
         """Checks if the token is expired (with buffer for proactive refresh)."""
         return expiry_timestamp < time.time() + REFRESH_EXPIRY_BUFFER_SECONDS
+    def _is_token_truly_expired(self, creds: Dict[str, Any]) -> bool:
+        """Check if token is TRULY expired (past actual expiry, not just threshold).
+        This is different from _is_token_expired() which uses a buffer for proactive refresh.
+        This method checks if the token is actually unusable.
+        """
+        expiry_str = creds.get("expiry_date")
+        if not expiry_str:
+            return True
+        try:
+            from datetime import datetime
+            expiry_dt = datetime.fromisoformat(expiry_str.replace("Z", "+00:00"))
+            expiry_timestamp = expiry_dt.timestamp()
+        except (ValueError, AttributeError):
+            try:
+                expiry_timestamp = float(expiry_str)
+            except (ValueError, TypeError):
+                return True
+        return expiry_timestamp < time.time()
     async def _fetch_user_info(self, access_token: str) -> Dict[str, Any]:
         """
         Fetches user info (including API key) from iFlow API.
                         )
                         response.raise_for_status()
                         new_token_data = response.json()
+                        # [FIX] Handle wrapped response format: {success: bool, data: {...}}
+                        # iFlow API may return tokens nested inside a 'data' key
+                        if (
+                            isinstance(new_token_data, dict)
+                            and "data" in new_token_data
+                        ):
+                            lib_logger.debug(
+                                f"iFlow refresh response wrapped in 'data' key, extracting..."
+                            )
+                            # Check for error in wrapped response
+                            if not new_token_data.get("success", True):
+                                error_msg = new_token_data.get(
+                                    "message", "Unknown error"
+                                )
+                                raise ValueError(
+                                    f"iFlow token refresh failed: {error_msg}"
+                                )
+                            new_token_data = new_token_data.get("data", {})
                         break  # Success
                     except httpx.HTTPStatusError as e:
             # Update tokens
             access_token = new_token_data.get("access_token")
             if not access_token:
+                # Log response keys for debugging
+                response_keys = (
+                    list(new_token_data.keys())
+                    if isinstance(new_token_data, dict)
+                    else type(new_token_data).__name__
+                )
+                lib_logger.error(
+                    f"Missing access_token in refresh response for '{Path(path).name}'. "
+                    f"Response keys: {response_keys}"
+                )
                 raise ValueError("Missing access_token in refresh response")
             creds_from_file["access_token"] = access_token
         Proactively refreshes tokens if they're close to expiry.
         Only applies to OAuth credentials (file paths or env:// paths). Direct API keys are skipped.
         """
+        # lib_logger.debug(f"proactively_refresh called for: {credential_identifier}")
         # Try to load credentials - this will fail for direct API keys
         # and succeed for OAuth credentials (file paths or env:// paths)
             creds = await self._load_credentials(credential_identifier)
         except IOError as e:
             # Not a valid credential path (likely a direct API key string)
+            # lib_logger.debug(
+            #     f"Skipping refresh for '{credential_identifier}' - not an OAuth credential: {e}"
+            # )
             return
         is_expired = self._is_token_expired(creds)
+        # lib_logger.debug(
+        #     f"Token expired check for '{Path(credential_identifier).name}': {is_expired}"
+        # )
         if is_expired:
+            # lib_logger.debug(
+            #     f"Queueing refresh for '{Path(credential_identifier).name}'"
+            # )
+            # lib_logger.info(f"Proactive refresh triggered for '{Path(credential_identifier).name}'")
             await self._queue_refresh(
                 credential_identifier, force=False, needs_reauth=False
             )
             return self._refresh_locks[path]
     def is_credential_available(self, path: str) -> bool:
+        """Check if a credential is available for rotation.
+        Credentials are unavailable if:
+        1. In re-auth queue (token is truly broken, requires user interaction)
+        2. Token is TRULY expired (past actual expiry, not just threshold)
+        Note: Credentials in normal refresh queue are still available because
+        the old token is valid until actual expiry.
+        TTL cleanup (defense-in-depth): If a credential has been in the re-auth
+        queue longer than _unavailable_ttl_seconds without being processed, it's
+        cleaned up. This should only happen if the re-auth processor crashes or
+        is cancelled without proper cleanup.
         """
+        # Check if in re-auth queue (truly unavailable)
+        if path in self._unavailable_credentials:
+            marked_time = self._unavailable_credentials.get(path)
+            if marked_time is not None:
+                now = time.time()
+                if now - marked_time > self._unavailable_ttl_seconds:
+                    # Entry is stale - clean it up and return available
+                    # This is a defense-in-depth for edge cases where re-auth
+                    # processor crashed or was cancelled without cleanup
+                    lib_logger.warning(
+                        f"Credential '{Path(path).name}' stuck in re-auth queue for "
+                        f"{int(now - marked_time)}s (TTL: {self._unavailable_ttl_seconds}s). "
+                        f"Re-auth processor may have crashed. Auto-cleaning stale entry."
+                    )
+                    # Clean up both tracking structures for consistency
+                    self._unavailable_credentials.pop(path, None)
+                    self._queued_credentials.discard(path)
+                else:
+                    return False  # Still in re-auth, not available
+        # Check if token is TRULY expired (not just threshold-expired)
+        creds = self._credentials_cache.get(path)
+        if creds and self._is_token_truly_expired(creds):
+            # Token is actually expired - should not be used
+            # Queue for refresh if not already queued
+            if path not in self._queued_credentials:
+                # lib_logger.debug(
+                #     f"Credential '{Path(path).name}' is truly expired, queueing for refresh"
+                # )
+                asyncio.create_task(
+                    self._queue_refresh(path, force=True, needs_reauth=False)
                 )
+            return False
+        return True
     async def _ensure_queue_processor_running(self):
         """Lazily starts the queue processor if not already running."""
                 self._process_refresh_queue()
             )
+    async def _ensure_reauth_processor_running(self):
+        """Lazily starts the re-auth queue processor if not already running."""
+        if self._reauth_processor_task is None or self._reauth_processor_task.done():
+            self._reauth_processor_task = asyncio.create_task(
+                self._process_reauth_queue()
+            )
     async def _queue_refresh(
         self, path: str, force: bool = False, needs_reauth: bool = False
     ):
+        """Add a credential to the appropriate refresh queue if not already queued.
         Args:
             path: Credential file path
             force: Force refresh even if not expired
+            needs_reauth: True if full re-authentication needed (routes to re-auth queue)
+        Queue routing:
+        - needs_reauth=True: Goes to re-auth queue, marks as unavailable
+        - needs_reauth=False: Goes to normal refresh queue, does NOT mark unavailable
+          (old token is still valid until actual expiry)
         """
         # IMPORTANT: Only check backoff for simple automated refreshes
         # Re-authentication (interactive OAuth) should BYPASS backoff since it needs user input
                 backoff_until = self._next_refresh_after[path]
                 if now < backoff_until:
                     # Credential is in backoff for automated refresh, do not queue
+                    # remaining = int(backoff_until - now)
+                    # lib_logger.debug(
+                    #     f"Skipping automated refresh for '{Path(path).name}' (in backoff for {remaining}s)"
+                    # )
                     return
         async with self._queue_tracking_lock:
             if path not in self._queued_credentials:
                 self._queued_credentials.add(path)
+                if needs_reauth:
+                    # Re-auth queue: mark as unavailable (token is truly broken)
+                    self._unavailable_credentials[path] = time.time()
+                    # lib_logger.debug(
+                    #     f"Queued '{Path(path).name}' for RE-AUTH (marked unavailable). "
+                    #     f"Total unavailable: {len(self._unavailable_credentials)}"
+                    # )
+                    await self._reauth_queue.put(path)
+                    await self._ensure_reauth_processor_running()
+                else:
+                    # Normal refresh queue: do NOT mark unavailable (old token still valid)
+                    # lib_logger.debug(
+                    #     f"Queued '{Path(path).name}' for refresh (still available). "
+                    #     f"Queue size: {self._refresh_queue.qsize() + 1}"
+                    # )
+                    await self._refresh_queue.put((path, force))
+                    await self._ensure_queue_processor_running()
     async def _process_refresh_queue(self):
+        """Background worker that processes normal refresh requests sequentially.
+        Key behaviors:
+        - 15s timeout per refresh operation
+        - 30s delay between processing credentials (prevents thundering herd)
+        - On failure: back of queue, max 3 retries before kicked
+        - If 401/403 detected: routes to re-auth queue
+        - Does NOT mark credentials unavailable (old token still valid)
+        """
+        # lib_logger.info("Refresh queue processor started")
         while True:
             path = None
             try:
                 # Wait for an item with timeout to allow graceful shutdown
                 try:
+                    path, force = await asyncio.wait_for(
                         self._refresh_queue.get(), timeout=60.0
                     )
                 except asyncio.TimeoutError:
+                    # Queue is empty and idle for 60s - clean up and exit
                     async with self._queue_tracking_lock:
+                        # Clear any stale retry counts
+                        self._queue_retry_count.clear()
                     self._queue_processor_task = None
+                    # lib_logger.debug("Refresh queue processor idle, shutting down")
                     return
                 try:
+                    # Quick check if still expired (optimization to avoid unnecessary refresh)
+                    creds = self._credentials_cache.get(path)
+                    if creds and not self._is_token_expired(creds):
+                        # No longer expired, skip refresh
+                        # lib_logger.debug(
+                        #     f"Credential '{Path(path).name}' no longer expired, skipping refresh"
+                        # )
+                        # Clear retry count on skip (not a failure)
+                        self._queue_retry_count.pop(path, None)
+                        continue
+                    # Perform refresh with timeout
+                    try:
+                        async with asyncio.timeout(self._refresh_timeout_seconds):
+                            await self._refresh_token(path, force=force)
+                        # SUCCESS: Clear retry count
+                        self._queue_retry_count.pop(path, None)
+                        # lib_logger.info(f"Refresh SUCCESS for '{Path(path).name}'")
+                    except asyncio.TimeoutError:
+                        lib_logger.warning(
+                            f"Refresh timeout ({self._refresh_timeout_seconds}s) for '{Path(path).name}'"
+                        )
+                        await self._handle_refresh_failure(path, force, "timeout")
+                    except httpx.HTTPStatusError as e:
+                        status_code = e.response.status_code
+                        if status_code in (401, 403):
+                            # Invalid refresh token - route to re-auth queue
+                            lib_logger.warning(
+                                f"Refresh token invalid for '{Path(path).name}' (HTTP {status_code}). "
+                                f"Routing to re-auth queue."
                             )
+                            self._queue_retry_count.pop(path, None)  # Clear retry count
+                            async with self._queue_tracking_lock:
+                                self._queued_credentials.discard(
+                                    path
+                                )  # Remove from queued
+                            await self._queue_refresh(
+                                path, force=True, needs_reauth=True
+                            )
+                        else:
+                            await self._handle_refresh_failure(
+                                path, force, f"HTTP {status_code}"
+                            )
+                    except Exception as e:
+                        await self._handle_refresh_failure(path, force, str(e))
+                finally:
+                    # Remove from queued set (unless re-queued by failure handler)
+                    async with self._queue_tracking_lock:
+                        # Only discard if not re-queued (check if still in queue set from retry)
+                        if (
+                            path in self._queued_credentials
+                            and self._queue_retry_count.get(path, 0) == 0
+                        ):
+                            self._queued_credentials.discard(path)
+                    self._refresh_queue.task_done()
+                # Wait between credentials to spread load
+                await asyncio.sleep(self._refresh_interval_seconds)
+            except asyncio.CancelledError:
+                # lib_logger.debug("Refresh queue processor cancelled")
+                break
+            except Exception as e:
+                lib_logger.error(f"Error in refresh queue processor: {e}")
+                if path:
+                    async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
+    async def _handle_refresh_failure(self, path: str, force: bool, error: str):
+        """Handle a refresh failure with back-of-line retry logic.
+        - Increments retry count
+        - If under max retries: re-adds to END of queue
+        - If at max retries: kicks credential out (retried next BackgroundRefresher cycle)
+        """
+        retry_count = self._queue_retry_count.get(path, 0) + 1
+        self._queue_retry_count[path] = retry_count
+        if retry_count >= self._refresh_max_retries:
+            # Kicked out until next BackgroundRefresher cycle
+            lib_logger.error(
+                f"Max retries ({self._refresh_max_retries}) reached for '{Path(path).name}' "
+                f"(last error: {error}). Will retry next refresh cycle."
+            )
+            self._queue_retry_count.pop(path, None)
+            async with self._queue_tracking_lock:
+                self._queued_credentials.discard(path)
+            return
+        # Re-add to END of queue for retry
+        lib_logger.warning(
+            f"Refresh failed for '{Path(path).name}' ({error}). "
+            f"Retry {retry_count}/{self._refresh_max_retries}, back of queue."
+        )
+        # Keep in queued_credentials set, add back to queue
+        await self._refresh_queue.put((path, force))
+    async def _process_reauth_queue(self):
+        """Background worker that processes re-auth requests.
+        Key behaviors:
+        - Credentials ARE marked unavailable (token is truly broken)
+        - Uses ReauthCoordinator for interactive OAuth
+        - No automatic retry (requires user action)
+        - Cleans up unavailable status when done
+        """
+        # lib_logger.info("Re-auth queue processor started")
+        while True:
+            path = None
+            try:
+                # Wait for an item with timeout to allow graceful shutdown
+                try:
+                    path = await asyncio.wait_for(
+                        self._reauth_queue.get(), timeout=60.0
+                    )
+                except asyncio.TimeoutError:
+                    # Queue is empty and idle for 60s - exit
+                    self._reauth_processor_task = None
+                    # lib_logger.debug("Re-auth queue processor idle, shutting down")
+                    return
+                try:
+                    lib_logger.info(f"Starting re-auth for '{Path(path).name}'...")
+                    await self.initialize_token(path)
+                    lib_logger.info(f"Re-auth SUCCESS for '{Path(path).name}'")
+                except Exception as e:
+                    lib_logger.error(f"Re-auth FAILED for '{Path(path).name}': {e}")
+                    # No automatic retry for re-auth (requires user action)
                 finally:
+                    # Always clean up
                     async with self._queue_tracking_lock:
                         self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
+                        # lib_logger.debug(
+                        #     f"Re-auth cleanup for '{Path(path).name}'. "
+                        #     f"Remaining unavailable: {len(self._unavailable_credentials)}"
+                        # )
+                    self._reauth_queue.task_done()
             except asyncio.CancelledError:
+                # Clean up current credential before breaking
                 if path:
                     async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
+                # lib_logger.debug("Re-auth queue processor cancelled")
                 break
             except Exception as e:
+                lib_logger.error(f"Error in re-auth queue processor: {e}")
                 if path:
                     async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
     async def _perform_interactive_oauth(
         self, path: str, creds: Dict[str, Any], display_name: str
         state = secrets.token_urlsafe(32)
         # Build authorization URL
+        callback_port = get_callback_port()
+        redirect_uri = f"http://localhost:{callback_port}/oauth2callback"
         auth_params = {
             "loginMethod": "phone",
             "type": "phone",
         auth_url = f"{IFLOW_OAUTH_AUTHORIZE_ENDPOINT}?{urlencode(auth_params)}"
         # Start OAuth callback server
+        callback_server = OAuthCallbackServer(port=callback_port)
         try:
             await callback_server.start(expected_state=state)
         except Exception as e:
             lib_logger.error(f"Failed to get iFlow user info from credentials: {e}")
             return {"email": None}
+    # =========================================================================
+    # CREDENTIAL MANAGEMENT METHODS
+    # =========================================================================
+    def _get_provider_file_prefix(self) -> str:
+        """Return the file prefix for iFlow credentials."""
+        return "iflow"
+    def _get_oauth_base_dir(self) -> Path:
+        """Get the base directory for OAuth credential files."""
+        return Path.cwd() / "oauth_creds"
+    def _find_existing_credential_by_email(
+        self, email: str, base_dir: Optional[Path] = None
+    ) -> Optional[Path]:
+        """Find an existing credential file for the given email."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        for cred_file in glob(pattern):
+            try:
+                with open(cred_file, "r") as f:
+                    creds = json.load(f)
+                existing_email = creds.get("email") or creds.get(
+                    "_proxy_metadata", {}
+                ).get("email")
+                if existing_email == email:
+                    return Path(cred_file)
+            except (json.JSONDecodeError, IOError) as e:
+                lib_logger.debug(f"Could not read credential file {cred_file}: {e}")
+                continue
+        return None
+    def _get_next_credential_number(self, base_dir: Optional[Path] = None) -> int:
+        """Get the next available credential number."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        existing_numbers = []
+        for cred_file in glob(pattern):
+            match = re.search(r"_oauth_(\d+)\.json$", cred_file)
+            if match:
+                existing_numbers.append(int(match.group(1)))
+        if not existing_numbers:
+            return 1
+        return max(existing_numbers) + 1
+    def _build_credential_path(
+        self, base_dir: Optional[Path] = None, number: Optional[int] = None
+    ) -> Path:
+        """Build a path for a new credential file."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        if number is None:
+            number = self._get_next_credential_number(base_dir)
+        prefix = self._get_provider_file_prefix()
+        filename = f"{prefix}_oauth_{number}.json"
+        return base_dir / filename
+    async def setup_credential(
+        self, base_dir: Optional[Path] = None
+    ) -> IFlowCredentialSetupResult:
+        """
+        Complete credential setup flow: OAuth -> save.
+        This is the main entry point for setting up new credentials.
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        # Ensure directory exists
+        base_dir.mkdir(exist_ok=True)
+        try:
+            # Step 1: Perform OAuth authentication
+            temp_creds = {"_proxy_metadata": {"display_name": "new iFlow credential"}}
+            new_creds = await self.initialize_token(temp_creds)
+            # Step 2: Get user info for deduplication
+            email = new_creds.get("email") or new_creds.get("_proxy_metadata", {}).get(
+                "email"
+            )
+            if not email:
+                return IFlowCredentialSetupResult(
+                    success=False, error="Could not retrieve email from OAuth response"
+                )
+            # Step 3: Check for existing credential with same email
+            existing_path = self._find_existing_credential_by_email(email, base_dir)
+            is_update = existing_path is not None
+            if is_update:
+                file_path = existing_path
+                lib_logger.info(
+                    f"Found existing credential for {email}, updating {file_path.name}"
+                )
+            else:
+                file_path = self._build_credential_path(base_dir)
+                lib_logger.info(
+                    f"Creating new credential for {email} at {file_path.name}"
+                )
+            # Step 4: Save credentials to file
+            await self._save_credentials(str(file_path), new_creds)
+            return IFlowCredentialSetupResult(
+                success=True,
+                file_path=str(file_path),
+                email=email,
+                is_update=is_update,
+                credentials=new_creds,
+            )
+        except Exception as e:
+            lib_logger.error(f"Credential setup failed: {e}")
+            return IFlowCredentialSetupResult(success=False, error=str(e))
+    def build_env_lines(self, creds: Dict[str, Any], cred_number: int) -> List[str]:
+        """Generate .env file lines for an iFlow credential."""
+        email = creds.get("email") or creds.get("_proxy_metadata", {}).get(
+            "email", "unknown"
+        )
+        prefix = f"IFLOW_{cred_number}"
+        lines = [
+            f"# IFLOW Credential #{cred_number} for: {email}",
+            f"# Exported from: iflow_oauth_{cred_number}.json",
+            f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
+            "#",
+            "# To combine multiple credentials into one .env file, copy these lines",
+            "# and ensure each credential has a unique number (1, 2, 3, etc.)",
+            "",
+            f"{prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
+            f"{prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
+            f"{prefix}_API_KEY={creds.get('api_key', '')}",
+            f"{prefix}_EXPIRY_DATE={creds.get('expiry_date', '')}",
+            f"{prefix}_EMAIL={email}",
+            f"{prefix}_TOKEN_TYPE={creds.get('token_type', 'Bearer')}",
+            f"{prefix}_SCOPE={creds.get('scope', 'read write')}",
+        ]
+        return lines
+    def export_credential_to_env(
+        self, credential_path: str, output_dir: Optional[Path] = None
+    ) -> Optional[str]:
+        """Export a credential file to .env format."""
+        try:
+            cred_path = Path(credential_path)
+            # Load credential
+            with open(cred_path, "r") as f:
+                creds = json.load(f)
+            # Extract metadata
+            email = creds.get("email") or creds.get("_proxy_metadata", {}).get(
+                "email", "unknown"
+            )
+            # Get credential number from filename
+            match = re.search(r"_oauth_(\d+)\.json$", cred_path.name)
+            cred_number = int(match.group(1)) if match else 1
+            # Build output path
+            if output_dir is None:
+                output_dir = cred_path.parent
+            safe_email = email.replace("@", "_at_").replace(".", "_")
+            env_filename = f"iflow_{cred_number}_{safe_email}.env"
+            env_path = output_dir / env_filename
+            # Build and write content
+            env_lines = self.build_env_lines(creds, cred_number)
+            with open(env_path, "w") as f:
+                f.write("\n".join(env_lines))
+            lib_logger.info(f"Exported credential to {env_path}")
+            return str(env_path)
+        except Exception as e:
+            lib_logger.error(f"Failed to export credential: {e}")
+            return None
+    def list_credentials(self, base_dir: Optional[Path] = None) -> List[Dict[str, Any]]:
+        """List all iFlow credential files."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        credentials = []
+        for cred_file in sorted(glob(pattern)):
+            try:
+                with open(cred_file, "r") as f:
+                    creds = json.load(f)
+                email = creds.get("email") or creds.get("_proxy_metadata", {}).get(
+                    "email", "unknown"
+                )
+                # Extract number from filename
+                match = re.search(r"_oauth_(\d+)\.json$", cred_file)
+                number = int(match.group(1)) if match else 0
+                credentials.append(
+                    {
+                        "file_path": cred_file,
+                        "email": email,
+                        "number": number,
+                    }
+                )
+            except Exception as e:
+                lib_logger.debug(f"Could not read credential file {cred_file}: {e}")
+                continue
+        return credentials
+    def delete_credential(self, credential_path: str) -> bool:
+        """Delete a credential file."""
+        try:
+            cred_path = Path(credential_path)
+            # Validate that it's one of our credential files
+            prefix = self._get_provider_file_prefix()
+            if not cred_path.name.startswith(f"{prefix}_oauth_"):
+                lib_logger.error(
+                    f"File {cred_path.name} does not appear to be an iFlow credential"
+                )
+                return False
+            if not cred_path.exists():
+                lib_logger.warning(f"Credential file does not exist: {credential_path}")
+                return False
+            # Remove from cache if present
+            self._credentials_cache.pop(credential_path, None)
+            # Delete the file
+            cred_path.unlink()
+            lib_logger.info(f"Deleted credential file: {credential_path}")
+            return True
+        except Exception as e:
+            lib_logger.error(f"Failed to delete credential: {e}")
+            return False

src/rotator_library/providers/iflow_provider.py CHANGED Viewed

@@ -10,19 +10,27 @@ from typing import Union, AsyncGenerator, List, Dict, Any
 from .provider_interface import ProviderInterface
 from .iflow_auth_base import IFlowAuthBase
 from ..model_definitions import ModelDefinitions
 import litellm
 from litellm.exceptions import RateLimitError, AuthenticationError
 from pathlib import Path
 import uuid
 from datetime import datetime
-lib_logger = logging.getLogger('rotator_library')
-LOGS_DIR = Path(__file__).resolve().parent.parent.parent.parent / "logs"
-IFLOW_LOGS_DIR = LOGS_DIR / "iflow_logs"
 class _IFlowFileLogger:
     """A simple file logger for a single iFlow transaction."""
     def __init__(self, model_name: str, enabled: bool = True):
         self.enabled = enabled
         if not self.enabled:
@@ -31,8 +39,10 @@ class _IFlowFileLogger:
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
         request_id = str(uuid.uuid4())
         # Sanitize model name for directory
-        safe_model_name = model_name.replace('/', '_').replace(':', '_')
-        self.log_dir = IFLOW_LOGS_DIR / f"{timestamp}_{safe_model_name}_{request_id}"
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
         except Exception as e:
@@ -41,16 +51,20 @@ class _IFlowFileLogger:
     def log_request(self, payload: Dict[str, Any]):
         """Logs the request payload sent to iFlow."""
-        if not self.enabled: return
         try:
-            with open(self.log_dir / "request_payload.json", "w", encoding="utf-8") as f:
                 json.dump(payload, f, indent=2, ensure_ascii=False)
         except Exception as e:
             lib_logger.error(f"_IFlowFileLogger: Failed to write request: {e}")
     def log_response_chunk(self, chunk: str):
         """Logs a raw chunk from the iFlow response stream."""
-        if not self.enabled: return
         try:
             with open(self.log_dir / "response_stream.log", "a", encoding="utf-8") as f:
                 f.write(chunk + "\n")
@@ -59,7 +73,8 @@ class _IFlowFileLogger:
     def log_error(self, error_message: str):
         """Logs an error message."""
-        if not self.enabled: return
         try:
             with open(self.log_dir / "error.log", "a", encoding="utf-8") as f:
                 f.write(f"[{datetime.utcnow().isoformat()}] {error_message}\n")
@@ -68,13 +83,15 @@ class _IFlowFileLogger:
     def log_final_response(self, response_data: Dict[str, Any]):
         """Logs the final, reassembled response."""
-        if not self.enabled: return
         try:
             with open(self.log_dir / "final_response.json", "w", encoding="utf-8") as f:
                 json.dump(response_data, f, indent=2, ensure_ascii=False)
         except Exception as e:
             lib_logger.error(f"_IFlowFileLogger: Failed to write final response: {e}")
 # Model list can be expanded as iFlow supports more models
 HARDCODED_MODELS = [
     "glm-4.6",
@@ -90,14 +107,25 @@ HARDCODED_MODELS = [
     "deepseek-v3",
     "qwen3-vl-plus",
     "qwen3-235b-a22b-instruct",
-    "qwen3-235b"
 ]
 # OpenAI-compatible parameters supported by iFlow API
 SUPPORTED_PARAMS = {
-    'model', 'messages', 'temperature', 'top_p', 'max_tokens',
-    'stream', 'tools', 'tool_choice', 'presence_penalty',
-    'frequency_penalty', 'n', 'stop', 'seed', 'response_format'
 }
@@ -106,6 +134,7 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
     iFlow provider using OAuth authentication with local callback server.
     API requests use the derived API key (NOT OAuth access_token).
     """
     skip_cost_calculation = True
     def __init__(self):
@@ -128,7 +157,9 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         Validates OAuth credentials if applicable.
         """
         models = []
-        env_var_ids = set()  # Track IDs from env vars to prevent hardcoded/dynamic duplicates
         def extract_model_id(item) -> str:
             """Extract model ID from various formats (dict, string with/without provider prefix)."""
@@ -154,7 +185,9 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                 # Track the ID to prevent hardcoded/dynamic duplicates
                 if model_id:
                     env_var_ids.add(model_id)
-            lib_logger.info(f"Loaded {len(static_models)} static models for iflow from environment variables")
         # Source 2: Add hardcoded models (only if ID not already in env vars)
         for model_id in HARDCODED_MODELS:
@@ -172,14 +205,17 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
             models_url = f"{api_base.rstrip('/')}/models"
             response = await client.get(
-                models_url,
-                headers={"Authorization": f"Bearer {api_key}"}
             )
             response.raise_for_status()
             dynamic_data = response.json()
             # Handle both {data: [...]} and direct [...] formats
-            model_list = dynamic_data.get("data", dynamic_data) if isinstance(dynamic_data, dict) else dynamic_data
             dynamic_count = 0
             for model in model_list:
@@ -190,7 +226,9 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                     dynamic_count += 1
             if dynamic_count > 0:
-                lib_logger.debug(f"Discovered {dynamic_count} additional models for iflow from API")
         except Exception as e:
             # Silently ignore dynamic discovery errors
@@ -255,7 +293,7 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         payload = {k: v for k, v in kwargs.items() if k in SUPPORTED_PARAMS}
         # Always force streaming for internal processing
-        payload['stream'] = True
         # NOTE: iFlow API does not support stream_options parameter
         # Unlike other providers, we don't include it to avoid HTTP 406 errors
@@ -264,16 +302,22 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         if "tools" in payload and payload["tools"]:
             payload["tools"] = self._clean_tool_schemas(payload["tools"])
             lib_logger.debug(f"Cleaned {len(payload['tools'])} tool schemas")
-        elif "tools" in payload and isinstance(payload["tools"], list) and len(payload["tools"]) == 0:
             # Inject dummy tool for empty arrays to prevent streaming issues (similar to Qwen's behavior)
-            payload["tools"] = [{
-                "type": "function",
-                "function": {
-                    "name": "noop",
-                    "description": "Placeholder tool to stabilise streaming",
-                    "parameters": {"type": "object"}
                 }
-            }]
             lib_logger.debug("Injected placeholder tool for empty tools array")
         return payload
@@ -282,7 +326,7 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         """
         Converts a raw iFlow SSE chunk to an OpenAI-compatible chunk.
         Since iFlow is OpenAI-compatible, minimal conversion is needed.
         CRITICAL FIX: Handle chunks with BOTH usage and choices (final chunk)
         without early return to ensure finish_reason is properly processed.
         """
@@ -302,32 +346,36 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                 "model": model_id,
                 "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
-                "created": chunk.get("created", int(time.time()))
             }
             # Then yield the usage chunk
             yield {
-                "choices": [], "model": model_id, "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
                 "created": chunk.get("created", int(time.time())),
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
-                }
             }
             return
         # Handle usage-only chunks
         if usage_data:
             yield {
-                "choices": [], "model": model_id, "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
                 "created": chunk.get("created", int(time.time())),
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
-                }
             }
             return
@@ -339,13 +387,15 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                 "model": model_id,
                 "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
-                "created": chunk.get("created", int(time.time()))
             }
-    def _stream_to_completion_response(self, chunks: List[litellm.ModelResponse]) -> litellm.ModelResponse:
         """
         Manually reassembles streaming chunks into a complete response.
         Key improvements:
         - Determines finish_reason based on accumulated state (tool_calls vs stop)
         - Properly initializes tool_calls with type field
@@ -358,14 +408,16 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         final_message = {"role": "assistant"}
         aggregated_tool_calls = {}
         usage_data = None
-        chunk_finish_reason = None  # Track finish_reason from chunks (but we'll override)
         # Get the first chunk for basic response metadata
         first_chunk = chunks[0]
         # Process each chunk to aggregate content
         for chunk in chunks:
-            if not hasattr(chunk, 'choices') or not chunk.choices:
                 continue
             choice = chunk.choices[0]
@@ -389,25 +441,48 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                     index = tc_chunk.get("index", 0)
                     if index not in aggregated_tool_calls:
                         # Initialize with type field for OpenAI compatibility
-                        aggregated_tool_calls[index] = {"type": "function", "function": {"name": "", "arguments": ""}}
                     if "id" in tc_chunk:
                         aggregated_tool_calls[index]["id"] = tc_chunk["id"]
                     if "type" in tc_chunk:
                         aggregated_tool_calls[index]["type"] = tc_chunk["type"]
                     if "function" in tc_chunk:
-                        if "name" in tc_chunk["function"] and tc_chunk["function"]["name"] is not None:
-                            aggregated_tool_calls[index]["function"]["name"] += tc_chunk["function"]["name"]
-                        if "arguments" in tc_chunk["function"] and tc_chunk["function"]["arguments"] is not None:
-                            aggregated_tool_calls[index]["function"]["arguments"] += tc_chunk["function"]["arguments"]
             # Aggregate function calls (legacy format)
             if "function_call" in delta and delta["function_call"] is not None:
                 if "function_call" not in final_message:
                     final_message["function_call"] = {"name": "", "arguments": ""}
-                if "name" in delta["function_call"] and delta["function_call"]["name"] is not None:
-                    final_message["function_call"]["name"] += delta["function_call"]["name"]
-                if "arguments" in delta["function_call"] and delta["function_call"]["arguments"] is not None:
-                    final_message["function_call"]["arguments"] += delta["function_call"]["arguments"]
             # Track finish_reason from chunks (for reference only)
             if choice.get("finish_reason"):
@@ -415,7 +490,7 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         # Handle usage data from the last chunk that has it
         for chunk in reversed(chunks):
-            if hasattr(chunk, 'usage') and chunk.usage:
                 usage_data = chunk.usage
                 break
@@ -441,7 +516,7 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         final_choice = {
             "index": 0,
             "message": final_message,
-            "finish_reason": finish_reason
         }
         # Create the final ModelResponse
@@ -451,21 +526,20 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
             "created": first_chunk.created,
             "model": first_chunk.model,
             "choices": [final_choice],
-            "usage": usage_data
         }
         return litellm.ModelResponse(**final_response_data)
-    async def acompletion(self, client: httpx.AsyncClient, **kwargs) -> Union[litellm.ModelResponse, AsyncGenerator[litellm.ModelResponse, None]]:
         credential_path = kwargs.pop("credential_identifier")
         enable_request_logging = kwargs.pop("enable_request_logging", False)
         model = kwargs["model"]
         # Create dedicated file logger for this request
-        file_logger = _IFlowFileLogger(
-            model_name=model,
-            enabled=enable_request_logging
-        )
         async def make_request():
             """Prepares and makes the actual API call."""
@@ -473,8 +547,8 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
             api_base, api_key = await self.get_api_details(credential_path)
             # Strip provider prefix from model name (e.g., "iflow/Qwen3-Coder-Plus" -> "Qwen3-Coder-Plus")
-            model_name = model.split('/')[-1]
-            kwargs_with_stripped_model = {**kwargs, 'model': model_name}
             # Build clean payload with only supported parameters
             payload = self._build_request_payload(**kwargs_with_stripped_model)
@@ -483,7 +557,7 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                 "Authorization": f"Bearer {api_key}",  # Uses api_key from user info
                 "Content-Type": "application/json",
                 "Accept": "text/event-stream",
-                "User-Agent": "iFlow-Cli"
             }
             url = f"{api_base.rstrip('/')}/chat/completions"
@@ -492,7 +566,13 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
             file_logger.log_request(payload)
             lib_logger.debug(f"iFlow Request URL: {url}")
-            return client.stream("POST", url, headers=headers, json=payload, timeout=600)
         async def stream_handler(response_stream, attempt=1):
             """Handles the streaming response and converts chunks."""
@@ -501,11 +581,17 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                     # Check for HTTP errors before processing stream
                     if response.status_code >= 400:
                         error_text = await response.aread()
-                        error_text = error_text.decode('utf-8') if isinstance(error_text, bytes) else error_text
                         # Handle 401: Force token refresh and retry once
                         if response.status_code == 401 and attempt == 1:
-                            lib_logger.warning("iFlow returned 401. Forcing token refresh and retrying once.")
                             await self._refresh_token(credential_path, force=True)
                             retry_stream = await make_request()
                             async for chunk in stream_handler(retry_stream, attempt=2):
@@ -513,50 +599,61 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
                             return
                         # Handle 429: Rate limit
-                        elif response.status_code == 429 or "slow_down" in error_text.lower():
                             raise RateLimitError(
                                 f"iFlow rate limit exceeded: {error_text}",
                                 llm_provider="iflow",
                                 model=model,
-                                response=response
                             )
                         # Handle other errors
                         else:
-                            error_msg = f"iFlow HTTP {response.status_code} error: {error_text}"
                             file_logger.log_error(error_msg)
                             raise httpx.HTTPStatusError(
                                 f"HTTP {response.status_code}: {error_text}",
                                 request=response.request,
-                                response=response
                             )
                     # Process successful streaming response
                     async for line in response.aiter_lines():
                         file_logger.log_response_chunk(line)
                         # CRITICAL FIX: Handle both "data:" (no space) and "data: " (with space)
-                        if line.startswith('data:'):
                             # Extract data after "data:" prefix, handling both formats
-                            if line.startswith('data: '):
                                 data_str = line[6:]  # Skip "data: "
                             else:
                                 data_str = line[5:]  # Skip "data:"
                             if data_str.strip() == "[DONE]":
                                 break
                             try:
                                 chunk = json.loads(data_str)
-                                for openai_chunk in self._convert_chunk_to_openai(chunk, model):
                                     yield litellm.ModelResponse(**openai_chunk)
                             except json.JSONDecodeError:
-                                lib_logger.warning(f"Could not decode JSON from iFlow: {line}")
             except httpx.HTTPStatusError:
                 raise  # Re-raise HTTP errors we already handled
             except Exception as e:
                 file_logger.log_error(f"Error during iFlow stream processing: {e}")
-                lib_logger.error(f"Error during iFlow stream processing: {e}", exc_info=True)
                 raise
         async def logging_stream_wrapper():
@@ -574,7 +671,9 @@ class IFlowProvider(IFlowAuthBase, ProviderInterface):
         if kwargs.get("stream"):
             return logging_stream_wrapper()
         else:
             async def non_stream_wrapper():
                 chunks = [chunk async for chunk in logging_stream_wrapper()]
                 return self._stream_to_completion_response(chunks)
             return await non_stream_wrapper()

 from .provider_interface import ProviderInterface
 from .iflow_auth_base import IFlowAuthBase
 from ..model_definitions import ModelDefinitions
+from ..timeout_config import TimeoutConfig
+from ..utils.paths import get_logs_dir
 import litellm
 from litellm.exceptions import RateLimitError, AuthenticationError
 from pathlib import Path
 import uuid
 from datetime import datetime
+lib_logger = logging.getLogger("rotator_library")
+def _get_iflow_logs_dir() -> Path:
+    """Get the iFlow logs directory."""
+    logs_dir = get_logs_dir() / "iflow_logs"
+    logs_dir.mkdir(parents=True, exist_ok=True)
+    return logs_dir
 class _IFlowFileLogger:
     """A simple file logger for a single iFlow transaction."""
     def __init__(self, model_name: str, enabled: bool = True):
         self.enabled = enabled
         if not self.enabled:
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
         request_id = str(uuid.uuid4())
         # Sanitize model name for directory
+        safe_model_name = model_name.replace("/", "_").replace(":", "_")
+        self.log_dir = (
+            _get_iflow_logs_dir() / f"{timestamp}_{safe_model_name}_{request_id}"
+        )
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
         except Exception as e:
     def log_request(self, payload: Dict[str, Any]):
         """Logs the request payload sent to iFlow."""
+        if not self.enabled:
+            return
         try:
+            with open(
+                self.log_dir / "request_payload.json", "w", encoding="utf-8"
+            ) as f:
                 json.dump(payload, f, indent=2, ensure_ascii=False)
         except Exception as e:
             lib_logger.error(f"_IFlowFileLogger: Failed to write request: {e}")
     def log_response_chunk(self, chunk: str):
         """Logs a raw chunk from the iFlow response stream."""
+        if not self.enabled:
+            return
         try:
             with open(self.log_dir / "response_stream.log", "a", encoding="utf-8") as f:
                 f.write(chunk + "\n")
     def log_error(self, error_message: str):
         """Logs an error message."""
+        if not self.enabled:
+            return
         try:
             with open(self.log_dir / "error.log", "a", encoding="utf-8") as f:
                 f.write(f"[{datetime.utcnow().isoformat()}] {error_message}\n")
     def log_final_response(self, response_data: Dict[str, Any]):
         """Logs the final, reassembled response."""
+        if not self.enabled:
+            return
         try:
             with open(self.log_dir / "final_response.json", "w", encoding="utf-8") as f:
                 json.dump(response_data, f, indent=2, ensure_ascii=False)
         except Exception as e:
             lib_logger.error(f"_IFlowFileLogger: Failed to write final response: {e}")
 # Model list can be expanded as iFlow supports more models
 HARDCODED_MODELS = [
     "glm-4.6",
     "deepseek-v3",
     "qwen3-vl-plus",
     "qwen3-235b-a22b-instruct",
+    "qwen3-235b",
 ]
 # OpenAI-compatible parameters supported by iFlow API
 SUPPORTED_PARAMS = {
+    "model",
+    "messages",
+    "temperature",
+    "top_p",
+    "max_tokens",
+    "stream",
+    "tools",
+    "tool_choice",
+    "presence_penalty",
+    "frequency_penalty",
+    "n",
+    "stop",
+    "seed",
+    "response_format",
 }
     iFlow provider using OAuth authentication with local callback server.
     API requests use the derived API key (NOT OAuth access_token).
     """
     skip_cost_calculation = True
     def __init__(self):
         Validates OAuth credentials if applicable.
         """
         models = []
+        env_var_ids = (
+            set()
+        )  # Track IDs from env vars to prevent hardcoded/dynamic duplicates
         def extract_model_id(item) -> str:
             """Extract model ID from various formats (dict, string with/without provider prefix)."""
                 # Track the ID to prevent hardcoded/dynamic duplicates
                 if model_id:
                     env_var_ids.add(model_id)
+            lib_logger.info(
+                f"Loaded {len(static_models)} static models for iflow from environment variables"
+            )
         # Source 2: Add hardcoded models (only if ID not already in env vars)
         for model_id in HARDCODED_MODELS:
             models_url = f"{api_base.rstrip('/')}/models"
             response = await client.get(
+                models_url, headers={"Authorization": f"Bearer {api_key}"}
             )
             response.raise_for_status()
             dynamic_data = response.json()
             # Handle both {data: [...]} and direct [...] formats
+            model_list = (
+                dynamic_data.get("data", dynamic_data)
+                if isinstance(dynamic_data, dict)
+                else dynamic_data
+            )
             dynamic_count = 0
             for model in model_list:
                     dynamic_count += 1
             if dynamic_count > 0:
+                lib_logger.debug(
+                    f"Discovered {dynamic_count} additional models for iflow from API"
+                )
         except Exception as e:
             # Silently ignore dynamic discovery errors
         payload = {k: v for k, v in kwargs.items() if k in SUPPORTED_PARAMS}
         # Always force streaming for internal processing
+        payload["stream"] = True
         # NOTE: iFlow API does not support stream_options parameter
         # Unlike other providers, we don't include it to avoid HTTP 406 errors
         if "tools" in payload and payload["tools"]:
             payload["tools"] = self._clean_tool_schemas(payload["tools"])
             lib_logger.debug(f"Cleaned {len(payload['tools'])} tool schemas")
+        elif (
+            "tools" in payload
+            and isinstance(payload["tools"], list)
+            and len(payload["tools"]) == 0
+        ):
             # Inject dummy tool for empty arrays to prevent streaming issues (similar to Qwen's behavior)
+            payload["tools"] = [
+                {
+                    "type": "function",
+                    "function": {
+                        "name": "noop",
+                        "description": "Placeholder tool to stabilise streaming",
+                        "parameters": {"type": "object"},
+                    },
                 }
+            ]
             lib_logger.debug("Injected placeholder tool for empty tools array")
         return payload
         """
         Converts a raw iFlow SSE chunk to an OpenAI-compatible chunk.
         Since iFlow is OpenAI-compatible, minimal conversion is needed.
         CRITICAL FIX: Handle chunks with BOTH usage and choices (final chunk)
         without early return to ensure finish_reason is properly processed.
         """
                 "model": model_id,
                 "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
+                "created": chunk.get("created", int(time.time())),
             }
             # Then yield the usage chunk
             yield {
+                "choices": [],
+                "model": model_id,
+                "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
                 "created": chunk.get("created", int(time.time())),
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
+                },
             }
             return
         # Handle usage-only chunks
         if usage_data:
             yield {
+                "choices": [],
+                "model": model_id,
+                "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
                 "created": chunk.get("created", int(time.time())),
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
+                },
             }
             return
                 "model": model_id,
                 "object": "chat.completion.chunk",
                 "id": chunk.get("id", f"chatcmpl-iflow-{time.time()}"),
+                "created": chunk.get("created", int(time.time())),
             }
+    def _stream_to_completion_response(
+        self, chunks: List[litellm.ModelResponse]
+    ) -> litellm.ModelResponse:
         """
         Manually reassembles streaming chunks into a complete response.
         Key improvements:
         - Determines finish_reason based on accumulated state (tool_calls vs stop)
         - Properly initializes tool_calls with type field
         final_message = {"role": "assistant"}
         aggregated_tool_calls = {}
         usage_data = None
+        chunk_finish_reason = (
+            None  # Track finish_reason from chunks (but we'll override)
+        )
         # Get the first chunk for basic response metadata
         first_chunk = chunks[0]
         # Process each chunk to aggregate content
         for chunk in chunks:
+            if not hasattr(chunk, "choices") or not chunk.choices:
                 continue
             choice = chunk.choices[0]
                     index = tc_chunk.get("index", 0)
                     if index not in aggregated_tool_calls:
                         # Initialize with type field for OpenAI compatibility
+                        aggregated_tool_calls[index] = {
+                            "type": "function",
+                            "function": {"name": "", "arguments": ""},
+                        }
                     if "id" in tc_chunk:
                         aggregated_tool_calls[index]["id"] = tc_chunk["id"]
                     if "type" in tc_chunk:
                         aggregated_tool_calls[index]["type"] = tc_chunk["type"]
                     if "function" in tc_chunk:
+                        if (
+                            "name" in tc_chunk["function"]
+                            and tc_chunk["function"]["name"] is not None
+                        ):
+                            aggregated_tool_calls[index]["function"]["name"] += (
+                                tc_chunk["function"]["name"]
+                            )
+                        if (
+                            "arguments" in tc_chunk["function"]
+                            and tc_chunk["function"]["arguments"] is not None
+                        ):
+                            aggregated_tool_calls[index]["function"]["arguments"] += (
+                                tc_chunk["function"]["arguments"]
+                            )
             # Aggregate function calls (legacy format)
             if "function_call" in delta and delta["function_call"] is not None:
                 if "function_call" not in final_message:
                     final_message["function_call"] = {"name": "", "arguments": ""}
+                if (
+                    "name" in delta["function_call"]
+                    and delta["function_call"]["name"] is not None
+                ):
+                    final_message["function_call"]["name"] += delta["function_call"][
+                        "name"
+                    ]
+                if (
+                    "arguments" in delta["function_call"]
+                    and delta["function_call"]["arguments"] is not None
+                ):
+                    final_message["function_call"]["arguments"] += delta[
+                        "function_call"
+                    ]["arguments"]
             # Track finish_reason from chunks (for reference only)
             if choice.get("finish_reason"):
         # Handle usage data from the last chunk that has it
         for chunk in reversed(chunks):
+            if hasattr(chunk, "usage") and chunk.usage:
                 usage_data = chunk.usage
                 break
         final_choice = {
             "index": 0,
             "message": final_message,
+            "finish_reason": finish_reason,
         }
         # Create the final ModelResponse
             "created": first_chunk.created,
             "model": first_chunk.model,
             "choices": [final_choice],
+            "usage": usage_data,
         }
         return litellm.ModelResponse(**final_response_data)
+    async def acompletion(
+        self, client: httpx.AsyncClient, **kwargs
+    ) -> Union[litellm.ModelResponse, AsyncGenerator[litellm.ModelResponse, None]]:
         credential_path = kwargs.pop("credential_identifier")
         enable_request_logging = kwargs.pop("enable_request_logging", False)
         model = kwargs["model"]
         # Create dedicated file logger for this request
+        file_logger = _IFlowFileLogger(model_name=model, enabled=enable_request_logging)
         async def make_request():
             """Prepares and makes the actual API call."""
             api_base, api_key = await self.get_api_details(credential_path)
             # Strip provider prefix from model name (e.g., "iflow/Qwen3-Coder-Plus" -> "Qwen3-Coder-Plus")
+            model_name = model.split("/")[-1]
+            kwargs_with_stripped_model = {**kwargs, "model": model_name}
             # Build clean payload with only supported parameters
             payload = self._build_request_payload(**kwargs_with_stripped_model)
                 "Authorization": f"Bearer {api_key}",  # Uses api_key from user info
                 "Content-Type": "application/json",
                 "Accept": "text/event-stream",
+                "User-Agent": "iFlow-Cli",
             }
             url = f"{api_base.rstrip('/')}/chat/completions"
             file_logger.log_request(payload)
             lib_logger.debug(f"iFlow Request URL: {url}")
+            return client.stream(
+                "POST",
+                url,
+                headers=headers,
+                json=payload,
+                timeout=TimeoutConfig.streaming(),
+            )
         async def stream_handler(response_stream, attempt=1):
             """Handles the streaming response and converts chunks."""
                     # Check for HTTP errors before processing stream
                     if response.status_code >= 400:
                         error_text = await response.aread()
+                        error_text = (
+                            error_text.decode("utf-8")
+                            if isinstance(error_text, bytes)
+                            else error_text
+                        )
                         # Handle 401: Force token refresh and retry once
                         if response.status_code == 401 and attempt == 1:
+                            lib_logger.warning(
+                                "iFlow returned 401. Forcing token refresh and retrying once."
+                            )
                             await self._refresh_token(credential_path, force=True)
                             retry_stream = await make_request()
                             async for chunk in stream_handler(retry_stream, attempt=2):
                             return
                         # Handle 429: Rate limit
+                        elif (
+                            response.status_code == 429
+                            or "slow_down" in error_text.lower()
+                        ):
                             raise RateLimitError(
                                 f"iFlow rate limit exceeded: {error_text}",
                                 llm_provider="iflow",
                                 model=model,
+                                response=response,
                             )
                         # Handle other errors
                         else:
+                            error_msg = (
+                                f"iFlow HTTP {response.status_code} error: {error_text}"
+                            )
                             file_logger.log_error(error_msg)
                             raise httpx.HTTPStatusError(
                                 f"HTTP {response.status_code}: {error_text}",
                                 request=response.request,
+                                response=response,
                             )
                     # Process successful streaming response
                     async for line in response.aiter_lines():
                         file_logger.log_response_chunk(line)
                         # CRITICAL FIX: Handle both "data:" (no space) and "data: " (with space)
+                        if line.startswith("data:"):
                             # Extract data after "data:" prefix, handling both formats
+                            if line.startswith("data: "):
                                 data_str = line[6:]  # Skip "data: "
                             else:
                                 data_str = line[5:]  # Skip "data:"
                             if data_str.strip() == "[DONE]":
                                 break
                             try:
                                 chunk = json.loads(data_str)
+                                for openai_chunk in self._convert_chunk_to_openai(
+                                    chunk, model
+                                ):
                                     yield litellm.ModelResponse(**openai_chunk)
                             except json.JSONDecodeError:
+                                lib_logger.warning(
+                                    f"Could not decode JSON from iFlow: {line}"
+                                )
             except httpx.HTTPStatusError:
                 raise  # Re-raise HTTP errors we already handled
             except Exception as e:
                 file_logger.log_error(f"Error during iFlow stream processing: {e}")
+                lib_logger.error(
+                    f"Error during iFlow stream processing: {e}", exc_info=True
+                )
                 raise
         async def logging_stream_wrapper():
         if kwargs.get("stream"):
             return logging_stream_wrapper()
         else:
             async def non_stream_wrapper():
                 chunks = [chunk async for chunk in logging_stream_wrapper()]
                 return self._stream_to_completion_response(chunks)
             return await non_stream_wrapper()

src/rotator_library/providers/provider_cache.py CHANGED Viewed

@@ -20,19 +20,20 @@ import asyncio
 import json
 import logging
 import os
-import shutil
-import tempfile
 import time
 from pathlib import Path
 from typing import Any, Dict, Optional, Tuple
-lib_logger = logging.getLogger('rotator_library')
 # =============================================================================
 # UTILITY FUNCTIONS
 # =============================================================================
 def _env_bool(key: str, default: bool = False) -> bool:
     """Get boolean from environment variable."""
     return os.getenv(key, str(default).lower()).lower() in ("true", "1", "yes")
@@ -47,18 +48,19 @@ def _env_int(key: str, default: int) -> int:
 # PROVIDER CACHE CLASS
 # =============================================================================
 class ProviderCache:
     """
     Server-side cache for provider conversation state preservation.
     A generic, modular cache supporting any key-value data that providers need
     to persist across requests. Features:
     - Dual-TTL system: configurable memory TTL, longer disk TTL
     - Async disk persistence with batched writes
     - Background cleanup task for expired entries
     - Statistics tracking (hits, misses, writes)
     Args:
         cache_file: Path to disk cache file
         memory_ttl_seconds: In-memory entry lifetime (default: 1 hour)
@@ -67,13 +69,13 @@ class ProviderCache:
         write_interval: Seconds between background disk writes (default: 60)
         cleanup_interval: Seconds between expired entry cleanup (default: 30 min)
         env_prefix: Environment variable prefix for configuration overrides
     Environment Variables (with default prefix "PROVIDER_CACHE"):
         {PREFIX}_ENABLE: Enable/disable disk persistence
         {PREFIX}_WRITE_INTERVAL: Background write interval in seconds
         {PREFIX}_CLEANUP_INTERVAL: Cleanup interval in seconds
     """
     def __init__(
         self,
         cache_file: Path,
@@ -82,7 +84,7 @@ class ProviderCache:
         enable_disk: Optional[bool] = None,
         write_interval: Optional[int] = None,
         cleanup_interval: Optional[int] = None,
-        env_prefix: str = "PROVIDER_CACHE"
     ):
         # In-memory cache: {cache_key: (data, timestamp)}
         self._cache: Dict[str, Tuple[str, float]] = {}
@@ -90,25 +92,42 @@ class ProviderCache:
         self._disk_ttl = disk_ttl_seconds
         self._lock = asyncio.Lock()
         self._disk_lock = asyncio.Lock()
         # Disk persistence configuration
         self._cache_file = cache_file
-        self._enable_disk = enable_disk if enable_disk is not None else _env_bool(f"{env_prefix}_ENABLE", True)
         self._dirty = False
-        self._write_interval = write_interval or _env_int(f"{env_prefix}_WRITE_INTERVAL", 60)
-        self._cleanup_interval = cleanup_interval or _env_int(f"{env_prefix}_CLEANUP_INTERVAL", 1800)
         # Background tasks
         self._writer_task: Optional[asyncio.Task] = None
         self._cleanup_task: Optional[asyncio.Task] = None
         self._running = False
         # Statistics
-        self._stats = {"memory_hits": 0, "disk_hits": 0, "misses": 0, "writes": 0}
         # Metadata about this cache instance
         self._cache_name = cache_file.stem if cache_file else "unnamed"
         if self._enable_disk:
             lib_logger.debug(
                 f"ProviderCache[{self._cache_name}]: Disk enabled "
@@ -117,123 +136,120 @@ class ProviderCache:
             asyncio.create_task(self._async_init())
         else:
             lib_logger.debug(f"ProviderCache[{self._cache_name}]: Memory-only mode")
     # =========================================================================
     # INITIALIZATION
     # =========================================================================
     async def _async_init(self) -> None:
         """Async initialization: load from disk and start background tasks."""
         try:
             await self._load_from_disk()
             await self._start_background_tasks()
         except Exception as e:
-            lib_logger.error(f"ProviderCache[{self._cache_name}] async init failed: {e}")
     async def _load_from_disk(self) -> None:
         """Load cache from disk file with TTL validation."""
         if not self._enable_disk or not self._cache_file.exists():
             return
         try:
             async with self._disk_lock:
-                with open(self._cache_file, 'r', encoding='utf-8') as f:
                     data = json.load(f)
                 if data.get("version") != "1.0":
-                    lib_logger.warning(f"ProviderCache[{self._cache_name}]: Version mismatch, starting fresh")
                     return
                 now = time.time()
                 entries = data.get("entries", {})
                 loaded = expired = 0
                 for cache_key, entry in entries.items():
                     age = now - entry.get("timestamp", 0)
                     if age <= self._disk_ttl:
-                        value = entry.get("value", entry.get("signature", ""))  # Support both formats
                         if value:
                             self._cache[cache_key] = (value, entry["timestamp"])
                             loaded += 1
                     else:
                         expired += 1
                 lib_logger.debug(
                     f"ProviderCache[{self._cache_name}]: Loaded {loaded} entries ({expired} expired)"
                 )
         except json.JSONDecodeError as e:
-            lib_logger.warning(f"ProviderCache[{self._cache_name}]: File corrupted: {e}")
         except Exception as e:
             lib_logger.error(f"ProviderCache[{self._cache_name}]: Load failed: {e}")
     # =========================================================================
     # DISK PERSISTENCE
     # =========================================================================
-    async def _save_to_disk(self) -> None:
-        """Persist cache to disk using atomic write."""
         if not self._enable_disk:
-            return
-        try:
-            async with self._disk_lock:
-                self._cache_file.parent.mkdir(parents=True, exist_ok=True)
-                cache_data = {
-                    "version": "1.0",
-                    "memory_ttl_seconds": self._memory_ttl,
-                    "disk_ttl_seconds": self._disk_ttl,
-                    "entries": {
-                        key: {"value": val, "timestamp": ts}
-                        for key, (val, ts) in self._cache.items()
-                    },
-                    "statistics": {
-                        "total_entries": len(self._cache),
-                        "last_write": time.time(),
-                        **self._stats
-                    }
-                }
-                # Atomic write using temp file
-                parent_dir = self._cache_file.parent
-                tmp_fd, tmp_path = tempfile.mkstemp(dir=parent_dir, prefix='.tmp_', suffix='.json')
-                try:
-                    with os.fdopen(tmp_fd, 'w', encoding='utf-8') as f:
-                        json.dump(cache_data, f, indent=2)
-                    # Set restrictive permissions (if supported)
-                    try:
-                        os.chmod(tmp_path, 0o600)
-                    except (OSError, AttributeError):
-                        pass
-                    shutil.move(tmp_path, self._cache_file)
-                    self._stats["writes"] += 1
-                    lib_logger.debug(
-                        f"ProviderCache[{self._cache_name}]: Saved {len(self._cache)} entries"
-                    )
-                except Exception:
-                    if tmp_path and os.path.exists(tmp_path):
-                        os.unlink(tmp_path)
-                    raise
-        except Exception as e:
-            lib_logger.error(f"ProviderCache[{self._cache_name}]: Disk save failed: {e}")
     # =========================================================================
     # BACKGROUND TASKS
     # =========================================================================
     async def _start_background_tasks(self) -> None:
         """Start background writer and cleanup tasks."""
         if not self._enable_disk or self._running:
             return
         self._running = True
         self._writer_task = asyncio.create_task(self._writer_loop())
         self._cleanup_task = asyncio.create_task(self._cleanup_loop())
         lib_logger.debug(f"ProviderCache[{self._cache_name}]: Started background tasks")
     async def _writer_loop(self) -> None:
         """Background task: periodically flush dirty cache to disk."""
         try:
@@ -241,13 +257,17 @@ class ProviderCache:
                 await asyncio.sleep(self._write_interval)
                 if self._dirty:
                     try:
-                        await self._save_to_disk()
-                        self._dirty = False
                     except Exception as e:
-                        lib_logger.error(f"ProviderCache[{self._cache_name}]: Writer error: {e}")
         except asyncio.CancelledError:
             pass
     async def _cleanup_loop(self) -> None:
         """Background task: periodically clean up expired entries."""
         try:
@@ -256,12 +276,14 @@ class ProviderCache:
                 await self._cleanup_expired()
         except asyncio.CancelledError:
             pass
     async def _cleanup_expired(self) -> None:
         """Remove expired entries from memory cache."""
         async with self._lock:
             now = time.time()
-            expired = [k for k, (_, ts) in self._cache.items() if now - ts > self._memory_ttl]
             for k in expired:
                 del self._cache[k]
             if expired:
@@ -269,42 +291,42 @@ class ProviderCache:
                 lib_logger.debug(
                     f"ProviderCache[{self._cache_name}]: Cleaned {len(expired)} expired entries"
                 )
     # =========================================================================
     # CORE OPERATIONS
     # =========================================================================
     def store(self, key: str, value: str) -> None:
         """
         Store a value synchronously (schedules async storage).
         Args:
             key: Cache key
             value: Value to store (typically JSON-serialized data)
         """
         asyncio.create_task(self._async_store(key, value))
     async def _async_store(self, key: str, value: str) -> None:
         """Async implementation of store."""
         async with self._lock:
             self._cache[key] = (value, time.time())
             self._dirty = True
     async def store_async(self, key: str, value: str) -> None:
         """
         Store a value asynchronously (awaitable).
         Use this when you need to ensure the value is stored before continuing.
         """
         await self._async_store(key, value)
     def retrieve(self, key: str) -> Optional[str]:
         """
         Retrieve a value by key (synchronous, with optional async disk fallback).
         Args:
             key: Cache key
         Returns:
             Cached value if found and not expired, None otherwise
         """
@@ -316,17 +338,17 @@ class ProviderCache:
             else:
                 del self._cache[key]
                 self._dirty = True
         self._stats["misses"] += 1
         if self._enable_disk:
             # Schedule async disk lookup for next time
             asyncio.create_task(self._check_disk_fallback(key))
         return None
     async def retrieve_async(self, key: str) -> Optional[str]:
         """
         Retrieve a value asynchronously (checks disk if not in memory).
         Use this when you can await and need guaranteed disk fallback.
         """
         # Check memory first
@@ -340,24 +362,24 @@ class ProviderCache:
                     if key in self._cache:
                         del self._cache[key]
                         self._dirty = True
         # Check disk
         if self._enable_disk:
             return await self._disk_retrieve(key)
         self._stats["misses"] += 1
         return None
     async def _check_disk_fallback(self, key: str) -> None:
         """Check disk for key and load into memory if found (background)."""
         try:
             if not self._cache_file.exists():
                 return
             async with self._disk_lock:
-                with open(self._cache_file, 'r', encoding='utf-8') as f:
                     data = json.load(f)
                 entries = data.get("entries", {})
                 if key in entries:
                     entry = entries[key]
@@ -372,19 +394,21 @@ class ProviderCache:
                                 f"ProviderCache[{self._cache_name}]: Loaded {key} from disk"
                             )
         except Exception as e:
-            lib_logger.debug(f"ProviderCache[{self._cache_name}]: Disk fallback failed: {e}")
     async def _disk_retrieve(self, key: str) -> Optional[str]:
         """Direct disk retrieval with loading into memory."""
         try:
             if not self._cache_file.exists():
                 self._stats["misses"] += 1
                 return None
             async with self._disk_lock:
-                with open(self._cache_file, 'r', encoding='utf-8') as f:
                     data = json.load(f)
                 entries = data.get("entries", {})
                 if key in entries:
                     entry = entries[key]
@@ -396,34 +420,37 @@ class ProviderCache:
                                 self._cache[key] = (value, ts)
                             self._stats["disk_hits"] += 1
                             return value
             self._stats["misses"] += 1
             return None
         except Exception as e:
-            lib_logger.debug(f"ProviderCache[{self._cache_name}]: Disk retrieve failed: {e}")
             self._stats["misses"] += 1
             return None
     # =========================================================================
     # UTILITY METHODS
     # =========================================================================
     def contains(self, key: str) -> bool:
         """Check if key exists in memory cache (without updating stats)."""
         if key in self._cache:
             _, timestamp = self._cache[key]
             return time.time() - timestamp <= self._memory_ttl
         return False
     def get_stats(self) -> Dict[str, Any]:
-        """Get cache statistics."""
         return {
             **self._stats,
             "memory_entries": len(self._cache),
             "dirty": self._dirty,
-            "disk_enabled": self._enable_disk
         }
     async def clear(self) -> None:
         """Clear all cached data."""
         async with self._lock:
@@ -431,12 +458,12 @@ class ProviderCache:
             self._dirty = True
         if self._enable_disk:
             await self._save_to_disk()
     async def shutdown(self) -> None:
         """Graceful shutdown: flush pending writes and stop background tasks."""
         lib_logger.info(f"ProviderCache[{self._cache_name}]: Shutting down...")
         self._running = False
         # Cancel background tasks
         for task in (self._writer_task, self._cleanup_task):
             if task:
@@ -445,11 +472,11 @@ class ProviderCache:
                     await task
                 except asyncio.CancelledError:
                     pass
         # Final save
         if self._dirty and self._enable_disk:
             await self._save_to_disk()
         lib_logger.info(
             f"ProviderCache[{self._cache_name}]: Shutdown complete "
             f"(stats: mem_hits={self._stats['memory_hits']}, "
@@ -461,38 +488,39 @@ class ProviderCache:
 # CONVENIENCE FACTORY
 # =============================================================================
 def create_provider_cache(
     name: str,
     cache_dir: Optional[Path] = None,
     memory_ttl_seconds: int = 3600,
     disk_ttl_seconds: int = 86400,
-    env_prefix: Optional[str] = None
 ) -> ProviderCache:
     """
     Factory function to create a provider cache with sensible defaults.
     Args:
         name: Cache name (used as filename and for logging)
         cache_dir: Directory for cache file (default: project_root/cache/provider_name)
         memory_ttl_seconds: In-memory TTL
         disk_ttl_seconds: Disk TTL
         env_prefix: Environment variable prefix (default: derived from name)
     Returns:
         Configured ProviderCache instance
     """
     if cache_dir is None:
         cache_dir = Path(__file__).resolve().parent.parent.parent.parent / "cache"
     cache_file = cache_dir / f"{name}.json"
     if env_prefix is None:
         # Convert name to env prefix: "gemini3_signatures" -> "GEMINI3_SIGNATURES_CACHE"
         env_prefix = f"{name.upper().replace('-', '_')}_CACHE"
     return ProviderCache(
         cache_file=cache_file,
         memory_ttl_seconds=memory_ttl_seconds,
         disk_ttl_seconds=disk_ttl_seconds,
-        env_prefix=env_prefix
     )

 import json
 import logging
 import os
 import time
 from pathlib import Path
 from typing import Any, Dict, Optional, Tuple
+from ..utils.resilient_io import safe_write_json
+lib_logger = logging.getLogger("rotator_library")
 # =============================================================================
 # UTILITY FUNCTIONS
 # =============================================================================
 def _env_bool(key: str, default: bool = False) -> bool:
     """Get boolean from environment variable."""
     return os.getenv(key, str(default).lower()).lower() in ("true", "1", "yes")
 # PROVIDER CACHE CLASS
 # =============================================================================
 class ProviderCache:
     """
     Server-side cache for provider conversation state preservation.
     A generic, modular cache supporting any key-value data that providers need
     to persist across requests. Features:
     - Dual-TTL system: configurable memory TTL, longer disk TTL
     - Async disk persistence with batched writes
     - Background cleanup task for expired entries
     - Statistics tracking (hits, misses, writes)
     Args:
         cache_file: Path to disk cache file
         memory_ttl_seconds: In-memory entry lifetime (default: 1 hour)
         write_interval: Seconds between background disk writes (default: 60)
         cleanup_interval: Seconds between expired entry cleanup (default: 30 min)
         env_prefix: Environment variable prefix for configuration overrides
     Environment Variables (with default prefix "PROVIDER_CACHE"):
         {PREFIX}_ENABLE: Enable/disable disk persistence
         {PREFIX}_WRITE_INTERVAL: Background write interval in seconds
         {PREFIX}_CLEANUP_INTERVAL: Cleanup interval in seconds
     """
     def __init__(
         self,
         cache_file: Path,
         enable_disk: Optional[bool] = None,
         write_interval: Optional[int] = None,
         cleanup_interval: Optional[int] = None,
+        env_prefix: str = "PROVIDER_CACHE",
     ):
         # In-memory cache: {cache_key: (data, timestamp)}
         self._cache: Dict[str, Tuple[str, float]] = {}
         self._disk_ttl = disk_ttl_seconds
         self._lock = asyncio.Lock()
         self._disk_lock = asyncio.Lock()
         # Disk persistence configuration
         self._cache_file = cache_file
+        self._enable_disk = (
+            enable_disk
+            if enable_disk is not None
+            else _env_bool(f"{env_prefix}_ENABLE", True)
+        )
         self._dirty = False
+        self._write_interval = write_interval or _env_int(
+            f"{env_prefix}_WRITE_INTERVAL", 60
+        )
+        self._cleanup_interval = cleanup_interval or _env_int(
+            f"{env_prefix}_CLEANUP_INTERVAL", 1800
+        )
         # Background tasks
         self._writer_task: Optional[asyncio.Task] = None
         self._cleanup_task: Optional[asyncio.Task] = None
         self._running = False
         # Statistics
+        self._stats = {
+            "memory_hits": 0,
+            "disk_hits": 0,
+            "misses": 0,
+            "writes": 0,
+            "disk_errors": 0,
+        }
+        # Track disk health for monitoring
+        self._disk_available = True
         # Metadata about this cache instance
         self._cache_name = cache_file.stem if cache_file else "unnamed"
         if self._enable_disk:
             lib_logger.debug(
                 f"ProviderCache[{self._cache_name}]: Disk enabled "
             asyncio.create_task(self._async_init())
         else:
             lib_logger.debug(f"ProviderCache[{self._cache_name}]: Memory-only mode")
     # =========================================================================
     # INITIALIZATION
     # =========================================================================
     async def _async_init(self) -> None:
         """Async initialization: load from disk and start background tasks."""
         try:
             await self._load_from_disk()
             await self._start_background_tasks()
         except Exception as e:
+            lib_logger.error(
+                f"ProviderCache[{self._cache_name}] async init failed: {e}"
+            )
     async def _load_from_disk(self) -> None:
         """Load cache from disk file with TTL validation."""
         if not self._enable_disk or not self._cache_file.exists():
             return
         try:
             async with self._disk_lock:
+                with open(self._cache_file, "r", encoding="utf-8") as f:
                     data = json.load(f)
                 if data.get("version") != "1.0":
+                    lib_logger.warning(
+                        f"ProviderCache[{self._cache_name}]: Version mismatch, starting fresh"
+                    )
                     return
                 now = time.time()
                 entries = data.get("entries", {})
                 loaded = expired = 0
                 for cache_key, entry in entries.items():
                     age = now - entry.get("timestamp", 0)
                     if age <= self._disk_ttl:
+                        value = entry.get(
+                            "value", entry.get("signature", "")
+                        )  # Support both formats
                         if value:
                             self._cache[cache_key] = (value, entry["timestamp"])
                             loaded += 1
                     else:
                         expired += 1
                 lib_logger.debug(
                     f"ProviderCache[{self._cache_name}]: Loaded {loaded} entries ({expired} expired)"
                 )
         except json.JSONDecodeError as e:
+            lib_logger.warning(
+                f"ProviderCache[{self._cache_name}]: File corrupted: {e}"
+            )
         except Exception as e:
             lib_logger.error(f"ProviderCache[{self._cache_name}]: Load failed: {e}")
     # =========================================================================
     # DISK PERSISTENCE
     # =========================================================================
+    async def _save_to_disk(self) -> bool:
+        """Persist cache to disk using atomic write with health tracking.
+        Returns:
+            True if write succeeded, False otherwise.
+        """
         if not self._enable_disk:
+            return True  # Not an error if disk is disabled
+        async with self._disk_lock:
+            cache_data = {
+                "version": "1.0",
+                "memory_ttl_seconds": self._memory_ttl,
+                "disk_ttl_seconds": self._disk_ttl,
+                "entries": {
+                    key: {"value": val, "timestamp": ts}
+                    for key, (val, ts) in self._cache.items()
+                },
+                "statistics": {
+                    "total_entries": len(self._cache),
+                    "last_write": time.time(),
+                    **self._stats,
+                },
+            }
+            if safe_write_json(
+                self._cache_file, cache_data, lib_logger, secure_permissions=True
+            ):
+                self._stats["writes"] += 1
+                self._disk_available = True
+                lib_logger.debug(
+                    f"ProviderCache[{self._cache_name}]: Saved {len(self._cache)} entries"
+                )
+                return True
+            else:
+                self._stats["disk_errors"] += 1
+                self._disk_available = False
+                return False
     # =========================================================================
     # BACKGROUND TASKS
     # =========================================================================
     async def _start_background_tasks(self) -> None:
         """Start background writer and cleanup tasks."""
         if not self._enable_disk or self._running:
             return
         self._running = True
         self._writer_task = asyncio.create_task(self._writer_loop())
         self._cleanup_task = asyncio.create_task(self._cleanup_loop())
         lib_logger.debug(f"ProviderCache[{self._cache_name}]: Started background tasks")
     async def _writer_loop(self) -> None:
         """Background task: periodically flush dirty cache to disk."""
         try:
                 await asyncio.sleep(self._write_interval)
                 if self._dirty:
                     try:
+                        success = await self._save_to_disk()
+                        if success:
+                            self._dirty = False
+                        # If save failed, _dirty remains True so we retry next interval
                     except Exception as e:
+                        lib_logger.error(
+                            f"ProviderCache[{self._cache_name}]: Writer error: {e}"
+                        )
         except asyncio.CancelledError:
             pass
     async def _cleanup_loop(self) -> None:
         """Background task: periodically clean up expired entries."""
         try:
                 await self._cleanup_expired()
         except asyncio.CancelledError:
             pass
     async def _cleanup_expired(self) -> None:
         """Remove expired entries from memory cache."""
         async with self._lock:
             now = time.time()
+            expired = [
+                k for k, (_, ts) in self._cache.items() if now - ts > self._memory_ttl
+            ]
             for k in expired:
                 del self._cache[k]
             if expired:
                 lib_logger.debug(
                     f"ProviderCache[{self._cache_name}]: Cleaned {len(expired)} expired entries"
                 )
     # =========================================================================
     # CORE OPERATIONS
     # =========================================================================
     def store(self, key: str, value: str) -> None:
         """
         Store a value synchronously (schedules async storage).
         Args:
             key: Cache key
             value: Value to store (typically JSON-serialized data)
         """
         asyncio.create_task(self._async_store(key, value))
     async def _async_store(self, key: str, value: str) -> None:
         """Async implementation of store."""
         async with self._lock:
             self._cache[key] = (value, time.time())
             self._dirty = True
     async def store_async(self, key: str, value: str) -> None:
         """
         Store a value asynchronously (awaitable).
         Use this when you need to ensure the value is stored before continuing.
         """
         await self._async_store(key, value)
     def retrieve(self, key: str) -> Optional[str]:
         """
         Retrieve a value by key (synchronous, with optional async disk fallback).
         Args:
             key: Cache key
         Returns:
             Cached value if found and not expired, None otherwise
         """
             else:
                 del self._cache[key]
                 self._dirty = True
         self._stats["misses"] += 1
         if self._enable_disk:
             # Schedule async disk lookup for next time
             asyncio.create_task(self._check_disk_fallback(key))
         return None
     async def retrieve_async(self, key: str) -> Optional[str]:
         """
         Retrieve a value asynchronously (checks disk if not in memory).
         Use this when you can await and need guaranteed disk fallback.
         """
         # Check memory first
                     if key in self._cache:
                         del self._cache[key]
                         self._dirty = True
         # Check disk
         if self._enable_disk:
             return await self._disk_retrieve(key)
         self._stats["misses"] += 1
         return None
     async def _check_disk_fallback(self, key: str) -> None:
         """Check disk for key and load into memory if found (background)."""
         try:
             if not self._cache_file.exists():
                 return
             async with self._disk_lock:
+                with open(self._cache_file, "r", encoding="utf-8") as f:
                     data = json.load(f)
                 entries = data.get("entries", {})
                 if key in entries:
                     entry = entries[key]
                                 f"ProviderCache[{self._cache_name}]: Loaded {key} from disk"
                             )
         except Exception as e:
+            lib_logger.debug(
+                f"ProviderCache[{self._cache_name}]: Disk fallback failed: {e}"
+            )
     async def _disk_retrieve(self, key: str) -> Optional[str]:
         """Direct disk retrieval with loading into memory."""
         try:
             if not self._cache_file.exists():
                 self._stats["misses"] += 1
                 return None
             async with self._disk_lock:
+                with open(self._cache_file, "r", encoding="utf-8") as f:
                     data = json.load(f)
                 entries = data.get("entries", {})
                 if key in entries:
                     entry = entries[key]
                                 self._cache[key] = (value, ts)
                             self._stats["disk_hits"] += 1
                             return value
             self._stats["misses"] += 1
             return None
         except Exception as e:
+            lib_logger.debug(
+                f"ProviderCache[{self._cache_name}]: Disk retrieve failed: {e}"
+            )
             self._stats["misses"] += 1
             return None
     # =========================================================================
     # UTILITY METHODS
     # =========================================================================
     def contains(self, key: str) -> bool:
         """Check if key exists in memory cache (without updating stats)."""
         if key in self._cache:
             _, timestamp = self._cache[key]
             return time.time() - timestamp <= self._memory_ttl
         return False
     def get_stats(self) -> Dict[str, Any]:
+        """Get cache statistics including disk health."""
         return {
             **self._stats,
             "memory_entries": len(self._cache),
             "dirty": self._dirty,
+            "disk_enabled": self._enable_disk,
+            "disk_available": self._disk_available,
         }
     async def clear(self) -> None:
         """Clear all cached data."""
         async with self._lock:
             self._dirty = True
         if self._enable_disk:
             await self._save_to_disk()
     async def shutdown(self) -> None:
         """Graceful shutdown: flush pending writes and stop background tasks."""
         lib_logger.info(f"ProviderCache[{self._cache_name}]: Shutting down...")
         self._running = False
         # Cancel background tasks
         for task in (self._writer_task, self._cleanup_task):
             if task:
                     await task
                 except asyncio.CancelledError:
                     pass
         # Final save
         if self._dirty and self._enable_disk:
             await self._save_to_disk()
         lib_logger.info(
             f"ProviderCache[{self._cache_name}]: Shutdown complete "
             f"(stats: mem_hits={self._stats['memory_hits']}, "
 # CONVENIENCE FACTORY
 # =============================================================================
 def create_provider_cache(
     name: str,
     cache_dir: Optional[Path] = None,
     memory_ttl_seconds: int = 3600,
     disk_ttl_seconds: int = 86400,
+    env_prefix: Optional[str] = None,
 ) -> ProviderCache:
     """
     Factory function to create a provider cache with sensible defaults.
     Args:
         name: Cache name (used as filename and for logging)
         cache_dir: Directory for cache file (default: project_root/cache/provider_name)
         memory_ttl_seconds: In-memory TTL
         disk_ttl_seconds: Disk TTL
         env_prefix: Environment variable prefix (default: derived from name)
     Returns:
         Configured ProviderCache instance
     """
     if cache_dir is None:
         cache_dir = Path(__file__).resolve().parent.parent.parent.parent / "cache"
     cache_file = cache_dir / f"{name}.json"
     if env_prefix is None:
         # Convert name to env prefix: "gemini3_signatures" -> "GEMINI3_SIGNATURES_CACHE"
         env_prefix = f"{name.upper().replace('-', '_')}_CACHE"
     return ProviderCache(
         cache_file=cache_file,
         memory_ttl_seconds=memory_ttl_seconds,
         disk_ttl_seconds=disk_ttl_seconds,
+        env_prefix=env_prefix,
     )

src/rotator_library/providers/qwen_auth_base.py CHANGED Viewed

@@ -9,10 +9,11 @@ import asyncio
 import logging
 import webbrowser
 import os
 from pathlib import Path
-from typing import Dict, Any, Tuple, Union, Optional
-import tempfile
-import shutil
 import httpx
 from rich.console import Console
@@ -23,6 +24,7 @@ from rich.markup import escape as rich_escape
 from ..utils.headless_detection import is_headless_environment
 from ..utils.reauth_coordinator import get_reauth_coordinator
 lib_logger = logging.getLogger("rotator_library")
@@ -36,6 +38,20 @@ REFRESH_EXPIRY_BUFFER_SECONDS = 3 * 60 * 60  # 3 hours buffer before expiry
 console = Console()
 class QwenAuthBase:
     def __init__(self):
         self._credentials_cache: Dict[str, Dict[str, Any]] = {}
@@ -51,19 +67,36 @@ class QwenAuthBase:
             str, float
         ] = {}  # Track backoff timers (Unix timestamp)
-        # [QUEUE SYSTEM] Sequential refresh processing
         self._refresh_queue: asyncio.Queue = asyncio.Queue()
-        self._queued_credentials: set = set()  # Track credentials already in queue
-        # [FIX PR#34] Changed from set to dict mapping credential path to timestamp
-        # This enables TTL-based stale entry cleanup as defense in depth
         self._unavailable_credentials: Dict[
             str, float
         ] = {}  # Maps credential path -> timestamp when marked unavailable
-        self._unavailable_ttl_seconds: int = 300  # 5 minutes TTL for stale entries
         self._queue_tracking_lock = asyncio.Lock()  # Protects queue sets
-        self._queue_processor_task: Optional[asyncio.Task] = (
-            None  # Background worker task
-        )
     def _parse_env_credential_path(self, path: str) -> Optional[str]:
         """
@@ -188,81 +221,54 @@ class QwenAuthBase:
                         f"Environment variables for Qwen Code credential index {credential_index} not found"
                     )
-            # For file paths, try loading from legacy env vars first
-            env_creds = self._load_from_env()
-            if env_creds:
-                lib_logger.info(
-                    "Using Qwen Code credentials from environment variables"
-                )
-                self._credentials_cache[path] = env_creds
-                return env_creds
-            # Fall back to file-based loading
-            return await self._read_creds_from_file(path)
     async def _save_credentials(self, path: str, creds: Dict[str, Any]):
         # Don't save to file if credentials were loaded from environment
         if creds.get("_proxy_metadata", {}).get("loaded_from_env"):
             lib_logger.debug("Credentials loaded from env, skipping file save")
-            # Still update cache for in-memory consistency
-            self._credentials_cache[path] = creds
             return
-        # [ATOMIC WRITE] Use tempfile + move pattern to ensure atomic writes
-        parent_dir = os.path.dirname(os.path.abspath(path))
-        os.makedirs(parent_dir, exist_ok=True)
-        tmp_fd = None
-        tmp_path = None
-        try:
-            # Create temp file in same directory as target (ensures same filesystem)
-            tmp_fd, tmp_path = tempfile.mkstemp(
-                dir=parent_dir, prefix=".tmp_", suffix=".json", text=True
-            )
-            # Write JSON to temp file
-            with os.fdopen(tmp_fd, "w") as f:
-                json.dump(creds, f, indent=2)
-                tmp_fd = None  # fdopen closes the fd
-            # Set secure permissions (0600 = owner read/write only)
-            try:
-                os.chmod(tmp_path, 0o600)
-            except (OSError, AttributeError):
-                # Windows may not support chmod, ignore
-                pass
-            # Atomic move (overwrites target if it exists)
-            shutil.move(tmp_path, path)
-            tmp_path = None  # Successfully moved
-            # Update cache AFTER successful file write
-            self._credentials_cache[path] = creds
-            lib_logger.debug(
-                f"Saved updated Qwen OAuth credentials to '{path}' (atomic write)."
-            )
-        except Exception as e:
-            lib_logger.error(
-                f"Failed to save updated Qwen OAuth credentials to '{path}': {e}"
             )
-            # Clean up temp file if it still exists
-            if tmp_fd is not None:
-                try:
-                    os.close(tmp_fd)
-                except:
-                    pass
-            if tmp_path and os.path.exists(tmp_path):
-                try:
-                    os.unlink(tmp_path)
-                except:
-                    pass
-            raise
     def _is_token_expired(self, creds: Dict[str, Any]) -> bool:
         expiry_timestamp = creds.get("expiry_date", 0) / 1000
         return expiry_timestamp < time.time() + REFRESH_EXPIRY_BUFFER_SECONDS
     async def _refresh_token(self, path: str, force: bool = False) -> Dict[str, Any]:
         async with await self._get_lock(path):
             cached_creds = self._credentials_cache.get(path)
@@ -476,7 +482,7 @@ class QwenAuthBase:
         Proactively refreshes tokens if they're close to expiry.
         Only applies to OAuth credentials (file paths or env:// paths). Direct API keys are skipped.
         """
-        lib_logger.debug(f"proactively_refresh called for: {credential_identifier}")
         # Try to load credentials - this will fail for direct API keys
         # and succeed for OAuth credentials (file paths or env:// paths)
@@ -484,21 +490,21 @@ class QwenAuthBase:
             creds = await self._load_credentials(credential_identifier)
         except IOError as e:
             # Not a valid credential path (likely a direct API key string)
-            lib_logger.debug(
-                f"Skipping refresh for '{credential_identifier}' - not an OAuth credential: {e}"
-            )
             return
         is_expired = self._is_token_expired(creds)
-        lib_logger.debug(
-            f"Token expired check for '{Path(credential_identifier).name}': {is_expired}"
-        )
         if is_expired:
-            lib_logger.debug(
-                f"Queueing refresh for '{Path(credential_identifier).name}'"
-            )
-            # Queue for refresh with needs_reauth=False (automated refresh)
             await self._queue_refresh(
                 credential_identifier, force=False, needs_reauth=False
             )
@@ -511,30 +517,55 @@ class QwenAuthBase:
             return self._refresh_locks[path]
     def is_credential_available(self, path: str) -> bool:
-        """Check if a credential is available for rotation (not queued/refreshing).
-        [FIX PR#34] Now includes TTL-based stale entry cleanup as defense in depth.
-        If a credential has been unavailable for longer than _unavailable_ttl_seconds,
-        it is automatically cleaned up and considered available.
         """
-        if path not in self._unavailable_credentials:
-            return True
-        # [FIX PR#34] Check if the entry is stale (TTL expired)
-        marked_time = self._unavailable_credentials.get(path)
-        if marked_time is not None:
-            now = time.time()
-            if now - marked_time > self._unavailable_ttl_seconds:
-                # Entry is stale - clean it up and return available
-                lib_logger.warning(
-                    f"Credential '{Path(path).name}' was stuck in unavailable state for "
-                    f"{int(now - marked_time)}s (TTL: {self._unavailable_ttl_seconds}s). "
-                    f"Auto-cleaning stale entry."
                 )
-                self._unavailable_credentials.pop(path, None)
-                return True
-        return False
     async def _ensure_queue_processor_running(self):
         """Lazily starts the queue processor if not already running."""
@@ -543,15 +574,27 @@ class QwenAuthBase:
                 self._process_refresh_queue()
             )
     async def _queue_refresh(
         self, path: str, force: bool = False, needs_reauth: bool = False
     ):
-        """Add a credential to the refresh queue if not already queued.
         Args:
             path: Credential file path
             force: Force refresh even if not expired
-            needs_reauth: True if full re-authentication needed (bypasses backoff)
         """
         # IMPORTANT: Only check backoff for simple automated refreshes
         # Re-authentication (interactive OAuth) should BYPASS backoff since it needs user input
@@ -561,114 +604,223 @@ class QwenAuthBase:
                 backoff_until = self._next_refresh_after[path]
                 if now < backoff_until:
                     # Credential is in backoff for automated refresh, do not queue
-                    remaining = int(backoff_until - now)
-                    lib_logger.debug(
-                        f"Skipping automated refresh for '{Path(path).name}' (in backoff for {remaining}s)"
-                    )
                     return
         async with self._queue_tracking_lock:
             if path not in self._queued_credentials:
                 self._queued_credentials.add(path)
-                # [FIX PR#34] Store timestamp when marking unavailable (for TTL cleanup)
-                self._unavailable_credentials[path] = time.time()
-                lib_logger.debug(
-                    f"Marked '{Path(path).name}' as unavailable. "
-                    f"Total unavailable: {len(self._unavailable_credentials)}"
-                )
-                await self._refresh_queue.put((path, force, needs_reauth))
-                await self._ensure_queue_processor_running()
     async def _process_refresh_queue(self):
-        """Background worker that processes refresh requests sequentially."""
         while True:
             path = None
             try:
                 # Wait for an item with timeout to allow graceful shutdown
                 try:
-                    path, force, needs_reauth = await asyncio.wait_for(
                         self._refresh_queue.get(), timeout=60.0
                     )
                 except asyncio.TimeoutError:
-                    # [FIX PR#34] Clean up any stale unavailable entries before exiting
-                    # If we're idle for 60s, no refreshes are in progress
                     async with self._queue_tracking_lock:
-                        if self._unavailable_credentials:
-                            stale_count = len(self._unavailable_credentials)
-                            lib_logger.warning(
-                                f"Queue processor idle timeout. Cleaning {stale_count} "
-                                f"stale unavailable credentials: {list(self._unavailable_credentials.keys())}"
-                            )
-                            self._unavailable_credentials.clear()
-                        # [FIX BUG#6] Also clear queued credentials to prevent stuck state
-                        if self._queued_credentials:
-                            lib_logger.debug(
-                                f"Clearing {len(self._queued_credentials)} queued credentials on timeout"
-                            )
-                            self._queued_credentials.clear()
                     self._queue_processor_task = None
                     return
                 try:
-                    # Perform the actual refresh (still using per-credential lock)
-                    async with await self._get_lock(path):
-                        # Re-check if still expired (may have changed since queueing)
-                        creds = self._credentials_cache.get(path)
-                        if creds and not self._is_token_expired(creds):
-                            # No longer expired, mark as available
-                            async with self._queue_tracking_lock:
-                                self._unavailable_credentials.pop(path, None)
-                                lib_logger.debug(
-                                    f"Credential '{Path(path).name}' no longer expired, marked available. "
-                                    f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                                )
-                            continue
-                        # Perform refresh
-                        if not creds:
-                            creds = await self._load_credentials(path)
-                        await self._refresh_token(path, force=force)
-                        # SUCCESS: Mark as available again
-                        async with self._queue_tracking_lock:
-                            self._unavailable_credentials.pop(path, None)
-                            lib_logger.debug(
-                                f"Refresh SUCCESS for '{Path(path).name}', marked available. "
-                                f"Remaining unavailable: {len(self._unavailable_credentials)}"
                             )
                 finally:
-                    # [FIX PR#34] Remove from BOTH queued set AND unavailable credentials
-                    # This ensures cleanup happens in ALL exit paths (success, exception, etc.)
                     async with self._queue_tracking_lock:
                         self._queued_credentials.discard(path)
-                        # [FIX PR#34] Always clean up unavailable credentials in finally block
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"Finally cleanup for '{Path(path).name}'. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
-                    self._refresh_queue.task_done()
             except asyncio.CancelledError:
-                # [FIX PR#34] Clean up the current credential before breaking
                 if path:
                     async with self._queue_tracking_lock:
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"CancelledError cleanup for '{Path(path).name}'. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
                 break
             except Exception as e:
-                lib_logger.error(f"Error in queue processor: {e}")
-                # Even on error, mark as available (backoff will prevent immediate retry)
                 if path:
                     async with self._queue_tracking_lock:
                         self._unavailable_credentials.pop(path, None)
-                        lib_logger.debug(
-                            f"Error cleanup for '{Path(path).name}': {e}. "
-                            f"Remaining unavailable: {len(self._unavailable_credentials)}"
-                        )
     async def _perform_interactive_oauth(
         self, path: str, creds: Dict[str, Any], display_name: str
@@ -965,3 +1117,251 @@ class QwenAuthBase:
         except Exception as e:
             lib_logger.error(f"Failed to get Qwen user info from credentials: {e}")
             return {"email": None}

 import logging
 import webbrowser
 import os
+import re
+from dataclasses import dataclass, field
 from pathlib import Path
+from glob import glob
+from typing import Dict, Any, Tuple, Union, Optional, List
 import httpx
 from rich.console import Console
 from ..utils.headless_detection import is_headless_environment
 from ..utils.reauth_coordinator import get_reauth_coordinator
+from ..utils.resilient_io import safe_write_json
 lib_logger = logging.getLogger("rotator_library")
 console = Console()
+@dataclass
+class QwenCredentialSetupResult:
+    """
+    Standardized result structure for Qwen credential setup operations.
+    """
+    success: bool
+    file_path: Optional[str] = None
+    email: Optional[str] = None
+    is_update: bool = False
+    error: Optional[str] = None
+    credentials: Optional[Dict[str, Any]] = field(default=None, repr=False)
 class QwenAuthBase:
     def __init__(self):
         self._credentials_cache: Dict[str, Dict[str, Any]] = {}
             str, float
         ] = {}  # Track backoff timers (Unix timestamp)
+        # [QUEUE SYSTEM] Sequential refresh processing with two separate queues
+        # Normal refresh queue: for proactive token refresh (old token still valid)
         self._refresh_queue: asyncio.Queue = asyncio.Queue()
+        self._queue_processor_task: Optional[asyncio.Task] = None
+        # Re-auth queue: for invalid refresh tokens (requires user interaction)
+        self._reauth_queue: asyncio.Queue = asyncio.Queue()
+        self._reauth_processor_task: Optional[asyncio.Task] = None
+        # Tracking sets/dicts
+        self._queued_credentials: set = set()  # Track credentials in either queue
+        # Only credentials in re-auth queue are marked unavailable (not normal refresh)
+        # TTL cleanup is defense-in-depth for edge cases where re-auth processor crashes
         self._unavailable_credentials: Dict[
             str, float
         ] = {}  # Maps credential path -> timestamp when marked unavailable
+        # TTL should exceed reauth timeout (300s) to avoid premature cleanup
+        self._unavailable_ttl_seconds: int = 360  # 6 minutes TTL for stale entries
         self._queue_tracking_lock = asyncio.Lock()  # Protects queue sets
+        # Retry tracking for normal refresh queue
+        self._queue_retry_count: Dict[
+            str, int
+        ] = {}  # Track retry attempts per credential
+        # Configuration constants
+        self._refresh_timeout_seconds: int = 15  # Max time for single refresh
+        self._refresh_interval_seconds: int = 30  # Delay between queue items
+        self._refresh_max_retries: int = 3  # Attempts before kicked out
+        self._reauth_timeout_seconds: int = 300  # Time for user to complete OAuth
     def _parse_env_credential_path(self, path: str) -> Optional[str]:
         """
                         f"Environment variables for Qwen Code credential index {credential_index} not found"
                     )
+            # Try file-based loading first (preferred for explicit file paths)
+            try:
+                return await self._read_creds_from_file(path)
+            except IOError:
+                # File not found - fall back to legacy env vars for backwards compatibility
+                env_creds = self._load_from_env()
+                if env_creds:
+                    lib_logger.info(
+                        f"File '{path}' not found, using Qwen Code credentials from environment variables"
+                    )
+                    self._credentials_cache[path] = env_creds
+                    return env_creds
+                raise  # Re-raise the original file not found error
     async def _save_credentials(self, path: str, creds: Dict[str, Any]):
+        """Save credentials with in-memory fallback if disk unavailable."""
+        # Always update cache first (memory is reliable)
+        self._credentials_cache[path] = creds
         # Don't save to file if credentials were loaded from environment
         if creds.get("_proxy_metadata", {}).get("loaded_from_env"):
             lib_logger.debug("Credentials loaded from env, skipping file save")
             return
+        # Attempt disk write - if it fails, we still have the cache
+        # buffer_on_failure ensures data is retried periodically and saved on shutdown
+        if safe_write_json(
+            path, creds, lib_logger, secure_permissions=True, buffer_on_failure=True
+        ):
+            lib_logger.debug(f"Saved updated Qwen OAuth credentials to '{path}'.")
+        else:
+            lib_logger.warning(
+                "Qwen credentials cached in memory only (buffered for retry)."
             )
     def _is_token_expired(self, creds: Dict[str, Any]) -> bool:
         expiry_timestamp = creds.get("expiry_date", 0) / 1000
         return expiry_timestamp < time.time() + REFRESH_EXPIRY_BUFFER_SECONDS
+    def _is_token_truly_expired(self, creds: Dict[str, Any]) -> bool:
+        """Check if token is TRULY expired (past actual expiry, not just threshold).
+        This is different from _is_token_expired() which uses a buffer for proactive refresh.
+        This method checks if the token is actually unusable.
+        """
+        expiry_timestamp = creds.get("expiry_date", 0) / 1000
+        return expiry_timestamp < time.time()
     async def _refresh_token(self, path: str, force: bool = False) -> Dict[str, Any]:
         async with await self._get_lock(path):
             cached_creds = self._credentials_cache.get(path)
         Proactively refreshes tokens if they're close to expiry.
         Only applies to OAuth credentials (file paths or env:// paths). Direct API keys are skipped.
         """
+        # lib_logger.debug(f"proactively_refresh called for: {credential_identifier}")
         # Try to load credentials - this will fail for direct API keys
         # and succeed for OAuth credentials (file paths or env:// paths)
             creds = await self._load_credentials(credential_identifier)
         except IOError as e:
             # Not a valid credential path (likely a direct API key string)
+            # lib_logger.debug(
+            #     f"Skipping refresh for '{credential_identifier}' - not an OAuth credential: {e}"
+            # )
             return
         is_expired = self._is_token_expired(creds)
+        # lib_logger.debug(
+        #     f"Token expired check for '{Path(credential_identifier).name}': {is_expired}"
+        # )
         if is_expired:
+            # lib_logger.debug(
+            #     f"Queueing refresh for '{Path(credential_identifier).name}'"
+            # )
+            # lib_logger.info(f"Proactive refresh triggered for '{Path(credential_identifier).name}'")
             await self._queue_refresh(
                 credential_identifier, force=False, needs_reauth=False
             )
             return self._refresh_locks[path]
     def is_credential_available(self, path: str) -> bool:
+        """Check if a credential is available for rotation.
+        Credentials are unavailable if:
+        1. In re-auth queue (token is truly broken, requires user interaction)
+        2. Token is TRULY expired (past actual expiry, not just threshold)
+        Note: Credentials in normal refresh queue are still available because
+        the old token is valid until actual expiry.
+        TTL cleanup (defense-in-depth): If a credential has been in the re-auth
+        queue longer than _unavailable_ttl_seconds without being processed, it's
+        cleaned up. This should only happen if the re-auth processor crashes or
+        is cancelled without proper cleanup.
         """
+        # Check if in re-auth queue (truly unavailable)
+        if path in self._unavailable_credentials:
+            marked_time = self._unavailable_credentials.get(path)
+            if marked_time is not None:
+                now = time.time()
+                if now - marked_time > self._unavailable_ttl_seconds:
+                    # Entry is stale - clean it up and return available
+                    # This is a defense-in-depth for edge cases where re-auth
+                    # processor crashed or was cancelled without cleanup
+                    lib_logger.warning(
+                        f"Credential '{Path(path).name}' stuck in re-auth queue for "
+                        f"{int(now - marked_time)}s (TTL: {self._unavailable_ttl_seconds}s). "
+                        f"Re-auth processor may have crashed. Auto-cleaning stale entry."
+                    )
+                    # Clean up both tracking structures for consistency
+                    self._unavailable_credentials.pop(path, None)
+                    self._queued_credentials.discard(path)
+                else:
+                    return False  # Still in re-auth, not available
+        # Check if token is TRULY expired (not just threshold-expired)
+        creds = self._credentials_cache.get(path)
+        if creds and self._is_token_truly_expired(creds):
+            # Token is actually expired - should not be used
+            # Queue for refresh if not already queued
+            if path not in self._queued_credentials:
+                # lib_logger.debug(
+                #     f"Credential '{Path(path).name}' is truly expired, queueing for refresh"
+                # )
+                asyncio.create_task(
+                    self._queue_refresh(path, force=True, needs_reauth=False)
                 )
+            return False
+        return True
     async def _ensure_queue_processor_running(self):
         """Lazily starts the queue processor if not already running."""
                 self._process_refresh_queue()
             )
+    async def _ensure_reauth_processor_running(self):
+        """Lazily starts the re-auth queue processor if not already running."""
+        if self._reauth_processor_task is None or self._reauth_processor_task.done():
+            self._reauth_processor_task = asyncio.create_task(
+                self._process_reauth_queue()
+            )
     async def _queue_refresh(
         self, path: str, force: bool = False, needs_reauth: bool = False
     ):
+        """Add a credential to the appropriate refresh queue if not already queued.
         Args:
             path: Credential file path
             force: Force refresh even if not expired
+            needs_reauth: True if full re-authentication needed (routes to re-auth queue)
+        Queue routing:
+        - needs_reauth=True: Goes to re-auth queue, marks as unavailable
+        - needs_reauth=False: Goes to normal refresh queue, does NOT mark unavailable
+          (old token is still valid until actual expiry)
         """
         # IMPORTANT: Only check backoff for simple automated refreshes
         # Re-authentication (interactive OAuth) should BYPASS backoff since it needs user input
                 backoff_until = self._next_refresh_after[path]
                 if now < backoff_until:
                     # Credential is in backoff for automated refresh, do not queue
+                    # remaining = int(backoff_until - now)
+                    # lib_logger.debug(
+                    #     f"Skipping automated refresh for '{Path(path).name}' (in backoff for {remaining}s)"
+                    # )
                     return
         async with self._queue_tracking_lock:
             if path not in self._queued_credentials:
                 self._queued_credentials.add(path)
+                if needs_reauth:
+                    # Re-auth queue: mark as unavailable (token is truly broken)
+                    self._unavailable_credentials[path] = time.time()
+                    # lib_logger.debug(
+                    #     f"Queued '{Path(path).name}' for RE-AUTH (marked unavailable). "
+                    #     f"Total unavailable: {len(self._unavailable_credentials)}"
+                    # )
+                    await self._reauth_queue.put(path)
+                    await self._ensure_reauth_processor_running()
+                else:
+                    # Normal refresh queue: do NOT mark unavailable (old token still valid)
+                    # lib_logger.debug(
+                    #     f"Queued '{Path(path).name}' for refresh (still available). "
+                    #     f"Queue size: {self._refresh_queue.qsize() + 1}"
+                    # )
+                    await self._refresh_queue.put((path, force))
+                    await self._ensure_queue_processor_running()
     async def _process_refresh_queue(self):
+        """Background worker that processes normal refresh requests sequentially.
+        Key behaviors:
+        - 15s timeout per refresh operation
+        - 30s delay between processing credentials (prevents thundering herd)
+        - On failure: back of queue, max 3 retries before kicked
+        - If 401/403 detected: routes to re-auth queue
+        - Does NOT mark credentials unavailable (old token still valid)
+        """
+        # lib_logger.info("Refresh queue processor started")
         while True:
             path = None
             try:
                 # Wait for an item with timeout to allow graceful shutdown
                 try:
+                    path, force = await asyncio.wait_for(
                         self._refresh_queue.get(), timeout=60.0
                     )
                 except asyncio.TimeoutError:
+                    # Queue is empty and idle for 60s - clean up and exit
                     async with self._queue_tracking_lock:
+                        # Clear any stale retry counts
+                        self._queue_retry_count.clear()
                     self._queue_processor_task = None
+                    # lib_logger.debug("Refresh queue processor idle, shutting down")
                     return
                 try:
+                    # Quick check if still expired (optimization to avoid unnecessary refresh)
+                    creds = self._credentials_cache.get(path)
+                    if creds and not self._is_token_expired(creds):
+                        # No longer expired, skip refresh
+                        # lib_logger.debug(
+                        #     f"Credential '{Path(path).name}' no longer expired, skipping refresh"
+                        # )
+                        # Clear retry count on skip (not a failure)
+                        self._queue_retry_count.pop(path, None)
+                        continue
+                    # Perform refresh with timeout
+                    try:
+                        async with asyncio.timeout(self._refresh_timeout_seconds):
+                            await self._refresh_token(path, force=force)
+                        # SUCCESS: Clear retry count
+                        self._queue_retry_count.pop(path, None)
+                        # lib_logger.info(f"Refresh SUCCESS for '{Path(path).name}'")
+                    except asyncio.TimeoutError:
+                        lib_logger.warning(
+                            f"Refresh timeout ({self._refresh_timeout_seconds}s) for '{Path(path).name}'"
+                        )
+                        await self._handle_refresh_failure(path, force, "timeout")
+                    except httpx.HTTPStatusError as e:
+                        status_code = e.response.status_code
+                        if status_code in (401, 403):
+                            # Invalid refresh token - route to re-auth queue
+                            lib_logger.warning(
+                                f"Refresh token invalid for '{Path(path).name}' (HTTP {status_code}). "
+                                f"Routing to re-auth queue."
+                            )
+                            self._queue_retry_count.pop(path, None)  # Clear retry count
+                            async with self._queue_tracking_lock:
+                                self._queued_credentials.discard(
+                                    path
+                                )  # Remove from queued
+                            await self._queue_refresh(
+                                path, force=True, needs_reauth=True
+                            )
+                        else:
+                            await self._handle_refresh_failure(
+                                path, force, f"HTTP {status_code}"
                             )
+                    except Exception as e:
+                        await self._handle_refresh_failure(path, force, str(e))
                 finally:
+                    # Remove from queued set (unless re-queued by failure handler)
+                    async with self._queue_tracking_lock:
+                        # Only discard if not re-queued (check if still in queue set from retry)
+                        if (
+                            path in self._queued_credentials
+                            and self._queue_retry_count.get(path, 0) == 0
+                        ):
+                            self._queued_credentials.discard(path)
+                    self._refresh_queue.task_done()
+                # Wait between credentials to spread load
+                await asyncio.sleep(self._refresh_interval_seconds)
+            except asyncio.CancelledError:
+                # lib_logger.debug("Refresh queue processor cancelled")
+                break
+            except Exception as e:
+                lib_logger.error(f"Error in refresh queue processor: {e}")
+                if path:
+                    async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
+    async def _handle_refresh_failure(self, path: str, force: bool, error: str):
+        """Handle a refresh failure with back-of-line retry logic.
+        - Increments retry count
+        - If under max retries: re-adds to END of queue
+        - If at max retries: kicks credential out (retried next BackgroundRefresher cycle)
+        """
+        retry_count = self._queue_retry_count.get(path, 0) + 1
+        self._queue_retry_count[path] = retry_count
+        if retry_count >= self._refresh_max_retries:
+            # Kicked out until next BackgroundRefresher cycle
+            lib_logger.error(
+                f"Max retries ({self._refresh_max_retries}) reached for '{Path(path).name}' "
+                f"(last error: {error}). Will retry next refresh cycle."
+            )
+            self._queue_retry_count.pop(path, None)
+            async with self._queue_tracking_lock:
+                self._queued_credentials.discard(path)
+            return
+        # Re-add to END of queue for retry
+        lib_logger.warning(
+            f"Refresh failed for '{Path(path).name}' ({error}). "
+            f"Retry {retry_count}/{self._refresh_max_retries}, back of queue."
+        )
+        # Keep in queued_credentials set, add back to queue
+        await self._refresh_queue.put((path, force))
+    async def _process_reauth_queue(self):
+        """Background worker that processes re-auth requests.
+        Key behaviors:
+        - Credentials ARE marked unavailable (token is truly broken)
+        - Uses ReauthCoordinator for interactive OAuth
+        - No automatic retry (requires user action)
+        - Cleans up unavailable status when done
+        """
+        # lib_logger.info("Re-auth queue processor started")
+        while True:
+            path = None
+            try:
+                # Wait for an item with timeout to allow graceful shutdown
+                try:
+                    path = await asyncio.wait_for(
+                        self._reauth_queue.get(), timeout=60.0
+                    )
+                except asyncio.TimeoutError:
+                    # Queue is empty and idle for 60s - exit
+                    self._reauth_processor_task = None
+                    # lib_logger.debug("Re-auth queue processor idle, shutting down")
+                    return
+                try:
+                    lib_logger.info(f"Starting re-auth for '{Path(path).name}'...")
+                    await self.initialize_token(path)
+                    lib_logger.info(f"Re-auth SUCCESS for '{Path(path).name}'")
+                except Exception as e:
+                    lib_logger.error(f"Re-auth FAILED for '{Path(path).name}': {e}")
+                    # No automatic retry for re-auth (requires user action)
+                finally:
+                    # Always clean up
                     async with self._queue_tracking_lock:
                         self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
+                        # lib_logger.debug(
+                        #     f"Re-auth cleanup for '{Path(path).name}'. "
+                        #     f"Remaining unavailable: {len(self._unavailable_credentials)}"
+                        # )
+                    self._reauth_queue.task_done()
             except asyncio.CancelledError:
+                # Clean up current credential before breaking
                 if path:
                     async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
+                # lib_logger.debug("Re-auth queue processor cancelled")
                 break
             except Exception as e:
+                lib_logger.error(f"Error in re-auth queue processor: {e}")
                 if path:
                     async with self._queue_tracking_lock:
+                        self._queued_credentials.discard(path)
                         self._unavailable_credentials.pop(path, None)
     async def _perform_interactive_oauth(
         self, path: str, creds: Dict[str, Any], display_name: str
         except Exception as e:
             lib_logger.error(f"Failed to get Qwen user info from credentials: {e}")
             return {"email": None}
+    # =========================================================================
+    # CREDENTIAL MANAGEMENT METHODS
+    # =========================================================================
+    def _get_provider_file_prefix(self) -> str:
+        """Return the file prefix for Qwen credentials."""
+        return "qwen_code"
+    def _get_oauth_base_dir(self) -> Path:
+        """Get the base directory for OAuth credential files."""
+        return Path.cwd() / "oauth_creds"
+    def _find_existing_credential_by_email(
+        self, email: str, base_dir: Optional[Path] = None
+    ) -> Optional[Path]:
+        """Find an existing credential file for the given email."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        for cred_file in glob(pattern):
+            try:
+                with open(cred_file, "r") as f:
+                    creds = json.load(f)
+                existing_email = creds.get("_proxy_metadata", {}).get("email")
+                if existing_email == email:
+                    return Path(cred_file)
+            except (json.JSONDecodeError, IOError) as e:
+                lib_logger.debug(f"Could not read credential file {cred_file}: {e}")
+                continue
+        return None
+    def _get_next_credential_number(self, base_dir: Optional[Path] = None) -> int:
+        """Get the next available credential number."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        existing_numbers = []
+        for cred_file in glob(pattern):
+            match = re.search(r"_oauth_(\d+)\.json$", cred_file)
+            if match:
+                existing_numbers.append(int(match.group(1)))
+        if not existing_numbers:
+            return 1
+        return max(existing_numbers) + 1
+    def _build_credential_path(
+        self, base_dir: Optional[Path] = None, number: Optional[int] = None
+    ) -> Path:
+        """Build a path for a new credential file."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        if number is None:
+            number = self._get_next_credential_number(base_dir)
+        prefix = self._get_provider_file_prefix()
+        filename = f"{prefix}_oauth_{number}.json"
+        return base_dir / filename
+    async def setup_credential(
+        self, base_dir: Optional[Path] = None
+    ) -> QwenCredentialSetupResult:
+        """
+        Complete credential setup flow: OAuth -> save.
+        This is the main entry point for setting up new credentials.
+        """
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        # Ensure directory exists
+        base_dir.mkdir(exist_ok=True)
+        try:
+            # Step 1: Perform OAuth authentication
+            temp_creds = {
+                "_proxy_metadata": {"display_name": "new Qwen Code credential"}
+            }
+            new_creds = await self.initialize_token(temp_creds)
+            # Step 2: Get user info for deduplication
+            email = new_creds.get("_proxy_metadata", {}).get("email")
+            if not email:
+                return QwenCredentialSetupResult(
+                    success=False, error="Could not retrieve email from OAuth response"
+                )
+            # Step 3: Check for existing credential with same email
+            existing_path = self._find_existing_credential_by_email(email, base_dir)
+            is_update = existing_path is not None
+            if is_update:
+                file_path = existing_path
+                lib_logger.info(
+                    f"Found existing credential for {email}, updating {file_path.name}"
+                )
+            else:
+                file_path = self._build_credential_path(base_dir)
+                lib_logger.info(
+                    f"Creating new credential for {email} at {file_path.name}"
+                )
+            # Step 4: Save credentials to file
+            await self._save_credentials(str(file_path), new_creds)
+            return QwenCredentialSetupResult(
+                success=True,
+                file_path=str(file_path),
+                email=email,
+                is_update=is_update,
+                credentials=new_creds,
+            )
+        except Exception as e:
+            lib_logger.error(f"Credential setup failed: {e}")
+            return QwenCredentialSetupResult(success=False, error=str(e))
+    def build_env_lines(self, creds: Dict[str, Any], cred_number: int) -> List[str]:
+        """Generate .env file lines for a Qwen credential."""
+        email = creds.get("_proxy_metadata", {}).get("email", "unknown")
+        prefix = f"QWEN_CODE_{cred_number}"
+        lines = [
+            f"# QWEN_CODE Credential #{cred_number} for: {email}",
+            f"# Exported from: qwen_code_oauth_{cred_number}.json",
+            f"# Generated at: {time.strftime('%Y-%m-%d %H:%M:%S')}",
+            "#",
+            "# To combine multiple credentials into one .env file, copy these lines",
+            "# and ensure each credential has a unique number (1, 2, 3, etc.)",
+            "",
+            f"{prefix}_ACCESS_TOKEN={creds.get('access_token', '')}",
+            f"{prefix}_REFRESH_TOKEN={creds.get('refresh_token', '')}",
+            f"{prefix}_EXPIRY_DATE={creds.get('expiry_date', 0)}",
+            f"{prefix}_RESOURCE_URL={creds.get('resource_url', 'https://portal.qwen.ai/v1')}",
+            f"{prefix}_EMAIL={email}",
+        ]
+        return lines
+    def export_credential_to_env(
+        self, credential_path: str, output_dir: Optional[Path] = None
+    ) -> Optional[str]:
+        """Export a credential file to .env format."""
+        try:
+            cred_path = Path(credential_path)
+            # Load credential
+            with open(cred_path, "r") as f:
+                creds = json.load(f)
+            # Extract metadata
+            email = creds.get("_proxy_metadata", {}).get("email", "unknown")
+            # Get credential number from filename
+            match = re.search(r"_oauth_(\d+)\.json$", cred_path.name)
+            cred_number = int(match.group(1)) if match else 1
+            # Build output path
+            if output_dir is None:
+                output_dir = cred_path.parent
+            safe_email = email.replace("@", "_at_").replace(".", "_")
+            env_filename = f"qwen_code_{cred_number}_{safe_email}.env"
+            env_path = output_dir / env_filename
+            # Build and write content
+            env_lines = self.build_env_lines(creds, cred_number)
+            with open(env_path, "w") as f:
+                f.write("\n".join(env_lines))
+            lib_logger.info(f"Exported credential to {env_path}")
+            return str(env_path)
+        except Exception as e:
+            lib_logger.error(f"Failed to export credential: {e}")
+            return None
+    def list_credentials(self, base_dir: Optional[Path] = None) -> List[Dict[str, Any]]:
+        """List all Qwen credential files."""
+        if base_dir is None:
+            base_dir = self._get_oauth_base_dir()
+        prefix = self._get_provider_file_prefix()
+        pattern = str(base_dir / f"{prefix}_oauth_*.json")
+        credentials = []
+        for cred_file in sorted(glob(pattern)):
+            try:
+                with open(cred_file, "r") as f:
+                    creds = json.load(f)
+                metadata = creds.get("_proxy_metadata", {})
+                # Extract number from filename
+                match = re.search(r"_oauth_(\d+)\.json$", cred_file)
+                number = int(match.group(1)) if match else 0
+                credentials.append(
+                    {
+                        "file_path": cred_file,
+                        "email": metadata.get("email", "unknown"),
+                        "number": number,
+                    }
+                )
+            except Exception as e:
+                lib_logger.debug(f"Could not read credential file {cred_file}: {e}")
+                continue
+        return credentials
+    def delete_credential(self, credential_path: str) -> bool:
+        """Delete a credential file."""
+        try:
+            cred_path = Path(credential_path)
+            # Validate that it's one of our credential files
+            prefix = self._get_provider_file_prefix()
+            if not cred_path.name.startswith(f"{prefix}_oauth_"):
+                lib_logger.error(
+                    f"File {cred_path.name} does not appear to be a Qwen Code credential"
+                )
+                return False
+            if not cred_path.exists():
+                lib_logger.warning(f"Credential file does not exist: {credential_path}")
+                return False
+            # Remove from cache if present
+            self._credentials_cache.pop(credential_path, None)
+            # Delete the file
+            cred_path.unlink()
+            lib_logger.info(f"Deleted credential file: {credential_path}")
+            return True
+        except Exception as e:
+            lib_logger.error(f"Failed to delete credential: {e}")
+            return False

src/rotator_library/providers/qwen_code_provider.py CHANGED Viewed

@@ -10,19 +10,27 @@ from typing import Union, AsyncGenerator, List, Dict, Any
 from .provider_interface import ProviderInterface
 from .qwen_auth_base import QwenAuthBase
 from ..model_definitions import ModelDefinitions
 import litellm
 from litellm.exceptions import RateLimitError, AuthenticationError
 from pathlib import Path
 import uuid
 from datetime import datetime
-lib_logger = logging.getLogger('rotator_library')
-LOGS_DIR = Path(__file__).resolve().parent.parent.parent.parent / "logs"
-QWEN_CODE_LOGS_DIR = LOGS_DIR / "qwen_code_logs"
 class _QwenCodeFileLogger:
     """A simple file logger for a single Qwen Code transaction."""
     def __init__(self, model_name: str, enabled: bool = True):
         self.enabled = enabled
         if not self.enabled:
@@ -31,8 +39,10 @@ class _QwenCodeFileLogger:
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
         request_id = str(uuid.uuid4())
         # Sanitize model name for directory
-        safe_model_name = model_name.replace('/', '_').replace(':', '_')
-        self.log_dir = QWEN_CODE_LOGS_DIR / f"{timestamp}_{safe_model_name}_{request_id}"
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
         except Exception as e:
@@ -41,25 +51,32 @@ class _QwenCodeFileLogger:
     def log_request(self, payload: Dict[str, Any]):
         """Logs the request payload sent to Qwen Code."""
-        if not self.enabled: return
         try:
-            with open(self.log_dir / "request_payload.json", "w", encoding="utf-8") as f:
                 json.dump(payload, f, indent=2, ensure_ascii=False)
         except Exception as e:
             lib_logger.error(f"_QwenCodeFileLogger: Failed to write request: {e}")
     def log_response_chunk(self, chunk: str):
         """Logs a raw chunk from the Qwen Code response stream."""
-        if not self.enabled: return
         try:
             with open(self.log_dir / "response_stream.log", "a", encoding="utf-8") as f:
                 f.write(chunk + "\n")
         except Exception as e:
-            lib_logger.error(f"_QwenCodeFileLogger: Failed to write response chunk: {e}")
     def log_error(self, error_message: str):
         """Logs an error message."""
-        if not self.enabled: return
         try:
             with open(self.log_dir / "error.log", "a", encoding="utf-8") as f:
                 f.write(f"[{datetime.utcnow().isoformat()}] {error_message}\n")
@@ -68,28 +85,41 @@ class _QwenCodeFileLogger:
     def log_final_response(self, response_data: Dict[str, Any]):
         """Logs the final, reassembled response."""
-        if not self.enabled: return
         try:
             with open(self.log_dir / "final_response.json", "w", encoding="utf-8") as f:
                 json.dump(response_data, f, indent=2, ensure_ascii=False)
         except Exception as e:
-            lib_logger.error(f"_QwenCodeFileLogger: Failed to write final response: {e}")
-HARDCODED_MODELS = [
-    "qwen3-coder-plus",
-    "qwen3-coder-flash"
-]
 # OpenAI-compatible parameters supported by Qwen Code API
 SUPPORTED_PARAMS = {
-    'model', 'messages', 'temperature', 'top_p', 'max_tokens',
-    'stream', 'tools', 'tool_choice', 'presence_penalty',
-    'frequency_penalty', 'n', 'stop', 'seed', 'response_format'
 }
 class QwenCodeProvider(QwenAuthBase, ProviderInterface):
     skip_cost_calculation = True
-    REASONING_START_MARKER = 'THINK||'
     def __init__(self):
         super().__init__()
@@ -111,7 +141,9 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         Validates OAuth credentials if applicable.
         """
         models = []
-        env_var_ids = set()  # Track IDs from env vars to prevent hardcoded/dynamic duplicates
         def extract_model_id(item) -> str:
             """Extract model ID from various formats (dict, string with/without provider prefix)."""
@@ -137,7 +169,9 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
                 # Track the ID to prevent hardcoded/dynamic duplicates
                 if model_id:
                     env_var_ids.add(model_id)
-            lib_logger.info(f"Loaded {len(static_models)} static models for qwen_code from environment variables")
         # Source 2: Add hardcoded models (only if ID not already in env vars)
         for model_id in HARDCODED_MODELS:
@@ -155,14 +189,17 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
             models_url = f"{api_base.rstrip('/')}/v1/models"
             response = await client.get(
-                models_url,
-                headers={"Authorization": f"Bearer {access_token}"}
             )
             response.raise_for_status()
             dynamic_data = response.json()
             # Handle both {data: [...]} and direct [...] formats
-            model_list = dynamic_data.get("data", dynamic_data) if isinstance(dynamic_data, dict) else dynamic_data
             dynamic_count = 0
             for model in model_list:
@@ -173,7 +210,9 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
                     dynamic_count += 1
             if dynamic_count > 0:
-                lib_logger.debug(f"Discovered {dynamic_count} additional models for qwen_code from API")
         except Exception as e:
             # Silently ignore dynamic discovery errors
@@ -238,10 +277,10 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         payload = {k: v for k, v in kwargs.items() if k in SUPPORTED_PARAMS}
         # Always force streaming for internal processing
-        payload['stream'] = True
         # Always include usage data in stream
-        payload['stream_options'] = {"include_usage": True}
         # Handle tool schema cleaning
         if "tools" in payload and payload["tools"]:
@@ -250,22 +289,26 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         elif not payload.get("tools"):
             # Per Qwen Code API bug (see: https://github.com/qianwen-team/flash-dance/issues/2),
             # injecting a dummy tool prevents stream corruption when no tools are provided
-            payload["tools"] = [{
-                "type": "function",
-                "function": {
-                    "name": "do_not_call_me",
-                    "description": "Do not call this tool.",
-                    "parameters": {"type": "object", "properties": {}}
                 }
-            }]
-            lib_logger.debug("Injected dummy tool to prevent Qwen API stream corruption")
         return payload
     def _convert_chunk_to_openai(self, chunk: Dict[str, Any], model_id: str):
         """
         Converts a raw Qwen SSE chunk to an OpenAI-compatible chunk.
         CRITICAL FIX: Handle chunks with BOTH usage and choices (final chunk)
         without early return to ensure finish_reason is properly processed.
         """
@@ -287,32 +330,42 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
             # Yield the choice chunk first (contains finish_reason)
             yield {
-                "choices": [{"index": 0, "delta": delta, "finish_reason": finish_reason}],
-                "model": model_id, "object": "chat.completion.chunk",
-                "id": chunk_id, "created": chunk_created
             }
             # Then yield the usage chunk
             yield {
-                "choices": [], "model": model_id, "object": "chat.completion.chunk",
-                "id": chunk_id, "created": chunk_created,
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
-                }
             }
             return
         # Handle usage-only chunks
         if usage_data:
             yield {
-                "choices": [], "model": model_id, "object": "chat.completion.chunk",
-                "id": chunk_id, "created": chunk_created,
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
-                }
             }
             return
@@ -327,35 +380,52 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         # Handle <think> tags for reasoning content
         content = delta.get("content")
         if content and ("<think>" in content or "</think>" in content):
-            parts = content.replace("<think>", f"||{self.REASONING_START_MARKER}").replace("</think>", f"||/{self.REASONING_START_MARKER}").split("||")
             for part in parts:
-                if not part: continue
                 new_delta = {}
                 if part.startswith(self.REASONING_START_MARKER):
-                    new_delta['reasoning_content'] = part.replace(self.REASONING_START_MARKER, "")
                 elif part.startswith(f"/{self.REASONING_START_MARKER}"):
                     continue
                 else:
-                    new_delta['content'] = part
                 yield {
-                    "choices": [{"index": 0, "delta": new_delta, "finish_reason": None}],
-                    "model": model_id, "object": "chat.completion.chunk",
-                    "id": chunk_id, "created": chunk_created
                 }
         else:
             # Standard content chunk
             yield {
-                "choices": [{"index": 0, "delta": delta, "finish_reason": finish_reason}],
-                "model": model_id, "object": "chat.completion.chunk",
-                "id": chunk_id, "created": chunk_created
             }
-    def _stream_to_completion_response(self, chunks: List[litellm.ModelResponse]) -> litellm.ModelResponse:
         """
         Manually reassembles streaming chunks into a complete response.
         Key improvements:
         - Determines finish_reason based on accumulated state (tool_calls vs stop)
         - Properly initializes tool_calls with type field
@@ -368,14 +438,16 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         final_message = {"role": "assistant"}
         aggregated_tool_calls = {}
         usage_data = None
-        chunk_finish_reason = None  # Track finish_reason from chunks (but we'll override)
         # Get the first chunk for basic response metadata
         first_chunk = chunks[0]
         # Process each chunk to aggregate content
         for chunk in chunks:
-            if not hasattr(chunk, 'choices') or not chunk.choices:
                 continue
             choice = chunk.choices[0]
@@ -399,25 +471,48 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
                     index = tc_chunk.get("index", 0)
                     if index not in aggregated_tool_calls:
                         # Initialize with type field for OpenAI compatibility
-                        aggregated_tool_calls[index] = {"type": "function", "function": {"name": "", "arguments": ""}}
                     if "id" in tc_chunk:
                         aggregated_tool_calls[index]["id"] = tc_chunk["id"]
                     if "type" in tc_chunk:
                         aggregated_tool_calls[index]["type"] = tc_chunk["type"]
                     if "function" in tc_chunk:
-                        if "name" in tc_chunk["function"] and tc_chunk["function"]["name"] is not None:
-                            aggregated_tool_calls[index]["function"]["name"] += tc_chunk["function"]["name"]
-                        if "arguments" in tc_chunk["function"] and tc_chunk["function"]["arguments"] is not None:
-                            aggregated_tool_calls[index]["function"]["arguments"] += tc_chunk["function"]["arguments"]
             # Aggregate function calls (legacy format)
             if "function_call" in delta and delta["function_call"] is not None:
                 if "function_call" not in final_message:
                     final_message["function_call"] = {"name": "", "arguments": ""}
-                if "name" in delta["function_call"] and delta["function_call"]["name"] is not None:
-                    final_message["function_call"]["name"] += delta["function_call"]["name"]
-                if "arguments" in delta["function_call"] and delta["function_call"]["arguments"] is not None:
-                    final_message["function_call"]["arguments"] += delta["function_call"]["arguments"]
             # Track finish_reason from chunks (for reference only)
             if choice.get("finish_reason"):
@@ -425,7 +520,7 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         # Handle usage data from the last chunk that has it
         for chunk in reversed(chunks):
-            if hasattr(chunk, 'usage') and chunk.usage:
                 usage_data = chunk.usage
                 break
@@ -451,7 +546,7 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         final_choice = {
             "index": 0,
             "message": final_message,
-            "finish_reason": finish_reason
         }
         # Create the final ModelResponse
@@ -461,20 +556,21 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
             "created": first_chunk.created,
             "model": first_chunk.model,
             "choices": [final_choice],
-            "usage": usage_data
         }
         return litellm.ModelResponse(**final_response_data)
-    async def acompletion(self, client: httpx.AsyncClient, **kwargs) -> Union[litellm.ModelResponse, AsyncGenerator[litellm.ModelResponse, None]]:
         credential_path = kwargs.pop("credential_identifier")
         enable_request_logging = kwargs.pop("enable_request_logging", False)
         model = kwargs["model"]
         # Create dedicated file logger for this request
         file_logger = _QwenCodeFileLogger(
-            model_name=model,
-            enabled=enable_request_logging
         )
         async def make_request():
@@ -482,8 +578,8 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
             api_base, access_token = await self.get_api_details(credential_path)
             # Strip provider prefix from model name (e.g., "qwen_code/qwen3-coder-plus" -> "qwen3-coder-plus")
-            model_name = model.split('/')[-1]
-            kwargs_with_stripped_model = {**kwargs, 'model': model_name}
             # Build clean payload with only supported parameters
             payload = self._build_request_payload(**kwargs_with_stripped_model)
@@ -503,7 +599,13 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
             file_logger.log_request(payload)
             lib_logger.debug(f"Qwen Code Request URL: {url}")
-            return client.stream("POST", url, headers=headers, json=payload, timeout=600)
         async def stream_handler(response_stream, attempt=1):
             """Handles the streaming response and converts chunks."""
@@ -512,11 +614,17 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
                     # Check for HTTP errors before processing stream
                     if response.status_code >= 400:
                         error_text = await response.aread()
-                        error_text = error_text.decode('utf-8') if isinstance(error_text, bytes) else error_text
                         # Handle 401: Force token refresh and retry once
                         if response.status_code == 401 and attempt == 1:
-                            lib_logger.warning("Qwen Code returned 401. Forcing token refresh and retrying once.")
                             await self._refresh_token(credential_path, force=True)
                             retry_stream = await make_request()
                             async for chunk in stream_handler(retry_stream, attempt=2):
@@ -524,12 +632,15 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
                             return
                         # Handle 429: Rate limit
-                        elif response.status_code == 429 or "slow_down" in error_text.lower():
                             raise RateLimitError(
                                 f"Qwen Code rate limit exceeded: {error_text}",
                                 llm_provider="qwen_code",
                                 model=model,
-                                response=response
                             )
                         # Handle other errors
@@ -539,28 +650,34 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
                             raise httpx.HTTPStatusError(
                                 f"HTTP {response.status_code}: {error_text}",
                                 request=response.request,
-                                response=response
                             )
                     # Process successful streaming response
                     async for line in response.aiter_lines():
                         file_logger.log_response_chunk(line)
-                        if line.startswith('data: '):
                             data_str = line[6:]
                             if data_str == "[DONE]":
                                 break
                             try:
                                 chunk = json.loads(data_str)
-                                for openai_chunk in self._convert_chunk_to_openai(chunk, model):
                                     yield litellm.ModelResponse(**openai_chunk)
                             except json.JSONDecodeError:
-                                lib_logger.warning(f"Could not decode JSON from Qwen Code: {line}")
             except httpx.HTTPStatusError:
                 raise  # Re-raise HTTP errors we already handled
             except Exception as e:
                 file_logger.log_error(f"Error during Qwen Code stream processing: {e}")
-                lib_logger.error(f"Error during Qwen Code stream processing: {e}", exc_info=True)
                 raise
         async def logging_stream_wrapper():
@@ -578,7 +695,9 @@ class QwenCodeProvider(QwenAuthBase, ProviderInterface):
         if kwargs.get("stream"):
             return logging_stream_wrapper()
         else:
             async def non_stream_wrapper():
                 chunks = [chunk async for chunk in logging_stream_wrapper()]
                 return self._stream_to_completion_response(chunks)
-            return await non_stream_wrapper()

 from .provider_interface import ProviderInterface
 from .qwen_auth_base import QwenAuthBase
 from ..model_definitions import ModelDefinitions
+from ..timeout_config import TimeoutConfig
+from ..utils.paths import get_logs_dir
 import litellm
 from litellm.exceptions import RateLimitError, AuthenticationError
 from pathlib import Path
 import uuid
 from datetime import datetime
+lib_logger = logging.getLogger("rotator_library")
+def _get_qwen_code_logs_dir() -> Path:
+    """Get the Qwen Code logs directory."""
+    logs_dir = get_logs_dir() / "qwen_code_logs"
+    logs_dir.mkdir(parents=True, exist_ok=True)
+    return logs_dir
 class _QwenCodeFileLogger:
     """A simple file logger for a single Qwen Code transaction."""
     def __init__(self, model_name: str, enabled: bool = True):
         self.enabled = enabled
         if not self.enabled:
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
         request_id = str(uuid.uuid4())
         # Sanitize model name for directory
+        safe_model_name = model_name.replace("/", "_").replace(":", "_")
+        self.log_dir = (
+            _get_qwen_code_logs_dir() / f"{timestamp}_{safe_model_name}_{request_id}"
+        )
         try:
             self.log_dir.mkdir(parents=True, exist_ok=True)
         except Exception as e:
     def log_request(self, payload: Dict[str, Any]):
         """Logs the request payload sent to Qwen Code."""
+        if not self.enabled:
+            return
         try:
+            with open(
+                self.log_dir / "request_payload.json", "w", encoding="utf-8"
+            ) as f:
                 json.dump(payload, f, indent=2, ensure_ascii=False)
         except Exception as e:
             lib_logger.error(f"_QwenCodeFileLogger: Failed to write request: {e}")
     def log_response_chunk(self, chunk: str):
         """Logs a raw chunk from the Qwen Code response stream."""
+        if not self.enabled:
+            return
         try:
             with open(self.log_dir / "response_stream.log", "a", encoding="utf-8") as f:
                 f.write(chunk + "\n")
         except Exception as e:
+            lib_logger.error(
+                f"_QwenCodeFileLogger: Failed to write response chunk: {e}"
+            )
     def log_error(self, error_message: str):
         """Logs an error message."""
+        if not self.enabled:
+            return
         try:
             with open(self.log_dir / "error.log", "a", encoding="utf-8") as f:
                 f.write(f"[{datetime.utcnow().isoformat()}] {error_message}\n")
     def log_final_response(self, response_data: Dict[str, Any]):
         """Logs the final, reassembled response."""
+        if not self.enabled:
+            return
         try:
             with open(self.log_dir / "final_response.json", "w", encoding="utf-8") as f:
                 json.dump(response_data, f, indent=2, ensure_ascii=False)
         except Exception as e:
+            lib_logger.error(
+                f"_QwenCodeFileLogger: Failed to write final response: {e}"
+            )
+HARDCODED_MODELS = ["qwen3-coder-plus", "qwen3-coder-flash"]
 # OpenAI-compatible parameters supported by Qwen Code API
 SUPPORTED_PARAMS = {
+    "model",
+    "messages",
+    "temperature",
+    "top_p",
+    "max_tokens",
+    "stream",
+    "tools",
+    "tool_choice",
+    "presence_penalty",
+    "frequency_penalty",
+    "n",
+    "stop",
+    "seed",
+    "response_format",
 }
 class QwenCodeProvider(QwenAuthBase, ProviderInterface):
     skip_cost_calculation = True
+    REASONING_START_MARKER = "THINK||"
     def __init__(self):
         super().__init__()
         Validates OAuth credentials if applicable.
         """
         models = []
+        env_var_ids = (
+            set()
+        )  # Track IDs from env vars to prevent hardcoded/dynamic duplicates
         def extract_model_id(item) -> str:
             """Extract model ID from various formats (dict, string with/without provider prefix)."""
                 # Track the ID to prevent hardcoded/dynamic duplicates
                 if model_id:
                     env_var_ids.add(model_id)
+            lib_logger.info(
+                f"Loaded {len(static_models)} static models for qwen_code from environment variables"
+            )
         # Source 2: Add hardcoded models (only if ID not already in env vars)
         for model_id in HARDCODED_MODELS:
             models_url = f"{api_base.rstrip('/')}/v1/models"
             response = await client.get(
+                models_url, headers={"Authorization": f"Bearer {access_token}"}
             )
             response.raise_for_status()
             dynamic_data = response.json()
             # Handle both {data: [...]} and direct [...] formats
+            model_list = (
+                dynamic_data.get("data", dynamic_data)
+                if isinstance(dynamic_data, dict)
+                else dynamic_data
+            )
             dynamic_count = 0
             for model in model_list:
                     dynamic_count += 1
             if dynamic_count > 0:
+                lib_logger.debug(
+                    f"Discovered {dynamic_count} additional models for qwen_code from API"
+                )
         except Exception as e:
             # Silently ignore dynamic discovery errors
         payload = {k: v for k, v in kwargs.items() if k in SUPPORTED_PARAMS}
         # Always force streaming for internal processing
+        payload["stream"] = True
         # Always include usage data in stream
+        payload["stream_options"] = {"include_usage": True}
         # Handle tool schema cleaning
         if "tools" in payload and payload["tools"]:
         elif not payload.get("tools"):
             # Per Qwen Code API bug (see: https://github.com/qianwen-team/flash-dance/issues/2),
             # injecting a dummy tool prevents stream corruption when no tools are provided
+            payload["tools"] = [
+                {
+                    "type": "function",
+                    "function": {
+                        "name": "do_not_call_me",
+                        "description": "Do not call this tool.",
+                        "parameters": {"type": "object", "properties": {}},
+                    },
                 }
+            ]
+            lib_logger.debug(
+                "Injected dummy tool to prevent Qwen API stream corruption"
+            )
         return payload
     def _convert_chunk_to_openai(self, chunk: Dict[str, Any], model_id: str):
         """
         Converts a raw Qwen SSE chunk to an OpenAI-compatible chunk.
         CRITICAL FIX: Handle chunks with BOTH usage and choices (final chunk)
         without early return to ensure finish_reason is properly processed.
         """
             # Yield the choice chunk first (contains finish_reason)
             yield {
+                "choices": [
+                    {"index": 0, "delta": delta, "finish_reason": finish_reason}
+                ],
+                "model": model_id,
+                "object": "chat.completion.chunk",
+                "id": chunk_id,
+                "created": chunk_created,
             }
             # Then yield the usage chunk
             yield {
+                "choices": [],
+                "model": model_id,
+                "object": "chat.completion.chunk",
+                "id": chunk_id,
+                "created": chunk_created,
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
+                },
             }
             return
         # Handle usage-only chunks
         if usage_data:
             yield {
+                "choices": [],
+                "model": model_id,
+                "object": "chat.completion.chunk",
+                "id": chunk_id,
+                "created": chunk_created,
                 "usage": {
                     "prompt_tokens": usage_data.get("prompt_tokens", 0),
                     "completion_tokens": usage_data.get("completion_tokens", 0),
                     "total_tokens": usage_data.get("total_tokens", 0),
+                },
             }
             return
         # Handle <think> tags for reasoning content
         content = delta.get("content")
         if content and ("<think>" in content or "</think>" in content):
+            parts = (
+                content.replace("<think>", f"||{self.REASONING_START_MARKER}")
+                .replace("</think>", f"||/{self.REASONING_START_MARKER}")
+                .split("||")
+            )
             for part in parts:
+                if not part:
+                    continue
                 new_delta = {}
                 if part.startswith(self.REASONING_START_MARKER):
+                    new_delta["reasoning_content"] = part.replace(
+                        self.REASONING_START_MARKER, ""
+                    )
                 elif part.startswith(f"/{self.REASONING_START_MARKER}"):
                     continue
                 else:
+                    new_delta["content"] = part
                 yield {
+                    "choices": [
+                        {"index": 0, "delta": new_delta, "finish_reason": None}
+                    ],
+                    "model": model_id,
+                    "object": "chat.completion.chunk",
+                    "id": chunk_id,
+                    "created": chunk_created,
                 }
         else:
             # Standard content chunk
             yield {
+                "choices": [
+                    {"index": 0, "delta": delta, "finish_reason": finish_reason}
+                ],
+                "model": model_id,
+                "object": "chat.completion.chunk",
+                "id": chunk_id,
+                "created": chunk_created,
             }
+    def _stream_to_completion_response(
+        self, chunks: List[litellm.ModelResponse]
+    ) -> litellm.ModelResponse:
         """
         Manually reassembles streaming chunks into a complete response.
         Key improvements:
         - Determines finish_reason based on accumulated state (tool_calls vs stop)
         - Properly initializes tool_calls with type field
         final_message = {"role": "assistant"}
         aggregated_tool_calls = {}
         usage_data = None
+        chunk_finish_reason = (
+            None  # Track finish_reason from chunks (but we'll override)
+        )
         # Get the first chunk for basic response metadata
         first_chunk = chunks[0]
         # Process each chunk to aggregate content
         for chunk in chunks:
+            if not hasattr(chunk, "choices") or not chunk.choices:
                 continue
             choice = chunk.choices[0]
                     index = tc_chunk.get("index", 0)
                     if index not in aggregated_tool_calls:
                         # Initialize with type field for OpenAI compatibility
+                        aggregated_tool_calls[index] = {
+                            "type": "function",
+                            "function": {"name": "", "arguments": ""},
+                        }
                     if "id" in tc_chunk:
                         aggregated_tool_calls[index]["id"] = tc_chunk["id"]
                     if "type" in tc_chunk:
                         aggregated_tool_calls[index]["type"] = tc_chunk["type"]
                     if "function" in tc_chunk:
+                        if (
+                            "name" in tc_chunk["function"]
+                            and tc_chunk["function"]["name"] is not None
+                        ):
+                            aggregated_tool_calls[index]["function"]["name"] += (
+                                tc_chunk["function"]["name"]
+                            )
+                        if (
+                            "arguments" in tc_chunk["function"]
+                            and tc_chunk["function"]["arguments"] is not None
+                        ):
+                            aggregated_tool_calls[index]["function"]["arguments"] += (
+                                tc_chunk["function"]["arguments"]
+                            )
             # Aggregate function calls (legacy format)
             if "function_call" in delta and delta["function_call"] is not None:
                 if "function_call" not in final_message:
                     final_message["function_call"] = {"name": "", "arguments": ""}
+                if (
+                    "name" in delta["function_call"]
+                    and delta["function_call"]["name"] is not None
+                ):
+                    final_message["function_call"]["name"] += delta["function_call"][
+                        "name"
+                    ]
+                if (
+                    "arguments" in delta["function_call"]
+                    and delta["function_call"]["arguments"] is not None
+                ):
+                    final_message["function_call"]["arguments"] += delta[
+                        "function_call"
+                    ]["arguments"]
             # Track finish_reason from chunks (for reference only)
             if choice.get("finish_reason"):
         # Handle usage data from the last chunk that has it
         for chunk in reversed(chunks):
+            if hasattr(chunk, "usage") and chunk.usage:
                 usage_data = chunk.usage
                 break
         final_choice = {
             "index": 0,
             "message": final_message,
+            "finish_reason": finish_reason,
         }
         # Create the final ModelResponse
             "created": first_chunk.created,
             "model": first_chunk.model,
             "choices": [final_choice],
+            "usage": usage_data,
         }
         return litellm.ModelResponse(**final_response_data)
+    async def acompletion(
+        self, client: httpx.AsyncClient, **kwargs
+    ) -> Union[litellm.ModelResponse, AsyncGenerator[litellm.ModelResponse, None]]:
         credential_path = kwargs.pop("credential_identifier")
         enable_request_logging = kwargs.pop("enable_request_logging", False)
         model = kwargs["model"]
         # Create dedicated file logger for this request
         file_logger = _QwenCodeFileLogger(
+            model_name=model, enabled=enable_request_logging
         )
         async def make_request():
             api_base, access_token = await self.get_api_details(credential_path)
             # Strip provider prefix from model name (e.g., "qwen_code/qwen3-coder-plus" -> "qwen3-coder-plus")
+            model_name = model.split("/")[-1]
+            kwargs_with_stripped_model = {**kwargs, "model": model_name}
             # Build clean payload with only supported parameters
             payload = self._build_request_payload(**kwargs_with_stripped_model)
             file_logger.log_request(payload)
             lib_logger.debug(f"Qwen Code Request URL: {url}")
+            return client.stream(
+                "POST",
+                url,
+                headers=headers,
+                json=payload,
+                timeout=TimeoutConfig.streaming(),
+            )
         async def stream_handler(response_stream, attempt=1):
             """Handles the streaming response and converts chunks."""
                     # Check for HTTP errors before processing stream
                     if response.status_code >= 400:
                         error_text = await response.aread()
+                        error_text = (
+                            error_text.decode("utf-8")
+                            if isinstance(error_text, bytes)
+                            else error_text
+                        )
                         # Handle 401: Force token refresh and retry once
                         if response.status_code == 401 and attempt == 1:
+                            lib_logger.warning(
+                                "Qwen Code returned 401. Forcing token refresh and retrying once."
+                            )
                             await self._refresh_token(credential_path, force=True)
                             retry_stream = await make_request()
                             async for chunk in stream_handler(retry_stream, attempt=2):
                             return
                         # Handle 429: Rate limit
+                        elif (
+                            response.status_code == 429
+                            or "slow_down" in error_text.lower()
+                        ):
                             raise RateLimitError(
                                 f"Qwen Code rate limit exceeded: {error_text}",
                                 llm_provider="qwen_code",
                                 model=model,
+                                response=response,
                             )
                         # Handle other errors
                             raise httpx.HTTPStatusError(
                                 f"HTTP {response.status_code}: {error_text}",
                                 request=response.request,
+                                response=response,
                             )
                     # Process successful streaming response
                     async for line in response.aiter_lines():
                         file_logger.log_response_chunk(line)
+                        if line.startswith("data: "):
                             data_str = line[6:]
                             if data_str == "[DONE]":
                                 break
                             try:
                                 chunk = json.loads(data_str)
+                                for openai_chunk in self._convert_chunk_to_openai(
+                                    chunk, model
+                                ):
                                     yield litellm.ModelResponse(**openai_chunk)
                             except json.JSONDecodeError:
+                                lib_logger.warning(
+                                    f"Could not decode JSON from Qwen Code: {line}"
+                                )
             except httpx.HTTPStatusError:
                 raise  # Re-raise HTTP errors we already handled
             except Exception as e:
                 file_logger.log_error(f"Error during Qwen Code stream processing: {e}")
+                lib_logger.error(
+                    f"Error during Qwen Code stream processing: {e}", exc_info=True
+                )
                 raise
         async def logging_stream_wrapper():
         if kwargs.get("stream"):
             return logging_stream_wrapper()
         else:
             async def non_stream_wrapper():
                 chunks = [chunk async for chunk in logging_stream_wrapper()]
                 return self._stream_to_completion_response(chunks)
+            return await non_stream_wrapper()

src/rotator_library/timeout_config.py ADDED Viewed

	@@ -0,0 +1,102 @@

+# src/rotator_library/timeout_config.py
+"""
+Centralized timeout configuration for HTTP requests.
+All values can be overridden via environment variables:
+    TIMEOUT_CONNECT - Connection establishment timeout (default: 30s)
+    TIMEOUT_WRITE - Request body send timeout (default: 30s)
+    TIMEOUT_POOL - Connection pool acquisition timeout (default: 60s)
+    TIMEOUT_READ_STREAMING - Read timeout between chunks for streaming (default: 180s / 3 min)
+    TIMEOUT_READ_NON_STREAMING - Read timeout for non-streaming responses (default: 600s / 10 min)
+"""
+import os
+import logging
+import httpx
+lib_logger = logging.getLogger("rotator_library")
+class TimeoutConfig:
+    """
+    Centralized timeout configuration for HTTP requests.
+    All values can be overridden via environment variables.
+    """
+    # Default values (in seconds)
+    _CONNECT = 30.0
+    _WRITE = 30.0
+    _POOL = 60.0
+    _READ_STREAMING = 180.0  # 3 minutes between chunks
+    _READ_NON_STREAMING = 600.0  # 10 minutes for full response
+    @classmethod
+    def _get_env_float(cls, key: str, default: float) -> float:
+        """Get a float value from environment variable, or return default."""
+        value = os.environ.get(key)
+        if value is not None:
+            try:
+                return float(value)
+            except ValueError:
+                lib_logger.warning(
+                    f"Invalid value for {key}: {value}. Using default: {default}"
+                )
+        return default
+    @classmethod
+    def connect(cls) -> float:
+        """Connection establishment timeout."""
+        return cls._get_env_float("TIMEOUT_CONNECT", cls._CONNECT)
+    @classmethod
+    def write(cls) -> float:
+        """Request body send timeout."""
+        return cls._get_env_float("TIMEOUT_WRITE", cls._WRITE)
+    @classmethod
+    def pool(cls) -> float:
+        """Connection pool acquisition timeout."""
+        return cls._get_env_float("TIMEOUT_POOL", cls._POOL)
+    @classmethod
+    def read_streaming(cls) -> float:
+        """Read timeout between chunks for streaming requests."""
+        return cls._get_env_float("TIMEOUT_READ_STREAMING", cls._READ_STREAMING)
+    @classmethod
+    def read_non_streaming(cls) -> float:
+        """Read timeout for non-streaming responses."""
+        return cls._get_env_float("TIMEOUT_READ_NON_STREAMING", cls._READ_NON_STREAMING)
+    @classmethod
+    def streaming(cls) -> httpx.Timeout:
+        """
+        Timeout configuration for streaming LLM requests.
+        Uses a shorter read timeout (default 3 min) since we expect
+        periodic chunks. If no data arrives for this duration, the
+        connection is considered stalled.
+        """
+        return httpx.Timeout(
+            connect=cls.connect(),
+            read=cls.read_streaming(),
+            write=cls.write(),
+            pool=cls.pool(),
+        )
+    @classmethod
+    def non_streaming(cls) -> httpx.Timeout:
+        """
+        Timeout configuration for non-streaming LLM requests.
+        Uses a longer read timeout (default 10 min) since the server
+        may take significant time to generate the complete response
+        before sending anything back.
+        """
+        return httpx.Timeout(
+            connect=cls.connect(),
+            read=cls.read_non_streaming(),
+            write=cls.write(),
+            pool=cls.pool(),
+        )

src/rotator_library/usage_manager.py CHANGED Viewed

@@ -5,12 +5,15 @@ import logging
 import asyncio
 import random
 from datetime import date, datetime, timezone, time as dt_time
-from typing import Any, Dict, List, Optional, Set, Tuple
 import aiofiles
 import litellm
 from .error_handler import ClassifiedError, NoAvailableKeysError, mask_credential
 from .providers import PROVIDER_PLUGINS
 lib_logger = logging.getLogger("rotator_library")
 lib_logger.propagate = False
@@ -50,7 +53,7 @@ class UsageManager:
     def __init__(
         self,
-        file_path: str = "key_usage.json",
         daily_reset_time_utc: Optional[str] = "03:00",
         rotation_tolerance: float = 0.0,
         provider_rotation_modes: Optional[Dict[str, str]] = None,
@@ -65,7 +68,8 @@ class UsageManager:
         Initialize the UsageManager.
         Args:
-            file_path: Path to the usage data JSON file
             daily_reset_time_utc: Time in UTC when daily stats should reset (HH:MM format)
             rotation_tolerance: Tolerance for weighted random credential rotation.
                 - 0.0: Deterministic, least-used credential always selected
@@ -85,7 +89,14 @@ class UsageManager:
                 Used in sequential mode when priority not in priority_multipliers.
                 Example: {"antigravity": 2}
         """
-        self.file_path = file_path
         self.rotation_tolerance = rotation_tolerance
         self.provider_rotation_modes = provider_rotation_modes or {}
         self.provider_plugins = provider_plugins or PROVIDER_PLUGINS
@@ -103,6 +114,9 @@ class UsageManager:
         self._timeout_lock = asyncio.Lock()
         self._claimed_on_timeout: Set[str] = set()
         if daily_reset_time_utc:
             hour, minute = map(int, daily_reset_time_utc.split(":"))
             self.daily_reset_time_utc = dt_time(
@@ -540,27 +554,40 @@ class UsageManager:
                 self._initialized.set()
     async def _load_usage(self):
-        """Loads usage data from the JSON file asynchronously."""
         async with self._data_lock:
             if not os.path.exists(self.file_path):
                 self._usage_data = {}
                 return
             try:
                 async with aiofiles.open(self.file_path, "r") as f:
                     content = await f.read()
-                    self._usage_data = json.loads(content)
-            except (json.JSONDecodeError, IOError, FileNotFoundError):
                 self._usage_data = {}
     async def _save_usage(self):
-        """Saves the current usage data to the JSON file asynchronously."""
         if self._usage_data is None:
             return
         async with self._data_lock:
             # Add human-readable timestamp fields before saving
             self._add_readable_timestamps(self._usage_data)
-            async with aiofiles.open(self.file_path, "w") as f:
-                await f.write(json.dumps(self._usage_data, indent=2))
     async def _reset_daily_stats_if_needed(self):
         """

 import asyncio
 import random
 from datetime import date, datetime, timezone, time as dt_time
+from pathlib import Path
+from typing import Any, Dict, List, Optional, Set, Tuple, Union
 import aiofiles
 import litellm
 from .error_handler import ClassifiedError, NoAvailableKeysError, mask_credential
 from .providers import PROVIDER_PLUGINS
+from .utils.resilient_io import ResilientStateWriter
+from .utils.paths import get_data_file
 lib_logger = logging.getLogger("rotator_library")
 lib_logger.propagate = False
     def __init__(
         self,
+        file_path: Optional[Union[str, Path]] = None,
         daily_reset_time_utc: Optional[str] = "03:00",
         rotation_tolerance: float = 0.0,
         provider_rotation_modes: Optional[Dict[str, str]] = None,
         Initialize the UsageManager.
         Args:
+            file_path: Path to the usage data JSON file. If None, uses get_data_file("key_usage.json").
+                       Can be absolute Path, relative Path, or string.
             daily_reset_time_utc: Time in UTC when daily stats should reset (HH:MM format)
             rotation_tolerance: Tolerance for weighted random credential rotation.
                 - 0.0: Deterministic, least-used credential always selected
                 Used in sequential mode when priority not in priority_multipliers.
                 Example: {"antigravity": 2}
         """
+        # Resolve file_path - use default if not provided
+        if file_path is None:
+            self.file_path = str(get_data_file("key_usage.json"))
+        elif isinstance(file_path, Path):
+            self.file_path = str(file_path)
+        else:
+            # String path - could be relative or absolute
+            self.file_path = file_path
         self.rotation_tolerance = rotation_tolerance
         self.provider_rotation_modes = provider_rotation_modes or {}
         self.provider_plugins = provider_plugins or PROVIDER_PLUGINS
         self._timeout_lock = asyncio.Lock()
         self._claimed_on_timeout: Set[str] = set()
+        # Resilient writer for usage data persistence
+        self._state_writer = ResilientStateWriter(file_path, lib_logger)
         if daily_reset_time_utc:
             hour, minute = map(int, daily_reset_time_utc.split(":"))
             self.daily_reset_time_utc = dt_time(
                 self._initialized.set()
     async def _load_usage(self):
+        """Loads usage data from the JSON file asynchronously with resilience."""
         async with self._data_lock:
             if not os.path.exists(self.file_path):
                 self._usage_data = {}
                 return
             try:
                 async with aiofiles.open(self.file_path, "r") as f:
                     content = await f.read()
+                    self._usage_data = json.loads(content) if content.strip() else {}
+            except FileNotFoundError:
+                # File deleted between exists check and open
+                self._usage_data = {}
+            except json.JSONDecodeError as e:
+                lib_logger.warning(
+                    f"Corrupted usage file {self.file_path}: {e}. Starting fresh."
+                )
+                self._usage_data = {}
+            except (OSError, PermissionError, IOError) as e:
+                lib_logger.warning(
+                    f"Cannot read usage file {self.file_path}: {e}. Using empty state."
+                )
                 self._usage_data = {}
     async def _save_usage(self):
+        """Saves the current usage data using the resilient state writer."""
         if self._usage_data is None:
             return
         async with self._data_lock:
             # Add human-readable timestamp fields before saving
             self._add_readable_timestamps(self._usage_data)
+            # Hand off to resilient writer - handles retries and disk failures
+            self._state_writer.write(self._usage_data)
     async def _reset_daily_stats_if_needed(self):
         """

src/rotator_library/utils/__init__.py CHANGED Viewed

@@ -1,6 +1,34 @@
 # src/rotator_library/utils/__init__.py
 from .headless_detection import is_headless_environment
 from .reauth_coordinator import get_reauth_coordinator, ReauthCoordinator
-__all__ = ["is_headless_environment", "get_reauth_coordinator", "ReauthCoordinator"]

 # src/rotator_library/utils/__init__.py
 from .headless_detection import is_headless_environment
+from .paths import (
+    get_default_root,
+    get_logs_dir,
+    get_cache_dir,
+    get_oauth_dir,
+    get_data_file,
+)
 from .reauth_coordinator import get_reauth_coordinator, ReauthCoordinator
+from .resilient_io import (
+    BufferedWriteRegistry,
+    ResilientStateWriter,
+    safe_write_json,
+    safe_log_write,
+    safe_mkdir,
+)
+__all__ = [
+    "is_headless_environment",
+    "get_default_root",
+    "get_logs_dir",
+    "get_cache_dir",
+    "get_oauth_dir",
+    "get_data_file",
+    "get_reauth_coordinator",
+    "ReauthCoordinator",
+    "BufferedWriteRegistry",
+    "ResilientStateWriter",
+    "safe_write_json",
+    "safe_log_write",
+    "safe_mkdir",
+]

src/rotator_library/utils/paths.py ADDED Viewed

	@@ -0,0 +1,99 @@

+# src/rotator_library/utils/paths.py
+"""
+Centralized path management for the rotator library.
+Supports two runtime modes:
+1. PyInstaller EXE -> files in the directory containing the executable
+2. Script/Library  -> files in the current working directory (overridable)
+Library users can override by passing `data_dir` to RotatingClient.
+"""
+import sys
+from pathlib import Path
+from typing import Optional, Union
+def get_default_root() -> Path:
+    """
+    Get the default root directory for data files.
+    - EXE mode (PyInstaller): directory containing the executable
+    - Otherwise: current working directory
+    Returns:
+        Path to the root directory
+    """
+    if getattr(sys, "frozen", False):
+        # Running as PyInstaller bundle - use executable's directory
+        return Path(sys.executable).parent
+    # Running as script or library - use current working directory
+    return Path.cwd()
+def get_logs_dir(root: Optional[Union[Path, str]] = None) -> Path:
+    """
+    Get the logs directory, creating it if needed.
+    Args:
+        root: Optional root directory. If None, uses get_default_root().
+    Returns:
+        Path to the logs directory
+    """
+    base = Path(root) if root else get_default_root()
+    logs_dir = base / "logs"
+    logs_dir.mkdir(exist_ok=True)
+    return logs_dir
+def get_cache_dir(
+    root: Optional[Union[Path, str]] = None, subdir: Optional[str] = None
+) -> Path:
+    """
+    Get the cache directory, optionally with a subdirectory.
+    Args:
+        root: Optional root directory. If None, uses get_default_root().
+        subdir: Optional subdirectory name (e.g., "gemini_cli", "antigravity")
+    Returns:
+        Path to the cache directory (or subdirectory)
+    """
+    base = Path(root) if root else get_default_root()
+    cache_dir = base / "cache"
+    if subdir:
+        cache_dir = cache_dir / subdir
+    cache_dir.mkdir(parents=True, exist_ok=True)
+    return cache_dir
+def get_oauth_dir(root: Optional[Union[Path, str]] = None) -> Path:
+    """
+    Get the OAuth credentials directory, creating it if needed.
+    Args:
+        root: Optional root directory. If None, uses get_default_root().
+    Returns:
+        Path to the oauth_creds directory
+    """
+    base = Path(root) if root else get_default_root()
+    oauth_dir = base / "oauth_creds"
+    oauth_dir.mkdir(exist_ok=True)
+    return oauth_dir
+def get_data_file(filename: str, root: Optional[Union[Path, str]] = None) -> Path:
+    """
+    Get the path to a data file in the root directory.
+    Args:
+        filename: Name of the file (e.g., "key_usage.json", ".env")
+        root: Optional root directory. If None, uses get_default_root().
+    Returns:
+        Path to the file (does not create the file)
+    """
+    base = Path(root) if root else get_default_root()
+    return base / filename

src/rotator_library/utils/resilient_io.py ADDED Viewed

	@@ -0,0 +1,665 @@

+# src/rotator_library/utils/resilient_io.py
+"""
+Resilient I/O utilities for handling file operations gracefully.
+Provides three main patterns:
+1. BufferedWriteRegistry - Global singleton for buffered writes with periodic
+   retry and shutdown flush. Ensures data is saved on app exit (Ctrl+C).
+2. ResilientStateWriter - For stateful files (usage.json) that should be
+   buffered in memory and retried on disk failure.
+3. safe_write_json (with buffer_on_failure) - For critical files (auth tokens)
+   that should be buffered and retried if write fails.
+4. safe_log_write - For logs that can be dropped on failure.
+"""
+import atexit
+import json
+import os
+import shutil
+import tempfile
+import threading
+import time
+import logging
+from pathlib import Path
+from typing import Any, Callable, Dict, Optional, Tuple, Union
+# =============================================================================
+# BUFFERED WRITE REGISTRY (SINGLETON)
+# =============================================================================
+class BufferedWriteRegistry:
+    """
+    Global singleton registry for buffered writes with periodic retry and shutdown flush.
+    This ensures that critical data (auth tokens, usage stats) is saved even if
+    disk writes fail temporarily. On app exit (including Ctrl+C), all pending
+    writes are flushed.
+    Features:
+    - Per-file buffering: each file path has its own pending write
+    - Periodic retries: background thread retries failed writes every N seconds
+    - Shutdown flush: atexit hook ensures final write attempt on app exit
+    - Thread-safe: safe for concurrent access from multiple threads
+    Usage:
+        # Get the singleton instance
+        registry = BufferedWriteRegistry.get_instance()
+        # Register a pending write (usually called by safe_write_json on failure)
+        registry.register_pending(path, data, serializer_fn, options)
+        # Manual flush (optional - atexit handles this automatically)
+        results = registry.flush_all()
+    """
+    _instance: Optional["BufferedWriteRegistry"] = None
+    _instance_lock = threading.Lock()
+    def __init__(self, retry_interval: float = 30.0):
+        """
+        Initialize the registry. Use get_instance() instead of direct construction.
+        Args:
+            retry_interval: Seconds between retry attempts (default: 30)
+        """
+        self._pending: Dict[str, Tuple[Any, Callable[[Any], str], Dict[str, Any]]] = {}
+        self._retry_interval = retry_interval
+        self._lock = threading.Lock()
+        self._running = False
+        self._retry_thread: Optional[threading.Thread] = None
+        self._logger = logging.getLogger("rotator_library.resilient_io")
+        # Start background retry thread
+        self._start_retry_thread()
+        # Register atexit handler for shutdown flush
+        atexit.register(self._atexit_handler)
+    @classmethod
+    def get_instance(cls, retry_interval: float = 30.0) -> "BufferedWriteRegistry":
+        """
+        Get or create the singleton instance.
+        Args:
+            retry_interval: Seconds between retry attempts (only used on first call)
+        Returns:
+            The singleton BufferedWriteRegistry instance
+        """
+        if cls._instance is None:
+            with cls._instance_lock:
+                if cls._instance is None:
+                    cls._instance = cls(retry_interval)
+        return cls._instance
+    def _start_retry_thread(self) -> None:
+        """Start the background retry thread."""
+        if self._running:
+            return
+        self._running = True
+        self._retry_thread = threading.Thread(
+            target=self._retry_loop,
+            name="BufferedWriteRegistry-Retry",
+            daemon=True,  # Daemon so it doesn't block app exit
+        )
+        self._retry_thread.start()
+    def _retry_loop(self) -> None:
+        """Background thread: periodically retry pending writes."""
+        while self._running:
+            time.sleep(self._retry_interval)
+            if not self._running:
+                break
+            self._retry_pending()
+    def _retry_pending(self) -> None:
+        """Attempt to write all pending files."""
+        with self._lock:
+            if not self._pending:
+                return
+            # Copy paths to avoid modifying dict during iteration
+            paths = list(self._pending.keys())
+        for path_str in paths:
+            self._try_write(path_str, remove_on_success=True)
+    def register_pending(
+        self,
+        path: Union[str, Path],
+        data: Any,
+        serializer: Callable[[Any], str],
+        options: Optional[Dict[str, Any]] = None,
+    ) -> None:
+        """
+        Register a pending write for later retry.
+        If a write is already pending for this path, it is replaced with the new data
+        (we always want to write the latest state).
+        Args:
+            path: File path to write to
+            data: Data to serialize and write
+            serializer: Function to serialize data to string
+            options: Additional options (e.g., secure_permissions)
+        """
+        path_str = str(Path(path).resolve())
+        with self._lock:
+            self._pending[path_str] = (data, serializer, options or {})
+            self._logger.debug(f"Registered pending write for {Path(path).name}")
+    def unregister(self, path: Union[str, Path]) -> None:
+        """
+        Remove a pending write (called when write succeeds elsewhere).
+        Args:
+            path: File path to remove from pending
+        """
+        path_str = str(Path(path).resolve())
+        with self._lock:
+            self._pending.pop(path_str, None)
+    def _try_write(self, path_str: str, remove_on_success: bool = True) -> bool:
+        """
+        Attempt to write a pending file.
+        Args:
+            path_str: Resolved path string
+            remove_on_success: Remove from pending if successful
+        Returns:
+            True if write succeeded, False otherwise
+        """
+        with self._lock:
+            if path_str not in self._pending:
+                return True
+            data, serializer, options = self._pending[path_str]
+        path = Path(path_str)
+        try:
+            # Ensure directory exists
+            path.parent.mkdir(parents=True, exist_ok=True)
+            # Serialize data
+            content = serializer(data)
+            # Atomic write
+            tmp_fd = None
+            tmp_path = None
+            try:
+                tmp_fd, tmp_path = tempfile.mkstemp(
+                    dir=path.parent, prefix=".tmp_", suffix=".json", text=True
+                )
+                with os.fdopen(tmp_fd, "w", encoding="utf-8") as f:
+                    f.write(content)
+                    tmp_fd = None
+                # Set secure permissions if requested
+                if options.get("secure_permissions"):
+                    try:
+                        os.chmod(tmp_path, 0o600)
+                    except (OSError, AttributeError):
+                        pass
+                shutil.move(tmp_path, path)
+                tmp_path = None
+            finally:
+                if tmp_fd is not None:
+                    try:
+                        os.close(tmp_fd)
+                    except OSError:
+                        pass
+                if tmp_path and os.path.exists(tmp_path):
+                    try:
+                        os.unlink(tmp_path)
+                    except OSError:
+                        pass
+            # Success - remove from pending
+            if remove_on_success:
+                with self._lock:
+                    self._pending.pop(path_str, None)
+            self._logger.debug(f"Retry succeeded for {path.name}")
+            return True
+        except (OSError, PermissionError, IOError) as e:
+            self._logger.debug(f"Retry failed for {path.name}: {e}")
+            return False
+    def flush_all(self) -> Dict[str, bool]:
+        """
+        Attempt to write all pending files immediately.
+        Returns:
+            Dict mapping file paths to success status
+        """
+        with self._lock:
+            paths = list(self._pending.keys())
+        results = {}
+        for path_str in paths:
+            results[path_str] = self._try_write(path_str, remove_on_success=True)
+        return results
+    def _atexit_handler(self) -> None:
+        """Called on app exit to flush pending writes."""
+        self._running = False
+        with self._lock:
+            pending_count = len(self._pending)
+        if pending_count == 0:
+            return
+        self._logger.info(f"Flushing {pending_count} pending write(s) on shutdown...")
+        results = self.flush_all()
+        succeeded = sum(1 for v in results.values() if v)
+        failed = pending_count - succeeded
+        if failed > 0:
+            self._logger.warning(
+                f"Shutdown flush: {succeeded} succeeded, {failed} failed"
+            )
+            for path_str, success in results.items():
+                if not success:
+                    self._logger.warning(f"  Failed to save: {Path(path_str).name}")
+        else:
+            self._logger.info(f"Shutdown flush: all {succeeded} write(s) succeeded")
+    def get_pending_count(self) -> int:
+        """Get the number of pending writes."""
+        with self._lock:
+            return len(self._pending)
+    def get_pending_paths(self) -> list:
+        """Get list of paths with pending writes (for monitoring)."""
+        with self._lock:
+            return [Path(p).name for p in self._pending.keys()]
+    def shutdown(self) -> Dict[str, bool]:
+        """
+        Manually trigger shutdown: stop retry thread and flush all pending writes.
+        Returns:
+            Dict mapping file paths to success status
+        """
+        self._running = False
+        if self._retry_thread and self._retry_thread.is_alive():
+            self._retry_thread.join(timeout=1.0)
+        return self.flush_all()
+# =============================================================================
+# RESILIENT STATE WRITER
+# =============================================================================
+class ResilientStateWriter:
+    """
+    Manages resilient writes for stateful files (usage stats, credentials, cache).
+    Design:
+    - Caller hands off data via write() - always succeeds (memory update)
+    - Attempts disk write immediately
+    - If disk fails, retries periodically in background
+    - On recovery, writes full current state (not just new data)
+    Thread-safe for use in async contexts with sync file I/O.
+    Usage:
+        writer = ResilientStateWriter("data.json", logger)
+        writer.write({"key": "value"})  # Always succeeds
+        # ... later ...
+        if not writer.is_healthy:
+            logger.warning("Disk writes failing, data in memory only")
+    """
+    def __init__(
+        self,
+        path: Union[str, Path],
+        logger: logging.Logger,
+        retry_interval: float = 30.0,
+        serializer: Optional[Callable[[Any], str]] = None,
+    ):
+        """
+        Initialize the resilient writer.
+        Args:
+            path: File path to write to
+            logger: Logger for warnings/errors
+            retry_interval: Seconds between retry attempts when disk is unhealthy
+            serializer: Custom serializer function (defaults to JSON with indent=2)
+        """
+        self.path = Path(path)
+        self.logger = logger
+        self.retry_interval = retry_interval
+        self._serializer = serializer or (lambda d: json.dumps(d, indent=2))
+        self._current_state: Optional[Any] = None
+        self._disk_healthy = True
+        self._last_attempt: float = 0
+        self._last_success: Optional[float] = None
+        self._failure_count = 0
+        self._lock = threading.Lock()
+    def write(self, data: Any) -> bool:
+        """
+        Update state and attempt disk write.
+        Always updates in-memory state (guaranteed to succeed).
+        Attempts disk write - if disk is unhealthy, respects retry_interval
+        before attempting again to avoid flooding with failed writes.
+        Args:
+            data: Data to persist (must be serializable)
+        Returns:
+            True if disk write succeeded, False if failed (data still in memory)
+        """
+        with self._lock:
+            self._current_state = data
+            # If disk is unhealthy, only retry after retry_interval has passed
+            if not self._disk_healthy:
+                now = time.time()
+                if now - self._last_attempt < self.retry_interval:
+                    # Too soon to retry, data is safe in memory
+                    return False
+            return self._try_disk_write()
+    def retry_if_needed(self) -> bool:
+        """
+        Retry disk write if unhealthy and retry interval has passed.
+        Call this periodically (e.g., on each save attempt) to recover
+        from transient disk failures.
+        Returns:
+            True if healthy (no retry needed or retry succeeded)
+        """
+        with self._lock:
+            if self._disk_healthy:
+                return True
+            if self._current_state is None:
+                return True
+            now = time.time()
+            if now - self._last_attempt < self.retry_interval:
+                return False
+            return self._try_disk_write()
+    def _try_disk_write(self) -> bool:
+        """
+        Attempt atomic write to disk. Updates health status.
+        Uses tempfile + move pattern for atomic writes on POSIX systems.
+        On Windows, uses direct write (still safe for our use case).
+        Also registers/unregisters with BufferedWriteRegistry for shutdown flush.
+        """
+        if self._current_state is None:
+            return True
+        self._last_attempt = time.time()
+        try:
+            # Ensure directory exists
+            self.path.parent.mkdir(parents=True, exist_ok=True)
+            # Serialize data
+            content = self._serializer(self._current_state)
+            # Atomic write: write to temp file, then move
+            tmp_fd = None
+            tmp_path = None
+            try:
+                tmp_fd, tmp_path = tempfile.mkstemp(
+                    dir=self.path.parent, prefix=".tmp_", suffix=".json", text=True
+                )
+                with os.fdopen(tmp_fd, "w", encoding="utf-8") as f:
+                    f.write(content)
+                    tmp_fd = None  # fdopen closes the fd
+                # Atomic move
+                shutil.move(tmp_path, self.path)
+                tmp_path = None
+            finally:
+                # Cleanup on failure
+                if tmp_fd is not None:
+                    try:
+                        os.close(tmp_fd)
+                    except OSError:
+                        pass
+                if tmp_path and os.path.exists(tmp_path):
+                    try:
+                        os.unlink(tmp_path)
+                    except OSError:
+                        pass
+            # Success - update health and unregister from shutdown flush
+            self._disk_healthy = True
+            self._last_success = time.time()
+            self._failure_count = 0
+            BufferedWriteRegistry.get_instance().unregister(self.path)
+            return True
+        except (OSError, PermissionError, IOError) as e:
+            self._disk_healthy = False
+            self._failure_count += 1
+            # Register with BufferedWriteRegistry for shutdown flush
+            registry = BufferedWriteRegistry.get_instance()
+            registry.register_pending(
+                self.path,
+                self._current_state,
+                self._serializer,
+                {},  # No special options for ResilientStateWriter
+            )
+            # Log warning (rate-limited to avoid flooding)
+            if self._failure_count == 1 or self._failure_count % 10 == 0:
+                self.logger.warning(
+                    f"Failed to write {self.path.name}: {e}. "
+                    f"Data retained in memory (failure #{self._failure_count})."
+                )
+            return False
+    @property
+    def is_healthy(self) -> bool:
+        """Check if disk writes are currently working."""
+        return self._disk_healthy
+    @property
+    def current_state(self) -> Optional[Any]:
+        """Get the current in-memory state (for inspection/debugging)."""
+        return self._current_state
+    def get_health_info(self) -> Dict[str, Any]:
+        """
+        Get detailed health information for monitoring.
+        Returns dict with:
+            - healthy: bool
+            - failure_count: int
+            - last_success: Optional[float] (timestamp)
+            - last_attempt: float (timestamp)
+            - path: str
+        """
+        return {
+            "healthy": self._disk_healthy,
+            "failure_count": self._failure_count,
+            "last_success": self._last_success,
+            "last_attempt": self._last_attempt,
+            "path": str(self.path),
+        }
+def safe_write_json(
+    path: Union[str, Path],
+    data: Dict[str, Any],
+    logger: logging.Logger,
+    atomic: bool = True,
+    indent: int = 2,
+    ensure_ascii: bool = True,
+    secure_permissions: bool = False,
+    buffer_on_failure: bool = False,
+) -> bool:
+    """
+    Write JSON data to file with error handling and optional buffering.
+    When buffer_on_failure is True, failed writes are registered with the
+    BufferedWriteRegistry for periodic retry and shutdown flush. This ensures
+    critical data (like auth tokens) is eventually saved.
+    Args:
+        path: File path to write to
+        data: JSON-serializable data
+        logger: Logger for warnings
+        atomic: Use atomic write pattern (tempfile + move)
+        indent: JSON indentation level (default: 2)
+        ensure_ascii: Escape non-ASCII characters (default: True)
+        secure_permissions: Set file permissions to 0o600 (default: False)
+        buffer_on_failure: Register with BufferedWriteRegistry on failure (default: False)
+    Returns:
+        True on success, False on failure (never raises)
+    """
+    path = Path(path)
+    # Create serializer function that matches the requested formatting
+    def serializer(d: Any) -> str:
+        return json.dumps(d, indent=indent, ensure_ascii=ensure_ascii)
+    try:
+        path.parent.mkdir(parents=True, exist_ok=True)
+        content = serializer(data)
+        if atomic:
+            tmp_fd = None
+            tmp_path = None
+            try:
+                tmp_fd, tmp_path = tempfile.mkstemp(
+                    dir=path.parent, prefix=".tmp_", suffix=".json", text=True
+                )
+                with os.fdopen(tmp_fd, "w", encoding="utf-8") as f:
+                    f.write(content)
+                    tmp_fd = None
+                # Set secure permissions if requested (before move for security)
+                if secure_permissions:
+                    try:
+                        os.chmod(tmp_path, 0o600)
+                    except (OSError, AttributeError):
+                        # Windows may not support chmod, ignore
+                        pass
+                shutil.move(tmp_path, path)
+                tmp_path = None
+            finally:
+                if tmp_fd is not None:
+                    try:
+                        os.close(tmp_fd)
+                    except OSError:
+                        pass
+                if tmp_path and os.path.exists(tmp_path):
+                    try:
+                        os.unlink(tmp_path)
+                    except OSError:
+                        pass
+        else:
+            with open(path, "w", encoding="utf-8") as f:
+                f.write(content)
+            # Set secure permissions if requested
+            if secure_permissions:
+                try:
+                    os.chmod(path, 0o600)
+                except (OSError, AttributeError):
+                    pass
+        # Success - remove from pending if it was there
+        if buffer_on_failure:
+            BufferedWriteRegistry.get_instance().unregister(path)
+        return True
+    except (OSError, PermissionError, IOError, TypeError, ValueError) as e:
+        logger.warning(f"Failed to write JSON to {path}: {e}")
+        # Register for retry if buffering is enabled
+        if buffer_on_failure:
+            registry = BufferedWriteRegistry.get_instance()
+            registry.register_pending(
+                path,
+                data,
+                serializer,
+                {"secure_permissions": secure_permissions},
+            )
+            logger.debug(f"Buffered {path.name} for retry on next interval or shutdown")
+        return False
+def safe_log_write(
+    path: Union[str, Path],
+    content: str,
+    logger: logging.Logger,
+    mode: str = "a",
+) -> bool:
+    """
+    Write content to log file with error handling. No buffering or retry.
+    Suitable for log files where occasional loss is acceptable.
+    Creates parent directories if needed.
+    Args:
+        path: File path to write to
+        content: String content to write
+        logger: Logger for warnings
+        mode: File mode ('a' for append, 'w' for overwrite)
+    Returns:
+        True on success, False on failure (never raises)
+    """
+    path = Path(path)
+    try:
+        path.parent.mkdir(parents=True, exist_ok=True)
+        with open(path, mode, encoding="utf-8") as f:
+            f.write(content)
+        return True
+    except (OSError, PermissionError, IOError) as e:
+        logger.warning(f"Failed to write log to {path}: {e}")
+        return False
+def safe_mkdir(path: Union[str, Path], logger: logging.Logger) -> bool:
+    """
+    Create directory with error handling.
+    Args:
+        path: Directory path to create
+        logger: Logger for warnings
+    Returns:
+        True on success (or already exists), False on failure
+    """
+    try:
+        Path(path).mkdir(parents=True, exist_ok=True)
+        return True
+    except (OSError, PermissionError) as e:
+        logger.warning(f"Failed to create directory {path}: {e}")
+        return False