Spaces:
Paused
Paused
Mirrowel commited on
Commit ·
5838a8e
1
Parent(s): 21dcb11
feat: Add detailed documentation and installation instructions for the rotating API key client
Browse files- .gitignore +1 -0
- DOCUMENTATION.md +81 -0
- LICENSE.MD +21 -0
- README.md +63 -41
- requirements.txt +10 -0
- src/rotator_library/README.md +108 -18
.gitignore
CHANGED
|
@@ -123,3 +123,4 @@ cython_debug/
|
|
| 123 |
test_proxy.py
|
| 124 |
start_proxy.bat
|
| 125 |
key_usage.json
|
|
|
|
|
|
| 123 |
test_proxy.py
|
| 124 |
start_proxy.bat
|
| 125 |
key_usage.json
|
| 126 |
+
staged_changes.txt
|
DOCUMENTATION.md
ADDED
|
@@ -0,0 +1,81 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Technical Documentation: `rotating-api-key-client`
|
| 2 |
+
|
| 3 |
+
This document provides a detailed technical explanation of the `rotating-api-key-client` library, its components, and its internal workings.
|
| 4 |
+
|
| 5 |
+
## 1. `client.py` - The `RotatingClient`
|
| 6 |
+
|
| 7 |
+
The `RotatingClient` is the central component of the library, orchestrating API calls, key rotation, and error handling.
|
| 8 |
+
|
| 9 |
+
### Request Lifecycle (`acompletion`)
|
| 10 |
+
|
| 11 |
+
When `acompletion` is called, it follows these steps:
|
| 12 |
+
|
| 13 |
+
1. **Model and Provider Validation**: It first checks that a `model` is specified and extracts the provider name from it (e.g., `"gemini"` from `"gemini/gemini-2.5-flash-preview-05-20"`). It ensures that API keys for this provider are available.
|
| 14 |
+
|
| 15 |
+
2. **Key Selection Loop**: The client enters a loop to find a valid key and complete the request.
|
| 16 |
+
a. **Get Next Smart Key**: It calls `self.usage_manager.get_next_smart_key()` to get the least-used key for the given model that is not currently on cooldown.
|
| 17 |
+
b. **No Key Available**: If all keys for the provider are on cooldown, it waits for 5 seconds before restarting the loop.
|
| 18 |
+
|
| 19 |
+
3. **Attempt Loop**: Once a key is selected, it enters a retry loop (`for attempt in range(self.max_retries)`):
|
| 20 |
+
a. **API Call**: It calls `litellm.acompletion` with the selected key and the user-provided arguments.
|
| 21 |
+
b. **Success**:
|
| 22 |
+
- If the call is successful and **non-streaming**, it calls `self.usage_manager.record_success()`, returns the response, and the process ends.
|
| 23 |
+
- If the call is successful and **streaming**, it returns a `_streaming_wrapper` async generator. This wrapper formats the response chunks as Server-Sent Events (SSE) and calls `self.usage_manager.record_success()` only when the stream is fully consumed.
|
| 24 |
+
c. **Failure**: If an exception occurs:
|
| 25 |
+
- The failure is logged using `log_failure()`.
|
| 26 |
+
- **Server Error**: If `is_server_error()` returns `True` and there are retries left, it waits for a moment and continues to the next attempt with the *same key*.
|
| 27 |
+
- **Unrecoverable Error**: If `is_unrecoverable_error()` returns `True`, the exception is immediately raised, terminating the process.
|
| 28 |
+
- **Other Errors (Rate Limit, Auth, etc.)**: For any other error, it's considered a "rotation" error. `self.usage_manager.record_rotation_error()` is called to put the key on cooldown, and the inner `attempt` loop is broken. The outer `while` loop then continues, fetching a new key.
|
| 29 |
+
|
| 30 |
+
## 2. `usage_manager.py` - The `UsageManager`
|
| 31 |
+
|
| 32 |
+
This class is responsible for all logic related to tracking and selecting API keys.
|
| 33 |
+
|
| 34 |
+
### Key Data Structure
|
| 35 |
+
|
| 36 |
+
Usage data is stored in a JSON file (e.g., `key_usage.json`). Here's a conceptual view of its structure:
|
| 37 |
+
|
| 38 |
+
```json
|
| 39 |
+
{
|
| 40 |
+
"api_key_1_hash": {
|
| 41 |
+
"last_used": "timestamp",
|
| 42 |
+
"cooldown_until": "timestamp",
|
| 43 |
+
"global_usage": 150,
|
| 44 |
+
"daily_usage": {
|
| 45 |
+
"YYYY-MM-DD": 100
|
| 46 |
+
},
|
| 47 |
+
"model_usage": {
|
| 48 |
+
"gemini/gemini-2.5-flash-preview-05-20": 50
|
| 49 |
+
}
|
| 50 |
+
}
|
| 51 |
+
}
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
- **Key Hashing**: Keys are stored by their SHA256 hash to avoid exposing sensitive keys in logs or files.
|
| 55 |
+
- `cooldown_until`: If a key fails, this timestamp is set. The key will not be selected until the current time is past this timestamp.
|
| 56 |
+
- `model_usage`: Tracks the usage count for each specific model, which is the primary metric for the "smart" key selection.
|
| 57 |
+
|
| 58 |
+
### Core Methods
|
| 59 |
+
|
| 60 |
+
- `get_next_smart_key()`: This is the key selection logic. It filters out any keys that are on cooldown and then finds the key with the lowest usage count for the requested `model`.
|
| 61 |
+
- `record_success()`: Increments the usage counters (`global_usage`, `daily_usage`, `model_usage`) for the given key.
|
| 62 |
+
- `record_rotation_error()`: Sets the `cooldown_until` timestamp for the given key, effectively taking it out of rotation for a short period.
|
| 63 |
+
|
| 64 |
+
## 3. `error_handler.py`
|
| 65 |
+
|
| 66 |
+
This module contains functions to classify exceptions returned by `litellm`.
|
| 67 |
+
|
| 68 |
+
- `is_server_error(e)`: Checks if the exception is a transient server-side error (typically a `5xx` status code) that is worth retrying with the same key.
|
| 69 |
+
- `is_unrecoverable_error(e)`: Checks for critical errors (e.g., invalid request parameters) that should immediately stop the process. Any error that is not a server error or an unrecoverable error is treated as a "rotation" error by the client.
|
| 70 |
+
|
| 71 |
+
## 4. `failure_logger.py`
|
| 72 |
+
|
| 73 |
+
- `log_failure()`: This function logs detailed information about a failed API request to a file in the `logs/` directory. This is crucial for debugging issues with specific keys or providers. The log includes the hashed API key, the model, the error message, and the request data.
|
| 74 |
+
|
| 75 |
+
## 5. `providers/` - Provider Plugins
|
| 76 |
+
|
| 77 |
+
The provider plugin system allows for easy extension to support model list fetching from new LLM providers.
|
| 78 |
+
|
| 79 |
+
- **`provider_interface.py`**: Defines the abstract base class `ProviderPlugin` with a single abstract method, `get_models`. Any new provider plugin must inherit from this class and implement this method.
|
| 80 |
+
- **Implementations**: Each provider (e.g., `openai_provider.py`, `gemini_provider.py`) has its own file containing a class that implements the `ProviderPlugin` interface. The `get_models` method contains the specific logic to call the provider's API and return a list of their available models.
|
| 81 |
+
- **`__init__.py`**: This file acts as a registry for the available plugins. The `PROVIDER_PLUGINS` dictionary maps provider names to their corresponding plugin classes. The `RotatingClient` uses this dictionary to instantiate the correct plugin at runtime.
|
LICENSE.MD
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
MIT License
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2025 Mirrowel
|
| 4 |
+
|
| 5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
+
of this software and associated documentation files (the "Software"), to deal
|
| 7 |
+
in the Software without restriction, including without limitation the rights
|
| 8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 9 |
+
copies of the Software, and to permit persons to whom the Software is
|
| 10 |
+
furnished to do so, subject to the following conditions:
|
| 11 |
+
|
| 12 |
+
The above copyright notice and this permission notice shall be included in all
|
| 13 |
+
copies or substantial portions of the Software.
|
| 14 |
+
|
| 15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 21 |
+
SOFTWARE.
|
README.md
CHANGED
|
@@ -1,18 +1,31 @@
|
|
| 1 |
# API Key Proxy with Rotating Key Library
|
| 2 |
|
| 3 |
-
This project provides two main components:
|
| 4 |
|
| 5 |
1. A reusable Python library (`rotating-api-key-client`) for intelligently rotating API keys.
|
| 6 |
-
2. A FastAPI proxy application that uses this library to provide an OpenAI-compatible endpoint
|
| 7 |
|
| 8 |
## Features
|
| 9 |
|
| 10 |
-
- **Smart Key Rotation**:
|
| 11 |
-
- **Automatic Retries**:
|
| 12 |
-
- **Cooldowns**:
|
| 13 |
-
- **Usage Tracking**:
|
| 14 |
-
- **Provider Agnostic**:
|
| 15 |
-
- **OpenAI-Compatible Proxy**:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
## Project Structure
|
| 18 |
|
|
@@ -28,10 +41,9 @@ This project provides two main components:
|
|
| 28 |
│ ├── error_handler.py
|
| 29 |
│ ├── failure_logger.py
|
| 30 |
│ ├── usage_manager.py
|
| 31 |
-
│ ├──
|
| 32 |
-
│ └──
|
| 33 |
├── .env.example
|
| 34 |
-
├── .gitignore
|
| 35 |
├── README.md
|
| 36 |
└── requirements.txt
|
| 37 |
```
|
|
@@ -39,77 +51,79 @@ This project provides two main components:
|
|
| 39 |
## Setup and Installation
|
| 40 |
|
| 41 |
1. **Clone the repository:**
|
| 42 |
-
|
| 43 |
```bash
|
| 44 |
git clone <repository-url>
|
| 45 |
cd <repository-name>
|
| 46 |
```
|
| 47 |
|
| 48 |
2. **Create a virtual environment:**
|
| 49 |
-
|
| 50 |
```bash
|
| 51 |
python -m venv venv
|
| 52 |
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
|
| 53 |
```
|
| 54 |
|
| 55 |
-
3. **Install
|
| 56 |
-
|
| 57 |
-
The `requirements.txt` file includes the proxy's dependencies and installs the `rotator_library` in editable mode (`-e`), so you can develop both simultaneously.
|
| 58 |
-
|
| 59 |
```bash
|
| 60 |
pip install -r requirements.txt
|
| 61 |
```
|
| 62 |
|
| 63 |
4. **Configure environment variables:**
|
| 64 |
-
|
| 65 |
-
Create a `.env` file by copying the `.env.example`:
|
| 66 |
-
|
| 67 |
```bash
|
| 68 |
cp .env.example .env
|
| 69 |
```
|
|
|
|
| 70 |
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
```
|
| 74 |
-
# A secret key for your proxy to prevent unauthorized access
|
| 75 |
PROXY_API_KEY="your-secret-proxy-key"
|
| 76 |
|
| 77 |
-
# Add
|
| 78 |
-
# The keys will be tried in order.
|
| 79 |
GEMINI_API_KEY_1="your-gemini-api-key-1"
|
| 80 |
GEMINI_API_KEY_2="your-gemini-api-key-2"
|
| 81 |
-
|
|
|
|
| 82 |
```
|
| 83 |
|
| 84 |
## Running the Proxy
|
| 85 |
|
| 86 |
-
To
|
| 87 |
-
|
| 88 |
```bash
|
| 89 |
uvicorn src.proxy_app.main:app --reload
|
| 90 |
```
|
| 91 |
-
|
| 92 |
The proxy will be available at `http://127.0.0.1:8000`.
|
| 93 |
|
| 94 |
## Using the Proxy
|
| 95 |
|
| 96 |
-
You can make requests to the proxy as if it were the OpenAI API.
|
| 97 |
|
| 98 |
-
|
| 99 |
|
|
|
|
| 100 |
```bash
|
| 101 |
curl -X POST http://127.0.0.1:8000/v1/chat/completions \
|
| 102 |
-H "Content-Type: application/json" \
|
| 103 |
-H "Authorization: Bearer your-secret-proxy-key" \
|
| 104 |
-d '{
|
| 105 |
-
"model": "gemini/gemini-
|
| 106 |
-
"messages": [{"role": "user", "content": "What is the capital of France?"}]
|
| 107 |
-
"stream": false
|
| 108 |
}'
|
| 109 |
```
|
| 110 |
|
| 111 |
-
### Example with
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
|
|
|
|
| 113 |
```python
|
| 114 |
import requests
|
| 115 |
import json
|
|
@@ -123,16 +137,24 @@ headers = {
|
|
| 123 |
}
|
| 124 |
|
| 125 |
data = {
|
| 126 |
-
"model": "gemini/gemini-
|
| 127 |
-
"messages": [{"role": "user", "content": "What is the capital of France?"}]
|
| 128 |
-
"stream": False
|
| 129 |
}
|
| 130 |
|
| 131 |
response = requests.post(proxy_url, headers=headers, data=json.dumps(data))
|
| 132 |
-
|
| 133 |
print(response.json())
|
| 134 |
```
|
| 135 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
## Using the Library in Other Projects
|
| 137 |
|
| 138 |
-
The `rotating-api-key-client` library
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# API Key Proxy with Rotating Key Library
|
| 2 |
|
| 3 |
+
This project provides a robust solution for managing and rotating API keys for various Large Language Model (LLM) providers. It consists of two main components:
|
| 4 |
|
| 5 |
1. A reusable Python library (`rotating-api-key-client`) for intelligently rotating API keys.
|
| 6 |
+
2. A FastAPI proxy application that uses this library to provide an OpenAI-compatible endpoint.
|
| 7 |
|
| 8 |
## Features
|
| 9 |
|
| 10 |
+
- **Smart Key Rotation**: Intelligently selects the least-used API key to distribute request loads evenly.
|
| 11 |
+
- **Automatic Retries**: Automatically retries requests on transient server errors (e.g., 5xx status codes).
|
| 12 |
+
- **Key Cooldowns**: Temporarily disables keys that encounter rate limits or authentication errors to prevent further issues.
|
| 13 |
+
- **Usage Tracking**: Monitors daily and global usage for each API key.
|
| 14 |
+
- **Provider Agnostic**: Compatible with any provider supported by `litellm`.
|
| 15 |
+
- **OpenAI-Compatible Proxy**: Offers a familiar API interface for seamless interaction with different models.
|
| 16 |
+
|
| 17 |
+
## How It Works
|
| 18 |
+
|
| 19 |
+
The core of this project is the `RotatingClient` library, which manages a pool of API keys. When a request is made, the client:
|
| 20 |
+
|
| 21 |
+
1. **Selects the Best Key**: It identifies the key with the lowest usage count that is not currently in a cooldown period.
|
| 22 |
+
2. **Makes the Request**: It uses the selected key to make the API call via `litellm`.
|
| 23 |
+
3. **Handles Errors**:
|
| 24 |
+
- If a **retriable error** (like a 500 server error) occurs, it waits and retries the request.
|
| 25 |
+
- If a **non-retriable error** (like a rate limit or invalid key error) occurs, it places the key on a temporary cooldown and selects a new key for the next attempt.
|
| 26 |
+
4. **Tracks Usage**: On a successful request, it records the usage for the key.
|
| 27 |
+
|
| 28 |
+
The FastAPI proxy application exposes this functionality through an API endpoint that mimics the OpenAI API, making it easy to integrate with existing tools and applications.
|
| 29 |
|
| 30 |
## Project Structure
|
| 31 |
|
|
|
|
| 41 |
│ ├── error_handler.py
|
| 42 |
│ ├── failure_logger.py
|
| 43 |
│ ├── usage_manager.py
|
| 44 |
+
│ ├── providers/
|
| 45 |
+
│ └── ...
|
| 46 |
├── .env.example
|
|
|
|
| 47 |
├── README.md
|
| 48 |
└── requirements.txt
|
| 49 |
```
|
|
|
|
| 51 |
## Setup and Installation
|
| 52 |
|
| 53 |
1. **Clone the repository:**
|
|
|
|
| 54 |
```bash
|
| 55 |
git clone <repository-url>
|
| 56 |
cd <repository-name>
|
| 57 |
```
|
| 58 |
|
| 59 |
2. **Create a virtual environment:**
|
|
|
|
| 60 |
```bash
|
| 61 |
python -m venv venv
|
| 62 |
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
|
| 63 |
```
|
| 64 |
|
| 65 |
+
3. **Install dependencies:**
|
| 66 |
+
The `requirements.txt` file includes all necessary packages and installs the `rotator_library` in editable mode (`-e`), allowing for simultaneous development of the library and the proxy.
|
|
|
|
|
|
|
| 67 |
```bash
|
| 68 |
pip install -r requirements.txt
|
| 69 |
```
|
| 70 |
|
| 71 |
4. **Configure environment variables:**
|
| 72 |
+
Create a `.env` file by copying the example file:
|
|
|
|
|
|
|
| 73 |
```bash
|
| 74 |
cp .env.example .env
|
| 75 |
```
|
| 76 |
+
Edit the `.env` file to add your API keys. The proxy automatically detects keys for different providers based on the naming convention `PROVIDER_API_KEY_N`.
|
| 77 |
|
| 78 |
+
```env
|
| 79 |
+
# A secret key to protect your proxy from unauthorized access
|
|
|
|
|
|
|
| 80 |
PROXY_API_KEY="your-secret-proxy-key"
|
| 81 |
|
| 82 |
+
# Add API keys for each provider. They will be rotated automatically.
|
|
|
|
| 83 |
GEMINI_API_KEY_1="your-gemini-api-key-1"
|
| 84 |
GEMINI_API_KEY_2="your-gemini-api-key-2"
|
| 85 |
+
|
| 86 |
+
OPENAI_API_KEY_1="your-openai-api-key-1"
|
| 87 |
```
|
| 88 |
|
| 89 |
## Running the Proxy
|
| 90 |
|
| 91 |
+
To start the proxy application, run the following command:
|
|
|
|
| 92 |
```bash
|
| 93 |
uvicorn src.proxy_app.main:app --reload
|
| 94 |
```
|
|
|
|
| 95 |
The proxy will be available at `http://127.0.0.1:8000`.
|
| 96 |
|
| 97 |
## Using the Proxy
|
| 98 |
|
| 99 |
+
You can make requests to the proxy as if it were the OpenAI API. Remember to include your `PROXY_API_KEY` in the `Authorization` header.
|
| 100 |
|
| 101 |
+
The `model` parameter must be specified in the format `provider/model_name` (e.g., `gemini/gemini-2.5-flash-preview-05-20`, `openai/gpt-4`).
|
| 102 |
|
| 103 |
+
### Example with `curl` (Non-Streaming):
|
| 104 |
```bash
|
| 105 |
curl -X POST http://127.0.0.1:8000/v1/chat/completions \
|
| 106 |
-H "Content-Type: application/json" \
|
| 107 |
-H "Authorization: Bearer your-secret-proxy-key" \
|
| 108 |
-d '{
|
| 109 |
+
"model": "gemini/gemini-2.5-flash-preview-05-20",
|
| 110 |
+
"messages": [{"role": "user", "content": "What is the capital of France?"}]
|
|
|
|
| 111 |
}'
|
| 112 |
```
|
| 113 |
|
| 114 |
+
### Example with `curl` (Streaming):
|
| 115 |
+
```bash
|
| 116 |
+
curl -X POST http://127.0.0.1:8000/v1/chat/completions \
|
| 117 |
+
-H "Content-Type: application/json" \
|
| 118 |
+
-H "Authorization: Bearer your-secret-proxy-key" \
|
| 119 |
+
-d '{
|
| 120 |
+
"model": "gemini/gemini-2.5-flash-preview-05-20",
|
| 121 |
+
"messages": [{"role": "user", "content": "Write a short story about a robot."}],
|
| 122 |
+
"stream": true
|
| 123 |
+
}'
|
| 124 |
+
```
|
| 125 |
|
| 126 |
+
### Example with Python `requests`:
|
| 127 |
```python
|
| 128 |
import requests
|
| 129 |
import json
|
|
|
|
| 137 |
}
|
| 138 |
|
| 139 |
data = {
|
| 140 |
+
"model": "gemini/gemini-2.5-flash-preview-05-20",
|
| 141 |
+
"messages": [{"role": "user", "content": "What is the capital of France?"}]
|
|
|
|
| 142 |
}
|
| 143 |
|
| 144 |
response = requests.post(proxy_url, headers=headers, data=json.dumps(data))
|
|
|
|
| 145 |
print(response.json())
|
| 146 |
```
|
| 147 |
|
| 148 |
+
## Troubleshooting
|
| 149 |
+
|
| 150 |
+
- **`401 Unauthorized`**: Ensure your `PROXY_API_KEY` is set correctly in the `.env` file and included in the `Authorization` header of your request.
|
| 151 |
+
- **`500 Internal Server Error`**: Check the console logs of the `uvicorn` server for detailed error messages. This could indicate an issue with one of your provider API keys or a problem with the provider's service.
|
| 152 |
+
- **All keys on cooldown**: If you see a message that all keys are on cooldown, it means all your keys for a specific provider have recently failed. Check the `logs/` directory for details on why the failures occurred.
|
| 153 |
+
|
| 154 |
## Using the Library in Other Projects
|
| 155 |
|
| 156 |
+
The `rotating-api-key-client` is a standalone library that can be integrated into any Python project. For detailed documentation on how to use it, please refer to its `README.md` file located at `src/rotator_library/README.md`.
|
| 157 |
+
|
| 158 |
+
## Detailed Documentation
|
| 159 |
+
|
| 160 |
+
For a more in-depth technical explanation of the `rotating-api-key-client` library's architecture, components, and internal workings, please refer to the [Technical Documentation](DOCUMENTATION.md).
|
requirements.txt
CHANGED
|
@@ -1,4 +1,14 @@
|
|
|
|
|
| 1 |
fastapi
|
|
|
|
|
|
|
| 2 |
uvicorn
|
|
|
|
|
|
|
| 3 |
python-dotenv
|
|
|
|
|
|
|
| 4 |
-e src/rotator_library
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# FastAPI framework for building the proxy server
|
| 2 |
fastapi
|
| 3 |
+
|
| 4 |
+
# ASGI server for running the FastAPI application
|
| 5 |
uvicorn
|
| 6 |
+
|
| 7 |
+
# For loading environment variables from a .env file
|
| 8 |
python-dotenv
|
| 9 |
+
|
| 10 |
+
# Installs the local rotator_library in editable mode
|
| 11 |
-e src/rotator_library
|
| 12 |
+
|
| 13 |
+
# A library for calling LLM APIs with a consistent format
|
| 14 |
+
litellm
|
src/rotator_library/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# Rotating API Key Client
|
| 2 |
|
| 3 |
-
A simple, thread-safe client that intelligently rotates and retries API keys for use with `litellm`.
|
| 4 |
|
| 5 |
## Features
|
| 6 |
|
|
@@ -9,45 +9,135 @@ A simple, thread-safe client that intelligently rotates and retries API keys for
|
|
| 9 |
- **Cooldowns**: Puts keys on a temporary cooldown after rate limit or authentication errors.
|
| 10 |
- **Usage Tracking**: Tracks daily and global usage for each key.
|
| 11 |
- **Provider Agnostic**: Works with any provider supported by `litellm`.
|
|
|
|
| 12 |
|
| 13 |
## Installation
|
| 14 |
|
| 15 |
-
To install the library, you can install it directly from a
|
| 16 |
-
|
| 17 |
-
### From a local path:
|
| 18 |
|
| 19 |
```bash
|
|
|
|
| 20 |
pip install -e .
|
| 21 |
```
|
| 22 |
|
| 23 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
```python
|
| 28 |
import asyncio
|
| 29 |
from rotating_api_key_client import RotatingClient
|
| 30 |
|
| 31 |
async def main():
|
| 32 |
-
|
| 33 |
-
api_keys = ["key1", "key2", "key3"]
|
| 34 |
-
|
| 35 |
-
# Initialize the client
|
| 36 |
client = RotatingClient(api_keys=api_keys)
|
| 37 |
|
| 38 |
-
# Make a request
|
| 39 |
response = await client.acompletion(
|
| 40 |
-
model="gemini/gemini-
|
| 41 |
-
messages=[{"role": "user", "content": "Hello
|
| 42 |
)
|
| 43 |
-
|
| 44 |
print(response)
|
| 45 |
|
| 46 |
-
|
| 47 |
-
asyncio.run(main())
|
| 48 |
```
|
| 49 |
|
| 50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
|
| 52 |
```python
|
| 53 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Rotating API Key Client
|
| 2 |
|
| 3 |
+
A simple, thread-safe client that intelligently rotates and retries API keys for use with `litellm`. This library is designed to make your interactions with LLM providers more resilient and efficient.
|
| 4 |
|
| 5 |
## Features
|
| 6 |
|
|
|
|
| 9 |
- **Cooldowns**: Puts keys on a temporary cooldown after rate limit or authentication errors.
|
| 10 |
- **Usage Tracking**: Tracks daily and global usage for each key.
|
| 11 |
- **Provider Agnostic**: Works with any provider supported by `litellm`.
|
| 12 |
+
- **Extensible**: Easily add support for new providers through a plugin-based architecture.
|
| 13 |
|
| 14 |
## Installation
|
| 15 |
|
| 16 |
+
To install the library, you can install it directly from a local path, which is recommended for development.
|
|
|
|
|
|
|
| 17 |
|
| 18 |
```bash
|
| 19 |
+
# The -e flag installs it in "editable" mode
|
| 20 |
pip install -e .
|
| 21 |
```
|
| 22 |
|
| 23 |
+
## `RotatingClient` Class
|
| 24 |
+
|
| 25 |
+
This is the main class for interacting with the library.
|
| 26 |
+
|
| 27 |
+
### Initialization
|
| 28 |
+
|
| 29 |
+
```python
|
| 30 |
+
from rotating_api_key_client import RotatingClient
|
| 31 |
+
|
| 32 |
+
client = RotatingClient(
|
| 33 |
+
api_keys: Dict[str, List[str]],
|
| 34 |
+
max_retries: int = 2,
|
| 35 |
+
usage_file_path: str = "key_usage.json"
|
| 36 |
+
)
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
- `api_keys`: A dictionary where keys are provider names (e.g., `"openai"`, `"gemini"`) and values are lists of API keys for that provider.
|
| 40 |
+
- `max_retries`: The number of times to retry a request with the *same key* if a transient server error occurs.
|
| 41 |
+
- `usage_file_path`: The path to the JSON file where key usage data will be stored.
|
| 42 |
|
| 43 |
+
### Methods
|
| 44 |
+
|
| 45 |
+
#### `async def acompletion(self, **kwargs) -> Any:`
|
| 46 |
+
|
| 47 |
+
This is the primary method for making API calls. It's a wrapper around `litellm.acompletion` that adds key rotation and retry logic.
|
| 48 |
+
|
| 49 |
+
- **Parameters**: Accepts the same keyword arguments as `litellm.acompletion` (e.g., `messages`, `stream`). The `model` parameter is required and must be a string in the format `provider/model_name` (e.g., `"gemini/gemini-2.5-flash-preview-05-20"`).
|
| 50 |
+
- **Returns**:
|
| 51 |
+
- For non-streaming requests, it returns the `litellm` response object.
|
| 52 |
+
- For streaming requests, it returns an async generator that yields OpenAI-compatible Server-Sent Events (SSE).
|
| 53 |
+
|
| 54 |
+
**Example:**
|
| 55 |
|
| 56 |
```python
|
| 57 |
import asyncio
|
| 58 |
from rotating_api_key_client import RotatingClient
|
| 59 |
|
| 60 |
async def main():
|
| 61 |
+
api_keys = {"gemini": ["key1", "key2"]}
|
|
|
|
|
|
|
|
|
|
| 62 |
client = RotatingClient(api_keys=api_keys)
|
| 63 |
|
|
|
|
| 64 |
response = await client.acompletion(
|
| 65 |
+
model="gemini/gemini-2.5-flash-preview-05-20",
|
| 66 |
+
messages=[{"role": "user", "content": "Hello!"}]
|
| 67 |
)
|
|
|
|
| 68 |
print(response)
|
| 69 |
|
| 70 |
+
asyncio.run(main())
|
|
|
|
| 71 |
```
|
| 72 |
|
| 73 |
+
#### `def token_count(self, model: str, text: str = None, messages: List[Dict[str, str]] = None) -> int:`
|
| 74 |
+
|
| 75 |
+
Calculates the token count for a given text or list of messages using `litellm.token_counter`.
|
| 76 |
+
The `model` parameter is required and must be a string in the format `provider/model_name` (e.g., `"gemini/gemini-2.5-flash-preview-05-20"`).
|
| 77 |
+
**Example:**
|
| 78 |
|
| 79 |
```python
|
| 80 |
+
count = client.token_count(
|
| 81 |
+
model="gemini/gemini-2.5-flash-preview-05-20",
|
| 82 |
+
messages=[{"role": "user", "content": "Count these tokens."}]
|
| 83 |
+
)
|
| 84 |
+
print(f"Token count: {count}")
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
#### `async def get_available_models(self, provider: str) -> List[str]:`
|
| 88 |
+
|
| 89 |
+
Fetches a list of available models for a specific provider. Results are cached.
|
| 90 |
+
|
| 91 |
+
#### `async def get_all_available_models(self) -> Dict[str, List[str]]:`
|
| 92 |
+
|
| 93 |
+
Fetches a dictionary of all available models, grouped by provider.
|
| 94 |
+
|
| 95 |
+
## Error Handling and Cooldowns
|
| 96 |
+
|
| 97 |
+
The client is designed to handle errors gracefully:
|
| 98 |
+
|
| 99 |
+
- **Server Errors (`5xx`)**: The client will retry the request with the *same key* up to `max_retries` times.
|
| 100 |
+
- **Rate Limit / Auth Errors**: These are considered "rotation" errors. The client will immediately place the failing key on a temporary cooldown and try the request again with a different key.
|
| 101 |
+
- **Unrecoverable Errors**: For critical errors, the client will fail fast and raise the exception.
|
| 102 |
+
|
| 103 |
+
Cooldowns are managed by the `UsageManager` and prevent failing keys from being used repeatedly.
|
| 104 |
+
|
| 105 |
+
## Extending with Provider Plugins
|
| 106 |
+
|
| 107 |
+
You can add support for fetching model lists from new providers by creating a custom provider plugin.
|
| 108 |
+
|
| 109 |
+
1. **Create a new provider file** in `src/rotator_library/providers/`, for example, `my_provider.py`.
|
| 110 |
+
2. **Implement the `ProviderPlugin` interface**:
|
| 111 |
+
|
| 112 |
+
```python
|
| 113 |
+
# src/rotator_library/providers/my_provider.py
|
| 114 |
+
from .provider_interface import ProviderPlugin
|
| 115 |
+
from typing import List
|
| 116 |
+
|
| 117 |
+
class MyProvider(ProviderPlugin):
|
| 118 |
+
async def get_models(self, api_key: str) -> List[str]:
|
| 119 |
+
# Logic to fetch and return a list of model names
|
| 120 |
+
# e.g., ["my-provider/model-1", "my-provider/model-2"]
|
| 121 |
+
pass
|
| 122 |
+
```
|
| 123 |
+
|
| 124 |
+
3. **Register the plugin** in `src/rotator_library/providers/__init__.py`:
|
| 125 |
+
|
| 126 |
+
```python
|
| 127 |
+
# src/rotator_library/providers/__init__.py
|
| 128 |
+
from .openai_provider import OpenAIProvider
|
| 129 |
+
from .gemini_provider import GeminiProvider
|
| 130 |
+
from .my_provider import MyProvider # Import your new provider
|
| 131 |
+
|
| 132 |
+
PROVIDER_PLUGINS = {
|
| 133 |
+
"openai": OpenAIProvider,
|
| 134 |
+
"gemini": GeminiProvider,
|
| 135 |
+
"my_provider": MyProvider, # Add it to the dictionary
|
| 136 |
+
}
|
| 137 |
+
```
|
| 138 |
+
|
| 139 |
+
The `RotatingClient` will automatically use your new plugin when `get_available_models` is called for `"my_provider"`.
|
| 140 |
+
|
| 141 |
+
## Detailed Documentation
|
| 142 |
+
|
| 143 |
+
For a more in-depth technical explanation of the `rotating-api-key-client` library's architecture, components, and internal workings, please refer to the [Technical Documentation](../../DOCUMENTATION.md).
|