Spaces:
Paused
Paused
Mirrowel commited on
Commit ·
fc62d82
1
Parent(s): 1b7b9f5
feat(proxy): ✨ implement detailed per-request transaction logging
Browse filesThe new `DetailedLogger` captures comprehensive information for each API transaction, replacing the previous `log_request_response` function.
- Creates a unique, timestamped directory for every incoming request.
- Logs initial request headers and body to a `request.json` file.
- Appends individual chunks for streaming responses to a `streaming_chunks.jsonl` file.
- Records the complete final API response (status, headers, body, duration) to a `final_response.json` file.
- Generates `metadata.json` with a summary including model, token usage, finish reason, and extracted reasoning.
- .gitignore +1 -0
- DOCUMENTATION.md +21 -2
- README.md +4 -4
- src/proxy_app/detailed_logger.py +121 -0
- src/proxy_app/main.py +30 -29
- src/proxy_app/request_logger.py +0 -36
.gitignore
CHANGED
|
@@ -124,3 +124,4 @@ test_proxy.py
|
|
| 124 |
start_proxy.bat
|
| 125 |
key_usage.json
|
| 126 |
staged_changes.txt
|
|
|
|
|
|
| 124 |
start_proxy.bat
|
| 125 |
key_usage.json
|
| 126 |
staged_changes.txt
|
| 127 |
+
logs/
|
DOCUMENTATION.md
CHANGED
|
@@ -165,9 +165,28 @@ For streaming requests, the `chat_completions` endpoint returns a `StreamingResp
|
|
| 165 |
1. It passes the chunks from the `RotatingClient`'s stream directly to the user.
|
| 166 |
2. It aggregates the full response in the background so that it can be logged completely once the stream is finished.
|
| 167 |
|
| 168 |
-
### 3.2. `
|
| 169 |
|
| 170 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 171 |
|
| 172 |
### 3.3. `build.py`
|
| 173 |
|
|
|
|
| 165 |
1. It passes the chunks from the `RotatingClient`'s stream directly to the user.
|
| 166 |
2. It aggregates the full response in the background so that it can be logged completely once the stream is finished.
|
| 167 |
|
| 168 |
+
### 3.2. `detailed_logger.py` - Comprehensive Transaction Logging
|
| 169 |
|
| 170 |
+
To facilitate robust debugging and performance analysis, the proxy includes a powerful detailed logging system, enabled by the `--enable-request-logging` command-line flag. This system is managed by the `DetailedLogger` class in `detailed_logger.py`.
|
| 171 |
+
|
| 172 |
+
Unlike simple logging, this system creates a **unique directory for every single transaction**, ensuring that all related data is isolated and easy to analyze.
|
| 173 |
+
|
| 174 |
+
#### Log Directory Structure
|
| 175 |
+
|
| 176 |
+
When logging is enabled, each request will generate a new directory inside `logs/detailed_logs/` with a name like `YYYYMMDD_HHMMSS_unique-uuid`. Inside this directory, you will find a complete record of the transaction:
|
| 177 |
+
|
| 178 |
+
- **`request.json`**: Contains the full incoming request, including HTTP headers and the JSON body.
|
| 179 |
+
- **`streaming_chunks.jsonl`**: For streaming requests, this file contains a timestamped log of every individual data chunk received from the provider. This is invaluable for debugging malformed streams or partial responses.
|
| 180 |
+
- **`final_response.json`**: Contains the complete final response from the provider, including the status code, headers, and full JSON body. For streaming requests, this body is the fully reassembled message.
|
| 181 |
+
- **`metadata.json`**: A summary file for quick analysis, containing:
|
| 182 |
+
- `request_id`: The unique identifier for the transaction.
|
| 183 |
+
- `duration_ms`: The total time taken for the request to complete.
|
| 184 |
+
- `status_code`: The final HTTP status code returned by the provider.
|
| 185 |
+
- `model`: The model used for the request.
|
| 186 |
+
- `usage`: Token usage statistics (`prompt`, `completion`, `total`).
|
| 187 |
+
- `finish_reason`: The reason the model stopped generating tokens.
|
| 188 |
+
- `reasoning_found`: A boolean indicating if a `reasoning` field was detected in the response.
|
| 189 |
+
- `reasoning_content`: The extracted content of the `reasoning` field, if found.
|
| 190 |
|
| 191 |
### 3.3. `build.py`
|
| 192 |
|
README.md
CHANGED
|
@@ -30,7 +30,7 @@ This project provides a powerful solution for developers building complex applic
|
|
| 30 |
- **Intelligent Key Management**: Optimizes request distribution across your pool of keys by selecting the best available one for each call.
|
| 31 |
- **Escalating Per-Model Cooldowns**: If a key fails for a specific model, it's placed on a temporary, escalating cooldown for that model, allowing it to be used with others.
|
| 32 |
- **Automatic Daily Resets**: Cooldowns and usage statistics are automatically reset daily, making the system self-maintaining.
|
| 33 |
-
- **Request Logging**:
|
| 34 |
- **Provider Agnostic**: Compatible with any provider supported by `litellm`.
|
| 35 |
- **OpenAI-Compatible Proxy**: Offers a familiar API interface with additional endpoints for model and provider discovery.
|
| 36 |
|
|
@@ -238,7 +238,7 @@ The proxy server can be configured at runtime using the following command-line a
|
|
| 238 |
|
| 239 |
- `--host`: The IP address to bind the server to. Defaults to `0.0.0.0` (accessible from your local network).
|
| 240 |
- `--port`: The port to run the server on. Defaults to `8000`.
|
| 241 |
-
- `--enable-request-logging`: A flag to enable logging
|
| 242 |
|
| 243 |
**Example:**
|
| 244 |
```bash
|
|
@@ -255,8 +255,8 @@ For convenience on Windows, you can use the provided `.bat` scripts in the root
|
|
| 255 |
### Troubleshooting
|
| 256 |
|
| 257 |
- **`401 Unauthorized`**: Ensure your `PROXY_API_KEY` is set correctly in the `.env` file and included in the `Authorization: Bearer <key>` header of your request.
|
| 258 |
-
- **`500 Internal Server Error`**: Check the console logs of the `uvicorn` server for detailed error messages. This could indicate an issue with one of your provider API keys (e.g., it's invalid or has been revoked) or a problem with the provider's service.
|
| 259 |
-
- **All keys on cooldown**: If you see a message that all keys are on cooldown, it means all your keys for a specific provider have recently failed.
|
| 260 |
|
| 261 |
---
|
| 262 |
|
|
|
|
| 30 |
- **Intelligent Key Management**: Optimizes request distribution across your pool of keys by selecting the best available one for each call.
|
| 31 |
- **Escalating Per-Model Cooldowns**: If a key fails for a specific model, it's placed on a temporary, escalating cooldown for that model, allowing it to be used with others.
|
| 32 |
- **Automatic Daily Resets**: Cooldowns and usage statistics are automatically reset daily, making the system self-maintaining.
|
| 33 |
+
- **Detailed Request Logging**: Enable comprehensive logging for debugging. Each request gets its own directory with full request/response details, streaming chunks, and performance metadata.
|
| 34 |
- **Provider Agnostic**: Compatible with any provider supported by `litellm`.
|
| 35 |
- **OpenAI-Compatible Proxy**: Offers a familiar API interface with additional endpoints for model and provider discovery.
|
| 36 |
|
|
|
|
| 238 |
|
| 239 |
- `--host`: The IP address to bind the server to. Defaults to `0.0.0.0` (accessible from your local network).
|
| 240 |
- `--port`: The port to run the server on. Defaults to `8000`.
|
| 241 |
+
- `--enable-request-logging`: A flag to enable detailed, per-request logging. When active, the proxy creates a unique directory for each transaction in the `logs/detailed_logs/` folder, containing the full request, response, streaming chunks, and performance metadata. This is highly recommended for debugging.
|
| 242 |
|
| 243 |
**Example:**
|
| 244 |
```bash
|
|
|
|
| 255 |
### Troubleshooting
|
| 256 |
|
| 257 |
- **`401 Unauthorized`**: Ensure your `PROXY_API_KEY` is set correctly in the `.env` file and included in the `Authorization: Bearer <key>` header of your request.
|
| 258 |
+
- **`500 Internal Server Error`**: Check the console logs of the `uvicorn` server for detailed error messages. This could indicate an issue with one of your provider API keys (e.g., it's invalid or has been revoked) or a problem with the provider's service. If you have logging enabled (`--enable-request-logging`), inspect the `final_response.json` and `metadata.json` files in the corresponding log directory under `logs/detailed_logs/` for the specific error returned by the upstream provider.
|
| 259 |
+
- **All keys on cooldown**: If you see a message that all keys are on cooldown, it means all your keys for a specific provider have recently failed. If you have logging enabled (`--enable-request-logging`), check the `logs/detailed_logs/` directory to find the logs for the failed requests and inspect the `final_response.json` to see the underlying error from the provider.
|
| 260 |
|
| 261 |
---
|
| 262 |
|
src/proxy_app/detailed_logger.py
ADDED
|
@@ -0,0 +1,121 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
import time
|
| 3 |
+
import uuid
|
| 4 |
+
from datetime import datetime
|
| 5 |
+
from pathlib import Path
|
| 6 |
+
from typing import Any, Dict, Optional, List
|
| 7 |
+
import logging
|
| 8 |
+
|
| 9 |
+
LOGS_DIR = Path(__file__).resolve().parent.parent.parent / "logs"
|
| 10 |
+
DETAILED_LOGS_DIR = LOGS_DIR / "detailed_logs"
|
| 11 |
+
|
| 12 |
+
class DetailedLogger:
|
| 13 |
+
"""
|
| 14 |
+
Logs comprehensive details of each API transaction to a unique, timestamped directory.
|
| 15 |
+
"""
|
| 16 |
+
def __init__(self):
|
| 17 |
+
"""
|
| 18 |
+
Initializes the logger for a single request, creating a unique directory to store all related log files.
|
| 19 |
+
"""
|
| 20 |
+
self.start_time = time.time()
|
| 21 |
+
self.request_id = str(uuid.uuid4())
|
| 22 |
+
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
| 23 |
+
self.log_dir = DETAILED_LOGS_DIR / f"{timestamp}_{self.request_id}"
|
| 24 |
+
self.log_dir.mkdir(parents=True, exist_ok=True)
|
| 25 |
+
self.streaming = False
|
| 26 |
+
|
| 27 |
+
def _write_json(self, filename: str, data: Dict[str, Any]):
|
| 28 |
+
"""Helper to write data to a JSON file in the log directory."""
|
| 29 |
+
try:
|
| 30 |
+
with open(self.log_dir / filename, "w", encoding="utf-8") as f:
|
| 31 |
+
json.dump(data, f, indent=4, ensure_ascii=False)
|
| 32 |
+
except Exception as e:
|
| 33 |
+
logging.error(f"[{self.request_id}] Failed to write to {filename}: {e}")
|
| 34 |
+
|
| 35 |
+
def log_request(self, headers: Dict[str, Any], body: Dict[str, Any]):
|
| 36 |
+
"""Logs the initial request details."""
|
| 37 |
+
self.streaming = body.get("stream", False)
|
| 38 |
+
request_data = {
|
| 39 |
+
"request_id": self.request_id,
|
| 40 |
+
"timestamp_utc": datetime.utcnow().isoformat(),
|
| 41 |
+
"headers": dict(headers),
|
| 42 |
+
"body": body
|
| 43 |
+
}
|
| 44 |
+
self._write_json("request.json", request_data)
|
| 45 |
+
|
| 46 |
+
def log_stream_chunk(self, chunk: Dict[str, Any]):
|
| 47 |
+
"""Logs an individual chunk from a streaming response to a JSON Lines file."""
|
| 48 |
+
try:
|
| 49 |
+
log_entry = {
|
| 50 |
+
"timestamp_utc": datetime.utcnow().isoformat(),
|
| 51 |
+
"chunk": chunk
|
| 52 |
+
}
|
| 53 |
+
with open(self.log_dir / "streaming_chunks.jsonl", "a", encoding="utf-8") as f:
|
| 54 |
+
f.write(json.dumps(log_entry, ensure_ascii=False) + "\n")
|
| 55 |
+
except Exception as e:
|
| 56 |
+
logging.error(f"[{self.request_id}] Failed to write stream chunk: {e}")
|
| 57 |
+
|
| 58 |
+
def log_final_response(self, status_code: int, headers: Optional[Dict[str, Any]], body: Dict[str, Any]):
|
| 59 |
+
"""Logs the complete final response, either from a non-streaming call or after reassembling a stream."""
|
| 60 |
+
end_time = time.time()
|
| 61 |
+
duration_ms = (end_time - self.start_time) * 1000
|
| 62 |
+
|
| 63 |
+
response_data = {
|
| 64 |
+
"request_id": self.request_id,
|
| 65 |
+
"timestamp_utc": datetime.utcnow().isoformat(),
|
| 66 |
+
"status_code": status_code,
|
| 67 |
+
"duration_ms": round(duration_ms),
|
| 68 |
+
"headers": dict(headers) if headers else None,
|
| 69 |
+
"body": body
|
| 70 |
+
}
|
| 71 |
+
self._write_json("final_response.json", response_data)
|
| 72 |
+
self._log_metadata(response_data)
|
| 73 |
+
|
| 74 |
+
def _extract_reasoning(self, response_body: Dict[str, Any]) -> Optional[str]:
|
| 75 |
+
"""Recursively searches for and extracts 'reasoning' fields from the response body."""
|
| 76 |
+
if not isinstance(response_body, dict):
|
| 77 |
+
return None
|
| 78 |
+
|
| 79 |
+
if "reasoning" in response_body:
|
| 80 |
+
return response_body["reasoning"]
|
| 81 |
+
|
| 82 |
+
if "choices" in response_body and response_body["choices"]:
|
| 83 |
+
message = response_body["choices"][0].get("message", {})
|
| 84 |
+
if "reasoning" in message:
|
| 85 |
+
return message["reasoning"]
|
| 86 |
+
if "reasoning_content" in message:
|
| 87 |
+
return message["reasoning_content"]
|
| 88 |
+
|
| 89 |
+
return None
|
| 90 |
+
|
| 91 |
+
def _log_metadata(self, response_data: Dict[str, Any]):
|
| 92 |
+
"""Logs a summary of the transaction for quick analysis."""
|
| 93 |
+
usage = response_data.get("body", {}).get("usage", {})
|
| 94 |
+
model = response_data.get("body", {}).get("model", "N/A")
|
| 95 |
+
finish_reason = "N/A"
|
| 96 |
+
if "choices" in response_data.get("body", {}) and response_data["body"]["choices"]:
|
| 97 |
+
finish_reason = response_data["body"]["choices"][0].get("finish_reason", "N/A")
|
| 98 |
+
|
| 99 |
+
metadata = {
|
| 100 |
+
"request_id": self.request_id,
|
| 101 |
+
"timestamp_utc": response_data["timestamp_utc"],
|
| 102 |
+
"duration_ms": response_data["duration_ms"],
|
| 103 |
+
"status_code": response_data["status_code"],
|
| 104 |
+
"model": model,
|
| 105 |
+
"streaming": self.streaming,
|
| 106 |
+
"usage": {
|
| 107 |
+
"prompt_tokens": usage.get("prompt_tokens"),
|
| 108 |
+
"completion_tokens": usage.get("completion_tokens"),
|
| 109 |
+
"total_tokens": usage.get("total_tokens"),
|
| 110 |
+
},
|
| 111 |
+
"finish_reason": finish_reason,
|
| 112 |
+
"reasoning_found": False,
|
| 113 |
+
"reasoning_content": None
|
| 114 |
+
}
|
| 115 |
+
|
| 116 |
+
reasoning = self._extract_reasoning(response_data.get("body", {}))
|
| 117 |
+
if reasoning:
|
| 118 |
+
metadata["reasoning_found"] = True
|
| 119 |
+
metadata["reasoning_content"] = reasoning
|
| 120 |
+
|
| 121 |
+
self._write_json("metadata.json", metadata)
|
src/proxy_app/main.py
CHANGED
|
@@ -52,8 +52,9 @@ args, _ = parser.parse_known_args()
|
|
| 52 |
sys.path.append(str(Path(__file__).resolve().parent.parent))
|
| 53 |
|
| 54 |
from rotator_library import RotatingClient, PROVIDER_PLUGINS
|
| 55 |
-
from proxy_app.request_logger import
|
| 56 |
from proxy_app.batch_manager import EmbeddingBatcher
|
|
|
|
| 57 |
|
| 58 |
# --- Logging Configuration ---
|
| 59 |
LOG_DIR = Path(__file__).resolve().parent.parent / "logs"
|
|
@@ -206,7 +207,8 @@ async def verify_api_key(auth: str = Depends(api_key_header)):
|
|
| 206 |
async def streaming_response_wrapper(
|
| 207 |
request: Request,
|
| 208 |
request_data: dict,
|
| 209 |
-
response_stream: AsyncGenerator[str, None]
|
|
|
|
| 210 |
) -> AsyncGenerator[str, None]:
|
| 211 |
"""
|
| 212 |
Wraps a streaming response to log the full response after completion
|
|
@@ -227,8 +229,10 @@ async def streaming_response_wrapper(
|
|
| 227 |
try:
|
| 228 |
chunk_data = json.loads(content)
|
| 229 |
response_chunks.append(chunk_data)
|
|
|
|
|
|
|
| 230 |
except json.JSONDecodeError:
|
| 231 |
-
pass
|
| 232 |
except Exception as e:
|
| 233 |
logging.error(f"An error occurred during the response stream: {e}")
|
| 234 |
# Yield a final error message to the client to ensure they are not left hanging.
|
|
@@ -242,13 +246,8 @@ async def streaming_response_wrapper(
|
|
| 242 |
yield f"data: {json.dumps(error_payload)}\n\n"
|
| 243 |
yield "data: [DONE]\n\n"
|
| 244 |
# Also log this as a failed request
|
| 245 |
-
if
|
| 246 |
-
|
| 247 |
-
request_data=request_data,
|
| 248 |
-
response_data={"error": str(e)},
|
| 249 |
-
is_streaming=True,
|
| 250 |
-
log_type="completion"
|
| 251 |
-
)
|
| 252 |
return # Stop further processing
|
| 253 |
finally:
|
| 254 |
if response_chunks:
|
|
@@ -341,12 +340,11 @@ async def streaming_response_wrapper(
|
|
| 341 |
"usage": usage_data
|
| 342 |
}
|
| 343 |
|
| 344 |
-
if
|
| 345 |
-
|
| 346 |
-
|
| 347 |
-
|
| 348 |
-
|
| 349 |
-
log_type="completion"
|
| 350 |
)
|
| 351 |
|
| 352 |
@app.post("/v1/chat/completions")
|
|
@@ -359,8 +357,12 @@ async def chat_completions(
|
|
| 359 |
OpenAI-compatible endpoint powered by the RotatingClient.
|
| 360 |
Handles both streaming and non-streaming responses and logs them.
|
| 361 |
"""
|
|
|
|
| 362 |
try:
|
| 363 |
request_data = await request.json()
|
|
|
|
|
|
|
|
|
|
| 364 |
log_request_to_console(
|
| 365 |
url=str(request.url),
|
| 366 |
headers=dict(request.headers),
|
|
@@ -372,17 +374,20 @@ async def chat_completions(
|
|
| 372 |
if is_streaming:
|
| 373 |
response_generator = client.acompletion(request=request, **request_data)
|
| 374 |
return StreamingResponse(
|
| 375 |
-
streaming_response_wrapper(request, request_data, response_generator),
|
| 376 |
media_type="text/event-stream"
|
| 377 |
)
|
| 378 |
else:
|
| 379 |
response = await client.acompletion(request=request, **request_data)
|
| 380 |
-
if
|
| 381 |
-
|
| 382 |
-
|
| 383 |
-
|
| 384 |
-
|
| 385 |
-
|
|
|
|
|
|
|
|
|
|
| 386 |
)
|
| 387 |
return response
|
| 388 |
|
|
@@ -406,12 +411,8 @@ async def chat_completions(
|
|
| 406 |
request_data = await request.json()
|
| 407 |
except json.JSONDecodeError:
|
| 408 |
request_data = {"error": "Could not parse request body"}
|
| 409 |
-
|
| 410 |
-
|
| 411 |
-
response_data={"error": str(e)},
|
| 412 |
-
is_streaming=request_data.get("stream", False),
|
| 413 |
-
log_type="completion"
|
| 414 |
-
)
|
| 415 |
raise HTTPException(status_code=500, detail=str(e))
|
| 416 |
|
| 417 |
@app.post("/v1/embeddings")
|
|
|
|
| 52 |
sys.path.append(str(Path(__file__).resolve().parent.parent))
|
| 53 |
|
| 54 |
from rotator_library import RotatingClient, PROVIDER_PLUGINS
|
| 55 |
+
from proxy_app.request_logger import log_request_to_console
|
| 56 |
from proxy_app.batch_manager import EmbeddingBatcher
|
| 57 |
+
from proxy_app.detailed_logger import DetailedLogger
|
| 58 |
|
| 59 |
# --- Logging Configuration ---
|
| 60 |
LOG_DIR = Path(__file__).resolve().parent.parent / "logs"
|
|
|
|
| 207 |
async def streaming_response_wrapper(
|
| 208 |
request: Request,
|
| 209 |
request_data: dict,
|
| 210 |
+
response_stream: AsyncGenerator[str, None],
|
| 211 |
+
logger: Optional[DetailedLogger] = None
|
| 212 |
) -> AsyncGenerator[str, None]:
|
| 213 |
"""
|
| 214 |
Wraps a streaming response to log the full response after completion
|
|
|
|
| 229 |
try:
|
| 230 |
chunk_data = json.loads(content)
|
| 231 |
response_chunks.append(chunk_data)
|
| 232 |
+
if logger:
|
| 233 |
+
logger.log_stream_chunk(chunk_data)
|
| 234 |
except json.JSONDecodeError:
|
| 235 |
+
pass
|
| 236 |
except Exception as e:
|
| 237 |
logging.error(f"An error occurred during the response stream: {e}")
|
| 238 |
# Yield a final error message to the client to ensure they are not left hanging.
|
|
|
|
| 246 |
yield f"data: {json.dumps(error_payload)}\n\n"
|
| 247 |
yield "data: [DONE]\n\n"
|
| 248 |
# Also log this as a failed request
|
| 249 |
+
if logger:
|
| 250 |
+
logger.log_final_response(status_code=500, headers=None, body={"error": str(e)})
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 251 |
return # Stop further processing
|
| 252 |
finally:
|
| 253 |
if response_chunks:
|
|
|
|
| 340 |
"usage": usage_data
|
| 341 |
}
|
| 342 |
|
| 343 |
+
if logger:
|
| 344 |
+
logger.log_final_response(
|
| 345 |
+
status_code=200,
|
| 346 |
+
headers=None, # Headers are not available at this stage
|
| 347 |
+
body=full_response
|
|
|
|
| 348 |
)
|
| 349 |
|
| 350 |
@app.post("/v1/chat/completions")
|
|
|
|
| 357 |
OpenAI-compatible endpoint powered by the RotatingClient.
|
| 358 |
Handles both streaming and non-streaming responses and logs them.
|
| 359 |
"""
|
| 360 |
+
logger = DetailedLogger() if ENABLE_REQUEST_LOGGING else None
|
| 361 |
try:
|
| 362 |
request_data = await request.json()
|
| 363 |
+
if logger:
|
| 364 |
+
logger.log_request(headers=request.headers, body=request_data)
|
| 365 |
+
|
| 366 |
log_request_to_console(
|
| 367 |
url=str(request.url),
|
| 368 |
headers=dict(request.headers),
|
|
|
|
| 374 |
if is_streaming:
|
| 375 |
response_generator = client.acompletion(request=request, **request_data)
|
| 376 |
return StreamingResponse(
|
| 377 |
+
streaming_response_wrapper(request, request_data, response_generator, logger),
|
| 378 |
media_type="text/event-stream"
|
| 379 |
)
|
| 380 |
else:
|
| 381 |
response = await client.acompletion(request=request, **request_data)
|
| 382 |
+
if logger:
|
| 383 |
+
# Assuming response has status_code and headers attributes
|
| 384 |
+
# This might need adjustment based on the actual response object
|
| 385 |
+
response_headers = response.headers if hasattr(response, 'headers') else None
|
| 386 |
+
status_code = response.status_code if hasattr(response, 'status_code') else 200
|
| 387 |
+
logger.log_final_response(
|
| 388 |
+
status_code=status_code,
|
| 389 |
+
headers=response_headers,
|
| 390 |
+
body=response.model_dump()
|
| 391 |
)
|
| 392 |
return response
|
| 393 |
|
|
|
|
| 411 |
request_data = await request.json()
|
| 412 |
except json.JSONDecodeError:
|
| 413 |
request_data = {"error": "Could not parse request body"}
|
| 414 |
+
if logger:
|
| 415 |
+
logger.log_final_response(status_code=500, headers=None, body={"error": str(e)})
|
|
|
|
|
|
|
|
|
|
|
|
|
| 416 |
raise HTTPException(status_code=500, detail=str(e))
|
| 417 |
|
| 418 |
@app.post("/v1/embeddings")
|
src/proxy_app/request_logger.py
CHANGED
|
@@ -38,39 +38,3 @@ def log_request_to_console(url: str, headers: dict, client_info: tuple, request_
|
|
| 38 |
log_message = f"{time_str} - {client_info[0]}:{client_info[1]} - provider: {provider}, model: {model_name} - {endpoint_url}"
|
| 39 |
logging.info(log_message)
|
| 40 |
|
| 41 |
-
def log_request_response(
|
| 42 |
-
request_data: dict,
|
| 43 |
-
response_data: dict,
|
| 44 |
-
is_streaming: bool,
|
| 45 |
-
log_type: Literal["completion", "embedding"]
|
| 46 |
-
):
|
| 47 |
-
"""
|
| 48 |
-
Logs the request and response data to a file in the appropriate log directory.
|
| 49 |
-
"""
|
| 50 |
-
try:
|
| 51 |
-
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
| 52 |
-
unique_id = uuid.uuid4()
|
| 53 |
-
|
| 54 |
-
if log_type == "completion":
|
| 55 |
-
target_dir = COMPLETIONS_LOGS_DIR
|
| 56 |
-
elif log_type == "embedding":
|
| 57 |
-
target_dir = EMBEDDINGS_LOGS_DIR
|
| 58 |
-
else:
|
| 59 |
-
# Fallback to the main logs directory if log_type is invalid
|
| 60 |
-
target_dir = LOGS_DIR
|
| 61 |
-
|
| 62 |
-
filename = target_dir / f"{timestamp}_{unique_id}.json"
|
| 63 |
-
|
| 64 |
-
log_content = {
|
| 65 |
-
"request": request_data,
|
| 66 |
-
"response": response_data,
|
| 67 |
-
"is_streaming": is_streaming
|
| 68 |
-
}
|
| 69 |
-
|
| 70 |
-
with open(filename, "w", encoding="utf-8") as f:
|
| 71 |
-
json.dump(log_content, f, indent=4, ensure_ascii=False)
|
| 72 |
-
|
| 73 |
-
except Exception as e:
|
| 74 |
-
# In case of logging failure, we don't want to crash the main application
|
| 75 |
-
# Use the root logger to log the error to the file.
|
| 76 |
-
logging.error(f"Error logging request/response to file: {e}")
|
|
|
|
| 38 |
log_message = f"{time_str} - {client_info[0]}:{client_info[1]} - provider: {provider}, model: {model_name} - {endpoint_url}"
|
| 39 |
logging.info(log_message)
|
| 40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|