Spaces:
Running
Running
cloudwaddie commited on
Commit Β·
ecddbe3
1
Parent(s): 7e84f85
adjusted
Browse files- PRODUCTION_CHECKLIST.md +125 -0
- PRODUCTION_READY.md +167 -0
- README.md +119 -0
- src/main.py +201 -73
PRODUCTION_CHECKLIST.md
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Production Deployment Checklist
|
| 2 |
+
|
| 3 |
+
## Pre-Deployment
|
| 4 |
+
|
| 5 |
+
- [ ] **Set DEBUG = False** in `src/main.py`
|
| 6 |
+
- [ ] **Change admin password** in `config.json` from default
|
| 7 |
+
- [ ] **Generate strong API keys** using the dashboard
|
| 8 |
+
- [ ] **Configure rate limits** appropriate for your use case
|
| 9 |
+
- [ ] **Test all endpoints** with sample requests
|
| 10 |
+
- [ ] **Verify image upload** works with test images
|
| 11 |
+
- [ ] **Check LMArena tokens** are valid (arena-auth-prod-v1, cf_clearance)
|
| 12 |
+
|
| 13 |
+
## Security
|
| 14 |
+
|
| 15 |
+
- [ ] **Use HTTPS** via reverse proxy (nginx, Caddy, Traefik)
|
| 16 |
+
- [ ] **Restrict dashboard access** (IP whitelist or VPN)
|
| 17 |
+
- [ ] **Set strong passwords** for all accounts
|
| 18 |
+
- [ ] **Regularly rotate API keys** for security
|
| 19 |
+
- [ ] **Monitor for unauthorized access** in logs
|
| 20 |
+
- [ ] **Backup config.json** regularly
|
| 21 |
+
|
| 22 |
+
## Infrastructure
|
| 23 |
+
|
| 24 |
+
- [ ] **Set up reverse proxy** with SSL certificates
|
| 25 |
+
- [ ] **Configure systemd service** (or equivalent) for auto-restart
|
| 26 |
+
- [ ] **Set up monitoring** (response times, error rates)
|
| 27 |
+
- [ ] **Configure log rotation** to prevent disk fill
|
| 28 |
+
- [ ] **Test failover/restart** behavior
|
| 29 |
+
- [ ] **Document deployment** process for your team
|
| 30 |
+
|
| 31 |
+
## Performance
|
| 32 |
+
|
| 33 |
+
- [ ] **Test concurrent requests** to verify performance
|
| 34 |
+
- [ ] **Monitor memory usage** under load
|
| 35 |
+
- [ ] **Check response times** for acceptable latency
|
| 36 |
+
- [ ] **Test streaming mode** if using long responses
|
| 37 |
+
- [ ] **Verify image upload** doesn't cause timeouts
|
| 38 |
+
|
| 39 |
+
## Monitoring
|
| 40 |
+
|
| 41 |
+
- [ ] **Set up health checks** (e.g., /api/v1/models endpoint)
|
| 42 |
+
- [ ] **Monitor error rates** from logs
|
| 43 |
+
- [ ] **Track model usage** via dashboard statistics
|
| 44 |
+
- [ ] **Set up alerts** for high error rates or downtime
|
| 45 |
+
- [ ] **Monitor disk space** for logs and config backups
|
| 46 |
+
|
| 47 |
+
## Testing
|
| 48 |
+
|
| 49 |
+
- [ ] **Test with OpenAI SDK** to verify compatibility
|
| 50 |
+
- [ ] **Test error handling** (invalid keys, missing fields, etc.)
|
| 51 |
+
- [ ] **Test rate limiting** to verify it works correctly
|
| 52 |
+
- [ ] **Test image uploads** with various formats and sizes
|
| 53 |
+
- [ ] **Test streaming responses** if using that feature
|
| 54 |
+
- [ ] **Test multi-turn conversations** to verify session management
|
| 55 |
+
|
| 56 |
+
## Documentation
|
| 57 |
+
|
| 58 |
+
- [ ] **Document API endpoint** URL for your users
|
| 59 |
+
- [ ] **Document available models** and capabilities
|
| 60 |
+
- [ ] **Document rate limits** and usage policies
|
| 61 |
+
- [ ] **Document image support** and size limits
|
| 62 |
+
- [ ] **Document error codes** and troubleshooting
|
| 63 |
+
- [ ] **Create usage examples** for common scenarios
|
| 64 |
+
|
| 65 |
+
## Post-Deployment
|
| 66 |
+
|
| 67 |
+
- [ ] **Monitor logs** for first 24 hours
|
| 68 |
+
- [ ] **Verify all features** work in production
|
| 69 |
+
- [ ] **Test from external network** to verify accessibility
|
| 70 |
+
- [ ] **Check dashboard** is accessible and functional
|
| 71 |
+
- [ ] **Verify stats tracking** is working correctly
|
| 72 |
+
- [ ] **Document any issues** and resolutions
|
| 73 |
+
|
| 74 |
+
## Maintenance
|
| 75 |
+
|
| 76 |
+
- [ ] **Schedule regular token rotation** (arena-auth-prod-v1, cf_clearance)
|
| 77 |
+
- [ ] **Review API key usage** monthly
|
| 78 |
+
- [ ] **Check for updates** to dependencies
|
| 79 |
+
- [ ] **Monitor LMArena** for API changes
|
| 80 |
+
- [ ] **Backup configuration** weekly
|
| 81 |
+
- [ ] **Review logs** for suspicious activity
|
| 82 |
+
|
| 83 |
+
## Emergency Procedures
|
| 84 |
+
|
| 85 |
+
- [ ] **Document restart procedure** for service
|
| 86 |
+
- [ ] **Document token refresh** process
|
| 87 |
+
- [ ] **Document rollback procedure** if needed
|
| 88 |
+
- [ ] **Create emergency contacts** list
|
| 89 |
+
- [ ] **Test backup restoration** procedure
|
| 90 |
+
- [ ] **Document common issues** and fixes
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## Quick Start Commands
|
| 95 |
+
|
| 96 |
+
### Check Service Status
|
| 97 |
+
```bash
|
| 98 |
+
sudo systemctl status lmarenabridge
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
+
### View Recent Logs
|
| 102 |
+
```bash
|
| 103 |
+
sudo journalctl -u lmarenabridge -n 100 -f
|
| 104 |
+
```
|
| 105 |
+
|
| 106 |
+
### Restart Service
|
| 107 |
+
```bash
|
| 108 |
+
sudo systemctl restart lmarenabridge
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
### Test API Endpoint
|
| 112 |
+
```bash
|
| 113 |
+
curl http://localhost:8000/api/v1/models \
|
| 114 |
+
-H "Authorization: Bearer sk-lmab-your-key-here"
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
+
### Check Disk Space
|
| 118 |
+
```bash
|
| 119 |
+
df -h
|
| 120 |
+
```
|
| 121 |
+
|
| 122 |
+
### Monitor Process
|
| 123 |
+
```bash
|
| 124 |
+
htop
|
| 125 |
+
```
|
PRODUCTION_READY.md
ADDED
|
@@ -0,0 +1,167 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Production Readiness - Summary of Changes
|
| 2 |
+
|
| 3 |
+
## Changes Made for Production Deployment
|
| 4 |
+
|
| 5 |
+
### 1. Debug Mode Disabled β
|
| 6 |
+
- Set `DEBUG = False` in `src/main.py` (line 24)
|
| 7 |
+
- Reduces log verbosity for production
|
| 8 |
+
- Improves performance by reducing I/O operations
|
| 9 |
+
|
| 10 |
+
### 2. Enhanced Error Handling
|
| 11 |
+
|
| 12 |
+
#### Image Upload Function (`upload_image_to_lmarena`)
|
| 13 |
+
- **Input Validation**: Checks for empty data and invalid MIME types
|
| 14 |
+
- **HTTP Error Handling**: Catches and logs `httpx.TimeoutException` and `httpx.HTTPError`
|
| 15 |
+
- **JSON Parsing**: Handles `JSONDecodeError`, `KeyError`, and `IndexError` gracefully
|
| 16 |
+
- **Timeout Configuration**: 30s for requests, 60s for large uploads
|
| 17 |
+
- **Detailed Error Messages**: Clear error messages for debugging
|
| 18 |
+
|
| 19 |
+
#### Image Processing Function (`process_message_content`)
|
| 20 |
+
- **Data URI Validation**: Validates format before parsing
|
| 21 |
+
- **MIME Type Validation**: Ensures only image types are processed
|
| 22 |
+
- **Base64 Decoding**: Catches decoding errors gracefully
|
| 23 |
+
- **Size Limits**: Enforces 10MB maximum per image
|
| 24 |
+
- **Error Isolation**: Continues processing even if one image fails
|
| 25 |
+
|
| 26 |
+
#### Main API Endpoint (`api_chat_completions`)
|
| 27 |
+
- **JSON Parsing**: Validates request body format
|
| 28 |
+
- **Field Validation**: Checks required fields and data types
|
| 29 |
+
- **Empty Array Check**: Validates messages array is not empty
|
| 30 |
+
- **Model Loading**: Catches errors when loading model list
|
| 31 |
+
- **Usage Logging**: Non-critical failures don't break requests
|
| 32 |
+
- **Image Processing**: Catches and reports processing errors
|
| 33 |
+
- **HTTP Errors**: Returns OpenAI-compatible error responses
|
| 34 |
+
- **Timeout Errors**: 120s timeout with clear error messages
|
| 35 |
+
- **Unexpected Errors**: Catches all exceptions with detailed logging
|
| 36 |
+
|
| 37 |
+
### 3. Error Response Format
|
| 38 |
+
|
| 39 |
+
All errors return OpenAI-compatible format:
|
| 40 |
+
```json
|
| 41 |
+
{
|
| 42 |
+
"error": {
|
| 43 |
+
"message": "Descriptive error message",
|
| 44 |
+
"type": "error_type",
|
| 45 |
+
"code": "error_code"
|
| 46 |
+
}
|
| 47 |
+
}
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
Error types include:
|
| 51 |
+
- `rate_limit_error` - 429 Too Many Requests
|
| 52 |
+
- `upstream_error` - LMArena API errors
|
| 53 |
+
- `timeout_error` - Request timeouts
|
| 54 |
+
- `internal_error` - Unexpected server errors
|
| 55 |
+
|
| 56 |
+
### 4. Health Check Endpoint
|
| 57 |
+
|
| 58 |
+
New endpoint: `GET /api/v1/health`
|
| 59 |
+
|
| 60 |
+
Returns:
|
| 61 |
+
```json
|
| 62 |
+
{
|
| 63 |
+
"status": "healthy|degraded|unhealthy",
|
| 64 |
+
"timestamp": "2024-11-06T12:00:00Z",
|
| 65 |
+
"checks": {
|
| 66 |
+
"cf_clearance": true,
|
| 67 |
+
"models_loaded": true,
|
| 68 |
+
"model_count": 45,
|
| 69 |
+
"api_keys_configured": true
|
| 70 |
+
}
|
| 71 |
+
}
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
Use for monitoring and load balancer health checks.
|
| 75 |
+
|
| 76 |
+
### 5. Documentation Updates
|
| 77 |
+
|
| 78 |
+
#### README.md
|
| 79 |
+
- Added "Production Deployment" section
|
| 80 |
+
- Documented error handling capabilities
|
| 81 |
+
- Added debug mode instructions
|
| 82 |
+
- Included monitoring guidelines
|
| 83 |
+
- Security best practices
|
| 84 |
+
- Common issues and solutions
|
| 85 |
+
- Nginx reverse proxy example
|
| 86 |
+
- Systemd service example
|
| 87 |
+
|
| 88 |
+
#### PRODUCTION_CHECKLIST.md (New File)
|
| 89 |
+
- Pre-deployment checklist
|
| 90 |
+
- Security checklist
|
| 91 |
+
- Infrastructure setup
|
| 92 |
+
- Performance testing
|
| 93 |
+
- Monitoring setup
|
| 94 |
+
- Testing procedures
|
| 95 |
+
- Documentation requirements
|
| 96 |
+
- Post-deployment monitoring
|
| 97 |
+
- Maintenance schedule
|
| 98 |
+
- Emergency procedures
|
| 99 |
+
- Quick reference commands
|
| 100 |
+
|
| 101 |
+
## Security Improvements
|
| 102 |
+
|
| 103 |
+
1. **Input Validation**: All user inputs are validated
|
| 104 |
+
2. **Size Limits**: 10MB max per image prevents DOS attacks
|
| 105 |
+
3. **Error Sanitization**: Sensitive data not exposed in errors
|
| 106 |
+
4. **Timeout Protection**: All requests have timeouts
|
| 107 |
+
5. **Rate Limiting**: Existing rate limiting preserved
|
| 108 |
+
|
| 109 |
+
## Performance Optimizations
|
| 110 |
+
|
| 111 |
+
1. **Debug Logging**: Disabled in production mode
|
| 112 |
+
2. **Error Handling**: Fast-fail for invalid requests
|
| 113 |
+
3. **Non-blocking**: Image uploads use async operations
|
| 114 |
+
4. **Resource Cleanup**: Proper exception handling ensures cleanup
|
| 115 |
+
|
| 116 |
+
## Monitoring Capabilities
|
| 117 |
+
|
| 118 |
+
1. **Health Check Endpoint**: `/api/v1/health` for monitoring
|
| 119 |
+
2. **Error Logging**: Structured error messages
|
| 120 |
+
3. **Usage Statistics**: Tracked in dashboard
|
| 121 |
+
4. **Request Logging**: Optional debug mode for troubleshooting
|
| 122 |
+
|
| 123 |
+
## Deployment Ready
|
| 124 |
+
|
| 125 |
+
The application is now ready for production deployment with:
|
| 126 |
+
|
| 127 |
+
β
Debug mode OFF by default
|
| 128 |
+
β
Comprehensive error handling
|
| 129 |
+
β
Input validation on all endpoints
|
| 130 |
+
β
Timeout protection
|
| 131 |
+
β
Health check endpoint
|
| 132 |
+
β
OpenAI-compatible error responses
|
| 133 |
+
β
Detailed documentation
|
| 134 |
+
β
Production checklist
|
| 135 |
+
β
Security best practices
|
| 136 |
+
β
Monitoring guidelines
|
| 137 |
+
|
| 138 |
+
## Testing Recommendations
|
| 139 |
+
|
| 140 |
+
Before deploying to production:
|
| 141 |
+
|
| 142 |
+
1. **Run test_image_support.py** with various image sizes and formats
|
| 143 |
+
2. **Test with invalid inputs** to verify error handling
|
| 144 |
+
3. **Test rate limiting** with concurrent requests
|
| 145 |
+
4. **Test timeout scenarios** with slow networks
|
| 146 |
+
5. **Monitor resource usage** under load
|
| 147 |
+
6. **Test health check endpoint** with monitoring tools
|
| 148 |
+
7. **Verify log output** with DEBUG = False
|
| 149 |
+
|
| 150 |
+
## Next Steps
|
| 151 |
+
|
| 152 |
+
1. Review `PRODUCTION_CHECKLIST.md` and complete all items
|
| 153 |
+
2. Set up reverse proxy with SSL (see README.md)
|
| 154 |
+
3. Configure systemd service (see README.md)
|
| 155 |
+
4. Set up monitoring and alerts
|
| 156 |
+
5. Test all endpoints in production environment
|
| 157 |
+
6. Document your specific deployment details
|
| 158 |
+
7. Create backup procedures for config.json
|
| 159 |
+
|
| 160 |
+
## Support
|
| 161 |
+
|
| 162 |
+
If you encounter issues:
|
| 163 |
+
- Check logs for error messages
|
| 164 |
+
- Use `/api/v1/health` to verify system status
|
| 165 |
+
- Enable DEBUG mode temporarily for troubleshooting
|
| 166 |
+
- Review common issues in README.md
|
| 167 |
+
- Contact cloudwaddie for assistance
|
README.md
CHANGED
|
@@ -86,3 +86,122 @@ print(response.choices[0].message.content)
|
|
| 86 |
- External image URLs (http/https) are not yet supported
|
| 87 |
- Models without image support will ignore image content
|
| 88 |
- Check model capabilities using `/api/v1/models` endpoint
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
- External image URLs (http/https) are not yet supported
|
| 87 |
- Models without image support will ignore image content
|
| 88 |
- Check model capabilities using `/api/v1/models` endpoint
|
| 89 |
+
- Maximum image size: 10MB per image
|
| 90 |
+
|
| 91 |
+
## Production Deployment
|
| 92 |
+
|
| 93 |
+
### Error Handling
|
| 94 |
+
|
| 95 |
+
LMArenaBridge includes comprehensive error handling for production use:
|
| 96 |
+
|
| 97 |
+
- **Request Validation**: Validates JSON format, required fields, and data types
|
| 98 |
+
- **Model Validation**: Checks model availability and access permissions
|
| 99 |
+
- **Image Processing**: Validates image formats, sizes (max 10MB), and MIME types
|
| 100 |
+
- **Upload Failures**: Gracefully handles image upload failures with retry logic
|
| 101 |
+
- **Timeout Handling**: Configurable timeouts for all HTTP requests (30-120s)
|
| 102 |
+
- **Rate Limiting**: Built-in rate limiting per API key
|
| 103 |
+
- **Error Responses**: OpenAI-compatible error format for easy client integration
|
| 104 |
+
|
| 105 |
+
### Debug Mode
|
| 106 |
+
|
| 107 |
+
Debug mode is **OFF** by default in production. To enable debugging:
|
| 108 |
+
|
| 109 |
+
```python
|
| 110 |
+
# In src/main.py
|
| 111 |
+
DEBUG = True # Set to True for detailed logging
|
| 112 |
+
```
|
| 113 |
+
|
| 114 |
+
When debug mode is enabled, you'll see:
|
| 115 |
+
- Detailed request/response logs
|
| 116 |
+
- Image upload progress
|
| 117 |
+
- Model capability checks
|
| 118 |
+
- Session management details
|
| 119 |
+
|
| 120 |
+
**Important**: Keep debug mode OFF in production to reduce log verbosity and improve performance.
|
| 121 |
+
|
| 122 |
+
### Monitoring
|
| 123 |
+
|
| 124 |
+
Monitor these key metrics in production:
|
| 125 |
+
|
| 126 |
+
- **API Response Times**: Check for slow responses indicating timeout issues
|
| 127 |
+
- **Error Rates**: Track 4xx/5xx errors from `/api/v1/chat/completions`
|
| 128 |
+
- **Model Usage**: Dashboard shows top 10 most-used models
|
| 129 |
+
- **Image Upload Success**: Monitor image upload failures in logs
|
| 130 |
+
|
| 131 |
+
### Security Best Practices
|
| 132 |
+
|
| 133 |
+
1. **API Keys**: Use strong, randomly generated API keys (dashboard auto-generates secure keys)
|
| 134 |
+
2. **Rate Limiting**: Configure appropriate rate limits per key in dashboard
|
| 135 |
+
3. **Admin Password**: Change default admin password in `config.json`
|
| 136 |
+
4. **HTTPS**: Use a reverse proxy (nginx, Caddy) with SSL for production
|
| 137 |
+
5. **Firewall**: Restrict access to dashboard port (default 8000)
|
| 138 |
+
|
| 139 |
+
### Common Issues
|
| 140 |
+
|
| 141 |
+
**"LMArena API error: An error occurred"**
|
| 142 |
+
- Check that your `arena-auth-prod-v1` token is valid
|
| 143 |
+
- Verify `cf_clearance` cookie is not expired
|
| 144 |
+
- Ensure model is available on LMArena
|
| 145 |
+
|
| 146 |
+
**Image Upload Failures**
|
| 147 |
+
- Verify image is under 10MB
|
| 148 |
+
- Check MIME type is supported (image/png, image/jpeg, etc.)
|
| 149 |
+
- Ensure LMArena R2 storage is accessible
|
| 150 |
+
|
| 151 |
+
**Timeout Errors**
|
| 152 |
+
- Increase timeout in `src/main.py` if needed (default 120s)
|
| 153 |
+
- Check network connectivity to LMArena
|
| 154 |
+
- Consider using streaming mode for long responses
|
| 155 |
+
|
| 156 |
+
### Reverse Proxy Example (Nginx)
|
| 157 |
+
|
| 158 |
+
```nginx
|
| 159 |
+
server {
|
| 160 |
+
listen 443 ssl;
|
| 161 |
+
server_name api.yourdomain.com;
|
| 162 |
+
|
| 163 |
+
ssl_certificate /path/to/cert.pem;
|
| 164 |
+
ssl_certificate_key /path/to/key.pem;
|
| 165 |
+
|
| 166 |
+
location / {
|
| 167 |
+
proxy_pass http://localhost:8000;
|
| 168 |
+
proxy_set_header Host $host;
|
| 169 |
+
proxy_set_header X-Real-IP $remote_addr;
|
| 170 |
+
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
| 171 |
+
proxy_set_header X-Forwarded-Proto $scheme;
|
| 172 |
+
|
| 173 |
+
# For streaming responses
|
| 174 |
+
proxy_buffering off;
|
| 175 |
+
proxy_cache off;
|
| 176 |
+
}
|
| 177 |
+
}
|
| 178 |
+
```
|
| 179 |
+
|
| 180 |
+
### Running as a Service (systemd)
|
| 181 |
+
|
| 182 |
+
Create `/etc/systemd/system/lmarenabridge.service`:
|
| 183 |
+
|
| 184 |
+
```ini
|
| 185 |
+
[Unit]
|
| 186 |
+
Description=LMArena Bridge API
|
| 187 |
+
After=network.target
|
| 188 |
+
|
| 189 |
+
[Service]
|
| 190 |
+
Type=simple
|
| 191 |
+
User=youruser
|
| 192 |
+
WorkingDirectory=/path/to/lmarenabridge
|
| 193 |
+
Environment="PATH=/path/to/venv/bin"
|
| 194 |
+
ExecStart=/path/to/venv/bin/python src/main.py
|
| 195 |
+
Restart=always
|
| 196 |
+
RestartSec=10
|
| 197 |
+
|
| 198 |
+
[Install]
|
| 199 |
+
WantedBy=multi-user.target
|
| 200 |
+
```
|
| 201 |
+
|
| 202 |
+
Enable and start:
|
| 203 |
+
```bash
|
| 204 |
+
sudo systemctl enable lmarenabridge
|
| 205 |
+
sudo systemctl start lmarenabridge
|
| 206 |
+
sudo systemctl status lmarenabridge
|
| 207 |
+
```
|
src/main.py
CHANGED
|
@@ -64,6 +64,15 @@ async def upload_image_to_lmarena(image_data: bytes, mime_type: str, filename: s
|
|
| 64 |
Tuple of (key, download_url) if successful, or None if upload fails
|
| 65 |
"""
|
| 66 |
try:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
# Step 1: Request upload URL
|
| 68 |
debug_print(f"π€ Step 1: Requesting upload URL for {filename}")
|
| 69 |
|
|
@@ -77,72 +86,101 @@ async def upload_image_to_lmarena(image_data: bytes, mime_type: str, filename: s
|
|
| 77 |
})
|
| 78 |
|
| 79 |
async with httpx.AsyncClient() as client:
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
# Parse response - format: 0:{...}\n1:{...}\n
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
return None
|
| 99 |
|
| 100 |
-
upload_url = upload_data['data']['uploadUrl']
|
| 101 |
-
key = upload_data['data']['key']
|
| 102 |
-
debug_print(f"β
Got upload URL and key: {key}")
|
| 103 |
-
|
| 104 |
# Step 2: Upload image to R2 storage
|
| 105 |
debug_print(f"π€ Step 2: Uploading image to R2 storage ({len(image_data)} bytes)")
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
|
| 115 |
# Step 3: Get signed download URL (uses different Next-Action)
|
| 116 |
debug_print(f"π€ Step 3: Requesting signed download URL")
|
| 117 |
request_headers_step3 = request_headers.copy()
|
| 118 |
request_headers_step3["Next-Action"] = "6064c365792a3eaf40a60a874b327fe031ea6f22d7"
|
| 119 |
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 127 |
|
| 128 |
# Parse response
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
return None
|
| 139 |
|
| 140 |
-
download_url = download_data['data']['url']
|
| 141 |
-
debug_print(f"β
Got signed download URL: {download_url[:100]}...")
|
| 142 |
-
return (key, download_url)
|
| 143 |
-
|
| 144 |
except Exception as e:
|
| 145 |
-
debug_print(f"β
|
| 146 |
return None
|
| 147 |
|
| 148 |
async def process_message_content(content, model_capabilities: dict) -> tuple[str, List[dict]]:
|
|
@@ -184,11 +222,36 @@ async def process_message_content(content, model_capabilities: dict) -> tuple[st
|
|
| 184 |
if url.startswith('data:'):
|
| 185 |
# Format: data:image/png;base64,iVBORw0KGgo...
|
| 186 |
try:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 187 |
header, data = url.split(',', 1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 188 |
mime_type = header.split(';')[0].split(':')[1]
|
| 189 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
# Decode base64
|
| 191 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 192 |
|
| 193 |
# Generate filename
|
| 194 |
ext = mimetypes.guess_extension(mime_type) or '.png'
|
|
@@ -211,7 +274,7 @@ async def process_message_content(content, model_capabilities: dict) -> tuple[st
|
|
| 211 |
else:
|
| 212 |
debug_print(f"β οΈ Failed to upload image, skipping")
|
| 213 |
except Exception as e:
|
| 214 |
-
debug_print(f"β
|
| 215 |
|
| 216 |
# Handle URL images (direct URLs)
|
| 217 |
elif url.startswith('http://') or url.startswith('https://'):
|
|
@@ -1200,6 +1263,37 @@ async def refresh_tokens(session: str = Depends(get_current_session)):
|
|
| 1200 |
|
| 1201 |
# --- OpenAI Compatible API Endpoints ---
|
| 1202 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1203 |
@app.get("/api/v1/models")
|
| 1204 |
async def list_models(api_key: dict = Depends(rate_limit_api_key)):
|
| 1205 |
models = get_models()
|
|
@@ -1227,32 +1321,63 @@ async def api_chat_completions(request: Request, api_key: dict = Depends(rate_li
|
|
| 1227 |
debug_print("="*80)
|
| 1228 |
|
| 1229 |
try:
|
| 1230 |
-
body
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1231 |
debug_print(f"π₯ Request body keys: {list(body.keys())}")
|
| 1232 |
|
|
|
|
| 1233 |
model_public_name = body.get("model")
|
| 1234 |
messages = body.get("messages", [])
|
| 1235 |
stream = body.get("stream", False)
|
| 1236 |
|
| 1237 |
debug_print(f"π Stream mode: {stream}")
|
| 1238 |
-
|
| 1239 |
debug_print(f"π€ Requested model: {model_public_name}")
|
| 1240 |
debug_print(f"π¬ Number of messages: {len(messages)}")
|
| 1241 |
|
| 1242 |
-
if not model_public_name
|
| 1243 |
-
debug_print("β Missing model
|
| 1244 |
-
raise HTTPException(status_code=400, detail="Missing 'model'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1245 |
|
| 1246 |
# Find model ID from public name
|
| 1247 |
-
|
| 1248 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1249 |
|
| 1250 |
model_id = None
|
| 1251 |
model_org = None
|
|
|
|
|
|
|
| 1252 |
for m in models:
|
| 1253 |
if m.get("publicName") == model_public_name:
|
| 1254 |
model_id = m.get("id")
|
| 1255 |
model_org = m.get("organization")
|
|
|
|
| 1256 |
break
|
| 1257 |
|
| 1258 |
if not model_id:
|
|
@@ -1271,26 +1396,29 @@ async def api_chat_completions(request: Request, api_key: dict = Depends(rate_li
|
|
| 1271 |
)
|
| 1272 |
|
| 1273 |
debug_print(f"β
Found model ID: {model_id}")
|
| 1274 |
-
|
| 1275 |
-
# Get model capabilities
|
| 1276 |
-
model_capabilities = {}
|
| 1277 |
-
for m in models:
|
| 1278 |
-
if m.get("id") == model_id:
|
| 1279 |
-
model_capabilities = m.get("capabilities", {})
|
| 1280 |
-
break
|
| 1281 |
-
|
| 1282 |
debug_print(f"π§ Model capabilities: {model_capabilities}")
|
| 1283 |
|
| 1284 |
# Log usage
|
| 1285 |
-
|
| 1286 |
-
|
| 1287 |
-
|
| 1288 |
-
|
| 1289 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1290 |
|
| 1291 |
# Process last message content (may include images)
|
| 1292 |
-
|
| 1293 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1294 |
|
| 1295 |
# Validate prompt
|
| 1296 |
if not prompt:
|
|
|
|
| 64 |
Tuple of (key, download_url) if successful, or None if upload fails
|
| 65 |
"""
|
| 66 |
try:
|
| 67 |
+
# Validate inputs
|
| 68 |
+
if not image_data:
|
| 69 |
+
debug_print("β Image data is empty")
|
| 70 |
+
return None
|
| 71 |
+
|
| 72 |
+
if not mime_type or not mime_type.startswith('image/'):
|
| 73 |
+
debug_print(f"β Invalid MIME type: {mime_type}")
|
| 74 |
+
return None
|
| 75 |
+
|
| 76 |
# Step 1: Request upload URL
|
| 77 |
debug_print(f"π€ Step 1: Requesting upload URL for {filename}")
|
| 78 |
|
|
|
|
| 86 |
})
|
| 87 |
|
| 88 |
async with httpx.AsyncClient() as client:
|
| 89 |
+
try:
|
| 90 |
+
response = await client.post(
|
| 91 |
+
"https://lmarena.ai/?mode=direct",
|
| 92 |
+
headers=request_headers,
|
| 93 |
+
content=json.dumps([filename, mime_type]),
|
| 94 |
+
timeout=30.0
|
| 95 |
+
)
|
| 96 |
+
response.raise_for_status()
|
| 97 |
+
except httpx.TimeoutException:
|
| 98 |
+
debug_print("β Timeout while requesting upload URL")
|
| 99 |
+
return None
|
| 100 |
+
except httpx.HTTPError as e:
|
| 101 |
+
debug_print(f"β HTTP error while requesting upload URL: {e}")
|
| 102 |
+
return None
|
| 103 |
|
| 104 |
# Parse response - format: 0:{...}\n1:{...}\n
|
| 105 |
+
try:
|
| 106 |
+
lines = response.text.strip().split('\n')
|
| 107 |
+
upload_data = None
|
| 108 |
+
for line in lines:
|
| 109 |
+
if line.startswith('1:'):
|
| 110 |
+
upload_data = json.loads(line[2:])
|
| 111 |
+
break
|
| 112 |
+
|
| 113 |
+
if not upload_data or not upload_data.get('success'):
|
| 114 |
+
debug_print(f"β Failed to get upload URL: {response.text[:200]}")
|
| 115 |
+
return None
|
| 116 |
+
|
| 117 |
+
upload_url = upload_data['data']['uploadUrl']
|
| 118 |
+
key = upload_data['data']['key']
|
| 119 |
+
debug_print(f"β
Got upload URL and key: {key}")
|
| 120 |
+
except (json.JSONDecodeError, KeyError, IndexError) as e:
|
| 121 |
+
debug_print(f"β Failed to parse upload URL response: {e}")
|
| 122 |
return None
|
| 123 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
# Step 2: Upload image to R2 storage
|
| 125 |
debug_print(f"π€ Step 2: Uploading image to R2 storage ({len(image_data)} bytes)")
|
| 126 |
+
try:
|
| 127 |
+
response = await client.put(
|
| 128 |
+
upload_url,
|
| 129 |
+
content=image_data,
|
| 130 |
+
headers={"Content-Type": mime_type},
|
| 131 |
+
timeout=60.0
|
| 132 |
+
)
|
| 133 |
+
response.raise_for_status()
|
| 134 |
+
debug_print(f"β
Image uploaded successfully")
|
| 135 |
+
except httpx.TimeoutException:
|
| 136 |
+
debug_print("β Timeout while uploading image to R2 storage")
|
| 137 |
+
return None
|
| 138 |
+
except httpx.HTTPError as e:
|
| 139 |
+
debug_print(f"β HTTP error while uploading image: {e}")
|
| 140 |
+
return None
|
| 141 |
|
| 142 |
# Step 3: Get signed download URL (uses different Next-Action)
|
| 143 |
debug_print(f"π€ Step 3: Requesting signed download URL")
|
| 144 |
request_headers_step3 = request_headers.copy()
|
| 145 |
request_headers_step3["Next-Action"] = "6064c365792a3eaf40a60a874b327fe031ea6f22d7"
|
| 146 |
|
| 147 |
+
try:
|
| 148 |
+
response = await client.post(
|
| 149 |
+
"https://lmarena.ai/?mode=direct",
|
| 150 |
+
headers=request_headers_step3,
|
| 151 |
+
content=json.dumps([key]),
|
| 152 |
+
timeout=30.0
|
| 153 |
+
)
|
| 154 |
+
response.raise_for_status()
|
| 155 |
+
except httpx.TimeoutException:
|
| 156 |
+
debug_print("β Timeout while requesting download URL")
|
| 157 |
+
return None
|
| 158 |
+
except httpx.HTTPError as e:
|
| 159 |
+
debug_print(f"β HTTP error while requesting download URL: {e}")
|
| 160 |
+
return None
|
| 161 |
|
| 162 |
# Parse response
|
| 163 |
+
try:
|
| 164 |
+
lines = response.text.strip().split('\n')
|
| 165 |
+
download_data = None
|
| 166 |
+
for line in lines:
|
| 167 |
+
if line.startswith('1:'):
|
| 168 |
+
download_data = json.loads(line[2:])
|
| 169 |
+
break
|
| 170 |
+
|
| 171 |
+
if not download_data or not download_data.get('success'):
|
| 172 |
+
debug_print(f"β Failed to get download URL: {response.text[:200]}")
|
| 173 |
+
return None
|
| 174 |
+
|
| 175 |
+
download_url = download_data['data']['url']
|
| 176 |
+
debug_print(f"β
Got signed download URL: {download_url[:100]}...")
|
| 177 |
+
return (key, download_url)
|
| 178 |
+
except (json.JSONDecodeError, KeyError, IndexError) as e:
|
| 179 |
+
debug_print(f"β Failed to parse download URL response: {e}")
|
| 180 |
return None
|
| 181 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 182 |
except Exception as e:
|
| 183 |
+
debug_print(f"β Unexpected error uploading image: {type(e).__name__}: {e}")
|
| 184 |
return None
|
| 185 |
|
| 186 |
async def process_message_content(content, model_capabilities: dict) -> tuple[str, List[dict]]:
|
|
|
|
| 222 |
if url.startswith('data:'):
|
| 223 |
# Format: data:image/png;base64,iVBORw0KGgo...
|
| 224 |
try:
|
| 225 |
+
# Validate and parse data URI
|
| 226 |
+
if ',' not in url:
|
| 227 |
+
debug_print(f"β Invalid data URI format (no comma separator)")
|
| 228 |
+
continue
|
| 229 |
+
|
| 230 |
header, data = url.split(',', 1)
|
| 231 |
+
|
| 232 |
+
# Parse MIME type
|
| 233 |
+
if ';' not in header or ':' not in header:
|
| 234 |
+
debug_print(f"β Invalid data URI header format")
|
| 235 |
+
continue
|
| 236 |
+
|
| 237 |
mime_type = header.split(';')[0].split(':')[1]
|
| 238 |
|
| 239 |
+
# Validate MIME type
|
| 240 |
+
if not mime_type.startswith('image/'):
|
| 241 |
+
debug_print(f"β Invalid MIME type: {mime_type}")
|
| 242 |
+
continue
|
| 243 |
+
|
| 244 |
# Decode base64
|
| 245 |
+
try:
|
| 246 |
+
image_data = base64.b64decode(data)
|
| 247 |
+
except Exception as e:
|
| 248 |
+
debug_print(f"β Failed to decode base64 data: {e}")
|
| 249 |
+
continue
|
| 250 |
+
|
| 251 |
+
# Validate image size (max 10MB)
|
| 252 |
+
if len(image_data) > 10 * 1024 * 1024:
|
| 253 |
+
debug_print(f"β Image too large: {len(image_data)} bytes (max 10MB)")
|
| 254 |
+
continue
|
| 255 |
|
| 256 |
# Generate filename
|
| 257 |
ext = mimetypes.guess_extension(mime_type) or '.png'
|
|
|
|
| 274 |
else:
|
| 275 |
debug_print(f"β οΈ Failed to upload image, skipping")
|
| 276 |
except Exception as e:
|
| 277 |
+
debug_print(f"β Unexpected error processing base64 image: {type(e).__name__}: {e}")
|
| 278 |
|
| 279 |
# Handle URL images (direct URLs)
|
| 280 |
elif url.startswith('http://') or url.startswith('https://'):
|
|
|
|
| 1263 |
|
| 1264 |
# --- OpenAI Compatible API Endpoints ---
|
| 1265 |
|
| 1266 |
+
@app.get("/api/v1/health")
|
| 1267 |
+
async def health_check():
|
| 1268 |
+
"""Health check endpoint for monitoring"""
|
| 1269 |
+
try:
|
| 1270 |
+
models = get_models()
|
| 1271 |
+
config = get_config()
|
| 1272 |
+
|
| 1273 |
+
# Basic health checks
|
| 1274 |
+
has_cf_clearance = bool(config.get("cf_clearance"))
|
| 1275 |
+
has_models = len(models) > 0
|
| 1276 |
+
has_api_keys = len(config.get("api_keys", [])) > 0
|
| 1277 |
+
|
| 1278 |
+
status = "healthy" if (has_cf_clearance and has_models) else "degraded"
|
| 1279 |
+
|
| 1280 |
+
return {
|
| 1281 |
+
"status": status,
|
| 1282 |
+
"timestamp": datetime.now(timezone.utc).isoformat(),
|
| 1283 |
+
"checks": {
|
| 1284 |
+
"cf_clearance": has_cf_clearance,
|
| 1285 |
+
"models_loaded": has_models,
|
| 1286 |
+
"model_count": len(models),
|
| 1287 |
+
"api_keys_configured": has_api_keys
|
| 1288 |
+
}
|
| 1289 |
+
}
|
| 1290 |
+
except Exception as e:
|
| 1291 |
+
return {
|
| 1292 |
+
"status": "unhealthy",
|
| 1293 |
+
"timestamp": datetime.now(timezone.utc).isoformat(),
|
| 1294 |
+
"error": str(e)
|
| 1295 |
+
}
|
| 1296 |
+
|
| 1297 |
@app.get("/api/v1/models")
|
| 1298 |
async def list_models(api_key: dict = Depends(rate_limit_api_key)):
|
| 1299 |
models = get_models()
|
|
|
|
| 1321 |
debug_print("="*80)
|
| 1322 |
|
| 1323 |
try:
|
| 1324 |
+
# Parse request body with error handling
|
| 1325 |
+
try:
|
| 1326 |
+
body = await request.json()
|
| 1327 |
+
except json.JSONDecodeError as e:
|
| 1328 |
+
debug_print(f"β Invalid JSON in request body: {e}")
|
| 1329 |
+
raise HTTPException(status_code=400, detail=f"Invalid JSON in request body: {str(e)}")
|
| 1330 |
+
except Exception as e:
|
| 1331 |
+
debug_print(f"β Failed to read request body: {e}")
|
| 1332 |
+
raise HTTPException(status_code=400, detail=f"Failed to read request body: {str(e)}")
|
| 1333 |
+
|
| 1334 |
debug_print(f"π₯ Request body keys: {list(body.keys())}")
|
| 1335 |
|
| 1336 |
+
# Validate required fields
|
| 1337 |
model_public_name = body.get("model")
|
| 1338 |
messages = body.get("messages", [])
|
| 1339 |
stream = body.get("stream", False)
|
| 1340 |
|
| 1341 |
debug_print(f"π Stream mode: {stream}")
|
|
|
|
| 1342 |
debug_print(f"π€ Requested model: {model_public_name}")
|
| 1343 |
debug_print(f"π¬ Number of messages: {len(messages)}")
|
| 1344 |
|
| 1345 |
+
if not model_public_name:
|
| 1346 |
+
debug_print("β Missing 'model' in request")
|
| 1347 |
+
raise HTTPException(status_code=400, detail="Missing 'model' in request body.")
|
| 1348 |
+
|
| 1349 |
+
if not messages:
|
| 1350 |
+
debug_print("β Missing 'messages' in request")
|
| 1351 |
+
raise HTTPException(status_code=400, detail="Missing 'messages' in request body.")
|
| 1352 |
+
|
| 1353 |
+
if not isinstance(messages, list):
|
| 1354 |
+
debug_print("β 'messages' must be an array")
|
| 1355 |
+
raise HTTPException(status_code=400, detail="'messages' must be an array.")
|
| 1356 |
+
|
| 1357 |
+
if len(messages) == 0:
|
| 1358 |
+
debug_print("β 'messages' array is empty")
|
| 1359 |
+
raise HTTPException(status_code=400, detail="'messages' array cannot be empty.")
|
| 1360 |
|
| 1361 |
# Find model ID from public name
|
| 1362 |
+
try:
|
| 1363 |
+
models = get_models()
|
| 1364 |
+
debug_print(f"π Total models loaded: {len(models)}")
|
| 1365 |
+
except Exception as e:
|
| 1366 |
+
debug_print(f"β Failed to load models: {e}")
|
| 1367 |
+
raise HTTPException(
|
| 1368 |
+
status_code=503,
|
| 1369 |
+
detail="Failed to load model list from LMArena. Please try again later."
|
| 1370 |
+
)
|
| 1371 |
|
| 1372 |
model_id = None
|
| 1373 |
model_org = None
|
| 1374 |
+
model_capabilities = {}
|
| 1375 |
+
|
| 1376 |
for m in models:
|
| 1377 |
if m.get("publicName") == model_public_name:
|
| 1378 |
model_id = m.get("id")
|
| 1379 |
model_org = m.get("organization")
|
| 1380 |
+
model_capabilities = m.get("capabilities", {})
|
| 1381 |
break
|
| 1382 |
|
| 1383 |
if not model_id:
|
|
|
|
| 1396 |
)
|
| 1397 |
|
| 1398 |
debug_print(f"β
Found model ID: {model_id}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1399 |
debug_print(f"π§ Model capabilities: {model_capabilities}")
|
| 1400 |
|
| 1401 |
# Log usage
|
| 1402 |
+
try:
|
| 1403 |
+
model_usage_stats[model_public_name] += 1
|
| 1404 |
+
# Save stats immediately after incrementing
|
| 1405 |
+
config = get_config()
|
| 1406 |
+
config["usage_stats"] = dict(model_usage_stats)
|
| 1407 |
+
save_config(config)
|
| 1408 |
+
except Exception as e:
|
| 1409 |
+
# Don't fail the request if usage logging fails
|
| 1410 |
+
debug_print(f"β οΈ Failed to log usage stats: {e}")
|
| 1411 |
|
| 1412 |
# Process last message content (may include images)
|
| 1413 |
+
try:
|
| 1414 |
+
last_message_content = messages[-1].get("content", "")
|
| 1415 |
+
prompt, experimental_attachments = await process_message_content(last_message_content, model_capabilities)
|
| 1416 |
+
except Exception as e:
|
| 1417 |
+
debug_print(f"β Failed to process message content: {e}")
|
| 1418 |
+
raise HTTPException(
|
| 1419 |
+
status_code=400,
|
| 1420 |
+
detail=f"Failed to process message content: {str(e)}"
|
| 1421 |
+
)
|
| 1422 |
|
| 1423 |
# Validate prompt
|
| 1424 |
if not prompt:
|