Andrew McCracken
Claude
commited on
Commit
·
6b0a701
1
Parent(s):
efd4459
Configure uvicorn for concurrent request handling
Browse filesUpdated uvicorn settings for optimal concurrency:
- workers=1: Share model pool across all requests (can't share across processes)
- limit_concurrency=100: Handle up to 100 simultaneous connections
- timeout_keep_alive=120: Support long streaming responses
- backlog=2048: Queue pending connections
- loop='asyncio': Best async performance
With these settings + ModelPool, the app can:
- Accept 100 concurrent connections
- Process 10 simultaneous inferences (model pool size)
- Queue remaining requests gracefully
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
main.py
CHANGED
|
@@ -564,10 +564,19 @@ async def serve_test_interface():
|
|
| 564 |
if __name__ == "__main__":
|
| 565 |
import uvicorn
|
| 566 |
|
| 567 |
-
uvicorn
|
|
|
|
| 568 |
app,
|
| 569 |
host="0.0.0.0",
|
| 570 |
port=8000,
|
| 571 |
log_level="info",
|
| 572 |
-
access_log=True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 573 |
)
|
|
|
|
|
|
|
|
|
|
|
|
| 564 |
if __name__ == "__main__":
|
| 565 |
import uvicorn
|
| 566 |
|
| 567 |
+
# Configure uvicorn for concurrent request handling
|
| 568 |
+
config = uvicorn.Config(
|
| 569 |
app,
|
| 570 |
host="0.0.0.0",
|
| 571 |
port=8000,
|
| 572 |
log_level="info",
|
| 573 |
+
access_log=True,
|
| 574 |
+
workers=1, # Single worker to share model pool across all requests
|
| 575 |
+
limit_concurrency=100, # Allow up to 100 concurrent connections
|
| 576 |
+
timeout_keep_alive=120, # Keep connections alive for streaming
|
| 577 |
+
backlog=2048, # Queue up to 2048 pending connections
|
| 578 |
+
loop="asyncio" # Use asyncio event loop for best async performance
|
| 579 |
)
|
| 580 |
+
|
| 581 |
+
server = uvicorn.Server(config)
|
| 582 |
+
server.run()
|