Spaces:

tech-daskalos
/

CyberSecChatbot

Paused

Andrew McCracken Claude commited on Oct 14, 2025

Commit

6b0a701

1 Parent(s): efd4459

Configure uvicorn for concurrent request handling

Updated uvicorn settings for optimal concurrency:
- workers=1: Share model pool across all requests (can't share across processes)
- limit_concurrency=100: Handle up to 100 simultaneous connections
- timeout_keep_alive=120: Support long streaming responses
- backlog=2048: Queue pending connections
- loop='asyncio': Best async performance

With these settings + ModelPool, the app can:
- Accept 100 concurrent connections
- Process 10 simultaneous inferences (model pool size)
- Queue remaining requests gracefully

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show

main.py +11 -2

main.py CHANGED Viewed

@@ -564,10 +564,19 @@ async def serve_test_interface():
 if __name__ == "__main__":
     import uvicorn
-    uvicorn.run(
         app,
         host="0.0.0.0",
         port=8000,
         log_level="info",
-        access_log=True
     )

 if __name__ == "__main__":
     import uvicorn
+    # Configure uvicorn for concurrent request handling
+    config = uvicorn.Config(
         app,
         host="0.0.0.0",
         port=8000,
         log_level="info",
+        access_log=True,
+        workers=1,  # Single worker to share model pool across all requests
+        limit_concurrency=100,  # Allow up to 100 concurrent connections
+        timeout_keep_alive=120,  # Keep connections alive for streaming
+        backlog=2048,  # Queue up to 2048 pending connections
+        loop="asyncio"  # Use asyncio event loop for best async performance
     )
+    server = uvicorn.Server(config)
+    server.run()