Concurrency inside Iterations: The ConnectionManager iterates directly over active lists (self.active_connections and self.agent_connections) to broadcast messages. This is a severe problem in an asynchronous framework like FastAPI. Yielding via await ws.send_text allows other coroutines to execute, potentially mutating the list via disconnect(). This causes IndexError or skips elements in the list.
Heartbeat Silence Drop: The _ws_send_heartbeat coroutine relies on a try-except block that suppresses all exceptions and passes, silently stopping the heartbeat loop for the rest of the connection's lifetime. If the send_personal_message ping fails transiently, the client receives no more heartbeats and eventually the HTTP proxy (Nginx or HF) will terminate the idle connection.
Task Cancellation Context: In websocket_endpoint, when a WebSocketDisconnect is caught or another exception happens, the heartbeat task is cancelled properly.
Agent Streams: The /ws/agents handler has good resilience for decoding image data from Copilot, but the global AI context injection doesn't properly wrap potential data structure updates inside a thread-safe or async-safe barrier, exposing it to potential race conditions on heavy load.
Recommendations
Iterate over list(self.active_connections) rather than the raw reference.
Refactor heartbeat to catch and ignore send exceptions gracefully without terminating the while True loop, or explicitly log the failure to notify the orchestrator.