ConnectionManager Broadcast Modification: The WebSocket ConnectionManager.broadcast and broadcast_to_agents methods were iterating directly over self.active_connections and self.agent_connections. If an asynchronous operation yielded and a client disconnected, the list size would change during iteration, leading to skipped clients or iteration errors.
Global Locks: store_locks.sos_lock is used appropriately when appending to sos_events. However, lists like detections_history do not have an explicit async lock around them. Since _upsert_detection_sighting is mostly synchronous except when awaited inside other async handlers, it may not immediately corrupt but can cause interleaved updates.
Memory Issues
Detections History Unbounded Growth: Most memory lists (like alerts_db, sos_events, agentic_plans) are bounded using _trim_memory_list to prevent memory leaks in the long-running process. However, detections_history currently grows unbounded. When new faces are detected (or unique names generated), detections_history.append(entry) is called without trimming, leading to a slow memory leak over days/weeks of continuous operation.
Exception Handling
Heartbeat Silence: The _ws_send_heartbeat background task contains a try...except Exception: pass block. If send_personal_message encounters an error, the heartbeat task exits silently, meaning the proxy might drop idle connections because heartbeats cease without warning.
Agent Stream Errors: In agents_ws, agent_step exceptions are caught, but manager.send_personal_message during an error state might itself fail if the websocket has disconnected.