| print(""" |
| 1. **Project Description** |
| - Vision: Add a seamlessly integrated 'Vision' tab in the Magentic-UI to observe agents' visual interactions (screenshots or VNC) in real-time. |
| - Integration: The frontend `ChatView` will be updated to include tabs (Chat / Vision). The Vision tab will subscribe to visual data via WebSocket or display a VNC iframe. |
| - FastAPI Setup: |
| - Use existing `/api/ws/runs/{run_id}` for streaming screenshot events, or add `/api/runs/{run_id}/vision` to get VNC connection details. |
| |
| 2. **Tasks and Tests** |
| - Task 1 (Backend): Expose VNC/Vision info endpoint. |
| - Modify `src/magentic_ui/backend/web/routes/runs.py` using `replace_with_git_merge_diff` to add a `GET /runs/{run_id}/vision` endpoint returning VNC URL or stream status. |
| - Test: Add a unit test in `tests/` checking if the endpoint returns valid connection info. |
| - Task 2 (Backend WS): Broadcast screenshots. |
| - Update WebSocketManager in `src/magentic_ui/backend/web/managers/connection.py` using `replace_with_git_merge_diff` to relay `screenshot` type messages from agents like `FaraWebSurfer`. |
| - Test: Unit test the WebSocketManager to ensure `screenshot` messages are broadcasted properly. |
| - Task 3 (Frontend): Implement Tabs in Chat UI. |
| - Update `frontend/src/components/views/chat/chat.tsx` using `replace_with_git_merge_diff` to wrap the chat interface in an Ant Design `<Tabs>` component (Chat vs Vision). |
| - Test: Write a Playwright test ensuring the Tabs render and clicking 'Vision' switches the view. |
| - Task 4 (Frontend): Implement Vision Component. |
| - Create `frontend/src/components/views/chat/vision.tsx` using `write_file` to render an `<iframe>` for VNC or an `<img>` that updates when a `screenshot` WS message arrives. Verify the file contents using `read_file`. |
| - Test: Write a Playwright test simulating a `screenshot` WS message and verifying the image source updates. |
| - Task 5 (Test Verification): Run all tests. |
| - Run unit tests and frontend playwright tests to ensure there are no regressions using `run_in_bash_session` to execute `pytest` and `npm test` or equivalent. |
| - Task 6 (Pre Commit): Complete pre commit steps. |
| - Complete pre-commit steps to ensure proper testing, verification, review, and reflection are done. |
| - Task 7 (Submission): Submit code. |
| - Once all tests pass, submit the change. |
| |
| 3. **Functionality Expectations** |
| - User perspective: User clicks 'Vision' tab and sees exactly what the agent sees (browser viewport, desktop) via VNC or updating screenshots. Switching tabs doesn't interrupt the run. |
| - Technical perspective: Agents emit visual state. Backend routes it to the frontend via WS or provides a VNC endpoint. Frontend maintains connection regardless of active tab. |
| - Constraints: VNC requires Docker configuration exposing the noVNC port. Fallback to screenshots if VNC is unavailable. |
| |
| 4. **API Endpoints to be Exposed** |
| - `GET /api/runs/{run_id}/vision` |
| - Request: None |
| - Response: `{ "status": true, "vnc_url": "ws://localhost:5900", "has_vnc": true }` |
| - Auth: Inherits existing run access auth. |
| - WebSocket `/api/ws/runs/{run_id}` (Existing, modified) |
| - New message type from server: `{ "type": "screenshot", "data": "base64_encoded_image_string" }` |
| """) |
|
|