Spaces:
Paused
Paused
| # Plan: Integrate Browser Automation using Playwright | |
| **Problem:** Direct API requests to Notion using `httpx` are failing, likely due to server-side checks (e.g., TLS fingerprinting). | |
| **Solution:** Replace the direct `httpx` calls with browser automation using Playwright to mimic a real browser environment. | |
| **Steps:** | |
| 1. **Add Dependency:** | |
| * Add `playwright` to the [`requirements.txt`](requirements.txt) file. | |
| * *Note:* After updating requirements, the browser binaries for Playwright will need to be installed (typically via `playwright install` in the terminal). | |
| 2. **Modify `stream_notion_response` Function ([`main.py:184`](main.py:184)):** | |
| * **Remove `httpx`:** Delete the `async with httpx.AsyncClient(...)` block ([`main.py:216-263`](main.py:216)). Keep the surrounding error handling for now. | |
| * **Initialize Playwright:** Add code to start Playwright, launch a Chromium browser instance, and create a new browser context. | |
| * **Set Cookie:** Add the `NOTION_COOKIE` ([`main.py:26`](main.py:26)) to the browser context. | |
| * **Create Page:** Open a new page within the context. | |
| * **Execute Request via JavaScript:** Use `page.evaluate()` to run JavaScript code within the browser page. This JavaScript code will: | |
| * Use the `fetch` API to make the POST request to [`NOTION_API_URL`](main.py:24). | |
| * Include the necessary headers (copied from the original [`headers`](main.py:186) dictionary). | |
| * Send the `notion_request_body` (serialized as JSON, similar to [`main.py:218`](main.py:218)). | |
| * Handle the streaming response (`response.body.getReader()`) from `fetch`. | |
| * Read chunks from the stream (`reader.read()`) and send them back to the Python environment (e.g., using `page.expose_function` to call a Python callback). | |
| * **Process Streamed Chunks in Python:** The Python callback function (exposed to JS) will receive the raw chunks from the browser. This callback will need to decode the chunks (likely UTF-8) and process the `ndjson` lines similarly to the original code ([`main.py:228-249`](main.py:228)), yielding the formatted SSE chunks. | |
| * **Handle End of Stream:** Ensure the `[DONE]` message is sent correctly after the browser stream finishes. | |
| * **Cleanup:** Close the page, context, and browser instance properly (initially on a per-request basis). | |
| * **Update Error Handling:** Adapt the `try...except` blocks to catch potential Playwright-specific errors. | |
| **Diagram:** | |
| ```mermaid | |
| graph TD | |
| A[FastAPI Request /v1/chat/completions] --> B{Stream?}; | |
| B -- Yes --> C[Call stream_notion_response]; | |
| B -- No --> D[Call stream_notion_response internally]; | |
| subgraph stream_notion_response (Modified w/ Playwright) | |
| E[Build NotionRequestBody] --> F; | |
| F[Initialize Playwright & Launch Browser] --> G; | |
| G[Create Context & Add Cookie] --> H; | |
| H[Create Page & Expose Python Callback] --> I; | |
| I[page.evaluate(): JS Fetch POST to Notion] --> J; | |
| J[JS: Read Stream Chunks] --> K; | |
| K[JS: Send Chunk to Python Callback] --> L; | |
| L[Python Callback: Decode & Process Chunk] --> M; | |
| M[Yield Formatted SSE Chunk] --> N{End of Stream?}; | |
| N -- No --> J; | |
| N -- Yes --> O[Yield [DONE] Chunk]; | |
| O --> P[Cleanup Playwright (Page, Context, Browser)]; | |
| end | |
| C --> Q[Return StreamingResponse]; | |
| D --> R[Collect Chunks from stream_notion_response]; | |
| R --> S[Format Non-Streaming Response]; | |
| S --> T[Return JSON Response]; | |
| Q --> U[Client Receives SSE Stream]; | |
| T --> U; | |
| style F fill:#f9f,stroke:#333,stroke-width:2px | |
| style G fill:#f9f,stroke:#333,stroke-width:2px | |
| style H fill:#f9f,stroke:#333,stroke-width:2px | |
| style I fill:#f9f,stroke:#333,stroke-width:2px | |
| style J fill:#f9f,stroke:#333,stroke-width:2px | |
| style K fill:#f9f,stroke:#333,stroke-width:2px | |
| style L fill:#ccf,stroke:#333,stroke-width:1px | |
| style M fill:#ccf,stroke:#333,stroke-width:1px | |
| style O fill:#ccf,stroke:#333,stroke-width:1px | |
| style P fill:#f9f,stroke:#333,stroke-width:2px | |
| ``` | |
| **Agreed Choices:** | |
| * Dependency: `playwright` | |
| * Browser: Chromium | |
| * Browser Lifecycle: Launch/Close per request (initial approach) |