clash-linux commited on
Commit
cc93261
·
verified ·
1 Parent(s): 8e1aeaa

Upload 9 files

Browse files
Files changed (9) hide show
  1. .gitignore +32 -0
  2. Dockerfile +31 -0
  3. LICENSE +21 -0
  4. PLAN.md +75 -0
  5. README.md +127 -5
  6. docker-compose.yml +8 -0
  7. main.py +853 -0
  8. models.py +100 -0
  9. requirements.txt +6 -0
.gitignore ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Environment variables
2
+ .env
3
+
4
+ # Python artifacts
5
+ __pycache__/
6
+ *.pyc
7
+ *.pyo
8
+ *.pyd
9
+ .Python
10
+ build/
11
+ develop-eggs/
12
+ dist/
13
+ downloads/
14
+ eggs/
15
+ .eggs/
16
+ lib/
17
+ lib64/
18
+ parts/
19
+ sdist/
20
+ var/
21
+ wheels/
22
+ share/python-wheels/
23
+ *.egg-info/
24
+ .installed.cfg
25
+ *.egg
26
+ MANIFEST
27
+
28
+ # Virtual environment
29
+ .venv
30
+ venv/
31
+ ENV/
32
+ env/
Dockerfile ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use an official Python runtime as a parent image
2
+ FROM python:3.10-slim
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /app
6
+
7
+ # Copy the requirements file into the container at /app
8
+ COPY requirements.txt .
9
+
10
+ # Install any needed packages specified in requirements.txt
11
+ # Use --no-cache-dir to reduce image size
12
+ # Use --upgrade to ensure latest versions are installed
13
+ RUN pip install --no-cache-dir --upgrade -r requirements.txt
14
+
15
+ # Install Playwright browser binaries (specifically Chromium by default)
16
+ RUN playwright install chromium --with-deps
17
+
18
+ # Copy the current directory contents into the container at /app
19
+ COPY main.py .
20
+ COPY models.py .
21
+
22
+ # Make port 8000 available to the world outside this container
23
+ EXPOSE 7860
24
+
25
+ # Define environment variables (placeholders, will be set at runtime)
26
+ ENV NOTION_COOKIE=""
27
+ ENV NOTION_SPACE_ID=""
28
+
29
+ # Run uvicorn when the container launches
30
+ # Use 0.0.0.0 to make it accessible externally
31
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 gzzhongqi
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
PLAN.md ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Plan: Integrate Browser Automation using Playwright
2
+
3
+ **Problem:** Direct API requests to Notion using `httpx` are failing, likely due to server-side checks (e.g., TLS fingerprinting).
4
+
5
+ **Solution:** Replace the direct `httpx` calls with browser automation using Playwright to mimic a real browser environment.
6
+
7
+ **Steps:**
8
+
9
+ 1. **Add Dependency:**
10
+ * Add `playwright` to the [`requirements.txt`](requirements.txt) file.
11
+ * *Note:* After updating requirements, the browser binaries for Playwright will need to be installed (typically via `playwright install` in the terminal).
12
+
13
+ 2. **Modify `stream_notion_response` Function ([`main.py:184`](main.py:184)):**
14
+ * **Remove `httpx`:** Delete the `async with httpx.AsyncClient(...)` block ([`main.py:216-263`](main.py:216)). Keep the surrounding error handling for now.
15
+ * **Initialize Playwright:** Add code to start Playwright, launch a Chromium browser instance, and create a new browser context.
16
+ * **Set Cookie:** Add the `NOTION_COOKIE` ([`main.py:26`](main.py:26)) to the browser context.
17
+ * **Create Page:** Open a new page within the context.
18
+ * **Execute Request via JavaScript:** Use `page.evaluate()` to run JavaScript code within the browser page. This JavaScript code will:
19
+ * Use the `fetch` API to make the POST request to [`NOTION_API_URL`](main.py:24).
20
+ * Include the necessary headers (copied from the original [`headers`](main.py:186) dictionary).
21
+ * Send the `notion_request_body` (serialized as JSON, similar to [`main.py:218`](main.py:218)).
22
+ * Handle the streaming response (`response.body.getReader()`) from `fetch`.
23
+ * Read chunks from the stream (`reader.read()`) and send them back to the Python environment (e.g., using `page.expose_function` to call a Python callback).
24
+ * **Process Streamed Chunks in Python:** The Python callback function (exposed to JS) will receive the raw chunks from the browser. This callback will need to decode the chunks (likely UTF-8) and process the `ndjson` lines similarly to the original code ([`main.py:228-249`](main.py:228)), yielding the formatted SSE chunks.
25
+ * **Handle End of Stream:** Ensure the `[DONE]` message is sent correctly after the browser stream finishes.
26
+ * **Cleanup:** Close the page, context, and browser instance properly (initially on a per-request basis).
27
+ * **Update Error Handling:** Adapt the `try...except` blocks to catch potential Playwright-specific errors.
28
+
29
+ **Diagram:**
30
+
31
+ ```mermaid
32
+ graph TD
33
+ A[FastAPI Request /v1/chat/completions] --> B{Stream?};
34
+ B -- Yes --> C[Call stream_notion_response];
35
+ B -- No --> D[Call stream_notion_response internally];
36
+
37
+ subgraph stream_notion_response (Modified w/ Playwright)
38
+ E[Build NotionRequestBody] --> F;
39
+ F[Initialize Playwright & Launch Browser] --> G;
40
+ G[Create Context & Add Cookie] --> H;
41
+ H[Create Page & Expose Python Callback] --> I;
42
+ I[page.evaluate(): JS Fetch POST to Notion] --> J;
43
+ J[JS: Read Stream Chunks] --> K;
44
+ K[JS: Send Chunk to Python Callback] --> L;
45
+ L[Python Callback: Decode & Process Chunk] --> M;
46
+ M[Yield Formatted SSE Chunk] --> N{End of Stream?};
47
+ N -- No --> J;
48
+ N -- Yes --> O[Yield [DONE] Chunk];
49
+ O --> P[Cleanup Playwright (Page, Context, Browser)];
50
+ end
51
+
52
+ C --> Q[Return StreamingResponse];
53
+ D --> R[Collect Chunks from stream_notion_response];
54
+ R --> S[Format Non-Streaming Response];
55
+ S --> T[Return JSON Response];
56
+ Q --> U[Client Receives SSE Stream];
57
+ T --> U;
58
+
59
+ style F fill:#f9f,stroke:#333,stroke-width:2px
60
+ style G fill:#f9f,stroke:#333,stroke-width:2px
61
+ style H fill:#f9f,stroke:#333,stroke-width:2px
62
+ style I fill:#f9f,stroke:#333,stroke-width:2px
63
+ style J fill:#f9f,stroke:#333,stroke-width:2px
64
+ style K fill:#f9f,stroke:#333,stroke-width:2px
65
+ style L fill:#ccf,stroke:#333,stroke-width:1px
66
+ style M fill:#ccf,stroke:#333,stroke-width:1px
67
+ style O fill:#ccf,stroke:#333,stroke-width:1px
68
+ style P fill:#f9f,stroke:#333,stroke-width:2px
69
+ ```
70
+
71
+ **Agreed Choices:**
72
+
73
+ * Dependency: `playwright`
74
+ * Browser: Chromium
75
+ * Browser Lifecycle: Launch/Close per request (initial approach)
README.md CHANGED
@@ -1,10 +1,132 @@
1
  ---
2
- title: Notion2api
3
- emoji: 🌍
4
- colorFrom: gray
5
- colorTo: yellow
6
  sdk: docker
 
7
  pinned: false
 
 
8
  ---
 
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Notion API Bridge
3
+ emoji: 🌉
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: docker
7
+ app_port: 7860
8
  pinned: false
9
+ license: mit # Or choose another appropriate license if preferred
10
+ # Add any other relevant tags or configuration if needed
11
  ---
12
+ # OpenAI to Notion API Bridge
13
 
14
+ This project provides a FastAPI application that acts as a bridge between OpenAI-compatible API calls and the Notion API, allowing you to interact with Notion using standard OpenAI tools and libraries.
15
+
16
+ ## Environment Variables
17
+
18
+ The application requires the following environment variables to be set:
19
+
20
+ * `NOTION_COOKIE`: Your Notion complete cookie value. This is used for authentication with the Notion API. You can typically find this in your browser's developer tools while logged into Notion.
21
+ * `NOTION_SPACE_ID`: The ID of the Notion Space you want the API to interact with (`x-notion-space-id in header`).
22
+ * `PROXY_AUTH_TOKEN` (Optional): The Bearer token required for authentication to access the API endpoints. If not set, it defaults to `default_token`.
23
+ * `NOTION_ACTIVE_USER_HEADER` (Optional): If set, its value will be used for the `x-notion-active-user-header` in requests sent to the Notion API. If not set or empty, the header is omitted.
24
+ * `PROXY_URL` (Optional): URL of a proxy server to route all network connections through. If not set or empty, no proxy is used. Supports both HTTP and SOCKS5 proxies. Examples:
25
+ * HTTP proxy: `http://proxy.example.com:8080`
26
+ * SOCKS5 proxy: `socks5://proxy.example.com:1080`
27
+
28
+ This proxy configuration affects all network connections made by the application through the Playwright browser automation.
29
+
30
+ ## Running Locally (without Docker)
31
+
32
+ 1. Ensure you have Python 3.10+ installed.
33
+ 2. Install dependencies:
34
+ ```bash
35
+ pip install -r requirements.txt
36
+ ```
37
+ 3. Create a `.env` file in the project root with your `NOTION_COOKIE`:
38
+ ```dotenv
39
+ NOTION_COOKIE="your_cookie_value_here"
40
+ NOTION_SPACE_ID="your_space_id_here"
41
+ # PROXY_AUTH_TOKEN="your_secure_token" # Optional, defaults to default_token
42
+ # NOTION_ACTIVE_USER_HEADER="your_user_id" # Optional
43
+ # PROXY_URL="http://proxy.example.com:8080" # Optional, for HTTP proxy
44
+ # PROXY_URL="socks5://proxy.example.com:1080" # Optional, for SOCKS5 proxy
45
+ ```
46
+ 4. Run the application using Uvicorn:
47
+ ```bash
48
+ uvicorn main:app --reload --port 7860
49
+ ```
50
+ The server will be available at `http://localhost:7860`. You will need to provide the correct token (either the default `default_token` or the one set in `.env`) via an `Authorization: Bearer <token>` header. The `NOTION_SPACE_ID` will be loaded from the `.env` file.
51
+
52
+ ## Running with Docker Compose (Recommended for Local Dev)
53
+
54
+ This method uses the `docker-compose.yml` file for a streamlined local development setup. It automatically builds the image if needed and loads environment variables directly from your `.env` file.
55
+
56
+ 1. Ensure you have Docker and Docker Compose installed.
57
+ 2. Make sure your `.env` file exists in the project root with your `NOTION_COOKIE`, `NOTION_SPACE_ID`, and optionally `PROXY_AUTH_TOKEN`, `NOTION_ACTIVE_USER_HEADER`, and `PROXY_URL`. If `PROXY_AUTH_TOKEN` is not in the `.env` file, the default `default_token` will be used. If `NOTION_ACTIVE_USER_HEADER` is not set or empty, the corresponding header will not be sent. If `PROXY_URL` is not set or empty, no proxy will be used.
58
+ 3. Run the following command in the project root:
59
+ ```bash
60
+ docker-compose up --build -d
61
+ ```
62
+ * `--build`: Rebuilds the image if the `Dockerfile` or context has changed.
63
+ * `-d`: Runs the container in detached mode (in the background).
64
+ 4. The application will be accessible locally at `http://localhost:8139`. Environment variables like `NOTION_COOKIE` and `NOTION_SPACE_ID` will be loaded automatically from the `.env` file.
65
+
66
+ To stop the service, run:
67
+ ```bash
68
+ docker-compose down
69
+ ```
70
+
71
+ ## Running with Docker Command (Manual)
72
+
73
+ This method involves building and running the Docker container manually, passing environment variables directly in the command.
74
+
75
+ 1. **Build the Docker image:**
76
+ ```bash
77
+ docker build -t notion-api-bridge .
78
+ ```
79
+ 2. **Run the Docker container:**
80
+ Replace `"your_cookie_value"` with your actual Notion cookie.
81
+ ```bash
82
+ docker run -p 7860:7860 \
83
+ -e NOTION_COOKIE="your_cookie_value" \
84
+ -e NOTION_SPACE_ID="your_space_id" \
85
+ -e PROXY_AUTH_TOKEN="your_token" \ # Set your desired token here
86
+ # -e NOTION_ACTIVE_USER_HEADER="your_user_id" \ # Optional: Set the active user header
87
+ # -e PROXY_URL="http://proxy.example.com:8080" \ # Optional: Set proxy URL
88
+ notion-api-bridge
89
+ ```
90
+ The server will be available at `http://localhost:7860` (or whichever host port you mapped to the container's 7860). You will need to use the token provided in the `-e PROXY_AUTH_TOKEN` flag via an `Authorization: Bearer <token>` header for authentication. The `NOTION_SPACE_ID` is passed directly via the `-e` flag.
91
+
92
+ ## Deploying to Hugging Face Spaces
93
+
94
+ This application is designed to be easily deployed as a Docker Space on Hugging Face.
95
+
96
+ 1. **Create a new Space:** Go to Hugging Face and create a new Space, selecting "Docker" as the Space SDK. Choose a name (e.g., `notion-api-bridge`).
97
+ 2. **Upload Files:** Upload the `Dockerfile`, `main.py`, `models.py`, and `requirements.txt` to your Space repository. You can do this via the web interface or by cloning the repository and pushing the files. **Do not upload your `.env` file.**
98
+ 3. **Add Secrets:** In your Space settings, navigate to the "Secrets" section. Add two secrets:
99
+ * `NOTION_COOKIE`: Paste your Notion `token_v2` cookie value.
100
+ * `NOTION_SPACE_ID`: Paste the ID of the target Notion Space.
101
+ * `PROXY_AUTH_TOKEN`: Paste the desired Bearer token for API authentication (e.g., a strong, generated token). If you omit this, the default `default_token` will be used.
102
+ * `NOTION_ACTIVE_USER_HEADER` (Optional): Paste the user ID to be sent in the `x-notion-active-user-header`. If omitted, the header will not be sent.
103
+ * `PROXY_URL` (Optional): Paste the proxy server URL if you need to route connections through a proxy. Supports both HTTP (e.g., `http://proxy.example.com:8080`) and SOCKS5 (e.g., `socks5://proxy.example.com:1080`) proxies.
104
+ Hugging Face will securely inject these secrets as environment variables into your running container.
105
+ 4. **Deployment:** Hugging Face Spaces will automatically build the Docker image from your `Dockerfile` and run the container. It detects applications running on port 7860 (as specified in the `Dockerfile` and metadata).
106
+ 5. **Accessing the API:** Once the Space is running, you can access the API endpoint at the Space's public URL, providing the token via an `Authorization: Bearer <token>` header. The token must match the `PROXY_AUTH_TOKEN` secret you set (or the default `default_token`). The `NOTION_SPACE_ID` will be used automatically based on the secret you configured.
107
+
108
+ **Example using `curl` (replace `your_token` and URL):**
109
+ ```bash
110
+ # Example for Hugging Face Space (using token from HF Secret)
111
+ # Replace YOUR_HF_TOKEN with the value you set in the PROXY_AUTH_TOKEN secret
112
+ curl -X POST https://your-username-your-space-name.hf.space/v1/chat/completions \
113
+ -H "Authorization: Bearer YOUR_HF_TOKEN" \
114
+ -H "Content-Type: application/json" \
115
+ -d '{
116
+ "model": "notion-model", # Specify a Notion model like "openai-gpt-4.1"
117
+ "messages": [{"role": "user", "content": "Summarize this document."}],
118
+ "stream": false,
119
+ "notion_model": "openai-gpt-4.1" # Required field for Notion
120
+ }'
121
+
122
+ # Example for Localhost (using default token 'default_token')
123
+ # If you set a different token in .env or via -e, use that instead.
124
+ curl -X POST http://localhost:7860/v1/chat/completions \
125
+ -H "Authorization: Bearer default_token" \
126
+ -H "Content-Type: application/json" \
127
+ -d '{
128
+ "model": "notion-model",
129
+ "messages": [{"role": "user", "content": "What is the capital of France?"}],
130
+ "stream": true,
131
+ "notion_model": "anthropic-sonnet-4" # Required field for Notion
132
+ }'
docker-compose.yml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ version: '3.8'
2
+ services:
3
+ notion-bridge:
4
+ build: .
5
+ ports:
6
+ - "8139:7860" # Map host port 8139 to container port 7860
7
+ env_file:
8
+ - .env
main.py ADDED
@@ -0,0 +1,853 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import asyncio
2
+ import sys # For platform check
3
+ import logging
4
+ import os
5
+ import uuid
6
+ import json
7
+ import time
8
+ import random
9
+ import asyncio
10
+ import httpx # For getSpaces API call
11
+ from contextlib import asynccontextmanager # For lifespan
12
+ from playwright.async_api import async_playwright, Error as PlaywrightError
13
+ from fastapi import FastAPI, Request, HTTPException, Depends, status
14
+ from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
15
+ from fastapi.responses import StreamingResponse
16
+ from dotenv import load_dotenv
17
+ import secrets # Added for secure comparison
18
+ from datetime import datetime, timedelta, timezone # Explicit datetime imports
19
+ from zoneinfo import ZoneInfo # For timezone handling
20
+ from typing import List, Optional # Add List and Optional for typing
21
+ from models import (
22
+ ChatMessage, ChatCompletionRequest, NotionTranscriptConfigValue,
23
+ NotionTranscriptContextValue, NotionTranscriptItem, NotionDebugOverrides,
24
+ NotionRequestBody, ChoiceDelta, Choice, ChatCompletionChunk, Model, ModelList
25
+ )
26
+
27
+ # Load environment variables from .env file
28
+ load_dotenv()
29
+
30
+ # --- Event Loop Policy for Windows ---
31
+ # For Playwright compatibility, especially with subprocesses on Windows
32
+ if sys.platform == "win32":
33
+ asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy())
34
+ logging.info("Set WindowsProactorEventLoopPolicy for asyncio.")
35
+
36
+ # --- Logging Configuration ---
37
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
38
+
39
+ # --- Account Management Class ---
40
+ class NotionAccount:
41
+ """Represents a single Notion account with its cookie and fetched IDs."""
42
+ def __init__(self, cookie: str):
43
+ self.cookie = cookie
44
+ self.space_id: Optional[str] = None
45
+ self.user_id: Optional[str] = None
46
+ self.is_healthy: bool = False
47
+ self.lock = asyncio.Lock() # To prevent concurrent fetches for the same account
48
+
49
+ def __str__(self):
50
+ return f"Account(user_id={self.user_id}, space_id={self.space_id}, healthy={self.is_healthy})"
51
+
52
+ # --- Configuration ---
53
+ NOTION_API_URL = "https://www.notion.so/api/v3/runInferenceTranscript"
54
+ # IMPORTANT: Load multiple Notion cookies securely from environment variables
55
+ # Use a unique separator like '|' because cookies can contain ';' and ','
56
+ NOTION_COOKIES_RAW = os.getenv("NOTION_COOKIES")
57
+
58
+ # --- Global State for Account Polling ---
59
+ ACCOUNTS: List[NotionAccount] = []
60
+ CURRENT_ACCOUNT_INDEX = 0 # Index for round-robin
61
+
62
+ if not NOTION_COOKIES_RAW:
63
+ # This is a critical error, app cannot function without it.
64
+ logging.error("CRITICAL: NOTION_COOKIES environment variable not set. Application will not work.")
65
+ # In a real app, you might exit or raise a more severe startup error.
66
+
67
+ # --- Proxy Configuration ---
68
+ PROXY_URL = os.getenv("PROXY_URL", "") # Empty string as default
69
+
70
+ # --- Authentication ---
71
+ EXPECTED_TOKEN = os.getenv("PROXY_AUTH_TOKEN", "default_token") # Default token
72
+ security = HTTPBearer()
73
+
74
+ def authenticate(credentials: HTTPAuthorizationCredentials = Depends(security)):
75
+ """Compares provided token with the expected token."""
76
+ correct_token = secrets.compare_digest(credentials.credentials, EXPECTED_TOKEN)
77
+ if not correct_token:
78
+ raise HTTPException(
79
+ status_code=status.HTTP_401_UNAUTHORIZED,
80
+ detail="Invalid authentication credentials",
81
+ # WWW-Authenticate header removed for Bearer
82
+ )
83
+ return True # Indicate successful authentication
84
+
85
+ # --- Notion Account Management ---
86
+
87
+ def get_next_account() -> NotionAccount:
88
+ """
89
+ Selects the next healthy Notion account using a round-robin strategy.
90
+ This function is not async and relies on Python's GIL for atomic index updates,
91
+ which is safe in a single-threaded asyncio environment.
92
+ """
93
+ global CURRENT_ACCOUNT_INDEX
94
+
95
+ # This check runs on every request to ensure we don't try to select from an empty list
96
+ # in case all accounts failed during startup.
97
+ if not ACCOUNTS:
98
+ raise HTTPException(
99
+ status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
100
+ detail="No Notion accounts are configured in the server."
101
+ )
102
+
103
+ healthy_accounts = [acc for acc in ACCOUNTS if acc.is_healthy]
104
+ if not healthy_accounts:
105
+ raise HTTPException(
106
+ status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
107
+ detail="No healthy Notion accounts available to process the request."
108
+ )
109
+
110
+ # Round-robin logic: iterate through all accounts to find the next healthy one
111
+ # This ensures that we can recover if an account becomes healthy again later.
112
+ # A lock is not strictly necessary in asyncio for a simple index increment,
113
+ # but could be added for thread-safety if using a threaded server.
114
+ start_index = CURRENT_ACCOUNT_INDEX
115
+ while True:
116
+ account = ACCOUNTS[CURRENT_ACCOUNT_INDEX]
117
+ CURRENT_ACCOUNT_INDEX = (CURRENT_ACCOUNT_INDEX + 1) % len(ACCOUNTS)
118
+ if account.is_healthy:
119
+ logging.info(f"Selected Notion account with User ID: {account.user_id} for request.")
120
+ return account
121
+ # This check prevents an infinite loop if no accounts are healthy,
122
+ # although the initial check for healthy_accounts should prevent this.
123
+ if CURRENT_ACCOUNT_INDEX == start_index:
124
+ # This part should theoretically not be reached.
125
+ raise HTTPException(
126
+ status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
127
+ detail="Critical error in account selection: No healthy accounts found in rotation."
128
+ )
129
+
130
+ # --- FastAPI App ---
131
+
132
+ async def fetch_and_set_notion_ids(account: NotionAccount):
133
+ """Fetches space ID and user ID for a given Notion account and marks it as healthy on success."""
134
+ async with account.lock: # Ensure only one fetch operation per account at a time
135
+ if account.is_healthy: # Don't re-fetch if already healthy
136
+ logging.info(f"Account for user {account.user_id} is already healthy, skipping fetch.")
137
+ return
138
+
139
+ if not account.cookie:
140
+ logging.error("Cannot fetch Notion IDs: Account cookie is not set.")
141
+ account.is_healthy = False
142
+ return
143
+
144
+ get_spaces_url = "https://www.notion.so/api/v3/getSpaces"
145
+ # Headers for the JS fetch call (cookie is handled by context)
146
+ js_fetch_headers = {
147
+ 'Content-Type': 'application/json',
148
+ 'accept': '*/*',
149
+ 'accept-language': 'en-US,en;q=0.9',
150
+ 'notion-audit-log-platform': 'web',
151
+ 'notion-client-version': '23.13.0.3686', # Match cURL example or use a recent one
152
+ 'origin': 'https://www.notion.so',
153
+ 'priority': 'u=1, i',
154
+ 'referer': 'https://www.notion.so/', # Simplified
155
+ 'sec-ch-ua': '"Chromium";v="136", "Google Chrome";v="136", "Not.A/Brand";v="99"',
156
+ 'sec-ch-ua-mobile': '?0',
157
+ 'sec-ch-ua-platform': '"Windows"',
158
+ 'sec-fetch-dest': 'empty',
159
+ 'sec-fetch-mode': 'cors',
160
+ 'sec-fetch-site': 'same-origin',
161
+ 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36'
162
+ }
163
+
164
+ browser = None
165
+ context = None
166
+ page = None
167
+
168
+ try:
169
+ logging.info(f"Attempting to fetch Notion user/space IDs for an account...")
170
+ async with async_playwright() as p:
171
+ # Configure browser launch with proxy if PROXY_URL is set
172
+ launch_args = {
173
+ 'headless': True,
174
+ 'args': ['--no-sandbox', '--disable-setuid-sandbox']
175
+ }
176
+ if PROXY_URL:
177
+ launch_args['proxy'] = {'server': PROXY_URL}
178
+ logging.info(f"Using proxy for browser launch: {PROXY_URL}")
179
+
180
+ try:
181
+ browser = await p.chromium.launch(**launch_args)
182
+ except PlaywrightError as e:
183
+ if PROXY_URL and "proxy" in str(e).lower():
184
+ logging.error(f"Invalid proxy URL or proxy connection failed: {PROXY_URL}. Error: {e}")
185
+ raise PlaywrightError(f"Proxy configuration error: {e}")
186
+ else:
187
+ raise
188
+
189
+ context = await browser.new_context(user_agent=js_fetch_headers['user-agent'])
190
+
191
+ # Add cookies from the account's cookie string
192
+ cookies_to_add = []
193
+ cookie_pairs = account.cookie.split('; ')
194
+ for pair in cookie_pairs:
195
+ if '=' in pair:
196
+ name, value = pair.split('=', 1)
197
+ cookies_to_add.append({
198
+ 'name': name.strip(), 'value': value.strip(),
199
+ 'domain': '.notion.so', 'path': '/',
200
+ 'secure': True, 'httpOnly': True, 'sameSite': 'Lax'
201
+ })
202
+ if cookies_to_add:
203
+ await context.add_cookies(cookies_to_add)
204
+ else:
205
+ logging.error("No valid cookies parsed from account's cookie for getSpaces.")
206
+ account.is_healthy = False
207
+ return
208
+
209
+ page = await context.new_page()
210
+ logging.info("DEBUG: getSpaces - Navigating to notion.so to warm up context...")
211
+ try:
212
+ await page.goto("https://www.notion.so/", wait_until="domcontentloaded", timeout=15000) # 15s timeout
213
+ logging.info("DEBUG: getSpaces - Warm-up navigation to notion.so complete.")
214
+ except PlaywrightError as nav_err:
215
+ logging.warning(f"DEBUG: getSpaces - Warm-up navigation to notion.so failed: {nav_err}. Proceeding with fetch anyway.")
216
+
217
+ # JavaScript to perform the fetch for getSpaces
218
+ javascript_code_get_spaces = """
219
+ async (args) => {
220
+ const { apiUrl, headers, body } = args;
221
+ try {
222
+ const response = await fetch(apiUrl, {
223
+ method: 'POST',
224
+ headers: headers,
225
+ body: JSON.stringify(body) // Ensure body is stringified
226
+ });
227
+ if (!response.ok) {
228
+ console.error('getSpaces Fetch error:', response.status, await response.text());
229
+ return { success: false, error: `HTTP ${response.status}` };
230
+ }
231
+ const data = await response.json();
232
+ return { success: true, data: data };
233
+ } catch (error) {
234
+ console.error('getSpaces JS Exception:', error);
235
+ return { success: false, error: error.toString() };
236
+ }
237
+ }
238
+ """
239
+ js_args = {"apiUrl": get_spaces_url, "headers": js_fetch_headers, "body": {}} # Empty JSON body for getSpaces
240
+
241
+ logging.info("Executing Playwright page.evaluate for getSpaces...")
242
+ result = await page.evaluate(javascript_code_get_spaces, js_args)
243
+
244
+ if not result or not result.get('success'):
245
+ error_detail = result.get('error', 'Unknown error during getSpaces JS execution')
246
+ logging.error(f"Playwright getSpaces call failed for account: {error_detail}")
247
+ account.is_healthy = False
248
+ return
249
+
250
+ data = result.get('data')
251
+ if not data:
252
+ logging.error("No data returned from successful getSpaces call for account.")
253
+ account.is_healthy = False
254
+ return
255
+
256
+ # Extract user ID
257
+ user_id_key = next(iter(data), None)
258
+ if not user_id_key:
259
+ logging.error("Could not extract user ID from getSpaces response for account.")
260
+ account.is_healthy = False
261
+ return
262
+ account.user_id = user_id_key
263
+ logging.info(f"Fetched Notion User ID: {account.user_id}")
264
+
265
+ # Extract space ID
266
+ user_root = data.get(user_id_key, {}).get("user_root", {}).get(user_id_key, {})
267
+ space_view_pointers = user_root.get("value", {}).get("value", {}).get("space_view_pointers", [])
268
+ if space_view_pointers and isinstance(space_view_pointers, list) and len(space_view_pointers) > 0:
269
+ account.space_id = space_view_pointers[0].get("spaceId")
270
+ if account.space_id:
271
+ logging.info(f"Fetched Notion Space ID: {account.space_id} for User ID: {account.user_id}")
272
+ account.is_healthy = True # Mark as healthy on complete success
273
+ else:
274
+ logging.error(f"Could not extract spaceId for User ID: {account.user_id}")
275
+ account.is_healthy = False
276
+ else:
277
+ logging.error(f"Could not find space_view_pointers or spaceId for User ID: {account.user_id}")
278
+ account.is_healthy = False
279
+
280
+ except PlaywrightError as e:
281
+ logging.error(f"Playwright error during fetch_and_set_notion_ids for account: {e}")
282
+ account.is_healthy = False
283
+ except Exception as e:
284
+ logging.error(f"General error during fetch_and_set_notion_ids for account: {e}")
285
+ account.is_healthy = False
286
+ finally:
287
+ # Prioritize closing the browser, which should handle its contexts/pages.
288
+ # Add checks to prevent errors if already closed.
289
+ if browser and browser.is_connected():
290
+ try:
291
+ logging.info("DEBUG: fetch_and_set_notion_ids - Closing browser...")
292
+ await browser.close()
293
+ logging.info("DEBUG: fetch_and_set_notion_ids - Browser closed.")
294
+ except PlaywrightError as e:
295
+ logging.warning(f"DEBUG: fetch_and_set_notion_ids - Ignoring error during browser close: {e}")
296
+ except Exception as e: # Catch potential unexpected errors during close
297
+ logging.warning(f"DEBUG: fetch_and_set_notion_ids - Ignoring unexpected error during browser close: {e}")
298
+ else:
299
+ # If browser is None or not connected, page/context are likely also invalid or already handled.
300
+ logging.info("DEBUG: fetch_and_set_notion_ids - Browser already closed or not initialized.")
301
+
302
+ logging.info(f"fetch_and_set_notion_ids completed for account. Final status: {account}")
303
+
304
+
305
+ @asynccontextmanager
306
+ async def lifespan(app: FastAPI):
307
+ # On startup
308
+ logging.info("Application startup: Initializing Notion accounts...")
309
+
310
+ if NOTION_COOKIES_RAW:
311
+ # Split cookies by a unique separator, e.g., '|'
312
+ cookie_list = [c.strip() for c in NOTION_COOKIES_RAW.split('|') if c.strip()]
313
+ for cookie in cookie_list:
314
+ ACCOUNTS.append(NotionAccount(cookie=cookie))
315
+ logging.info(f"Loaded {len(ACCOUNTS)} Notion account(s) from environment variable.")
316
+
317
+ if not ACCOUNTS:
318
+ logging.error("CRITICAL: No Notion accounts loaded. The application will not be able to process requests.")
319
+ else:
320
+ # Concurrently fetch IDs for all accounts
321
+ logging.info("Fetching IDs for all loaded Notion accounts...")
322
+ fetch_tasks = [fetch_and_set_notion_ids(acc) for acc in ACCOUNTS]
323
+ await asyncio.gather(*fetch_tasks)
324
+
325
+ healthy_count = sum(1 for acc in ACCOUNTS if acc.is_healthy)
326
+ logging.info(f"Initialization complete. {healthy_count} of {len(ACCOUNTS)} accounts are healthy.")
327
+
328
+ if healthy_count == 0:
329
+ logging.error("CRITICAL: No healthy Notion accounts available after initialization.")
330
+
331
+ yield
332
+ # On shutdown (if any cleanup needed)
333
+ logging.info("Application shutdown.")
334
+
335
+ app = FastAPI(lifespan=lifespan)
336
+
337
+ # --- Helper Functions ---
338
+
339
+ def build_notion_request(request_data: ChatCompletionRequest, account: NotionAccount) -> NotionRequestBody:
340
+ """Transforms OpenAI-style messages to Notion transcript format, using the provided account."""
341
+
342
+ # --- Timestamp and User ID Logic ---
343
+ # Use the user ID from the selected account
344
+ user_id = account.user_id
345
+ # Get all non-assistant messages to assign timestamps
346
+ non_assistant_messages = [msg for msg in request_data.messages if msg.role != "assistant"]
347
+ num_non_assistant_messages = len(non_assistant_messages)
348
+ message_timestamps = {} # Store timestamps keyed by message id
349
+
350
+ if num_non_assistant_messages > 0:
351
+ # Get current time specifically in Pacific Time (America/Los_Angeles)
352
+ pacific_tz = ZoneInfo("America/Los_Angeles")
353
+ now_pacific = datetime.now(timezone.utc).astimezone(pacific_tz)
354
+
355
+ # Assign timestamp to the last non-assistant message
356
+ last_msg_id = non_assistant_messages[-1].id
357
+ message_timestamps[last_msg_id] = now_pacific
358
+
359
+ # Calculate timestamps for previous non-assistant messages (random intervals earlier)
360
+ current_timestamp = now_pacific
361
+ for i in range(num_non_assistant_messages - 2, -1, -1): # Iterate backwards from second-to-last
362
+ current_timestamp -= timedelta(minutes=random.randint(3, 20)) # Use random interval (3-20 mins)
363
+ message_timestamps[non_assistant_messages[i].id] = current_timestamp
364
+
365
+ # --- Build Transcript ---
366
+ # Get current time in Pacific timezone for context
367
+ pacific_tz = ZoneInfo("America/Los_Angeles")
368
+ now_pacific = datetime.now(timezone.utc).astimezone(pacific_tz)
369
+ # Format timestamp exactly as YYYY-MM-DDTHH:MM:SS.fff-HH:MM
370
+ dt_str = now_pacific.strftime("%Y-%m-%dT%H:%M:%S")
371
+ ms = f"{now_pacific.microsecond // 1000:03d}" # Ensure 3 digits for milliseconds
372
+ tz_str = now_pacific.strftime("%z") # Gets +HHMM or -HHMM
373
+ formatted_tz = f"{tz_str[:-2]}:{tz_str[-2:]}" # Insert colon
374
+ current_datetime_iso = f"{dt_str}.{ms}{formatted_tz}"
375
+
376
+ # Generate random text for userName and spaceName
377
+ random_words = ["Project", "Workspace", "Team", "Studio", "Lab", "Hub", "Zone", "Space"]
378
+ user_name = f"User{random.randint(100, 999)}"
379
+ space_name = f"{random.choice(random_words)} {random.randint(1, 99)}"
380
+
381
+ transcript = [
382
+ NotionTranscriptItem(
383
+ type="config",
384
+ value=NotionTranscriptConfigValue(model=request_data.notion_model)
385
+ ),
386
+ NotionTranscriptItem(
387
+ type="context",
388
+ value=NotionTranscriptContextValue(
389
+ userId=user_id or "", # Use the user_id from the selected account
390
+ spaceId=account.space_id, # Use space_id from the selected account
391
+ surface="home_module",
392
+ timezone="America/Los_Angeles",
393
+ userName=user_name,
394
+ spaceName=space_name,
395
+ spaceViewId=str(uuid.uuid4()), # Random UUID for spaceViewId
396
+ currentDatetime=current_datetime_iso
397
+ )
398
+ ),
399
+ NotionTranscriptItem(
400
+ type="agent-integration"
401
+ # No value field needed for agent-integration
402
+ )
403
+ ]
404
+
405
+ for message in request_data.messages:
406
+ if message.role == "assistant":
407
+ # Assistant messages get type="markdown-chat" and a traceId
408
+ transcript.append(NotionTranscriptItem(
409
+ type="markdown-chat",
410
+ value=message.content,
411
+ traceId=str(uuid.uuid4()) # Generate unique traceId for assistant message
412
+ ))
413
+ else: # Treat all other roles (user, system, etc.) as "user" type
414
+ created_at_dt = message_timestamps.get(message.id) # Use the unified timestamp dict
415
+ created_at_iso = None
416
+ if created_at_dt:
417
+ # Format timestamp exactly as YYYY-MM-DDTHH:MM:SS.fff-HH:MM
418
+ dt_str = created_at_dt.strftime("%Y-%m-%dT%H:%M:%S")
419
+ ms = f"{created_at_dt.microsecond // 1000:03d}" # Ensure 3 digits for milliseconds
420
+ tz_str = created_at_dt.strftime("%z") # Gets +HHMM or -HHMM
421
+ formatted_tz = f"{tz_str[:-2]}:{tz_str[-2:]}" # Insert colon
422
+ created_at_iso = f"{dt_str}.{ms}{formatted_tz}"
423
+
424
+ content = message.content
425
+ # Ensure content is treated as a string for user/system messages
426
+ if isinstance(content, list):
427
+ # Attempt to extract text from list format, default to empty string
428
+ text_content = ""
429
+ for part in content:
430
+ if isinstance(part, dict) and part.get("type") == "text":
431
+ text_part = part.get("text")
432
+ if isinstance(text_part, str):
433
+ text_content += text_part # Concatenate text parts if needed
434
+ content = text_content if text_content else "" # Use extracted text or empty string
435
+ elif not isinstance(content, str):
436
+ content = "" # Default to empty string if not list or string
437
+
438
+ # Format value as expected by Notion for user type: [[content_string]]
439
+ notion_value = [[content]] if content else [[""]]
440
+
441
+ transcript.append(NotionTranscriptItem(
442
+ type="user", # Set type to "user" for non-assistant roles
443
+ value=notion_value,
444
+ userId=user_id, # Assign userId
445
+ createdAt=created_at_iso # Assign timestamp
446
+ # No traceId for user/system messages
447
+ ))
448
+
449
+ # Use spaceId from the selected account, set createThread=True
450
+ return NotionRequestBody(
451
+ spaceId=account.space_id, # From selected account
452
+ transcript=transcript,
453
+ createThread=True, # Always create a new thread
454
+ # Generate a new traceId for each request
455
+ traceId=str(uuid.uuid4()),
456
+ # Explicitly set debugOverrides, generateTitle, and saveAllThreadOperations
457
+ debugOverrides=NotionDebugOverrides(
458
+ cachedInferences={},
459
+ annotationInferences={},
460
+ emitInferences=False
461
+ ),
462
+ generateTitle=False,
463
+ saveAllThreadOperations=False
464
+ )
465
+
466
+ # --- Background Playwright Task ---
467
+ async def _run_playwright_fetch(
468
+ chunk_queue: asyncio.Queue,
469
+ notion_request_body: NotionRequestBody,
470
+ headers_template: dict,
471
+ notion_api_url: str,
472
+ account: NotionAccount # Pass the whole account object
473
+ ):
474
+ """Runs Playwright fetch in the background, putting results into a queue."""
475
+ browser = None
476
+ context = None
477
+ page = None
478
+
479
+ # Construct headers for this specific task run
480
+ current_headers = headers_template.copy()
481
+ current_headers['x-notion-space-id'] = account.space_id # Use fetched space_id
482
+ if account.user_id: # Use fetched user_id for active user header
483
+ current_headers['x-notion-active-user-header'] = account.user_id
484
+
485
+ # 'cookie' is handled by context.add_cookies(), so it's not in current_headers for fetch
486
+
487
+ async def handle_chunk(chunk_str: str):
488
+ await chunk_queue.put(chunk_str)
489
+
490
+ async def handle_stream_end():
491
+ await chunk_queue.put(None)
492
+
493
+ try:
494
+ logging.info("DEBUG: Background task starting Playwright.")
495
+ async with async_playwright() as p:
496
+ # Configure browser launch with proxy if PROXY_URL is set
497
+ launch_args = {
498
+ 'headless': True,
499
+ 'args': ['--no-sandbox', '--disable-setuid-sandbox']
500
+ }
501
+ if PROXY_URL:
502
+ launch_args['proxy'] = {'server': PROXY_URL}
503
+ logging.info(f"DEBUG: Background task using proxy: {PROXY_URL}")
504
+
505
+ try:
506
+ browser = await p.chromium.launch(**launch_args)
507
+ except PlaywrightError as e:
508
+ if PROXY_URL and "proxy" in str(e).lower():
509
+ logging.error(f"Invalid proxy URL or proxy connection failed: {PROXY_URL}. Error: {e}")
510
+ await handle_stream_end() # Signal end of stream
511
+ raise PlaywrightError(f"Proxy configuration error: {e}")
512
+ else:
513
+ raise
514
+
515
+ logging.info("DEBUG: Background task browser launched.")
516
+ # Get user-agent from the constructed headers for this task
517
+ user_agent_for_context = current_headers.get('user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36') # Default if not in template
518
+ context = await browser.new_context(user_agent=user_agent_for_context)
519
+ logging.info("DEBUG: Background task context created.")
520
+
521
+ if account.cookie: # Use passed account cookie
522
+ cookies_to_add = []
523
+ cookie_pairs = account.cookie.split('; ')
524
+ for pair in cookie_pairs:
525
+ if '=' in pair:
526
+ name, value = pair.split('=', 1)
527
+ cookies_to_add.append({
528
+ 'name': name.strip(), 'value': value.strip(),
529
+ 'domain': '.notion.so', 'path': '/',
530
+ 'secure': True, 'httpOnly': True, 'sameSite': 'Lax'
531
+ })
532
+ if cookies_to_add:
533
+ await context.add_cookies(cookies_to_add)
534
+ logging.info("DEBUG: Background task cookies added.")
535
+ else:
536
+ logging.warning("Warning: No valid cookies found in account cookie for background task.")
537
+ else:
538
+ logging.error("Error: Account cookie is empty for background task.")
539
+ raise ValueError("Server configuration error: Notion cookie not set for background task.")
540
+
541
+ page = await context.new_page()
542
+ logging.info("DEBUG: Background task page created.")
543
+ await page.goto("https://www.notion.so/chat", wait_until="domcontentloaded")
544
+ logging.info("DEBUG: Background task navigation complete.")
545
+
546
+ await page.expose_function("sendChunkToPython", handle_chunk)
547
+ await page.expose_function("signalStreamEnd", handle_stream_end)
548
+ logging.info("DEBUG: Background task functions exposed.")
549
+
550
+ request_body_json_str = notion_request_body.json()
551
+
552
+ # Prepare headers for JS fetch (cookie is handled by context)
553
+ js_fetch_headers = current_headers.copy()
554
+ if 'cookie' in js_fetch_headers: # Should not be there if template is correct
555
+ del js_fetch_headers['cookie']
556
+
557
+
558
+ javascript_code = """
559
+ async (args) => {
560
+ const { apiUrl, headers, body } = args;
561
+ try {
562
+ const response = await fetch(apiUrl, {
563
+ method: 'POST',
564
+ headers: headers,
565
+ body: body
566
+ });
567
+ if (!response.ok) {
568
+ const errorText = await response.text();
569
+ console.error('JS Fetch error:', response.status, errorText);
570
+ await window.signalStreamEnd();
571
+ return { success: false, status: response.status, error: errorText };
572
+ }
573
+ if (!response.body) {
574
+ console.error('JS Response body is null');
575
+ await window.signalStreamEnd();
576
+ return { success: false, error: 'Response body is null' };
577
+ }
578
+ const reader = response.body.getReader();
579
+ const decoder = new TextDecoder();
580
+ while (true) {
581
+ const { done, value } = await reader.read();
582
+ if (done) break;
583
+ await window.sendChunkToPython(decoder.decode(value, { stream: true }));
584
+ }
585
+ await window.signalStreamEnd();
586
+ return { success: true };
587
+ } catch (error) {
588
+ console.error('JS Exception during fetch:', error);
589
+ await window.signalStreamEnd();
590
+ return { success: false, error: error.toString() };
591
+ }
592
+ }
593
+ """
594
+ js_args = {"apiUrl": notion_api_url, "headers": js_fetch_headers, "body": request_body_json_str}
595
+ logging.info("DEBUG: Background task executing page.evaluate()...")
596
+ js_result = await page.evaluate(javascript_code, js_args)
597
+ logging.info(f"DEBUG: Background task page.evaluate() result: {js_result}")
598
+
599
+ if not js_result or not js_result.get('success'):
600
+ error_detail = js_result.get('error', 'Unknown JS execution error')
601
+ logging.error(f"Error in background task JS execution: {error_detail}")
602
+ # Error already signaled to queue by JS calling signalStreamEnd
603
+ # Re-raise to be caught by the task's main try/except
604
+ raise PlaywrightError(f"JS Fetch Error: {error_detail}")
605
+
606
+ except Exception as e:
607
+ logging.error(f"Error in _run_playwright_fetch background task: {e}")
608
+ await chunk_queue.put(None) # Ensure queue is terminated on error
609
+ # Exception will be caught by playwright_task.exception() in the main generator
610
+ finally:
611
+ logging.info("DEBUG: Background task _run_playwright_fetch attempting to close browser.")
612
+ if browser and browser.is_connected():
613
+ try:
614
+ await browser.close()
615
+ logging.info("DEBUG: Background task browser closed.")
616
+ except Exception as e:
617
+ logging.warning(f"Ignoring error during background task browser close: {e}")
618
+ else:
619
+ logging.info("DEBUG: Background task browser already closed or not initialized.")
620
+
621
+ # --- Main Generator Called by Endpoint ---
622
+ async def stream_notion_response(notion_request_body: NotionRequestBody, account: NotionAccount):
623
+ """Creates background task for Playwright and yields results from queue."""
624
+ chunk_queue = asyncio.Queue()
625
+ playwright_task = None
626
+
627
+ # These should be defined once per request stream
628
+ chunk_id = f"chatcmpl-{uuid.uuid4()}"
629
+ created_time = int(time.time())
630
+
631
+ # Define the template for headers here, to be passed to the background task
632
+ # The background task will then add/override specific headers like x-notion-space-id
633
+ # It will also fetch NOTION_ACTIVE_USER_HEADER from os.getenv()
634
+ headers_template = {
635
+ 'accept': 'application/x-ndjson',
636
+ 'accept-language': 'en-US,en;q=0.9',
637
+ 'content-type': 'application/json',
638
+ 'notion-audit-log-platform': 'web',
639
+ 'notion-client-version': '23.13.0.3668',
640
+ 'origin': 'https://www.notion.so',
641
+ 'priority': 'u=1, i',
642
+ 'referer': 'https://www.notion.so/chat',
643
+ 'sec-ch-ua': '"Chromium";v="136", "Google Chrome";v="136", "Not.A/Brand";v="99"',
644
+ 'sec-ch-ua-mobile': '?0',
645
+ 'sec-ch-ua-platform': '"Windows"',
646
+ 'sec-fetch-dest': 'empty',
647
+ 'sec-fetch-mode': 'cors',
648
+ 'sec-fetch-site': 'same-origin',
649
+ 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
650
+ # 'cookie' and 'x-notion-space-id' will be handled/added by _run_playwright_fetch
651
+ # using the passed environment variable strings
652
+ }
653
+
654
+ try:
655
+ # Health check is now performed by get_next_account() before this function is called.
656
+ logging.info("DEBUG: Main generator starting Playwright background task.")
657
+ playwright_task = asyncio.create_task(
658
+ _run_playwright_fetch(
659
+ chunk_queue,
660
+ notion_request_body,
661
+ headers_template,
662
+ NOTION_API_URL, # Global constant
663
+ account # Pass the selected account
664
+ )
665
+ )
666
+
667
+ accumulated_line = ""
668
+ logging.info("DEBUG: Main generator starting queue processing loop.")
669
+ while True:
670
+ chunk = await chunk_queue.get() # Wait for a chunk from the background task
671
+ if chunk is None:
672
+ logging.info("DEBUG: Main generator received None sentinel from queue.")
673
+ break
674
+
675
+ accumulated_line += chunk
676
+ while '\n' in accumulated_line:
677
+ line, accumulated_line = accumulated_line.split('\n', 1)
678
+ if not line.strip():
679
+ continue
680
+ try:
681
+ data = json.loads(line)
682
+ if data.get("type") == "markdown-chat" and isinstance(data.get("value"), str):
683
+ content_chunk = data["value"]
684
+ if content_chunk:
685
+ sse_chunk = ChatCompletionChunk(
686
+ id=chunk_id, created=created_time,
687
+ choices=[Choice(delta=ChoiceDelta(content=content_chunk))]
688
+ )
689
+ logging.info(f"DEBUG: Main generator yielding chunk: {content_chunk[:50]}...")
690
+ yield f"data: {sse_chunk.json()}\n\n"
691
+ # No asyncio.sleep(0) here, as yielding should be enough
692
+ elif "recordMap" in data:
693
+ logging.info("DEBUG: Main generator detected recordMap, ignoring.")
694
+ except json.JSONDecodeError:
695
+ logging.warning(f"Warning: Main generator could not decode JSON line: {line}")
696
+ except Exception as e:
697
+ logging.error(f"Error processing line in main generator: {line} - {e}")
698
+
699
+ # Process any final accumulated data after None sentinel
700
+ if accumulated_line.strip():
701
+ try:
702
+ data = json.loads(accumulated_line)
703
+ if data.get("type") == "markdown-chat" and isinstance(data.get("value"), str):
704
+ content_chunk = data["value"]
705
+ if content_chunk:
706
+ sse_chunk = ChatCompletionChunk(
707
+ id=chunk_id, created=created_time,
708
+ choices=[Choice(delta=ChoiceDelta(content=content_chunk))]
709
+ )
710
+ logging.info(f"DEBUG: Main generator yielding final accumulated chunk: {content_chunk[:50]}...")
711
+ yield f"data: {sse_chunk.json()}\n\n"
712
+ except json.JSONDecodeError:
713
+ logging.warning(f"Warning: Main generator could not decode final JSON line: {accumulated_line}")
714
+ except Exception as e:
715
+ logging.error(f"Error processing final line in main generator: {accumulated_line} - {e}")
716
+
717
+ # After loop, check if the background task raised an exception
718
+ if playwright_task.done() and playwright_task.exception():
719
+ task_exception = playwright_task.exception()
720
+ logging.error(f"Playwright background task failed: {task_exception}")
721
+ raise HTTPException(status_code=500, detail=f"Error during background browser automation: {task_exception}")
722
+ else:
723
+ logging.info("DEBUG: Main generator background task completed successfully.")
724
+ final_chunk = ChatCompletionChunk(
725
+ id=chunk_id, created=created_time,
726
+ choices=[Choice(delta=ChoiceDelta(), finish_reason="stop")]
727
+ )
728
+ logging.info("DEBUG: Main generator yielding final stop chunk.")
729
+ yield f"data: {final_chunk.json()}\n\n"
730
+ logging.info("DEBUG: Main generator yielding [DONE] marker.")
731
+ yield "data: [DONE]\n\n"
732
+
733
+ except Exception as e:
734
+ logging.error(f"Error in main stream_notion_response generator: {e}")
735
+ if playwright_task and not playwright_task.done():
736
+ logging.info("DEBUG: Main generator cancelling background task due to its own error.")
737
+ playwright_task.cancel()
738
+ raise
739
+ finally:
740
+ logging.info("DEBUG: Main generator finished.")
741
+ if playwright_task and not playwright_task.done():
742
+ logging.info("DEBUG: Main generator ensuring background task is cancelled on exit.")
743
+ playwright_task.cancel()
744
+ try:
745
+ await playwright_task # Allow cancellation to propagate
746
+ except asyncio.CancelledError:
747
+ logging.info("DEBUG: Background task successfully cancelled.")
748
+ except Exception as e:
749
+ logging.error(f"DEBUG: Error during background task cancellation/await: {e}")
750
+
751
+
752
+ # --- API Endpoint ---
753
+
754
+ @app.get("/v1/models", response_model=ModelList)
755
+ async def list_models(authenticated: bool = Depends(authenticate)):
756
+ """
757
+ Endpoint to list available Notion models, mimicking OpenAI's /v1/models.
758
+ """
759
+ available_models = [
760
+ "openai-gpt-4.1",
761
+ "anthropic-opus-4",
762
+ "anthropic-sonnet-4"
763
+ ]
764
+ model_list = [
765
+ Model(id=model_id, owned_by="notion") # created uses default_factory
766
+ for model_id in available_models
767
+ ]
768
+ return ModelList(data=model_list)
769
+
770
+ @app.post("/v1/chat/completions")
771
+ async def chat_completions(request_data: ChatCompletionRequest, request: Request, authenticated: bool = Depends(authenticate)):
772
+ """
773
+ Endpoint to mimic OpenAI's chat completions, proxying to Notion.
774
+ It uses round-robin to select a healthy Notion account for each request.
775
+ """
776
+ account = get_next_account() # Select a healthy account
777
+
778
+ notion_request_body = build_notion_request(request_data, account)
779
+
780
+ if request_data.stream:
781
+ # Call the Playwright generator, passing the selected account
782
+ return StreamingResponse(
783
+ stream_notion_response(notion_request_body, account),
784
+ media_type="text/event-stream"
785
+ )
786
+ else:
787
+ # --- Non-Streaming Logic (Optional - Collects stream internally) ---
788
+ # Note: The primary goal is streaming, but a non-streaming version
789
+ # might be useful for testing or simpler clients.
790
+ # This requires collecting all chunks from the async generator.
791
+ full_response_content = ""
792
+ final_finish_reason = None
793
+ chunk_id = f"chatcmpl-{uuid.uuid4()}" # Generate ID for the non-streamed response
794
+ created_time = int(time.time())
795
+
796
+ # --- Non-streaming logic needs to call the generator with the selected account ---
797
+ try:
798
+ # Call the Playwright generator, passing the selected account
799
+ async for line in stream_notion_response(notion_request_body, account):
800
+ if line.startswith("data: ") and "[DONE]" not in line:
801
+ try:
802
+ data_json = line[len("data: "):].strip()
803
+ if data_json:
804
+ chunk_data = json.loads(data_json)
805
+ if chunk_data.get("choices"):
806
+ delta = chunk_data["choices"][0].get("delta", {})
807
+ content = delta.get("content")
808
+ if content:
809
+ full_response_content += content
810
+ finish_reason = chunk_data["choices"][0].get("finish_reason")
811
+ if finish_reason:
812
+ final_finish_reason = finish_reason
813
+ except json.JSONDecodeError:
814
+ print(f"Warning: Could not decode JSON line in non-streaming mode: {line}")
815
+
816
+ # Construct the final OpenAI-compatible non-streaming response
817
+ return {
818
+ "id": chunk_id,
819
+ "object": "chat.completion",
820
+ "created": created_time,
821
+ "model": request_data.model, # Return the model requested by the client
822
+ "choices": [
823
+ {
824
+ "index": 0,
825
+ "message": {
826
+ "role": "assistant",
827
+ "content": full_response_content,
828
+ },
829
+ "finish_reason": final_finish_reason or "stop", # Default to stop if not explicitly set
830
+ }
831
+ ],
832
+ "usage": { # Note: Token usage is not available from Notion
833
+ "prompt_tokens": None,
834
+ "completion_tokens": None,
835
+ "total_tokens": None,
836
+ },
837
+ }
838
+ except HTTPException as e:
839
+ # Re-raise HTTP exceptions from the streaming function
840
+ raise e
841
+ except Exception as e:
842
+ print(f"Error during non-streaming processing: {e}")
843
+ raise HTTPException(status_code=500, detail="Internal server error processing Notion response")
844
+
845
+
846
+ # --- Uvicorn Runner ---
847
+ # Allows running with `python main.py` for simple testing,
848
+ # but `uvicorn main:app --reload` is recommended for development.
849
+ if __name__ == "__main__":
850
+ import uvicorn
851
+ print("Starting server. Access at http://127.0.0.1:7860")
852
+ print("Ensure NOTION_COOKIES is set in your .env file or environment, separated by '|'.")
853
+ uvicorn.run(app, host="127.0.0.1", port=7860)
models.py ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import uuid
3
+ from pydantic import BaseModel, Field
4
+ from typing import List, Optional, Dict, Any, Literal, Union
5
+
6
+ # --- Models Moved from main.py ---
7
+
8
+ # Input Models (OpenAI-like)
9
+ class ChatMessage(BaseModel):
10
+ id: uuid.UUID = Field(default_factory=uuid.uuid4)
11
+ role: Literal["system", "user", "assistant"]
12
+ content: Union[str, List[Dict[str, Any]]]
13
+ userId: Optional[str] = None # Added for user messages
14
+ createdAt: Optional[str] = None # Added for timestamping
15
+ traceId: Optional[str] = None # Added for assistant messages
16
+
17
+ class ChatCompletionRequest(BaseModel):
18
+ messages: List[ChatMessage]
19
+ model: str = "notion-proxy" # Model name can be passed, but we map to Notion's model
20
+ stream: bool = False
21
+ # Add other potential OpenAI params if needed, though they might not map directly
22
+ # max_tokens: Optional[int] = None
23
+ # temperature: Optional[float] = None
24
+ # space_id and thread_id are now handled globally via environment variables
25
+ notion_model: str = "anthropic-opus-4" # Default Notion model, can be overridden
26
+
27
+
28
+ # Notion Models
29
+ class NotionTranscriptConfigValue(BaseModel):
30
+ type: str = "markdown-chat"
31
+ model: str # e.g., "anthropic-opus-4"
32
+
33
+ class NotionTranscriptContextValue(BaseModel):
34
+ userId: str
35
+ spaceId: str
36
+ surface: str = "home_module"
37
+ timezone: str = "America/Los_Angeles"
38
+ userName: str
39
+ spaceName: str
40
+ spaceViewId: str
41
+ currentDatetime: str
42
+
43
+ class NotionTranscriptItem(BaseModel):
44
+ id: uuid.UUID = Field(default_factory=uuid.uuid4)
45
+ type: Literal["config", "user", "markdown-chat", "agent-integration", "context"]
46
+ value: Optional[Union[List[List[str]], str, NotionTranscriptConfigValue, NotionTranscriptContextValue]] = None
47
+ userId: Optional[str] = None # Added for user messages in Notion transcript
48
+ createdAt: Optional[str] = None # Added for timestamping in Notion transcript
49
+ traceId: Optional[str] = None # Added for assistant messages in Notion transcript
50
+
51
+ class NotionDebugOverrides(BaseModel):
52
+ cachedInferences: Dict = Field(default_factory=dict)
53
+ annotationInferences: Dict = Field(default_factory=dict)
54
+ emitInferences: bool = False
55
+
56
+ class NotionRequestBody(BaseModel):
57
+ traceId: str = Field(default_factory=lambda: str(uuid.uuid4()))
58
+ spaceId: str
59
+ transcript: List[NotionTranscriptItem]
60
+ # threadId is removed, createThread will be set to true
61
+ createThread: bool = True
62
+ debugOverrides: NotionDebugOverrides = Field(default_factory=NotionDebugOverrides)
63
+ generateTitle: bool = False
64
+ saveAllThreadOperations: bool = True
65
+
66
+ class Config:
67
+ # Ensure UUIDs are serialized as strings in the final JSON request
68
+ json_encoders = {
69
+ uuid.UUID: str
70
+ }
71
+
72
+
73
+ # Output Models (OpenAI SSE)
74
+ class ChoiceDelta(BaseModel):
75
+ content: Optional[str] = None
76
+
77
+ class Choice(BaseModel):
78
+ index: int = 0
79
+ delta: ChoiceDelta
80
+ finish_reason: Optional[Literal["stop", "length"]] = None
81
+
82
+ class ChatCompletionChunk(BaseModel):
83
+ id: str = Field(default_factory=lambda: f"chatcmpl-{uuid.uuid4()}")
84
+ object: str = "chat.completion.chunk"
85
+ created: int = Field(default_factory=lambda: int(time.time()))
86
+ model: str = "notion-proxy" # Or could reflect the underlying Notion model
87
+ choices: List[Choice]
88
+
89
+
90
+ # --- Models for /v1/models Endpoint ---
91
+
92
+ class Model(BaseModel):
93
+ id: str
94
+ object: str = "model"
95
+ created: int = Field(default_factory=lambda: int(time.time()))
96
+ owned_by: str = "notion" # Or specify based on actual model origin if needed
97
+
98
+ class ModelList(BaseModel):
99
+ object: str = "list"
100
+ data: List[Model]
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi
2
+ uvicorn[standard]
3
+ httpx
4
+ pydantic
5
+ python-dotenv
6
+ playwright