BrejBala commited on
Commit
b09b8a3
·
1 Parent(s): e63c592

final changes with API key

Browse files
backend/.env.example CHANGED
@@ -18,7 +18,8 @@ LANGCHAIN_API_KEY=your-langsmith-api-key
18
  LANGCHAIN_PROJECT=rag-agent-workbench
19
 
20
  # Optional: basic API protection
21
- # When set, /ingest/*, /documents/*, /search, and /chat* require header X-API-Key
 
22
  API_KEY=your-backend-api-key
23
 
24
  # Optional: CORS
 
18
  LANGCHAIN_PROJECT=rag-agent-workbench
19
 
20
  # Optional: basic API protection
21
+ # When set, all routers except /health and the OpenAPI/Swagger docs require header X-API-Key.
22
+ # In production-like environments (ENV=production or on Hugging Face Spaces), API_KEY must be set.
23
  API_KEY=your-backend-api-key
24
 
25
  # Optional: CORS
backend/README.md CHANGED
@@ -48,7 +48,9 @@ Optional for LangSmith tracing:
48
 
49
  Optional for basic API protection:
50
 
51
- - `API_KEY` – when set, `/ingest/*`, `/documents/*`, `/search`, and `/chat*` require `X-API-Key` header.
 
 
52
 
53
  Optional for CORS:
54
 
@@ -237,21 +239,71 @@ A helper script is provided to seed the index with multiple arXiv and OpenAlex q
237
  python ../scripts/seed_ingest.py --base-url http://localhost:8000 --namespace dev --mailto you@example.com
238
  ```
239
 
240
- ## Docling integration (external script)
241
 
242
- Docling is used via a separate script so the backend container stays small. To convert a local PDF and upload it as text:
 
 
243
 
244
  ```bash
245
  cd scripts
246
- pip install docling
247
  python docling_convert_and_upload.py \
248
- --pdf-path /path/to/file.pdf \
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
249
  --backend-url http://localhost:8000 \
250
  --namespace dev \
251
- --title "My PDF via Docling" \
252
- --source docling
 
253
  ```
254
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
255
  ## Deploy Backend on Hugging Face Spaces (Docker)
256
 
257
  1. **Create a new Space**
 
48
 
49
  Optional for basic API protection:
50
 
51
+ - `API_KEY` – when set, all routers except `/health` are protected by `X-API-Key` (including `/chat`, `/search`, `/documents/*`, `/ingest/*`, `/metrics`, and the OpenAPI/Swagger docs).
52
+ - In production-like environments (`ENV=production` or on Hugging Face Spaces), `API_KEY` **must** be set or the backend will fail to start.
53
+ - In local development (no Spaces and `ENV` not set to `production`), `API_KEY` is optional; when omitted, the API (including docs) is open.
54
 
55
  Optional for CORS:
56
 
 
239
  python ../scripts/seed_ingest.py --base-url http://localhost:8000 --namespace dev --mailto you@example.com
240
  ```
241
 
242
+ ## Docling integration (external scripts)
243
 
244
+ Docling is used via separate scripts so the backend container stays small and does not depend on Docling. To convert local documents and upload them as text:
245
+
246
+ ### Single file
247
 
248
  ```bash
249
  cd scripts
250
+ pip install docling # optional but recommended for rich formats
251
  python docling_convert_and_upload.py \
252
+ --file /path/to/file.pdf \
253
+ --backend-url http://localhost:8000 \
254
+ --namespace dev \
255
+ --title "My local document" \
256
+ --source local-file \
257
+ --api-key "$API_KEY"
258
+ ```
259
+
260
+ - Supported formats when Docling is installed include: PDF, DOCX, PPT/PPTX, XLS/XLSX, HTML/HTM, MD, AsciiDoc, and TXT.
261
+ - If Docling is **not** installed:
262
+ - `.txt` and `.md` files are ingested as raw text.
263
+ - Other formats will fail with a clear message instructing you to install Docling.
264
+
265
+ ### Batch ingest a folder
266
+
267
+ ```bash
268
+ cd scripts
269
+ pip install docling # optional but recommended
270
+ python batch_ingest_local_folder.py \
271
+ --folder /path/to/folder \
272
  --backend-url http://localhost:8000 \
273
  --namespace dev \
274
+ --source local-folder \
275
+ --max-files 200 \
276
+ --api-key "$API_KEY"
277
  ```
278
 
279
+ - Recursively scans the folder for supported extensions and ingests up to `max-files` documents.
280
+ - Each file is converted via `docling_convert_and_upload.py` logic and uploaded to `/documents/upload-text`.
281
+
282
+ ## Upload documents via UI (Streamlit dialog)
283
+
284
+ The Streamlit chat frontend also supports uploading documents directly from the browser:
285
+
286
+ - Click the **“📄 Upload Document”** button at the top of the chat page.
287
+ - A modal dialog opens with:
288
+ - File chooser (`.pdf`, `.md`, `.txt`, `.docx`, `.pptx`, `.xlsx`, `.html`, `.htm`).
289
+ - Title (defaults to filename without extension).
290
+ - Namespace (defaults to the backend namespace, e.g. `dev`).
291
+ - Source label (defaults to `ui-upload`).
292
+ - Optional metadata: tags (comma-separated) and free-form notes.
293
+ - On upload:
294
+ - The frontend converts the file to markdown/text and calls `POST /documents/upload-text` with:
295
+ - `title`, `source`, `text`, `namespace`, and a `metadata` dictionary containing conversion and UI metadata.
296
+ - On success, the upload is recorded in a “Recent uploads” section in the sidebar and can be quickly queried via “Search this document”.
297
+
298
+ Notes:
299
+
300
+ - Conversion happens entirely in the frontend:
301
+ - `.txt` and `.md` files are read as raw text.
302
+ - For richer formats (PDF/Office/HTML), the frontend attempts to use **Docling** if installed.
303
+ - If Docling is not available, an informative error is shown and the user is asked to upload `.md`/`.txt` instead.
304
+ - On Streamlit Cloud, Docling must be added to the app’s Python environment (e.g. `requirements.txt`) for PDF/Office uploads to work.
305
+ - Streamlit’s file uploader has a default maximum size (typically 200 MB); check Streamlit documentation if you need to increase or restrict this limit.
306
+
307
  ## Deploy Backend on Hugging Face Spaces (Docker)
308
 
309
  1. **Create a new Space**
backend/app/core/auth.py ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from functools import lru_cache
3
+ from typing import Optional
4
+
5
+ from fastapi import HTTPException, Security, status
6
+ from fastapi.security import APIKeyHeader
7
+
8
+ from app.core.logging import get_logger
9
+
10
+ logger = get_logger(__name__)
11
+
12
+ api_key_header = APIKeyHeader(name="X-API-Key", auto_error=False)
13
+
14
+
15
+ @lru_cache(maxsize=1)
16
+ def _get_configured_api_key() -> Optional[str]:
17
+ """Return the configured API key, or None if not set.
18
+
19
+ The key is read from the API_KEY environment variable.
20
+ """
21
+ raw = os.getenv("API_KEY")
22
+ if raw is None or not raw.strip():
23
+ return None
24
+ return raw.strip()
25
+
26
+
27
+ def _is_production_like() -> bool:
28
+ """Heuristic to detect production / hosted environments.
29
+
30
+ - ENV=production
31
+ - or running on Hugging Face Spaces (SPACE_ID or HF_HOME set)
32
+ """
33
+ env = os.getenv("ENV", "").strip().lower()
34
+ if env == "production":
35
+ return True
36
+ if os.getenv("SPACE_ID") or os.getenv("HF_HOME"):
37
+ return True
38
+ return False
39
+
40
+
41
+ def validate_api_key_configuration() -> None:
42
+ """Validate API key configuration at startup.
43
+
44
+ Behaviour:
45
+ - In production-like environments (HF Spaces or ENV=production):
46
+ - API_KEY MUST be set, otherwise raise RuntimeError (fail fast).
47
+ - In other environments:
48
+ - If API_KEY is missing, allow running open but log a clear warning.
49
+ """
50
+ configured = _get_configured_api_key()
51
+ if _is_production_like():
52
+ if not configured:
53
+ raise RuntimeError(
54
+ "API_KEY environment variable must be set when running in "
55
+ "production or on Hugging Face Spaces. Configure API_KEY in "
56
+ "your environment or Space secrets."
57
+ )
58
+ logger.info("API key configured for production / hosted environment.")
59
+ else:
60
+ if not configured:
61
+ logger.warning(
62
+ "API_KEY is not set; backend is running without authentication. "
63
+ "This is intended for local development only."
64
+ )
65
+ else:
66
+ logger.info("API key configured for development mode.")
67
+
68
+
69
+ async def require_api_key(api_key: Optional[str] = Security(api_key_header)) -> None:
70
+ """FastAPI dependency that enforces X-API-Key when configured.
71
+
72
+ - If API_KEY is not configured (local/dev), this is a no-op.
73
+ - If API_KEY is configured:
74
+ - Missing or mismatched X-API-Key results in HTTP 403.
75
+ """
76
+ configured = _get_configured_api_key()
77
+ if not configured:
78
+ # No API key configured: dev mode, do not enforce.
79
+ return
80
+
81
+ if not api_key:
82
+ raise HTTPException(
83
+ status_code=status.HTTP_403_FORBIDDEN,
84
+ detail="Missing API key. Provide X-API-Key header.",
85
+ )
86
+
87
+ if api_key != configured:
88
+ raise HTTPException(
89
+ status_code=status.HTTP_403_FORBIDDEN,
90
+ detail="Invalid API key.",
91
+ )
backend/app/core/security.py CHANGED
@@ -1,10 +1,8 @@
1
  import os
2
- from typing import Iterable, List, Optional
3
 
4
- from fastapi import FastAPI, Request
5
- from fastapi.responses import JSONResponse
6
  from fastapi.middleware.cors import CORSMiddleware
7
- from starlette.middleware.base import BaseHTTPMiddleware
8
 
9
  from app.core.logging import get_logger
10
 
@@ -23,76 +21,11 @@ def _get_allowed_origins() -> List[str]:
23
  return origins
24
 
25
 
26
- class APIKeyMiddleware(BaseHTTPMiddleware):
27
- """Optional API key protection for selected endpoints.
28
-
29
- When the API_KEY environment variable is set, this middleware enforces the
30
- presence of an `X-API-Key` header with a matching value for:
31
-
32
- - /ingest/*
33
- - /documents/*
34
- - /chat*
35
- - /search
36
-
37
- The following paths remain public regardless of API_KEY:
38
-
39
- - /health
40
- - /docs
41
- - /openapi.json
42
- - /redoc
43
- - /metrics
44
 
45
- When API_KEY is not set, the middleware is not installed and the API is open.
46
  """
47
-
48
- def __init__(self, app: FastAPI, api_key: str) -> None: # type: ignore[override]
49
- super().__init__(app)
50
- self.api_key = api_key
51
-
52
- self._protected_prefixes: List[str] = [
53
- "/ingest",
54
- "/documents",
55
- "/chat",
56
- "/search",
57
- ]
58
- self._public_prefixes: List[str] = [
59
- "/health",
60
- "/docs",
61
- "/openapi.json",
62
- "/redoc",
63
- "/metrics",
64
- ]
65
-
66
- async def dispatch(self, request: Request, call_next): # type: ignore[override]
67
- path = request.url.path or "/"
68
-
69
- # Public endpoints stay open.
70
- if any(path.startswith(prefix) for prefix in self._public_prefixes):
71
- return await call_next(request)
72
-
73
- # Only enforce for protected prefixes.
74
- if not any(path.startswith(prefix) for prefix in self._protected_prefixes):
75
- return await call_next(request)
76
-
77
- header_key: Optional[str] = request.headers.get("X-API-Key")
78
- if not header_key or header_key != self.api_key:
79
- logger.warning("Rejected request with missing/invalid API key path=%s", path)
80
- return JSONResponse(
81
- status_code=401,
82
- content={
83
- "detail": (
84
- "Missing or invalid API key. Provide X-API-Key header with "
85
- "a valid key to access this endpoint."
86
- )
87
- },
88
- )
89
-
90
- return await call_next(request)
91
-
92
-
93
- def configure_security(app: FastAPI) -> None:
94
- """Configure CORS and optional API key protection on the FastAPI app."""
95
- # CORS
96
  origins = _get_allowed_origins()
97
  app.add_middleware(
98
  CORSMiddleware,
@@ -101,16 +34,4 @@ def configure_security(app: FastAPI) -> None:
101
  allow_methods=["*"],
102
  allow_headers=["*"],
103
  )
104
- logger.info("CORS configured allow_origins=%s", origins)
105
-
106
- # Optional API key middleware
107
- api_key = os.getenv("API_KEY")
108
- if not api_key:
109
- logger.warning(
110
- "API key disabled; protected endpoints are open. "
111
- "Set API_KEY environment variable to enable X-API-Key protection."
112
- )
113
- return
114
-
115
- logger.info("API key protection enabled for ingest, documents, search, and chat.")
116
- app.add_middleware(APIKeyMiddleware, api_key=api_key)
 
1
  import os
2
+ from typing import List
3
 
4
+ from fastapi import FastAPI
 
5
  from fastapi.middleware.cors import CORSMiddleware
 
6
 
7
  from app.core.logging import get_logger
8
 
 
21
  return origins
22
 
23
 
24
+ def configure_security(app: FastAPI) -> None:
25
+ """Configure CORS on the FastAPI app.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
+ API key enforcement is handled via dependencies in app.core.auth.
28
  """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  origins = _get_allowed_origins()
30
  app.add_middleware(
31
  CORSMiddleware,
 
34
  allow_methods=["*"],
35
  allow_headers=["*"],
36
  )
37
+ logger.info("CORS configured allow_origins=%s", origins)
 
 
 
 
 
 
 
 
 
 
 
 
backend/app/main.py CHANGED
@@ -1,6 +1,7 @@
1
- from fastapi import FastAPI
2
  from fastapi.responses import ORJSONResponse
3
 
 
4
  from app.core.config import get_settings
5
  from app.core.errors import PineconeIndexConfigError, setup_exception_handlers
6
  from app.core.logging import configure_logging, get_logger
@@ -20,6 +21,9 @@ settings = get_settings()
20
  configure_logging(settings.LOG_LEVEL)
21
  logger = get_logger(__name__)
22
 
 
 
 
23
  # Log runtime port / environment context at import time for easier diagnostics.
24
  get_port()
25
 
@@ -37,13 +41,14 @@ configure_security(app)
37
  setup_rate_limiter(app)
38
  setup_metrics(app)
39
 
40
- # Register routers with tags and ensure they are included in the schema
 
41
  app.include_router(health_router, tags=["health"])
42
- app.include_router(ingest_router, tags=["ingest"])
43
- app.include_router(search_router, tags=["search"])
44
- app.include_router(documents_router, tags=["documents"])
45
- app.include_router(chat_router, tags=["chat"])
46
- app.include_router(metrics_router, tags=["metrics"])
47
 
48
  # Register exception handlers
49
  setup_exception_handlers(app)
 
1
+ from fastapi import Depends, FastAPI
2
  from fastapi.responses import ORJSONResponse
3
 
4
+ from app.core.auth import require_api_key, validate_api_key_configuration
5
  from app.core.config import get_settings
6
  from app.core.errors import PineconeIndexConfigError, setup_exception_handlers
7
  from app.core.logging import configure_logging, get_logger
 
21
  configure_logging(settings.LOG_LEVEL)
22
  logger = get_logger(__name__)
23
 
24
+ # Validate API key configuration early so hosted deployments fail fast when misconfigured.
25
+ validate_api_key_configuration()
26
+
27
  # Log runtime port / environment context at import time for easier diagnostics.
28
  get_port()
29
 
 
41
  setup_rate_limiter(app)
42
  setup_metrics(app)
43
 
44
+ # Register routers with tags and ensure they are included in the schema.
45
+ # Health and docs remain public; all other routers are protected by API key dependency when configured.
46
  app.include_router(health_router, tags=["health"])
47
+ app.include_router(ingest_router, tags=["ingest"], dependencies=[Depends(require_api_key)])
48
+ app.include_router(search_router, tags=["search"], dependencies=[Depends(require_api_key)])
49
+ app.include_router(documents_router, tags=["documents"], dependencies=[Depends(require_api_key)])
50
+ app.include_router(chat_router, tags=["chat"], dependencies=[Depends(require_api_key)])
51
+ app.include_router(metrics_router, tags=["metrics"], dependencies=[Depends(require_api_key)])
52
 
53
  # Register exception handlers
54
  setup_exception_handlers(app)
docs/CONTEXT.md CHANGED
@@ -275,38 +275,35 @@ RAG Agent Workbench is a lightweight experimentation backend for retrieval-augme
275
  - Logs: `Starting on port=<port> hf_spaces_mode=<bool>` using a simple heuristic (`SPACE_ID` / `SPACE_REPO_ID` env vars).
276
  - Called from `app.main` at import time so the log line is visible in container logs during startup.
277
 
278
- ### API key middleware and CORS
279
 
280
  - **API key protection**
281
- - New module: `backend/app/core/security.py`
282
- - `configure_security(app)`:
283
- - Configures CORS.
284
- - Installs optional `APIKeyMiddleware` when `API_KEY` env var is set.
285
- - `APIKeyMiddleware` rules:
286
- - If `API_KEY` is set:
287
- - Require header `X-API-Key` on:
288
- - `/ingest/*`
289
- - `/documents/*`
290
- - `/search`
291
- - `/chat*` (both `/chat` and `/chat/stream`)
292
- - Public endpoints (no key required):
293
- - `/health`
294
- - `/docs`
295
- - `/openapi.json`
296
- - `/redoc`
297
- - `/metrics`
298
- - Requests with missing/invalid key receive:
299
- - HTTP `401` with JSON `{"detail": "Missing or invalid API key. ..."}`.
300
- - If `API_KEY` is **not** set:
301
- - Middleware is not installed.
302
- - A warning is logged:
303
- - `"API key disabled; protected endpoints are open. Set API_KEY..."`.
304
- - Intended use:
305
- - For public demos, set a simple API key and configure the frontend to pass it.
306
- - For local development, leaving `API_KEY` unset keeps the API open.
307
 
308
  - **CORS configuration**
309
- - Also in `core/security.py`:
310
  - Reads `ALLOWED_ORIGINS` env var as a comma-separated list.
311
  - If unset or empty:
312
  - Defaults to `["*"]` (permissive, useful for local dev and quick demos).
@@ -314,7 +311,7 @@ RAG Agent Workbench is a lightweight experimentation backend for retrieval-augme
314
  - `allow_origins=origins`
315
  - `allow_methods=["*"]`
316
  - `allow_headers=["*"]`
317
- - Documented in `.env.example` and README so operators can lock this down for real deployments.
318
 
319
  ### Rate limiting (SlowAPI)
320
 
@@ -483,22 +480,36 @@ RAG Agent Workbench is a lightweight experimentation backend for retrieval-augme
483
  - `httpx`
484
  - Backend configuration:
485
  - Reads `BACKEND_BASE_URL` from `st.secrets["BACKEND_BASE_URL"]` or the `BACKEND_BASE_URL` environment variable.
486
- - Reads optional `API_KEY` from `st.secrets["API_KEY"]` or the `API_KEY` environment variable.
487
- - Connectivity panel (sidebar):
488
- - Displays the configured backend URL.
489
- - Indicates whether an API key is configured.
490
- - Provides a "Ping /health" button that calls the backend and shows the JSON response.
491
- - Chat UI:
492
- - Text input for namespace (`dev` by default).
493
- - Text area for user question.
494
- - On "Send":
495
- - Calls backend `/chat` with:
496
- - `query`, `namespace`, `top_k=5`, `use_web_fallback=true`.
497
- - Includes `X-API-Key` header when configured.
498
- - Displays:
499
- - Answer text.
500
- - Timings JSON.
501
- - Up to 5 sources with titles, URLs, and snippet text (in expanders).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
502
 
503
  - Root-level `requirements.txt`
504
  - Added to support Streamlit Community Cloud, where the root requirements file is used:
 
275
  - Logs: `Starting on port=<port> hf_spaces_mode=<bool>` using a simple heuristic (`SPACE_ID` / `SPACE_REPO_ID` env vars).
276
  - Called from `app.main` at import time so the log line is visible in container logs during startup.
277
 
278
+ ### API key protection and CORS
279
 
280
  - **API key protection**
281
+ - New module: `backend/app/core/auth.py`
282
+ - Defines `require_api_key` FastAPI dependency using `APIKeyHeader` (`X-API-Key`).
283
+ - `validate_api_key_configuration()` runs at startup and enforces:
284
+ - In production-like environments (`ENV=production` or on Hugging Face Spaces via `SPACE_ID` / `HF_HOME`):
285
+ - `API_KEY` **must** be set or the backend fails fast with a clear error.
286
+ - In local development:
287
+ - If `API_KEY` is missing, the backend runs open but logs a prominent warning.
288
+ - `require_api_key` behaviour:
289
+ - If `API_KEY` is not configured (dev mode), the dependency is a no-op.
290
+ - If `API_KEY` is configured:
291
+ - Missing or mismatched `X-API-Key` results in HTTP 403.
292
+ - Wiring:
293
+ - All routers except `/health` are registered with `dependencies=[Depends(require_api_key)]`.
294
+ - Docs and OpenAPI endpoints are explicitly secured:
295
+ - `GET /openapi.json` – returns `app.openapi()`, protected by `require_api_key`.
296
+ - `GET /docs` – Swagger UI via `get_swagger_ui_html`, protected by `require_api_key`.
297
+ - `GET /redoc` – ReDoc UI via `get_redoc_html`, protected by `require_api_key`.
298
+ - Effect:
299
+ - In HF Spaces / production:
300
+ - `/docs`, `/redoc`, `/openapi.json`, `/chat`, `/search`, `/documents/*`, `/ingest/*`, `/metrics` all require `X-API-Key`.
301
+ - `/health` remains public for simple uptime checks.
302
+ - In local dev with no `API_KEY`:
303
+ - All endpoints (including docs) are accessible without a key for convenience.
 
 
 
304
 
305
  - **CORS configuration**
306
+ - `backend/app/core/security.py` now focuses solely on CORS:
307
  - Reads `ALLOWED_ORIGINS` env var as a comma-separated list.
308
  - If unset or empty:
309
  - Defaults to `["*"]` (permissive, useful for local dev and quick demos).
 
311
  - `allow_origins=origins`
312
  - `allow_methods=["*"]`
313
  - `allow_headers=["*"]`
314
+ - API key enforcement is handled entirely via `core/auth.py` and router/dependency wiring.
315
 
316
  ### Rate limiting (SlowAPI)
317
 
 
480
  - `httpx`
481
  - Backend configuration:
482
  - Reads `BACKEND_BASE_URL` from `st.secrets["BACKEND_BASE_URL"]` or the `BACKEND_BASE_URL` environment variable.
483
+ - Reads `API_KEY` from `st.secrets["API_KEY"]` or the `API_KEY` environment variable.
484
+ - Sidebar ("Backend" + settings):
485
+ - Shows backend URL and API key status.
486
+ - "Ping /health" button that calls the backend and shows the JSON response.
487
+ - `top_k` slider, `min_score` slider, `use_web_fallback` checkbox.
488
+ - "Show sources" toggle and "Clear chat" button.
489
+ - "Recent uploads" section with quick actions:
490
+ - For each recent upload, displays title, namespace, timestamp.
491
+ - A "Search this document" button pre-fills the chat input with a prompt such as `Summarize: <title>`.
492
+ - Chatbot UI:
493
+ - Uses `st.chat_message` and `st.chat_input` with conversation stored in `st.session_state.messages`.
494
+ - When the user sends a message:
495
+ - Appends it to history and displays it.
496
+ - Calls `/chat/stream` with `X-API-Key` (if available) and streams tokens into the UI.
497
+ - If `/chat/stream` is unavailable (e.g. 404), falls back to `/chat`.
498
+ - Assistant messages:
499
+ - Display the answer text.
500
+ - Optionally show sources in an expandable "Sources" section with titles, URLs, scores, and truncated snippets.
501
+ - If `API_KEY` is not configured in secrets or environment:
502
+ - The app warns and disables sending messages to the protected backend.
503
+ - UI document upload:
504
+ - A top-level “📄 Upload Document” button opens a `@st.dialog` modal.
505
+ - Inside the dialog:
506
+ - `st.file_uploader` for `.pdf`, `.md`, `.txt`, `.docx`, `.pptx`, `.xlsx`, `.html`, `.htm`.
507
+ - Inputs for title (defaulting to filename), namespace, source label, tags, and notes.
508
+ - A checkbox to allow uploading even when extracted text is very short.
509
+ - On submit:
510
+ - The frontend converts the file to text/markdown (using Docling when installed, or raw text for `.md`/`.txt`).
511
+ - Calls backend `POST /documents/upload-text` with `X-API-Key`.
512
+ - On success, records the upload in `st.session_state.recent_uploads` and triggers a rerun to close the dialog.
513
 
514
  - Root-level `requirements.txt`
515
  - Added to support Streamlit Community Cloud, where the root requirements file is used:
docs/WORKLOG.md CHANGED
@@ -39,4 +39,37 @@
39
  - Use port `7860` by default in the Docker image, while respecting the `PORT` environment variable for platforms like Hugging Face Spaces.
40
  - Keep API key protection opt-in via `API_KEY` with clear logging when disabled.
41
  - Enable rate limiting and caching by default, with simple boolean toggles (`RATE_LIMIT_ENABLED`, `CACHE_ENABLED`) for easy operational control.
42
- - Implement metrics as in-memory only (no external storage) and expose them via a JSON `/metrics` endpoint tailored for demos and lightweight monitoring.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  - Use port `7860` by default in the Docker image, while respecting the `PORT` environment variable for platforms like Hugging Face Spaces.
40
  - Keep API key protection opt-in via `API_KEY` with clear logging when disabled.
41
  - Enable rate limiting and caching by default, with simple boolean toggles (`RATE_LIMIT_ENABLED`, `CACHE_ENABLED`) for easy operational control.
42
+ - Implement metrics as in-memory only (no external storage) and expose them via a JSON `/metrics` endpoint tailored for demos and lightweight monitoring.
43
+
44
+ ## 2026-01-17 – Security + UI + Ingestion Hardening
45
+
46
+ - **Summary**
47
+ - Hardened the backend for public deployment by enforcing API key protection for all non-health endpoints and (initially) for the OpenAPI/Swagger documentation, then relaxed docs to be publicly viewable while keeping all functional endpoints protected.
48
+ - Upgraded the Streamlit frontend to a conversational chat UI using Streamlit's chat primitives.
49
+ - Improved local document ingestion workflows with Docling-aware scripts for single files and batch folder ingestion.
50
+ - Added a UI-based document upload dialog in the Streamlit app that ingests files via `/documents/upload-text`.
51
+
52
+ - **Key Files Changed**
53
+ - Backend authentication and wiring:
54
+ - `backend/app/core/auth.py`
55
+ - `backend/app/core/security.py`
56
+ - `backend/app/main.py`
57
+ - Frontend chatbot UI and upload:
58
+ - `frontend/app.py`
59
+ - `frontend/services/file_convert.py`
60
+ - `frontend/services/backend_client.py`
61
+ - Local ingestion scripts:
62
+ - `scripts/docling_convert_and_upload.py`
63
+ - `scripts/batch_ingest_local_folder.py`
64
+ - Documentation:
65
+ - `backend/README.md`
66
+ - `docs/CONTEXT.md`
67
+ - `docs/WORKLOG.md` (this file)
68
+
69
+ - **Major Decisions**
70
+ - In production-like environments (`ENV=production` or on Hugging Face Spaces), require `API_KEY` and fail fast at startup when it is missing; Swagger/OpenAPI remain publicly accessible but all non-health API endpoints still enforce `X-API-Key`.
71
+ - Use a single `require_api_key` dependency (based on `APIKeyHeader`) to protect all routers except `/health`.
72
+ - Treat Streamlit as a first-class chat client, using `st.chat_message`/`st.chat_input` with session-based history and optional streaming from `/chat/stream`.
73
+ - Keep Docling as an optional dependency used in:
74
+ - Local ingestion scripts that upload text to `/documents/upload-text`.
75
+ - The frontend upload dialog for converting PDFs/Office/HTML when available, while falling back to raw `.md`/`.txt` and showing clear errors otherwise.
frontend/.streamlit/secrets.toml DELETED
@@ -1 +0,0 @@
1
- BACKEND_BASE_URL = "http://127.0.0.1:8000"
 
 
frontend/app.py CHANGED
@@ -1,131 +1,447 @@
 
1
  import os
2
- from typing import Any, Dict
 
 
3
 
4
  import httpx
5
  import streamlit as st
6
 
 
 
 
7
 
8
  def get_backend_base_url() -> str:
9
- # Prefer Streamlit secrets, then environment variable, then localhost.
10
- secrets = getattr(st, "secrets", {})
11
- base_url = getattr(secrets, "get", lambda _k, _d=None: None)("BACKEND_BASE_URL", None)
12
- if not base_url:
13
  base_url = os.getenv("BACKEND_BASE_URL", "http://localhost:8000")
14
- return base_url.rstrip("/")
15
 
16
 
17
- def get_api_key() -> str | None:
18
- secrets = getattr(st, "secrets", {})
19
- api_key = getattr(secrets, "get", lambda _k, _d=None: None)("API_KEY", None)
20
- if not api_key:
21
- api_key = os.getenv("API_KEY")
22
- return api_key
23
 
24
 
25
- async def ping_health(base_url: str, api_key: str | None) -> Dict[str, Any]:
26
  url = f"{base_url}/health"
27
  headers: Dict[str, str] = {}
28
  if api_key:
29
  headers["X-API-Key"] = api_key
30
- async with httpx.AsyncClient(timeout=10.0) as client:
31
- resp = await client.get(url, headers=headers)
32
  return resp.json()
33
 
34
 
35
- async def call_chat(
36
  base_url: str,
37
- api_key: str | None,
38
- query: str,
39
- namespace: str,
40
  ) -> Dict[str, Any]:
41
  url = f"{base_url}/chat"
42
- payload: Dict[str, Any] = {
43
- "query": query,
44
- "namespace": namespace,
45
- "top_k": 5,
46
- "use_web_fallback": True,
47
- }
48
- headers: Dict[str, str] = {"Content-Type": "application/json"}
49
- if api_key:
50
- headers["X-API-Key"] = api_key
51
-
52
- async with httpx.AsyncClient(timeout=60.0) as client:
53
- resp = await client.post(url, json=payload, headers=headers)
54
  return resp.json()
55
 
56
 
57
- def main() -> None:
58
- st.set_page_config(page_title="RAG Agent Workbench", layout="wide")
59
- st.title("RAG Agent Workbench – Chat Demo")
 
 
 
60
 
61
- backend_base_url = get_backend_base_url()
62
- api_key = get_api_key()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  with st.sidebar:
65
- st.header("Connectivity")
 
66
  st.markdown(f"**Backend URL:** `{backend_base_url}`")
67
  if api_key:
68
- st.markdown("**API key:** configured in Streamlit secrets.")
69
  else:
70
- st.markdown("**API key:** not set (backend may be open).")
 
 
 
71
 
72
  if st.button("Ping /health"):
73
  try:
74
- import asyncio
75
-
76
- health = asyncio.run(ping_health(backend_base_url, api_key))
77
  st.success("Backend reachable.")
78
  st.json(health)
79
  except Exception as exc: # noqa: BLE001
80
  st.error(f"Health check failed: {exc}")
81
 
82
- st.subheader("Chat")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
 
84
- namespace = st.text_input("Namespace", value="dev", help="Pinecone namespace to query.")
85
- query = st.text_area(
86
- "Your question",
87
- value="What is retrieval-augmented generation?",
88
- height=100,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
- if st.button("Send"):
92
- if not query.strip():
93
- st.warning("Please enter a question.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  return
95
- with st.spinner("Calling backend /chat..."):
96
- try:
97
- import asyncio
98
 
99
- response = asyncio.run(
100
- call_chat(backend_base_url, api_key, query.strip(), namespace.strip() or "dev")
101
- )
102
- except Exception as exc: # noqa: BLE001
103
- st.error(f"Error calling backend: {exc}")
104
- return
105
-
106
- answer = response.get("answer", "")
107
- timings = response.get("timings", {})
108
- sources = response.get("sources", [])
109
-
110
- st.markdown("### Answer")
111
- st.write(answer or "_No answer returned._")
112
-
113
- st.markdown("### Timings (ms)")
114
- st.json(timings)
115
-
116
- if sources:
117
- st.markdown("### Sources")
118
- for idx, src in enumerate(sources[:5], start=1):
119
- title = src.get("title") or f"Source {idx}"
120
- url = src.get("url") or ""
121
- score = src.get("score", 0.0)
122
- st.markdown(f"**[{idx}] {title}** (score={score:.3f})")
123
- if url:
124
- st.markdown(f"- URL: {url}")
125
- chunk_text = src.get("chunk_text") or ""
126
- if chunk_text:
127
- with st.expander("Snippet", expanded=False):
128
- st.write(chunk_text)
129
 
130
 
131
  if __name__ == "__main__":
 
1
+ import json
2
  import os
3
+ from datetime import datetime
4
+ from pathlib import Path
5
+ from typing import Any, Dict, Generator, List, Optional, Tuple
6
 
7
  import httpx
8
  import streamlit as st
9
 
10
+ from services.backend_client import post_upload_text
11
+ from services.file_convert import convert_uploaded_file_to_text
12
+
13
 
14
  def get_backend_base_url() -> str:
15
+ """Prefer Streamlit secrets, then environment variable, then localhost."""
16
+ if "BACKEND_BASE_URL" in st.secrets:
17
+ base_url = st.secrets["BACKEND_BASE_URL"]
18
+ else:
19
  base_url = os.getenv("BACKEND_BASE_URL", "http://localhost:8000")
20
+ return str(base_url).rstrip("/")
21
 
22
 
23
+ def get_api_key() -> Optional[str]:
24
+ """Read API key from Streamlit secrets or environment."""
25
+ if "API_KEY" in st.secrets:
26
+ return str(st.secrets["API_KEY"])
27
+ return os.getenv("API_KEY")
 
28
 
29
 
30
+ def ping_health(base_url: str, api_key: Optional[str]) -> Dict[str, Any]:
31
  url = f"{base_url}/health"
32
  headers: Dict[str, str] = {}
33
  if api_key:
34
  headers["X-API-Key"] = api_key
35
+ resp = httpx.get(url, headers=headers, timeout=10.0)
36
+ resp.raise_for_status()
37
  return resp.json()
38
 
39
 
40
+ def call_chat(
41
  base_url: str,
42
+ api_key: str,
43
+ payload: Dict[str, Any],
 
44
  ) -> Dict[str, Any]:
45
  url = f"{base_url}/chat"
46
+ headers: Dict[str, str] = {"Content-Type": "application/json", "X-API-Key": api_key}
47
+ resp = httpx.post(url, json=payload, headers=headers, timeout=60.0)
48
+ resp.raise_for_status()
 
 
 
 
 
 
 
 
 
49
  return resp.json()
50
 
51
 
52
+ def iter_chat_stream(
53
+ base_url: str,
54
+ api_key: str,
55
+ payload: Dict[str, Any],
56
+ ) -> Generator[Tuple[str, Optional[Dict[str, Any]]], None, None]:
57
+ """Stream tokens from /chat/stream and yield (partial_answer, final_payload).
58
 
59
+ The final_payload is None for intermediate updates and populated once
60
+ when the terminating SSE event is received.
61
+ """
62
+ url = f"{base_url}/chat/stream"
63
+ headers: Dict[str, str] = {"Content-Type": "application/json", "X-API-Key": api_key}
64
+
65
+ full_answer = ""
66
+ final_payload: Optional[Dict[str, Any]] = None
67
+ current_event: Optional[str] = None
68
+
69
+ with httpx.Client(timeout=60.0) as client:
70
+ with client.stream("POST", url, json=payload, headers=headers) as resp:
71
+ resp.raise_for_status()
72
+ for line in resp.iter_lines():
73
+ if not line:
74
+ continue
75
+
76
+ if line.startswith("event:"):
77
+ current_event = line.split(":", 1)[1].strip()
78
+ continue
79
+
80
+ if line.startswith("data:"):
81
+ data = line.split(":", 1)[1].lstrip()
82
+ if current_event == "end":
83
+ # Final payload with full JSON response.
84
+ try:
85
+ final_payload = json.loads(data)
86
+ except json.JSONDecodeError:
87
+ final_payload = None
88
+ else:
89
+ if data:
90
+ if full_answer:
91
+ full_answer += " "
92
+ full_answer += data
93
+ # Yield intermediate answer text.
94
+ yield full_answer, None
95
 
96
+ # After stream ends, make sure we yield at least once with final payload.
97
+ if final_payload is not None:
98
+ # If the backend included the final answer in the JSON payload, prefer it.
99
+ answer_text = str(final_payload.get("answer") or full_answer)
100
+ yield answer_text, final_payload
101
+ elif full_answer:
102
+ yield full_answer, None
103
+
104
+
105
+ def init_session_state() -> None:
106
+ if "messages" not in st.session_state:
107
+ st.session_state.messages: List[Dict[str, Any]] = []
108
+ if "show_sources" not in st.session_state:
109
+ st.session_state.show_sources = True
110
+ if "supports_stream" not in st.session_state:
111
+ st.session_state.supports_stream = True
112
+ # Namespace is fixed for now; default to "dev".
113
+ if "namespace" not in st.session_state:
114
+ st.session_state.namespace = "dev"
115
+ if "recent_uploads" not in st.session_state:
116
+ st.session_state.recent_uploads: List[Dict[str, Any]] = []
117
+ if "chat_prefill" not in st.session_state:
118
+ st.session_state.chat_prefill = None
119
+
120
+
121
+ def render_sidebar(backend_base_url: str, api_key: Optional[str]) -> Dict[str, Any]:
122
  with st.sidebar:
123
+ st.header("Backend")
124
+
125
  st.markdown(f"**Backend URL:** `{backend_base_url}`")
126
  if api_key:
127
+ st.markdown("**API key:** configured in Streamlit secrets or environment.")
128
  else:
129
+ st.warning(
130
+ "API_KEY is not configured. The backend is expected to be protected; "
131
+ "chat will be disabled until an API key is set."
132
+ )
133
 
134
  if st.button("Ping /health"):
135
  try:
136
+ health = ping_health(backend_base_url, api_key)
 
 
137
  st.success("Backend reachable.")
138
  st.json(health)
139
  except Exception as exc: # noqa: BLE001
140
  st.error(f"Health check failed: {exc}")
141
 
142
+ st.markdown("---")
143
+ st.subheader("Chat settings")
144
+
145
+ top_k = st.slider("Top K", min_value=1, max_value=20, value=5, step=1)
146
+ min_score = st.slider(
147
+ "Minimum relevance score",
148
+ min_value=0.0,
149
+ max_value=1.0,
150
+ value=0.25,
151
+ step=0.05,
152
+ )
153
+ use_web_fallback = st.checkbox(
154
+ "Use web fallback (Tavily)",
155
+ value=True,
156
+ help="When enabled, /chat may call Tavily if retrieval is weak.",
157
+ )
158
+
159
+ st.session_state.show_sources = st.checkbox(
160
+ "Show sources", value=st.session_state.show_sources
161
+ )
162
+
163
+ if st.button("Clear chat"):
164
+ st.session_state.messages = []
165
+
166
+ st.markdown("---")
167
+ st.subheader("Recent uploads")
168
+ recent = st.session_state.get("recent_uploads", [])
169
+ if not recent:
170
+ st.caption("No documents uploaded yet.")
171
+ else:
172
+ for idx, item in enumerate(recent):
173
+ title = item.get("title") or "Untitled"
174
+ ns = item.get("namespace") or st.session_state.get("namespace", "dev")
175
+ ts = item.get("timestamp", "")
176
+ st.markdown(f"- **{title}** \n Namespace: `{ns}` \n Uploaded: {ts}")
177
+ if st.button("Search this document", key=f"search_upload_{idx}"):
178
+ st.session_state.chat_prefill = f"Summarize: {title}"
179
+
180
+ return {
181
+ "top_k": top_k,
182
+ "min_score": float(min_score),
183
+ "use_web_fallback": bool(use_web_fallback),
184
+ }
185
+
186
+
187
+ def render_chat_history(show_sources: bool) -> None:
188
+ for message in st.session_state.messages:
189
+ role = message.get("role", "user")
190
+ content = message.get("content", "")
191
+ with st.chat_message("assistant" if role == "assistant" else "user"):
192
+ st.markdown(content)
193
+ if role == "assistant" and show_sources:
194
+ sources = message.get("sources") or []
195
+ if sources:
196
+ with st.expander("Sources", expanded=False):
197
+ for idx, src in enumerate(sources, start=1):
198
+ title = src.get("title") or f"Source {idx}"
199
+ url = src.get("url") or ""
200
+ score = src.get("score", 0.0)
201
+ st.markdown(f"**[{idx}] {title}** (score={score:.3f})")
202
+ if url:
203
+ st.markdown(f"- URL: {url}")
204
+ chunk_text = src.get("chunk_text") or ""
205
+ if chunk_text:
206
+ st.write(chunk_text[:1000] + ("..." if len(chunk_text) > 1000 else ""))
207
 
208
+
209
+ @st.dialog("Upload document")
210
+ def upload_dialog(backend_base_url: str, api_key: Optional[str]) -> None:
211
+ """Modal dialog for uploading and ingesting a document via /documents/upload-text."""
212
+ st.write("Upload a document to ingest it into the RAG backend.")
213
+
214
+ with st.form("upload_form"):
215
+ uploaded_file = st.file_uploader(
216
+ "Choose a file",
217
+ type=["pdf", "md", "txt", "docx", "pptx", "xlsx", "html", "htm"],
218
+ accept_multiple_files=False,
219
+ )
220
+
221
+ default_title = ""
222
+ if uploaded_file is not None:
223
+ default_title = Path(uploaded_file.name).stem
224
+
225
+ title = st.text_input("Title", value=default_title)
226
+ namespace = st.text_input(
227
+ "Namespace",
228
+ value=st.session_state.get("namespace", "dev"),
229
+ help="Target Pinecone namespace.",
230
+ )
231
+ source = st.text_input("Source label", value="ui-upload")
232
+ tags = st.text_input("Tags (comma separated)", value="")
233
+ notes = st.text_area("Notes", value="", height=80)
234
+
235
+ upload_anyway = st.checkbox(
236
+ "Upload even if extracted text is very short",
237
+ value=False,
238
+ help="Enable to upload even when the extracted text is shorter than 200 characters.",
239
+ )
240
+
241
+ submit = st.form_submit_button("Upload")
242
+ if not submit:
243
+ return
244
+
245
+ if uploaded_file is None:
246
+ st.error("Please select a file to upload.")
247
+ return
248
+
249
+ if not title.strip():
250
+ st.error("Please provide a title.")
251
+ return
252
+
253
+ if not api_key:
254
+ st.error("API_KEY is not configured; cannot upload to a protected backend.")
255
+ return
256
+
257
+ with st.spinner("Converting and uploading document..."):
258
+ try:
259
+ uploaded_file.seek(0)
260
+ text, conv_meta = convert_uploaded_file_to_text(uploaded_file)
261
+ except Exception as exc: # noqa: BLE001
262
+ st.error(f"Error converting file: {exc}")
263
+ return
264
+
265
+ if len(text.strip()) < 200 and not upload_anyway:
266
+ st.warning(
267
+ "Extracted text is very short (< 200 characters). "
268
+ "Check the file or enable the checkbox to upload anyway."
269
+ )
270
+ return
271
+
272
+ meta: Dict[str, Any] = {
273
+ **conv_meta,
274
+ "tags": [t.strip() for t in tags.split(",") if t.strip()],
275
+ "notes": notes,
276
+ }
277
+
278
+ payload = {
279
+ "title": title.strip(),
280
+ "source": source.strip() or "ui-upload",
281
+ "text": text,
282
+ "namespace": namespace.strip() or st.session_state.get("namespace", "dev"),
283
+ "metadata": meta,
284
+ }
285
+
286
+ try:
287
+ response = post_upload_text(backend_base_url, api_key, payload)
288
+ except httpx.HTTPStatusError as exc:
289
+ if exc.response is not None:
290
+ detail = exc.response.text
291
+ status_code = exc.response.status_code
292
+ else:
293
+ detail = str(exc)
294
+ status_code = "error"
295
+ st.error(f"Upload failed ({status_code}): {detail}")
296
+ return
297
+ except Exception as exc: # noqa: BLE001
298
+ st.error(f"Upload failed: {exc}")
299
+ return
300
+
301
+ # Record recent upload and suggest a follow-up chat action.
302
+ rec = {
303
+ "title": title.strip(),
304
+ "namespace": payload["namespace"],
305
+ "timestamp": datetime.utcnow().isoformat() + "Z",
306
+ "response": response,
307
+ }
308
+ recent = st.session_state.get("recent_uploads", [])
309
+ recent.append(rec)
310
+ st.session_state.recent_uploads = recent[-5:]
311
+
312
+ st.success(f"Uploaded and indexed: {title.strip()}")
313
+ st.rerun()
314
+
315
+
316
+ def main() -> None:
317
+ st.set_page_config(page_title="RAG Agent Workbench", layout="wide")
318
+ st.title("RAG Agent Workbench – Chatbot")
319
+
320
+ init_session_state()
321
+
322
+ backend_base_url = get_backend_base_url()
323
+ api_key = get_api_key()
324
+
325
+ # Upload button near the top-level chat UI.
326
+ if st.button("📄 Upload Document"):
327
+ upload_dialog(backend_base_url, api_key)
328
+
329
+ settings = render_sidebar(backend_base_url, api_key)
330
+ render_chat_history(show_sources=st.session_state.show_sources)
331
+
332
+ if not api_key:
333
+ st.info(
334
+ "Configure `API_KEY` in Streamlit secrets (and on the backend) to start chatting."
335
+ )
336
+ return
337
+
338
+ # Pre-fill chat input if a suggestion was set (e.g. from recent uploads).
339
+ prefill = st.session_state.get("chat_prefill")
340
+ if prefill and "chat_input" not in st.session_state:
341
+ st.session_state.chat_input = prefill
342
+
343
+ user_message = st.chat_input(
344
+ "Ask a question about your documents...", key="chat_input"
345
  )
346
+ if not user_message:
347
+ return
348
+
349
+ # Clear any prefill once the user has sent a message.
350
+ st.session_state.chat_prefill = None
351
+
352
+ # Record and display user message
353
+ st.session_state.messages.append({"role": "user", "content": user_message})
354
+ with st.chat_message("user"):
355
+ st.markdown(user_message)
356
+
357
+ # Prepare payload for backend
358
+ chat_history = [
359
+ {"role": msg["role"], "content": msg["content"]}
360
+ for msg in st.session_state.messages
361
+ if msg.get("role") in ("user", "assistant")
362
+ ]
363
+ payload: Dict[str, Any] = {
364
+ "query": user_message,
365
+ "namespace": st.session_state.namespace,
366
+ "top_k": int(settings["top_k"]),
367
+ "use_web_fallback": settings["use_web_fallback"],
368
+ "min_score": float(settings["min_score"]),
369
+ "max_web_results": 5,
370
+ "chat_history": chat_history,
371
+ }
372
 
373
+ # Call backend and stream / display assistant response
374
+ with st.chat_message("assistant"):
375
+ placeholder = st.empty()
376
+ placeholder.markdown("_Thinking..._")
377
+
378
+ response: Optional[Dict[str, Any]] = None
379
+
380
+ try:
381
+ if st.session_state.get("supports_stream", True):
382
+ try:
383
+ # Attempt to use streaming endpoint first.
384
+ for partial_answer, final_payload in iter_chat_stream(
385
+ backend_base_url,
386
+ api_key,
387
+ payload,
388
+ ):
389
+ if partial_answer:
390
+ placeholder.markdown(partial_answer)
391
+ if final_payload is not None:
392
+ response = final_payload
393
+ break
394
+ except httpx.HTTPStatusError as exc:
395
+ # If /chat/stream is not available, fall back to /chat.
396
+ if exc.response is not None and exc.response.status_code == 404:
397
+ st.session_state.supports_stream = False
398
+ else:
399
+ raise
400
+
401
+ if response is None:
402
+ # Fallback to non-streaming /chat.
403
+ response = call_chat(backend_base_url, api_key, payload)
404
+ answer_text = str(response.get("answer") or "")
405
+ if answer_text:
406
+ placeholder.markdown(answer_text)
407
+ else:
408
+ placeholder.markdown("_No answer returned._")
409
+
410
+ except Exception as exc: # noqa: BLE001
411
+ placeholder.markdown("")
412
+ st.error(f"Error calling backend: {exc}")
413
  return
 
 
 
414
 
415
+ if not response:
416
+ return
417
+
418
+ answer = str(response.get("answer") or "")
419
+ sources = response.get("sources") or []
420
+ timings = response.get("timings") or {}
421
+
422
+ # Optionally render sources for this assistant turn.
423
+ if st.session_state.show_sources and sources:
424
+ with st.expander("Sources", expanded=False):
425
+ for idx, src in enumerate(sources, start=1):
426
+ title = src.get("title") or f"Source {idx}"
427
+ url = src.get("url") or ""
428
+ score = src.get("score", 0.0)
429
+ st.markdown(f"**[{idx}] {title}** (score={score:.3f})")
430
+ if url:
431
+ st.markdown(f"- URL: {url}")
432
+ chunk_text = src.get("chunk_text") or ""
433
+ if chunk_text:
434
+ st.write(chunk_text[:1000] + ("..." if len(chunk_text) > 1000 else ""))
435
+
436
+ # Persist assistant message with metadata.
437
+ st.session_state.messages.append(
438
+ {
439
+ "role": "assistant",
440
+ "content": answer,
441
+ "sources": sources,
442
+ "timings": timings,
443
+ }
444
+ )
445
 
446
 
447
  if __name__ == "__main__":
frontend/services/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Helper package for frontend services (conversion, backend client, etc.).
frontend/services/backend_client.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ from typing import Any, Dict, Optional
4
+
5
+ import httpx
6
+
7
+
8
+ def post_upload_text(
9
+ base_url: str,
10
+ api_key: Optional[str],
11
+ payload: Dict[str, Any],
12
+ ) -> Dict[str, Any]:
13
+ """Call backend /documents/upload-text with the given payload.
14
+
15
+ Sends X-API-Key when provided and raises for HTTP errors.
16
+ """
17
+ url = f"{base_url.rstrip('/')}/documents/upload-text"
18
+ headers: Dict[str, str] = {"Content-Type": "application/json"}
19
+ if api_key:
20
+ headers["X-API-Key"] = api_key
21
+
22
+ with httpx.Client(timeout=60.0) as client:
23
+ resp = client.post(url, json=payload, headers=headers)
24
+ resp.raise_for_status()
25
+ return resp.json()
frontend/services/file_convert.py ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ from pathlib import Path
4
+ from tempfile import NamedTemporaryFile
5
+ from typing import Any, Dict, Tuple
6
+
7
+ try:
8
+ from docling.document_converter import DocumentConverter
9
+ except ImportError: # pragma: no cover - optional dependency
10
+ DocumentConverter = None # type: ignore[assignment]
11
+
12
+
13
+ def convert_uploaded_file_to_text(uploaded_file) -> Tuple[str, Dict[str, Any]]:
14
+ """Convert an uploaded Streamlit file to text/markdown.
15
+
16
+ - For .txt and .md, returns raw UTF-8 text.
17
+ - For other supported formats (PDF/Office/HTML), uses Docling when installed.
18
+ - Raises a RuntimeError with a user-friendly message when Docling is required
19
+ but not installed.
20
+ """
21
+ filename = uploaded_file.name
22
+ ext = Path(filename).suffix.lower().lstrip(".")
23
+ size_bytes = getattr(uploaded_file, "size", None)
24
+ content_type = getattr(uploaded_file, "type", None)
25
+
26
+ metadata: Dict[str, Any] = {
27
+ "filename": filename,
28
+ "ext": ext,
29
+ "size_bytes": size_bytes,
30
+ "content_type": content_type,
31
+ }
32
+
33
+ # Plain text / markdown: read directly.
34
+ if ext in {"txt", "md"}:
35
+ raw_bytes = uploaded_file.read()
36
+ text = raw_bytes.decode("utf-8", errors="ignore")
37
+ metadata["converted_by"] = "raw"
38
+ return text, metadata
39
+
40
+ # Rich formats: require Docling.
41
+ if DocumentConverter is None:
42
+ raise RuntimeError(
43
+ "Docling is not installed; conversion for this file type is unavailable. "
44
+ "Install docling (e.g. `pip install docling`) or upload a .md/.txt file."
45
+ )
46
+
47
+ # Persist to a temporary file so Docling can read it from disk.
48
+ with NamedTemporaryFile(delete=True, suffix=f".{ext}") as tmp:
49
+ # Streamlit's UploadedFile exposes getbuffer() for zero-copy writes.
50
+ tmp.write(uploaded_file.getbuffer())
51
+ tmp.flush()
52
+
53
+ converter = DocumentConverter()
54
+ result = converter.convert(tmp.name)
55
+
56
+ try:
57
+ text = result.document.export_to_markdown()
58
+ except Exception: # noqa: BLE001
59
+ # Fallback to plain text if markdown export is not available.
60
+ text = result.document.export_to_text()
61
+
62
+ metadata["converted_by"] = "docling"
63
+ return text, metadata
scripts/batch_ingest_local_folder.py ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Optional dependency:
2
+ # pip install docling
3
+ #
4
+ # Batch-ingest a local folder of documents into the backend by converting each
5
+ # supported file to markdown/text (using Docling when available) and uploading
6
+ # it via /documents/upload-text.
7
+
8
+ import argparse
9
+ import json
10
+ from pathlib import Path
11
+ from typing import Any, Dict, List, Optional
12
+
13
+ from docling_convert_and_upload import convert_file_to_text, upload_text # type: ignore[import]
14
+
15
+
16
+ def parse_args() -> argparse.Namespace:
17
+ parser = argparse.ArgumentParser(
18
+ description=(
19
+ "Recursively ingest a folder of local documents using Docling (when available) "
20
+ "and upload them to the backend via /documents/upload-text."
21
+ )
22
+ )
23
+ parser.add_argument(
24
+ "--folder",
25
+ type=str,
26
+ required=True,
27
+ help="Root folder containing documents to ingest.",
28
+ )
29
+ parser.add_argument(
30
+ "--backend-url",
31
+ "--backend",
32
+ dest="backend_url",
33
+ type=str,
34
+ default="http://localhost:8000",
35
+ help="Base URL of the running backend (default: http://localhost:8000).",
36
+ )
37
+ parser.add_argument(
38
+ "--namespace",
39
+ type=str,
40
+ default="dev",
41
+ help="Target Pinecone namespace (default: dev).",
42
+ )
43
+ parser.add_argument(
44
+ "--source",
45
+ type=str,
46
+ default="local-folder",
47
+ help="Source label stored in metadata (default: local-folder).",
48
+ )
49
+ parser.add_argument(
50
+ "--api-key",
51
+ type=str,
52
+ default=None,
53
+ help="Optional API key for the backend (sent as X-API-Key).",
54
+ )
55
+ parser.add_argument(
56
+ "--max-files",
57
+ type=int,
58
+ default=200,
59
+ help="Maximum number of files to ingest (default: 200).",
60
+ )
61
+ return parser.parse_args()
62
+
63
+
64
+ SUPPORTED_EXTENSIONS = {
65
+ ".pdf",
66
+ ".docx",
67
+ ".ppt",
68
+ ".pptx",
69
+ ".xls",
70
+ ".xlsx",
71
+ ".html",
72
+ ".htm",
73
+ ".md",
74
+ ".markdown",
75
+ ".adoc",
76
+ ".txt",
77
+ }
78
+
79
+
80
+ def find_files(root: Path, max_files: int) -> List[Path]:
81
+ files: List[Path] = []
82
+ for path in root.rglob("*"):
83
+ if not path.is_file():
84
+ continue
85
+ if path.suffix.lower() not in SUPPORTED_EXTENSIONS:
86
+ continue
87
+ files.append(path)
88
+ if len(files) >= max_files:
89
+ break
90
+ return files
91
+
92
+
93
+ def main() -> int:
94
+ args = parse_args()
95
+ root = Path(args.folder).expanduser().resolve()
96
+ if not root.is_dir():
97
+ raise SystemExit(f"Folder not found: {root}")
98
+
99
+ files = find_files(root, args.max_files)
100
+ if not files:
101
+ print(f"No supported files found in {root}")
102
+ return 0
103
+
104
+ print(f"Found {len(files)} file(s) to ingest in {root} (max {args.max_files}).")
105
+
106
+ successes = 0
107
+ failures: List[Dict[str, Any]] = []
108
+
109
+ for idx, file_path in enumerate(files, start=1):
110
+ print(f"[{idx}/{len(files)}] Converting {file_path}...")
111
+ try:
112
+ text = convert_file_to_text(file_path)
113
+ except Exception as exc: # noqa: BLE001
114
+ print(f" Conversion failed: {exc}")
115
+ failures.append({"path": str(file_path), "error": str(exc)})
116
+ continue
117
+
118
+ try:
119
+ response = upload_text(
120
+ backend_url=args.backend_url,
121
+ title=file_path.name,
122
+ source=args.source,
123
+ text=text,
124
+ namespace=args.namespace,
125
+ metadata={
126
+ "original_path": str(file_path),
127
+ "extension": file_path.suffix.lower(),
128
+ },
129
+ api_key=args.api_key,
130
+ )
131
+ successes += 1
132
+ print(f" Uploaded successfully: {json.dumps(response, indent=2)}")
133
+ except Exception as exc: # noqa: BLE001
134
+ print(f" Upload failed: {exc}")
135
+ failures.append({"path": str(file_path), "error": str(exc)})
136
+
137
+ print()
138
+ print(f"Ingestion complete. Successes: {successes}, Failures: {len(failures)}")
139
+ if failures:
140
+ print("Failures:")
141
+ for item in failures:
142
+ print(f"- {item['path']}: {item['error']}")
143
+
144
+ return 0
145
+
146
+
147
+ if __name__ == "__main__":
148
+ raise SystemExit(main())
scripts/docling_convert_and_upload.py CHANGED
@@ -1,28 +1,44 @@
1
- # pip install docling
 
 
 
 
 
 
2
 
3
  import argparse
4
  import json
5
- from typing import Any, Dict
 
6
 
7
  import httpx
8
- from docling.document_converter import DocumentConverter
 
 
 
 
9
 
10
 
11
  def parse_args() -> argparse.Namespace:
12
  parser = argparse.ArgumentParser(
13
  description=(
14
- "Convert a local PDF using Docling and upload the extracted text "
15
- "to the RAG backend via /documents/upload-text."
16
  )
17
  )
18
  parser.add_argument(
 
19
  "--pdf-path",
 
 
20
  type=str,
21
  required=True,
22
- help="Path to the local PDF file.",
23
  )
24
  parser.add_argument(
25
  "--backend-url",
 
 
26
  type=str,
27
  default="http://localhost:8000",
28
  help="Base URL of the running backend (default: http://localhost:8000).",
@@ -37,21 +53,50 @@ def parse_args() -> argparse.Namespace:
37
  "--title",
38
  type=str,
39
  default=None,
40
- help="Optional title for the document; defaults to the PDF filename.",
41
  )
42
  parser.add_argument(
43
  "--source",
44
  type=str,
45
- default="docling",
46
- help="Source label stored in metadata (default: docling).",
 
 
 
 
 
 
47
  )
48
  return parser.parse_args()
49
 
50
 
51
- def convert_pdf_to_markdown(pdf_path: str) -> str:
52
- converter = DocumentConverter()
53
- result = converter.convert(pdf_path)
54
- return result.document.export_to_markdown()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
 
57
  def upload_text(
@@ -60,7 +105,8 @@ def upload_text(
60
  source: str,
61
  text: str,
62
  namespace: str,
63
- metadata: Dict[str, Any] | None = None,
 
64
  ) -> Dict[str, Any]:
65
  url = f"{backend_url.rstrip('/')}/documents/upload-text"
66
  payload = {
@@ -70,18 +116,30 @@ def upload_text(
70
  "namespace": namespace,
71
  "metadata": metadata or {},
72
  }
 
 
 
 
73
  with httpx.Client(timeout=60.0) as client:
74
- response = client.post(url, json=payload)
75
  response.raise_for_status()
76
  return response.json()
77
 
78
 
79
  def main() -> int:
80
  args = parse_args()
81
- title = args.title or args.pdf_path.rsplit("/", 1)[-1]
 
 
 
 
82
 
83
- print(f"Converting PDF at {args.pdf_path} with Docling...")
84
- markdown_text = convert_pdf_to_markdown(args.pdf_path)
 
 
 
 
85
 
86
  print(
87
  f"Uploading converted text to backend at {args.backend_url} "
@@ -91,9 +149,10 @@ def main() -> int:
91
  backend_url=args.backend_url,
92
  title=title,
93
  source=args.source,
94
- text=markdown_text,
95
  namespace=args.namespace,
96
- metadata={"original_path": args.pdf_path},
 
97
  )
98
 
99
  print("Upload response:")
 
1
+ # Optional dependency:
2
+ # pip install docling
3
+ #
4
+ # This script converts local documents (PDF, Markdown, and other formats
5
+ # supported by Docling) to text/markdown and uploads them to the backend via
6
+ # /documents/upload-text. Docling is used when available; for .txt/.md files,
7
+ # the script can fall back to raw text if Docling is not installed.
8
 
9
  import argparse
10
  import json
11
+ from pathlib import Path
12
+ from typing import Any, Dict, Optional
13
 
14
  import httpx
15
+
16
+ try:
17
+ from docling.document_converter import DocumentConverter
18
+ except ImportError: # pragma: no cover - optional dependency
19
+ DocumentConverter = None # type: ignore[assignment]
20
 
21
 
22
  def parse_args() -> argparse.Namespace:
23
  parser = argparse.ArgumentParser(
24
  description=(
25
+ "Convert a local document using Docling (when available) and "
26
+ "upload the extracted text to the RAG backend via /documents/upload-text."
27
  )
28
  )
29
  parser.add_argument(
30
+ "--file",
31
  "--pdf-path",
32
+ "--path",
33
+ dest="file_path",
34
  type=str,
35
  required=True,
36
+ help="Path to the local file (PDF, Markdown, DOCX, HTML, TXT, etc.).",
37
  )
38
  parser.add_argument(
39
  "--backend-url",
40
+ "--backend",
41
+ dest="backend_url",
42
  type=str,
43
  default="http://localhost:8000",
44
  help="Base URL of the running backend (default: http://localhost:8000).",
 
53
  "--title",
54
  type=str,
55
  default=None,
56
+ help="Optional title for the document; defaults to the filename.",
57
  )
58
  parser.add_argument(
59
  "--source",
60
  type=str,
61
+ default="local-file",
62
+ help="Source label stored in metadata (default: local-file).",
63
+ )
64
+ parser.add_argument(
65
+ "--api-key",
66
+ type=str,
67
+ default=None,
68
+ help="Optional API key for the backend (sent as X-API-Key).",
69
  )
70
  return parser.parse_args()
71
 
72
 
73
+ def _docling_available() -> bool:
74
+ return DocumentConverter is not None
75
+
76
+
77
+ def convert_file_to_text(file_path: Path) -> str:
78
+ """Convert a file to markdown/text.
79
+
80
+ - If Docling is installed, it is used for all supported formats.
81
+ - If Docling is not installed:
82
+ - .txt and .md files are read as raw text.
83
+ - Other formats raise a RuntimeError with installation instructions.
84
+ """
85
+ suffix = file_path.suffix.lower()
86
+
87
+ if _docling_available():
88
+ converter = DocumentConverter()
89
+ result = converter.convert(str(file_path))
90
+ return result.document.export_to_markdown()
91
+
92
+ # Docling is not available.
93
+ if suffix in {".txt", ".md"}:
94
+ return file_path.read_text(encoding="utf-8", errors="ignore")
95
+
96
+ raise RuntimeError(
97
+ f"Docling is required to convert '{file_path}'. Install it with:\n"
98
+ " pip install docling"
99
+ )
100
 
101
 
102
  def upload_text(
 
105
  source: str,
106
  text: str,
107
  namespace: str,
108
+ metadata: Optional[Dict[str, Any]] = None,
109
+ api_key: Optional[str] = None,
110
  ) -> Dict[str, Any]:
111
  url = f"{backend_url.rstrip('/')}/documents/upload-text"
112
  payload = {
 
116
  "namespace": namespace,
117
  "metadata": metadata or {},
118
  }
119
+ headers: Dict[str, str] = {"Content-Type": "application/json"}
120
+ if api_key:
121
+ headers["X-API-Key"] = api_key
122
+
123
  with httpx.Client(timeout=60.0) as client:
124
+ response = client.post(url, json=payload, headers=headers)
125
  response.raise_for_status()
126
  return response.json()
127
 
128
 
129
  def main() -> int:
130
  args = parse_args()
131
+ file_path = Path(args.file_path).expanduser().resolve()
132
+ if not file_path.is_file():
133
+ raise SystemExit(f"File not found: {file_path}")
134
+
135
+ title = args.title or file_path.name
136
 
137
+ print(f"Converting file at {file_path}...")
138
+ try:
139
+ text = convert_file_to_text(file_path)
140
+ except Exception as exc: # noqa: BLE001
141
+ print(f"Error converting file: {exc}")
142
+ return 1
143
 
144
  print(
145
  f"Uploading converted text to backend at {args.backend_url} "
 
149
  backend_url=args.backend_url,
150
  title=title,
151
  source=args.source,
152
+ text=text,
153
  namespace=args.namespace,
154
+ metadata={"original_path": str(file_path), "extension": file_path.suffix.lower()},
155
+ api_key=args.api_key,
156
  )
157
 
158
  print("Upload response:")