feat(agent): switch AI backend from OpenRouter to HF Inference Providers
Browse filesThe chat panel and embed studio now call Hugging Face Inference Providers
(OpenAI-compatible router at https://router.huggingface.co/v1) via
@ai-sdk/openai-compatible, using the logged-in editor's OAuth token when
available and falling back to HF_TOKEN. Forking the Space no longer
requires wiring an OpenRouter API key - the user's own HF token funds
their own inference.
- Replace @openrouter/ai-sdk-provider with @ai-sdk/openai-compatible
- Forward the OAuth cookie token per request, fallback to HF_TOKEN
- Refresh AVAILABLE_MODELS to HF-served, tool-calling-capable models
(gpt-oss 120B/20B, Llama 3.3 70B, Qwen3 Coder 480B, DeepSeek V3.1)
- Add inference-api to hf_oauth_scopes and document the new env vars
(HF_INFERENCE_MODEL, expanded HF_TOKEN role) in README + SPECIFICATION
- Update agent-chat tests to expect the new "Hugging Face token" error
Co-authored-by: Cursor <cursoragent@cursor.com>
- README.md +11 -11
- backend/.env.example +24 -17
- backend/package-lock.json +0 -14
- backend/package.json +0 -1
- backend/src/agent/chat.ts +17 -6
- backend/src/agent/stream-handler.ts +55 -12
- backend/tests/agent-chat.test.ts +11 -11
- docs/SPECIFICATION.md +8 -8
|
@@ -9,6 +9,7 @@ pinned: false
|
|
| 9 |
hf_oauth: true
|
| 10 |
hf_oauth_scopes:
|
| 11 |
- manage-repos
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
# Research Article Template Editor
|
|
@@ -41,7 +42,7 @@ A collaborative, real-time editor for web-native scientific articles. It lets mu
|
|
| 41 |
| Collaboration | Y.js, Hocuspocus (WebSocket), y-tiptap |
|
| 42 |
| Backend | Node.js, Express, Vite (dev proxy), Hocuspocus server |
|
| 43 |
| Publishing | Custom TipTap-JSON → HTML renderer, Puppeteer for PDF |
|
| 44 |
-
| AI | Vercel AI SDK v6 (`ai`, `@ai-sdk/react`)
|
| 45 |
| Styling | Plain CSS with custom properties, no framework |
|
| 46 |
| Storage | Local FS or Hugging Face datasets (via `@huggingface/hub`) |
|
| 47 |
| Container | Single-image Docker build, runs on port 8080 |
|
|
@@ -83,7 +84,7 @@ See [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) for a diagram and the full to
|
|
| 83 |
### Prerequisites
|
| 84 |
|
| 85 |
- Node.js 20+
|
| 86 |
-
-
|
| 87 |
- A Hugging Face OAuth app (client id/secret) if you want login + HF dataset persistence
|
| 88 |
|
| 89 |
### Local development
|
|
@@ -93,7 +94,7 @@ Backend and frontend run as two separate processes in dev (Vite proxies `/api`,
|
|
| 93 |
```bash
|
| 94 |
# terminal 1 — backend (Express + Hocuspocus on :8080)
|
| 95 |
cd backend
|
| 96 |
-
cp .env.example .env #
|
| 97 |
npm install
|
| 98 |
npm run dev
|
| 99 |
|
|
@@ -118,13 +119,13 @@ Then open http://localhost:8080.
|
|
| 118 |
|
| 119 |
### Run your own copy on a Hugging Face Space
|
| 120 |
|
| 121 |
-
Want your own editor
|
| 122 |
|
| 123 |
1. **Duplicate the Space.** On https://huggingface.co/spaces/tfrere/research-article-template-editor, click `⋯ → Duplicate this Space`. Pick your namespace and visibility. HF copies the Dockerfile, the OAuth wiring and rebuilds the image automatically.
|
| 124 |
-
2. **Get an OpenRouter API key.** Sign up at https://openrouter.ai and create a key under https://openrouter.ai/keys. The chat agent and the embed studio call OpenRouter through the [Vercel AI SDK](https://ai-sdk.dev/), so any model exposed by OpenRouter works (defaults to `anthropic/claude-sonnet-4`).
|
| 125 |
-
3. **Add the key as a Space secret.** In your duplicated Space, go to `Settings → Variables and secrets → New secret`, name it `OPENROUTER_API_KEY` and paste the value. Optional: add `OPENROUTER_MODEL` as a public variable to override the default model. Save - the Space restarts and the AI features light up.
|
| 126 |
|
| 127 |
-
That's it.
|
|
|
|
|
|
|
| 128 |
|
| 129 |
## Scripts
|
| 130 |
|
|
@@ -155,10 +156,9 @@ Copy `backend/.env.example` to `backend/.env` and fill the relevant values. Key
|
|
| 155 |
| Variable | Purpose |
|
| 156 |
|---|---|
|
| 157 |
| `OAUTH_CLIENT_ID` / `OAUTH_CLIENT_SECRET` | HF OAuth app for user login (required to edit when running on a Space) |
|
| 158 |
-
| `OAUTH_SCOPES` | OAuth scopes (default `openid profile`) |
|
| 159 |
-
| `
|
| 160 |
-
| `
|
| 161 |
-
| `HF_TOKEN` | Server-side Hugging Face token (fallback when no user OAuth token is present) |
|
| 162 |
| `HF_DATASET_ID` | Target HF dataset repo for document persistence (when not running on a Space) |
|
| 163 |
| `SPACE_ID` / `SPACE_HOST` | Auto-set by HF Spaces; drive dataset id + secure cookies in production |
|
| 164 |
| `DATA_DIR` | Where documents, uploads and published bundles are stored on disk (default: `./data`) |
|
|
|
|
| 9 |
hf_oauth: true
|
| 10 |
hf_oauth_scopes:
|
| 11 |
- manage-repos
|
| 12 |
+
- inference-api
|
| 13 |
---
|
| 14 |
|
| 15 |
# Research Article Template Editor
|
|
|
|
| 42 |
| Collaboration | Y.js, Hocuspocus (WebSocket), y-tiptap |
|
| 43 |
| Backend | Node.js, Express, Vite (dev proxy), Hocuspocus server |
|
| 44 |
| Publishing | Custom TipTap-JSON → HTML renderer, Puppeteer for PDF |
|
| 45 |
+
| AI | Vercel AI SDK v6 (`ai`, `@ai-sdk/react`) → Hugging Face Inference Providers (OpenAI-compatible router) |
|
| 46 |
| Styling | Plain CSS with custom properties, no framework |
|
| 47 |
| Storage | Local FS or Hugging Face datasets (via `@huggingface/hub`) |
|
| 48 |
| Container | Single-image Docker build, runs on port 8080 |
|
|
|
|
| 84 |
### Prerequisites
|
| 85 |
|
| 86 |
- Node.js 20+
|
| 87 |
+
- A Hugging Face token with the `Make calls to Inference Providers` permission for the AI features (embed studio, chat agent). Generate one at https://huggingface.co/settings/tokens. On a HF Space the logged-in user's OAuth token is used instead - no manual setup needed.
|
| 88 |
- A Hugging Face OAuth app (client id/secret) if you want login + HF dataset persistence
|
| 89 |
|
| 90 |
### Local development
|
|
|
|
| 94 |
```bash
|
| 95 |
# terminal 1 — backend (Express + Hocuspocus on :8080)
|
| 96 |
cd backend
|
| 97 |
+
cp .env.example .env # set HF_TOKEN, optional OAUTH_* and HF_DATASET_ID
|
| 98 |
npm install
|
| 99 |
npm run dev
|
| 100 |
|
|
|
|
| 119 |
|
| 120 |
### Run your own copy on a Hugging Face Space
|
| 121 |
|
| 122 |
+
Want your own editor? One step:
|
| 123 |
|
| 124 |
1. **Duplicate the Space.** On https://huggingface.co/spaces/tfrere/research-article-template-editor, click `⋯ → Duplicate this Space`. Pick your namespace and visibility. HF copies the Dockerfile, the OAuth wiring and rebuilds the image automatically.
|
|
|
|
|
|
|
| 125 |
|
| 126 |
+
That's it. No API key to wire up. The AI features (chat agent + embed studio) call **Hugging Face Inference Providers** at `https://router.huggingface.co/v1` using the OAuth token of whoever is currently logged in. As long as your duplicated Space requests the `inference-api` scope (already declared in the README frontmatter as `hf_oauth_scopes`), every editor gets AI for free under their own Inference Providers quota.
|
| 127 |
+
|
| 128 |
+
Optional public variable: `HF_INFERENCE_MODEL` (e.g. `meta-llama/Llama-3.3-70B-Instruct`) to override the default model id. The full list of supported chat-completion models lives at https://huggingface.co/models?inference_provider=all&other=conversational.
|
| 129 |
|
| 130 |
## Scripts
|
| 131 |
|
|
|
|
| 156 |
| Variable | Purpose |
|
| 157 |
|---|---|
|
| 158 |
| `OAUTH_CLIENT_ID` / `OAUTH_CLIENT_SECRET` | HF OAuth app for user login (required to edit when running on a Space) |
|
| 159 |
+
| `OAUTH_SCOPES` | OAuth scopes (default `openid profile`). Add `manage-repos` for dataset persistence and `inference-api` to power the AI features with the user's token |
|
| 160 |
+
| `HF_TOKEN` | Server-side Hugging Face token. Used as a fallback when no user OAuth token is present (e.g. local dev). Needs the `Make calls to Inference Providers` permission to enable the chat agent + embed studio |
|
| 161 |
+
| `HF_INFERENCE_MODEL` | Override the default chat-completion model id (defaults to `openai/gpt-oss-120b`). Any tool-calling-capable model exposed by HF Inference Providers works |
|
|
|
|
| 162 |
| `HF_DATASET_ID` | Target HF dataset repo for document persistence (when not running on a Space) |
|
| 163 |
| `SPACE_ID` / `SPACE_HOST` | Auto-set by HF Spaces; drive dataset id + secure cookies in production |
|
| 164 |
| `DATA_DIR` | Where documents, uploads and published bundles are stored on disk (default: `./data`) |
|
|
@@ -35,10 +35,13 @@
|
|
| 35 |
OAUTH_CLIENT_ID=
|
| 36 |
OAUTH_CLIENT_SECRET=
|
| 37 |
|
| 38 |
-
# Space-scoped OAuth
|
| 39 |
-
#
|
| 40 |
-
#
|
| 41 |
-
#
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
# -----------------------------------------------------------------------------
|
| 44 |
# HF Space context (auto-injected by HF Spaces, set manually for local dev)
|
|
@@ -64,24 +67,28 @@ OAUTH_CLIENT_SECRET=
|
|
| 64 |
# HF_DATASET_ID=
|
| 65 |
|
| 66 |
# Server-side fallback HF token. Used when no user OAuth token is present yet
|
| 67 |
-
# (e.g. before the first login
|
| 68 |
-
#
|
| 69 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
# HF_TOKEN=
|
| 71 |
|
| 72 |
# -----------------------------------------------------------------------------
|
| 73 |
# AI features (chat panel + embed studio)
|
| 74 |
# -----------------------------------------------------------------------------
|
| 75 |
-
#
|
| 76 |
-
#
|
| 77 |
-
#
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
#
|
| 82 |
-
#
|
| 83 |
-
#
|
| 84 |
-
# OPENROUTER_MODEL=anthropic/claude-sonnet-4
|
| 85 |
|
| 86 |
# -----------------------------------------------------------------------------
|
| 87 |
# Publishing
|
|
|
|
| 35 |
OAUTH_CLIENT_ID=
|
| 36 |
OAUTH_CLIENT_SECRET=
|
| 37 |
|
| 38 |
+
# Space-scoped OAuth needs:
|
| 39 |
+
# - "manage-repos" to read/write the dataset that backs persistence
|
| 40 |
+
# - "inference-api" so the user's OAuth token can call HF Inference
|
| 41 |
+
# Providers (powers the chat panel + embed studio)
|
| 42 |
+
# Defaults to "openid profile" when unset, which is enough for login-only
|
| 43 |
+
# flows but disables AI features and dataset persistence.
|
| 44 |
+
# OAUTH_SCOPES=openid profile manage-repos inference-api
|
| 45 |
|
| 46 |
# -----------------------------------------------------------------------------
|
| 47 |
# HF Space context (auto-injected by HF Spaces, set manually for local dev)
|
|
|
|
| 67 |
# HF_DATASET_ID=
|
| 68 |
|
| 69 |
# Server-side fallback HF token. Used when no user OAuth token is present yet
|
| 70 |
+
# (e.g. before the first login, or during local dev without OAuth).
|
| 71 |
+
#
|
| 72 |
+
# The chat panel and embed studio call Hugging Face Inference Providers
|
| 73 |
+
# (https://router.huggingface.co/v1) with this token when no OAuth token is
|
| 74 |
+
# available. Generate one at https://huggingface.co/settings/tokens with
|
| 75 |
+
# "Write" scope (or a fine-grained token with both repo + inference
|
| 76 |
+
# permissions). Optional on a HF Space with OAuth configured - the logged-in
|
| 77 |
+
# user's token is used instead.
|
| 78 |
# HF_TOKEN=
|
| 79 |
|
| 80 |
# -----------------------------------------------------------------------------
|
| 81 |
# AI features (chat panel + embed studio)
|
| 82 |
# -----------------------------------------------------------------------------
|
| 83 |
+
# The AI assistant calls Hugging Face Inference Providers with either the
|
| 84 |
+
# logged-in user's OAuth token or HF_TOKEN above. No extra API key needed -
|
| 85 |
+
# this is the whole point of moving off OpenRouter.
|
| 86 |
+
|
| 87 |
+
# Override the default model id used by the chat agent. The list of
|
| 88 |
+
# supported models is in backend/src/agent/chat.ts (AVAILABLE_MODELS), but
|
| 89 |
+
# any model exposed by HF Inference Providers with tool-calling support
|
| 90 |
+
# works. Defaults to "openai/gpt-oss-120b".
|
| 91 |
+
# HF_INFERENCE_MODEL=openai/gpt-oss-120b
|
|
|
|
| 92 |
|
| 93 |
# -----------------------------------------------------------------------------
|
| 94 |
# Publishing
|
|
@@ -19,7 +19,6 @@
|
|
| 19 |
"@hocuspocus/server": "^3.4.4",
|
| 20 |
"@hocuspocus/transformer": "^3.4.4",
|
| 21 |
"@huggingface/hub": "^2.11.0",
|
| 22 |
-
"@openrouter/ai-sdk-provider": "^2.5.1",
|
| 23 |
"@tiptap/core": "^3.22.3",
|
| 24 |
"@tiptap/extension-image": "^3.22.3",
|
| 25 |
"@tiptap/extension-link": "^3.22.3",
|
|
@@ -817,19 +816,6 @@
|
|
| 817 |
"url": "https://paulmillr.com/funding/"
|
| 818 |
}
|
| 819 |
},
|
| 820 |
-
"node_modules/@openrouter/ai-sdk-provider": {
|
| 821 |
-
"version": "2.5.1",
|
| 822 |
-
"resolved": "https://registry.npmjs.org/@openrouter/ai-sdk-provider/-/ai-sdk-provider-2.5.1.tgz",
|
| 823 |
-
"integrity": "sha512-r1fJL1Cb3gQDa2MpWH/sfx1BsEW0uzlRriJM6eihaKqbtKDmZoBisF32VcVaQYassighX7NGCkF68EsrZA43uQ==",
|
| 824 |
-
"license": "Apache-2.0",
|
| 825 |
-
"engines": {
|
| 826 |
-
"node": ">=18"
|
| 827 |
-
},
|
| 828 |
-
"peerDependencies": {
|
| 829 |
-
"ai": "^6.0.0",
|
| 830 |
-
"zod": "^3.25.0 || ^4.0.0"
|
| 831 |
-
}
|
| 832 |
-
},
|
| 833 |
"node_modules/@opentelemetry/api": {
|
| 834 |
"version": "1.9.0",
|
| 835 |
"resolved": "https://registry.npmjs.org/@opentelemetry/api/-/api-1.9.0.tgz",
|
|
|
|
| 19 |
"@hocuspocus/server": "^3.4.4",
|
| 20 |
"@hocuspocus/transformer": "^3.4.4",
|
| 21 |
"@huggingface/hub": "^2.11.0",
|
|
|
|
| 22 |
"@tiptap/core": "^3.22.3",
|
| 23 |
"@tiptap/extension-image": "^3.22.3",
|
| 24 |
"@tiptap/extension-link": "^3.22.3",
|
|
|
|
| 816 |
"url": "https://paulmillr.com/funding/"
|
| 817 |
}
|
| 818 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 819 |
"node_modules/@opentelemetry/api": {
|
| 820 |
"version": "1.9.0",
|
| 821 |
"resolved": "https://registry.npmjs.org/@opentelemetry/api/-/api-1.9.0.tgz",
|
|
@@ -23,7 +23,6 @@
|
|
| 23 |
"@hocuspocus/server": "^3.4.4",
|
| 24 |
"@hocuspocus/transformer": "^3.4.4",
|
| 25 |
"@huggingface/hub": "^2.11.0",
|
| 26 |
-
"@openrouter/ai-sdk-provider": "^2.5.1",
|
| 27 |
"@tiptap/core": "^3.22.3",
|
| 28 |
"@tiptap/extension-image": "^3.22.3",
|
| 29 |
"@tiptap/extension-link": "^3.22.3",
|
|
|
|
| 23 |
"@hocuspocus/server": "^3.4.4",
|
| 24 |
"@hocuspocus/transformer": "^3.4.4",
|
| 25 |
"@huggingface/hub": "^2.11.0",
|
|
|
|
| 26 |
"@tiptap/core": "^3.22.3",
|
| 27 |
"@tiptap/extension-image": "^3.22.3",
|
| 28 |
"@tiptap/extension-link": "^3.22.3",
|
|
@@ -3,13 +3,24 @@ import { SYSTEM_PROMPT, buildMessages } from "./system-prompt.js";
|
|
| 3 |
import { streamChatResponse } from "./stream-handler.js";
|
| 4 |
import type { Request, Response } from "express";
|
| 5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
export const AVAILABLE_MODELS = [
|
| 7 |
-
{ id: "
|
| 8 |
-
{ id: "
|
| 9 |
-
{ id: "
|
| 10 |
-
{ id: "
|
| 11 |
-
{ id: "
|
| 12 |
-
{ id: "openai/gpt-4.1", label: "GPT-4.1", context: "1M", cost: "$$" },
|
| 13 |
];
|
| 14 |
|
| 15 |
export async function handleChat(req: Request, res: Response) {
|
|
|
|
| 3 |
import { streamChatResponse } from "./stream-handler.js";
|
| 4 |
import type { Request, Response } from "express";
|
| 5 |
|
| 6 |
+
/**
|
| 7 |
+
* Models exposed in the UI picker. All ids must be served by Hugging
|
| 8 |
+
* Face Inference Providers (`https://router.huggingface.co/v1`) and
|
| 9 |
+
* support function/tool calling - the agent loop won't work without it.
|
| 10 |
+
*
|
| 11 |
+
* Discover more conversational models here:
|
| 12 |
+
* https://huggingface.co/models?inference_provider=all&other=conversational
|
| 13 |
+
*
|
| 14 |
+
* `context` is the advertised context window; `cost` is a rough
|
| 15 |
+
* relative price tag ($, $$, $$$) - inference providers charge their
|
| 16 |
+
* own rates, see the docs for the source of truth.
|
| 17 |
+
*/
|
| 18 |
export const AVAILABLE_MODELS = [
|
| 19 |
+
{ id: "openai/gpt-oss-120b", label: "GPT-OSS 120B", context: "131K", cost: "$$" },
|
| 20 |
+
{ id: "openai/gpt-oss-20b", label: "GPT-OSS 20B", context: "131K", cost: "$" },
|
| 21 |
+
{ id: "meta-llama/Llama-3.3-70B-Instruct", label: "Llama 3.3 70B", context: "128K", cost: "$" },
|
| 22 |
+
{ id: "Qwen/Qwen3-Coder-480B-A35B-Instruct", label: "Qwen3 Coder 480B", context: "262K", cost: "$$" },
|
| 23 |
+
{ id: "deepseek-ai/DeepSeek-V3.1", label: "DeepSeek V3.1", context: "128K", cost: "$$" },
|
|
|
|
| 24 |
];
|
| 25 |
|
| 26 |
export async function handleChat(req: Request, res: Response) {
|
|
@@ -1,15 +1,47 @@
|
|
| 1 |
import { streamText, convertToModelMessages } from "ai";
|
| 2 |
-
import {
|
| 3 |
import type { Request, Response } from "express";
|
|
|
|
| 4 |
|
| 5 |
-
export const DEFAULT_MODEL = "
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
}
|
| 14 |
|
| 15 |
interface StreamChatOptions {
|
|
@@ -24,19 +56,30 @@ export async function streamChatResponse(
|
|
| 24 |
{ systemPrompt, tools, logPrefix }: StreamChatOptions,
|
| 25 |
) {
|
| 26 |
try {
|
| 27 |
-
const { messages,
|
| 28 |
|
| 29 |
if (!messages || !Array.isArray(messages)) {
|
| 30 |
res.status(400).json({ error: "messages array is required" });
|
| 31 |
return;
|
| 32 |
}
|
| 33 |
|
| 34 |
-
const
|
| 35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
const modelMessages = await convertToModelMessages(messages);
|
| 37 |
|
| 38 |
const result = streamText({
|
| 39 |
-
model: provider.
|
| 40 |
system: systemPrompt,
|
| 41 |
messages: modelMessages,
|
| 42 |
tools,
|
|
|
|
| 1 |
import { streamText, convertToModelMessages } from "ai";
|
| 2 |
+
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
|
| 3 |
import type { Request, Response } from "express";
|
| 4 |
+
import { extractToken } from "../auth.js";
|
| 5 |
|
| 6 |
+
export const DEFAULT_MODEL = "openai/gpt-oss-120b";
|
| 7 |
|
| 8 |
+
/**
|
| 9 |
+
* Hugging Face Inference Providers exposes an OpenAI-compatible chat
|
| 10 |
+
* completions endpoint at `https://router.huggingface.co/v1` that routes
|
| 11 |
+
* to a fleet of providers (Cerebras, Together, Fireworks, ...). The
|
| 12 |
+
* upside: any HF user token with the `inference-api` scope can call it,
|
| 13 |
+
* so a forked Space gets AI features for free as soon as the user logs
|
| 14 |
+
* in - no extra API key to wire up.
|
| 15 |
+
*
|
| 16 |
+
* See https://huggingface.co/docs/inference-providers
|
| 17 |
+
*/
|
| 18 |
+
const HF_INFERENCE_BASE_URL = "https://router.huggingface.co/v1";
|
| 19 |
+
|
| 20 |
+
/**
|
| 21 |
+
* Resolve the HF token used to authenticate inference calls.
|
| 22 |
+
*
|
| 23 |
+
* Priority:
|
| 24 |
+
* 1. The currently logged-in editor's OAuth token (forwarded from the
|
| 25 |
+
* `hf_access_token` cookie). This is the production path on a HF
|
| 26 |
+
* Space - no environment secret needed.
|
| 27 |
+
* 2. The `HF_TOKEN` env var fallback. Useful for local dev when OAuth
|
| 28 |
+
* isn't configured, or as a server-side default when the OAuth
|
| 29 |
+
* scope doesn't include `inference-api` yet.
|
| 30 |
+
*/
|
| 31 |
+
function resolveHfToken(req: Request): string | undefined {
|
| 32 |
+
const userToken = extractToken(req.headers.cookie);
|
| 33 |
+
if (userToken) return userToken;
|
| 34 |
+
const envToken = process.env.HF_TOKEN;
|
| 35 |
+
if (envToken) return envToken;
|
| 36 |
+
return undefined;
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
function createProvider(apiKey: string) {
|
| 40 |
+
return createOpenAICompatible({
|
| 41 |
+
name: "huggingface",
|
| 42 |
+
baseURL: HF_INFERENCE_BASE_URL,
|
| 43 |
+
apiKey,
|
| 44 |
+
});
|
| 45 |
}
|
| 46 |
|
| 47 |
interface StreamChatOptions {
|
|
|
|
| 56 |
{ systemPrompt, tools, logPrefix }: StreamChatOptions,
|
| 57 |
) {
|
| 58 |
try {
|
| 59 |
+
const { messages, model } = req.body;
|
| 60 |
|
| 61 |
if (!messages || !Array.isArray(messages)) {
|
| 62 |
res.status(400).json({ error: "messages array is required" });
|
| 63 |
return;
|
| 64 |
}
|
| 65 |
|
| 66 |
+
const apiKey = resolveHfToken(req);
|
| 67 |
+
if (!apiKey) {
|
| 68 |
+
res.status(500).json({
|
| 69 |
+
error:
|
| 70 |
+
"No Hugging Face token available. Sign in with your HF account " +
|
| 71 |
+
"(the OAuth token is used to call Inference Providers) or set " +
|
| 72 |
+
"HF_TOKEN in the backend environment.",
|
| 73 |
+
});
|
| 74 |
+
return;
|
| 75 |
+
}
|
| 76 |
+
|
| 77 |
+
const provider = createProvider(apiKey);
|
| 78 |
+
const modelId = model || process.env.HF_INFERENCE_MODEL || DEFAULT_MODEL;
|
| 79 |
const modelMessages = await convertToModelMessages(messages);
|
| 80 |
|
| 81 |
const result = streamText({
|
| 82 |
+
model: provider.chatModel(modelId),
|
| 83 |
system: systemPrompt,
|
| 84 |
messages: modelMessages,
|
| 85 |
tools,
|
|
@@ -3,7 +3,7 @@
|
|
| 3 |
*
|
| 4 |
* Tests for /api/chat and /api/embed-chat routes:
|
| 5 |
* - Input validation (missing messages)
|
| 6 |
-
* - Missing
|
| 7 |
* - Model list endpoint
|
| 8 |
*/
|
| 9 |
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
|
|
@@ -71,9 +71,9 @@ describe("/api/chat - validation", () => {
|
|
| 71 |
expect(res.body).toHaveProperty("error");
|
| 72 |
});
|
| 73 |
|
| 74 |
-
it("returns 500 when
|
| 75 |
-
const original = process.env.
|
| 76 |
-
delete process.env.
|
| 77 |
|
| 78 |
const res = await request(app)
|
| 79 |
.post("/api/chat")
|
|
@@ -83,9 +83,9 @@ describe("/api/chat - validation", () => {
|
|
| 83 |
.expect(500);
|
| 84 |
|
| 85 |
expect(res.body).toHaveProperty("error");
|
| 86 |
-
expect(res.body.error).toContain("
|
| 87 |
|
| 88 |
-
if (original) process.env.
|
| 89 |
});
|
| 90 |
});
|
| 91 |
|
|
@@ -109,9 +109,9 @@ describe("/api/embed-chat - validation", () => {
|
|
| 109 |
expect(res.body).toHaveProperty("error");
|
| 110 |
});
|
| 111 |
|
| 112 |
-
it("returns 500 when
|
| 113 |
-
const original = process.env.
|
| 114 |
-
delete process.env.
|
| 115 |
|
| 116 |
const res = await request(app)
|
| 117 |
.post("/api/embed-chat")
|
|
@@ -121,8 +121,8 @@ describe("/api/embed-chat - validation", () => {
|
|
| 121 |
.expect(500);
|
| 122 |
|
| 123 |
expect(res.body).toHaveProperty("error");
|
| 124 |
-
expect(res.body.error).toContain("
|
| 125 |
|
| 126 |
-
if (original) process.env.
|
| 127 |
});
|
| 128 |
});
|
|
|
|
| 3 |
*
|
| 4 |
* Tests for /api/chat and /api/embed-chat routes:
|
| 5 |
* - Input validation (missing messages)
|
| 6 |
+
* - Missing HF token handling
|
| 7 |
* - Model list endpoint
|
| 8 |
*/
|
| 9 |
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
|
|
|
|
| 71 |
expect(res.body).toHaveProperty("error");
|
| 72 |
});
|
| 73 |
|
| 74 |
+
it("returns 500 when no HF token is available", async () => {
|
| 75 |
+
const original = process.env.HF_TOKEN;
|
| 76 |
+
delete process.env.HF_TOKEN;
|
| 77 |
|
| 78 |
const res = await request(app)
|
| 79 |
.post("/api/chat")
|
|
|
|
| 83 |
.expect(500);
|
| 84 |
|
| 85 |
expect(res.body).toHaveProperty("error");
|
| 86 |
+
expect(res.body.error).toContain("Hugging Face token");
|
| 87 |
|
| 88 |
+
if (original) process.env.HF_TOKEN = original;
|
| 89 |
});
|
| 90 |
});
|
| 91 |
|
|
|
|
| 109 |
expect(res.body).toHaveProperty("error");
|
| 110 |
});
|
| 111 |
|
| 112 |
+
it("returns 500 when no HF token is available", async () => {
|
| 113 |
+
const original = process.env.HF_TOKEN;
|
| 114 |
+
delete process.env.HF_TOKEN;
|
| 115 |
|
| 116 |
const res = await request(app)
|
| 117 |
.post("/api/embed-chat")
|
|
|
|
| 121 |
.expect(500);
|
| 122 |
|
| 123 |
expect(res.body).toHaveProperty("error");
|
| 124 |
+
expect(res.body.error).toContain("Hugging Face token");
|
| 125 |
|
| 126 |
+
if (original) process.env.HF_TOKEN = original;
|
| 127 |
});
|
| 128 |
});
|
|
@@ -127,8 +127,9 @@ flowchart LR
|
|
| 127 |
|
| 128 |
### 4.6 AI Agent
|
| 129 |
|
| 130 |
-
- Provider:
|
| 131 |
-
-
|
|
|
|
| 132 |
- **Context**: document text, current selection, frontmatter (sent by frontend with each message)
|
| 133 |
- **Tools** (declarative, executed client-side by the frontend):
|
| 134 |
- `replaceSelection` - replace selected text
|
|
@@ -265,7 +266,7 @@ The publisher reads these same CSS files server-side and injects them inline int
|
|
| 265 |
### 6.2 HF Space Configuration (README.md frontmatter)
|
| 266 |
|
| 267 |
- SDK: `docker`, port `8080`
|
| 268 |
-
- OAuth: `hf_oauth: true`, scopes: `manage-repos`
|
| 269 |
- Two git remotes: `space` (tfrere/collab-editor, dev) and `prod` (tfrere/research-article-template-editor, production)
|
| 270 |
|
| 271 |
### 6.3 Environment Variables
|
|
@@ -278,11 +279,10 @@ The publisher reads these same CSS files server-side and injects them inline int
|
|
| 278 |
| `SPACE_HOST` | For OAuth | HTTPS callback URL host |
|
| 279 |
| `OAUTH_CLIENT_ID` | For OAuth | HF OAuth client |
|
| 280 |
| `OAUTH_CLIENT_SECRET` | For OAuth | HF OAuth secret |
|
| 281 |
-
| `OAUTH_SCOPES` | No (default `openid profile`) | OAuth scopes |
|
| 282 |
| `HF_DATASET_ID` | No | Override dataset name (default: `{SPACE_ID}-data`) |
|
| 283 |
-
| `HF_TOKEN` |
|
| 284 |
-
| `
|
| 285 |
-
| `OPENROUTER_MODEL` | No | Default AI model |
|
| 286 |
| `ENABLE_PDF` | No (default true) | Toggle PDF/thumbnail generation |
|
| 287 |
|
| 288 |
### 6.4 Local Development
|
|
@@ -297,7 +297,7 @@ cd frontend && npm install && npm run dev
|
|
| 297 |
# Starts on http://localhost:5678 (proxies /api and /collab to :8080)
|
| 298 |
```
|
| 299 |
|
| 300 |
-
Create a `.env` file in `backend/` with at minimum `
|
| 301 |
|
| 302 |
---
|
| 303 |
|
|
|
|
| 127 |
|
| 128 |
### 4.6 AI Agent
|
| 129 |
|
| 130 |
+
- Provider: Hugging Face Inference Providers (`https://router.huggingface.co/v1`), default model `openai/gpt-oss-120b`
|
| 131 |
+
- Auth: per-request bearer token resolved from the editor's OAuth cookie when available, falling back to the server-side `HF_TOKEN`. On a HF Space with `inference-api` scope, no extra secret is needed - the logged-in user pays for their own inference under their HF quota.
|
| 132 |
+
- Streaming via Vercel AI SDK `streamText` over `@ai-sdk/openai-compatible`
|
| 133 |
- **Context**: document text, current selection, frontmatter (sent by frontend with each message)
|
| 134 |
- **Tools** (declarative, executed client-side by the frontend):
|
| 135 |
- `replaceSelection` - replace selected text
|
|
|
|
| 266 |
### 6.2 HF Space Configuration (README.md frontmatter)
|
| 267 |
|
| 268 |
- SDK: `docker`, port `8080`
|
| 269 |
+
- OAuth: `hf_oauth: true`, scopes: `manage-repos`, `inference-api`
|
| 270 |
- Two git remotes: `space` (tfrere/collab-editor, dev) and `prod` (tfrere/research-article-template-editor, production)
|
| 271 |
|
| 272 |
### 6.3 Environment Variables
|
|
|
|
| 279 |
| `SPACE_HOST` | For OAuth | HTTPS callback URL host |
|
| 280 |
| `OAUTH_CLIENT_ID` | For OAuth | HF OAuth client |
|
| 281 |
| `OAUTH_CLIENT_SECRET` | For OAuth | HF OAuth secret |
|
| 282 |
+
| `OAUTH_SCOPES` | No (default `openid profile`) | OAuth scopes. Add `manage-repos` for dataset persistence and `inference-api` to power AI features with the user's token |
|
| 283 |
| `HF_DATASET_ID` | No | Override dataset name (default: `{SPACE_ID}-data`) |
|
| 284 |
+
| `HF_TOKEN` | For AI chat in local dev | Fallback Hub token for HF API + Inference Providers. Needs the "Make calls to Inference Providers" permission |
|
| 285 |
+
| `HF_INFERENCE_MODEL` | No (default `openai/gpt-oss-120b`) | Default chat-completion model id served by HF Inference Providers |
|
|
|
|
| 286 |
| `ENABLE_PDF` | No (default true) | Toggle PDF/thumbnail generation |
|
| 287 |
|
| 288 |
### 6.4 Local Development
|
|
|
|
| 297 |
# Starts on http://localhost:5678 (proxies /api and /collab to :8080)
|
| 298 |
```
|
| 299 |
|
| 300 |
+
Create a `.env` file in `backend/` with at minimum `HF_TOKEN` for AI chat (must have the "Make calls to Inference Providers" permission). Without `SPACE_ID`, OAuth is disabled and all users can edit.
|
| 301 |
|
| 302 |
---
|
| 303 |
|