Spaces:

mishig
/

chat-ui

Running on CPU Upgrade

victor HF Staff commited on Dec 9, 2025

Commit

700a224

1 Parent(s): d2ae872

Add documentation

Introduced a new documentation structure under docs/source, covering configuration, installation (local, Docker, Helm), architecture, and common issues. Updated .env and README.md to clarify model discovery and LLM router configuration. This improves onboarding, setup, and operational guidance for developers and users.

Files changed (15) hide show

.env +1 -1
README.md +1 -1
docs/source/_toctree.yml +30 -0
docs/source/configuration/common-issues.md +37 -0
docs/source/configuration/llm-router.md +105 -0
docs/source/configuration/mcp-tools.md +83 -0
docs/source/configuration/metrics.md +9 -0
docs/source/configuration/open-id.md +57 -0
docs/source/configuration/overview.md +88 -0
docs/source/configuration/theming.md +20 -0
docs/source/developing/architecture.md +47 -0
docs/source/index.md +52 -0
docs/source/installation/docker.md +43 -0
docs/source/installation/helm.md +43 -0
docs/source/installation/local.md +62 -0

.env CHANGED Viewed

@@ -54,7 +54,7 @@ TASK_MODEL=
 LLM_ROUTER_ARCH_BASE_URL=
 ## LLM Router Configuration
-# Path to routes policy (JSON array). Defaults to llm-router/routes.chat.json
 LLM_ROUTER_ROUTES_PATH=
 # Model used at the Arch router endpoint for selection

 LLM_ROUTER_ARCH_BASE_URL=
 ## LLM Router Configuration
+# Path to routes policy (JSON array). Required when the router is enabled; must point to a valid JSON file.
 LLM_ROUTER_ROUTES_PATH=
 # Model used at the Arch router endpoint for selection

README.md CHANGED Viewed

@@ -122,7 +122,7 @@ PUBLIC_APP_DATA_SHARING=
 ### Models
-This build does not use the `MODELS` env var or GGUF discovery. Configure models via `OPENAI_BASE_URL` only; Chat UI will fetch `${OPENAI_BASE_URL}/models` and populate the list automatically. Authorization uses `OPENAI_API_KEY` (preferred). `HF_TOKEN` remains a legacy alias.
 ### LLM Router (Optional)

 ### Models
+Models are discovered from `${OPENAI_BASE_URL}/models`, and you can optionally override their metadata via the `MODELS` env var (JSON5). Legacy provider‑specific integrations and GGUF discovery are removed. Authorization uses `OPENAI_API_KEY` (preferred). `HF_TOKEN` remains a legacy alias.
 ### LLM Router (Optional)

docs/source/_toctree.yml ADDED Viewed

	@@ -0,0 +1,30 @@

+- local: index
+  title: Chat UI
+- title: Installation
+  sections:
+    - local: installation/local
+      title: Local
+    - local: installation/docker
+      title: Docker
+    - local: installation/helm
+      title: Helm
+- title: Configuration
+  sections:
+    - local: configuration/overview
+      title: Overview
+    - local: configuration/theming
+      title: Theming
+    - local: configuration/open-id
+      title: OpenID
+    - local: configuration/mcp-tools
+      title: MCP Tools
+    - local: configuration/llm-router
+      title: LLM Router
+    - local: configuration/metrics
+      title: Metrics
+    - local: configuration/common-issues
+      title: Common Issues
+- title: Developing
+  sections:
+    - local: developing/architecture
+      title: Architecture

docs/source/configuration/common-issues.md ADDED Viewed

	@@ -0,0 +1,37 @@

+# Common Issues
+## 403: You don't have access to this conversation
+This usually happens when running Chat UI over HTTP without proper cookie configuration.
+**Recommended:** Set up a reverse proxy (NGINX, Caddy) to handle HTTPS.
+**Alternative:** If you must run over HTTP, configure cookies:
+```ini
+COOKIE_SECURE=false
+COOKIE_SAMESITE=lax
+```
+Also ensure `PUBLIC_ORIGIN` matches your actual URL:
+```ini
+PUBLIC_ORIGIN=http://localhost:5173
+```
+## Models not loading
+If models aren't appearing in the UI:
+1. Verify `OPENAI_BASE_URL` is correct and accessible
+2. Check that `OPENAI_API_KEY` is valid
+3. Ensure the endpoint returns models at `${OPENAI_BASE_URL}/models`
+## Database connection errors
+For development, you can skip MongoDB entirely - Chat UI will use an embedded database.
+For production, verify:
+- `MONGODB_URL` is a valid connection string
+- Your IP is whitelisted (for MongoDB Atlas)
+- The database user has read/write permissions

docs/source/configuration/llm-router.md ADDED Viewed

	@@ -0,0 +1,105 @@

+# LLM Router
+Chat UI includes an intelligent routing system that automatically selects the best model for each request. When enabled, users see a virtual "Omni" model that routes to specialized models based on the conversation context.
+The router uses [katanemo/Arch-Router-1.5B](https://huggingface.co/katanemo/Arch-Router-1.5B) for route selection.
+## Configuration
+### Basic Setup
+```ini
+# Arch router endpoint (OpenAI-compatible)
+LLM_ROUTER_ARCH_BASE_URL=https://router.huggingface.co/v1
+LLM_ROUTER_ARCH_MODEL=katanemo/Arch-Router-1.5B
+# Path to your routes policy JSON
+LLM_ROUTER_ROUTES_PATH=./config/routes.json
+```
+### Routes Policy
+Create a JSON file defining your routes. Each route specifies:
+```json
+[
+  {
+    "name": "coding",
+    "description": "Programming, debugging, code review",
+    "primary_model": "Qwen/Qwen3-Coder-480B-A35B-Instruct",
+    "fallback_models": ["meta-llama/Llama-3.3-70B-Instruct"]
+  },
+  {
+    "name": "casual_conversation",
+    "description": "General chat, questions, explanations",
+    "primary_model": "meta-llama/Llama-3.3-70B-Instruct"
+  }
+]
+```
+### Fallback Behavior
+```ini
+# Route to use when Arch returns "other"
+LLM_ROUTER_OTHER_ROUTE=casual_conversation
+# Model to use if Arch selection fails entirely
+LLM_ROUTER_FALLBACK_MODEL=meta-llama/Llama-3.3-70B-Instruct
+# Selection timeout (milliseconds)
+LLM_ROUTER_ARCH_TIMEOUT_MS=10000
+```
+## Multimodal Routing
+When a user sends an image, the router can bypass Arch and route directly to a vision model:
+```ini
+LLM_ROUTER_ENABLE_MULTIMODAL=true
+LLM_ROUTER_MULTIMODAL_MODEL=meta-llama/Llama-3.2-90B-Vision-Instruct
+```
+## Tools Routing
+When a user has MCP servers enabled, the router can automatically select a tools-capable model:
+```ini
+LLM_ROUTER_ENABLE_TOOLS=true
+LLM_ROUTER_TOOLS_MODEL=meta-llama/Llama-3.3-70B-Instruct
+```
+## UI Customization
+Customize how the router appears in the model selector:
+```ini
+PUBLIC_LLM_ROUTER_ALIAS_ID=omni
+PUBLIC_LLM_ROUTER_DISPLAY_NAME=Omni
+PUBLIC_LLM_ROUTER_LOGO_URL=https://example.com/logo.png
+```
+## How It Works
+When a user selects Omni:
+1. Chat UI sends the conversation context to the Arch router
+2. Arch analyzes the content and returns a route name
+3. Chat UI maps the route to the corresponding model
+4. The request streams from the selected model
+5. On errors, fallback models are tried in order
+The route selection is displayed in the UI so users can see which model was chosen.
+## Message Length Limits
+To optimize router performance, message content is trimmed before sending to Arch:
+```ini
+# Max characters for assistant messages (default: 500)
+LLM_ROUTER_MAX_ASSISTANT_LENGTH=500
+# Max characters for previous user messages (default: 400)
+LLM_ROUTER_MAX_PREV_USER_LENGTH=400
+```
+The latest user message is never trimmed.

docs/source/configuration/mcp-tools.md ADDED Viewed

	@@ -0,0 +1,83 @@

+# MCP Tools
+Chat UI supports tool calling via the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/). MCP servers expose tools that models can invoke during conversations.
+## Server Types
+Chat UI supports two types of MCP servers:
+### Base Servers (Admin-configured)
+Base servers are configured by the administrator via environment variables. They appear for all users and can be enabled/disabled per-user but not removed.
+```ini
+MCP_SERVERS=[
+  {"name": "Web Search (Exa)", "url": "https://mcp.exa.ai/mcp"},
+  {"name": "Hugging Face", "url": "https://hf.co/mcp"}
+]
+```
+Each server entry requires:
+- `name` - Display name shown in the UI
+- `url` - MCP server endpoint URL
+- `headers` (optional) - Custom headers for authentication
+### User Servers (Added from UI)
+Users can add their own MCP servers directly from the UI:
+1. Open the chat input and click the **+** button (or go to Settings)
+2. Select **MCP Servers**
+3. Click **Add Server**
+4. Enter the server name and URL
+5. Run **Health Check** to verify connectivity
+User-added servers are stored in the browser and can be removed at any time. They work alongside base servers.
+## User Token Forwarding
+When users are logged in via Hugging Face, you can forward their access token to MCP servers:
+```ini
+MCP_FORWARD_HF_USER_TOKEN=true
+```
+This allows MCP servers to access user-specific resources on their behalf.
+## Using Tools
+1. Enable the servers you want to use from the MCP Servers panel
+2. Start chatting - models will automatically use tools when appropriate
+### Model Requirements
+Not all models support tool calling. To enable tools for a specific model, add it to your `MODELS` override:
+```ini
+MODELS=`[
+  {
+    "id": "meta-llama/Llama-3.3-70B-Instruct",
+    "supportsTools": true
+  }
+]`
+```
+## Tool Execution Flow
+When a model decides to use a tool:
+1. The model generates a tool call with parameters
+2. Chat UI executes the call against the MCP server
+3. Results are displayed in the chat as a collapsible "tool" block
+4. Results are fed back to the model for follow-up responses
+## Integration with LLM Router
+When using the [LLM Router](./llm-router), you can configure automatic routing to a tools-capable model:
+```ini
+LLM_ROUTER_ENABLE_TOOLS=true
+LLM_ROUTER_TOOLS_MODEL=meta-llama/Llama-3.3-70B-Instruct
+```
+When a user has MCP servers enabled and selects the Omni model, the router will automatically use the specified tools model.

docs/source/configuration/metrics.md ADDED Viewed

	@@ -0,0 +1,9 @@

+# Metrics
+The server can expose prometheus metrics on port `5565` but is off by default. You may enable the metrics server with `METRICS_ENABLED=true` and change the port with `METRICS_PORT=1234`.
+<Tip>
+In development with `npm run dev`, the metrics server does not shutdown gracefully due to Sveltekit not providing hooks for restart. It's recommended to disable the metrics server in this case.
+</Tip>

docs/source/configuration/open-id.md ADDED Viewed

	@@ -0,0 +1,57 @@

+# OpenID
+By default, users are attributed a unique ID based on their browser session. To authenticate users with OpenID Connect, configure the following:
+```ini
+OPENID_CLIENT_ID=your_client_id
+OPENID_CLIENT_SECRET=your_client_secret
+OPENID_SCOPES="openid profile"
+```
+Use the provider URL for standard OpenID Connect discovery:
+```ini
+OPENID_PROVIDER_URL=https://your-provider.com
+```
+Advanced: you can also provide a client metadata document via `OPENID_CONFIG`. This value must be a JSON/JSON5 object (for example, a CIMD document) and is parsed server‑side to populate OpenID settings.
+**Redirect URI:** `https://your-domain.com/login/callback`
+## Access Control
+Restrict access to specific users:
+```ini
+# Allow only specific email addresses
+ALLOWED_USER_EMAILS=["user@example.com", "admin@example.com"]
+# Allow all users from specific domains
+ALLOWED_USER_DOMAINS=["example.com", "company.org"]
+```
+## Hugging Face Login
+For Hugging Face authentication, you can use automatic client registration:
+```ini
+OPENID_CLIENT_ID=__CIMD__
+```
+This creates an OAuth app automatically when deployed. See the [CIMD spec](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) for details.
+## User Token Forwarding
+When users log in via Hugging Face, you can forward their token for inference:
+```ini
+USE_USER_TOKEN=true
+```
+## Auto-Login
+Force authentication on all routes:
+```ini
+AUTOMATIC_LOGIN=true
+```

docs/source/configuration/overview.md ADDED Viewed

	@@ -0,0 +1,88 @@

+# Configuration Overview
+Chat UI is configured through environment variables. Default values are in `.env`; override them in `.env.local` or via your environment.
+## Required Configuration
+Chat UI connects to any OpenAI-compatible API endpoint:
+```ini
+OPENAI_BASE_URL=https://router.huggingface.co/v1
+OPENAI_API_KEY=hf_************************
+```
+Models are automatically discovered from `${OPENAI_BASE_URL}/models`. No manual model configuration is required.
+## Database
+```ini
+MONGODB_URL=mongodb://localhost:27017
+MONGODB_DB_NAME=chat-ui
+```
+For development, `MONGODB_URL` is optional - Chat UI falls back to an embedded MongoDB that persists to `./db`.
+## Model Overrides
+To customize model behavior, use the `MODELS` environment variable (JSON5 format):
+```ini
+MODELS=`[
+  {
+    "id": "meta-llama/Llama-3.3-70B-Instruct",
+    "name": "Llama 3.3 70B",
+    "multimodal": false,
+    "supportsTools": true
+  }
+]`
+```
+Override properties:
+- `id` - Model identifier (must match an ID from the `/models` endpoint)
+- `name` - Display name in the UI
+- `multimodal` - Enable image uploads
+- `supportsTools` - Enable MCP tool calling for models that don’t advertise tool support
+- `parameters` - Override default parameters (temperature, max_tokens, etc.)
+## Task Model
+Set a specific model for internal tasks (title generation, etc.):
+```ini
+TASK_MODEL=meta-llama/Llama-3.1-8B-Instruct
+```
+If not set, the current conversation model is used.
+## Voice Transcription
+Enable voice input with Whisper:
+```ini
+TRANSCRIPTION_MODEL=openai/whisper-large-v3-turbo
+TRANSCRIPTION_BASE_URL=https://router.huggingface.co/hf-inference/models
+```
+## Feature Flags
+```ini
+LLM_SUMMARIZATION=true          # Enable automatic conversation title generation
+ENABLE_DATA_EXPORT=true         # Allow users to export their data
+ALLOW_IFRAME=false              # Disallow embedding in iframes (set to true to allow)
+```
+## User Authentication
+Use OpenID Connect for authentication:
+```ini
+OPENID_CLIENT_ID=your_client_id
+OPENID_CLIENT_SECRET=your_client_secret
+OPENID_SCOPES="openid profile"
+```
+See [OpenID configuration](./open-id) for details.
+## Environment Variable Reference
+See the [`.env` file](https://github.com/huggingface/chat-ui/blob/main/.env) for the complete list of available options.

docs/source/configuration/theming.md ADDED Viewed

	@@ -0,0 +1,20 @@

+# Theming
+Customize the look and feel of Chat UI with these environment variables:
+```ini
+PUBLIC_APP_NAME=ChatUI
+PUBLIC_APP_ASSETS=chatui
+PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
+```
+- `PUBLIC_APP_NAME` - The name used as a title throughout the app
+- `PUBLIC_APP_ASSETS` - Directory for logos & favicons in `static/$PUBLIC_APP_ASSETS`. Options: `chatui`, `huggingchat`
+- `PUBLIC_APP_DESCRIPTION` - Description shown in meta tags and about sections
+## Additional Options
+```ini
+PUBLIC_APP_DATA_SHARING=1    # Show data sharing opt-in toggle in settings
+PUBLIC_ORIGIN=https://chat.example.com  # Your public URL (required for sharing)
+```

docs/source/developing/architecture.md ADDED Viewed

	@@ -0,0 +1,47 @@

+# Architecture
+This document provides a high-level overview of the Chat UI codebase. If you're looking to contribute or understand how the codebase works, this is the place for you!
+## Overview
+Chat UI provides a simple interface connecting LLMs to external tools via MCP. The project uses [MongoDB](https://www.mongodb.com/) and [SvelteKit](https://kit.svelte.dev/) with [Tailwind](https://tailwindcss.com/).
+Key architectural decisions:
+- **OpenAI-compatible only**: All model interactions use the OpenAI API format
+- **MCP for tools**: Tool calling is handled via Model Context Protocol servers
+- **Auto-discovery**: Models are discovered from the `/models` endpoint
+## Code Map
+### `routes`
+All routes rendered with SSR via SvelteKit. The majority of backend and frontend logic lives here, with shared modules in `lib` (client) and `lib/server` (server).
+### `textGeneration`
+Provides a standard interface for chat features including model output, tool calls, and streaming. Outputs `MessageUpdate`s for fine-grained status updates (new tokens, tool results, etc.).
+### `endpoints`
+Provides the streaming interface for OpenAI-compatible endpoints. Models are fetched and cached from `${OPENAI_BASE_URL}/models`.
+### `mcp`
+Implements MCP client functionality for tool discovery and execution. See [MCP Tools](../configuration/mcp-tools) for configuration.
+### `llmRouter`
+Intelligent routing logic that selects the best model for each request. Uses the Arch router model for classification. See [LLM Router](../configuration/llm-router) for details.
+### `migrations`
+MongoDB migrations for maintaining backwards compatibility across schema changes. Any schema changes must include a migration.
+## Development
+```bash
+npm install
+npm run dev
+```
+The dev server runs at `http://localhost:5173` with hot reloading.

docs/source/index.md ADDED Viewed

	@@ -0,0 +1,52 @@

+# Chat UI
+Open source chat interface with support for tools, multimodal inputs, and intelligent routing across models. The app uses MongoDB and SvelteKit behind the scenes. Try the live version called [HuggingChat on hf.co/chat](https://huggingface.co/chat) or [setup your own instance](./installation/local).
+Chat UI connects to any OpenAI-compatible API endpoint, making it work with:
+- [Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers)
+- [Ollama](https://ollama.ai)
+- [llama.cpp](https://github.com/ggerganov/llama.cpp)
+- [OpenRouter](https://openrouter.ai)
+- Any other OpenAI-compatible service
+**[MCP Tools](./configuration/mcp-tools)**: Function calling via Model Context Protocol (MCP) servers
+**[LLM Router](./configuration/llm-router)**: Intelligent routing to select the best model for each request
+**[Multimodal](./configuration/overview)**: Image uploads on models that support vision
+**[OpenID](./configuration/open-id)**: Optional user authentication via OpenID Connect
+## Quickstart
+**Step 1 - Create `.env.local`:**
+```ini
+OPENAI_BASE_URL=https://router.huggingface.co/v1
+OPENAI_API_KEY=hf_************************
+```
+You can use any OpenAI-compatible endpoint:
+| Provider | `OPENAI_BASE_URL` | `OPENAI_API_KEY` |
+|----------|-------------------|------------------|
+| Hugging Face | `https://router.huggingface.co/v1` | `hf_xxx` |
+| Ollama | `http://127.0.0.1:11434/v1` | `ollama` |
+| llama.cpp | `http://127.0.0.1:8080/v1` | `sk-local` |
+| OpenRouter | `https://openrouter.ai/api/v1` | `sk-or-v1-xxx` |
+**Step 2 - Install and run:**
+```bash
+git clone https://github.com/huggingface/chat-ui
+cd chat-ui
+npm install
+npm run dev -- --open
+```
+That's it! Chat UI will automatically discover available models from your endpoint.
+> [!TIP]
+> MongoDB is optional for development. When `MONGODB_URL` is not set, Chat UI uses an embedded database that persists to `./db`.
+For production deployments, see the [installation guides](./installation/local).

docs/source/installation/docker.md ADDED Viewed

	@@ -0,0 +1,43 @@

+# Running on Docker
+Pre-built Docker images are available:
+- **`ghcr.io/huggingface/chat-ui-db`** - Includes MongoDB (recommended for quick setup)
+- **`ghcr.io/huggingface/chat-ui`** - Requires external MongoDB
+## Quick Start (with bundled MongoDB)
+```bash
+docker run -p 3000:3000 \
+  -e OPENAI_BASE_URL=https://router.huggingface.co/v1 \
+  -e OPENAI_API_KEY=hf_*** \
+  -v chat-ui-data:/data \
+  ghcr.io/huggingface/chat-ui-db
+```
+## With External MongoDB
+If you have an existing MongoDB instance:
+```bash
+docker run -p 3000:3000 \
+  -e OPENAI_BASE_URL=https://router.huggingface.co/v1 \
+  -e OPENAI_API_KEY=hf_*** \
+  -e MONGODB_URL=mongodb://host.docker.internal:27017 \
+  ghcr.io/huggingface/chat-ui
+```
+Use `host.docker.internal` to reach MongoDB running on your host machine, or provide your MongoDB Atlas connection string.
+## Using an Environment File
+For more configuration options, use `--env-file` to avoid leaking secrets in shell history:
+```bash
+docker run -p 3000:3000 \
+  --env-file .env.local \
+  -v chat-ui-data:/data \
+  ghcr.io/huggingface/chat-ui-db
+```
+See the [configuration overview](../configuration/overview) for all available environment variables.

docs/source/installation/helm.md ADDED Viewed

	@@ -0,0 +1,43 @@

+# Helm
+<Tip warning={true}>
+The Helm chart is a work in progress and should be considered unstable. Breaking changes may be pushed without migration guides. Contributions welcome!
+</Tip>
+For Kubernetes deployment, use the Helm chart in `/chart`. No chart repository is published, so clone the repository and install by path.
+## Installation
+```bash
+git clone https://github.com/huggingface/chat-ui
+cd chat-ui
+helm install chat-ui ./chart -f values.yaml
+```
+## Example values.yaml
+```yaml
+replicas: 1
+domain: example.com
+service:
+  type: ClusterIP
+resources:
+  requests:
+    cpu: 100m
+    memory: 2Gi
+  limits:
+    cpu: "4"
+    memory: 6Gi
+envVars:
+  OPENAI_BASE_URL: https://router.huggingface.co/v1
+  OPENAI_API_KEY: hf_***
+  MONGODB_URL: mongodb://chat-ui-mongo:27017
+```
+See the [configuration overview](../configuration/overview) for all available environment variables.

docs/source/installation/local.md ADDED Viewed

	@@ -0,0 +1,62 @@

+# Running Locally
+## Quick Start
+1. Create a `.env.local` file with your API credentials:
+```ini
+OPENAI_BASE_URL=https://router.huggingface.co/v1
+OPENAI_API_KEY=hf_************************
+```
+2. Install and run:
+```bash
+npm install
+npm run dev -- --open
+```
+That's it! Chat UI will discover available models automatically from your endpoint.
+## Configuration
+Chat UI connects to any OpenAI-compatible API. Set `OPENAI_BASE_URL` to your provider:
+| Provider | `OPENAI_BASE_URL` |
+|----------|-------------------|
+| Hugging Face | `https://router.huggingface.co/v1` |
+| Ollama | `http://127.0.0.1:11434/v1` |
+| llama.cpp | `http://127.0.0.1:8080/v1` |
+| OpenRouter | `https://openrouter.ai/api/v1` |
+See the [configuration overview](../configuration/overview) for all available options.
+## Database
+For **development**, MongoDB is optional. When `MONGODB_URL` is not set, Chat UI uses an embedded MongoDB server that persists data to the `./db` folder.
+For **production**, you should use a dedicated MongoDB instance:
+### Option 1: Local MongoDB (Docker)
+```bash
+docker run -d -p 27017:27017 -v mongo-chat-ui:/data --name mongo-chat-ui mongo:latest
+```
+Then set `MONGODB_URL=mongodb://localhost:27017` in `.env.local`.
+### Option 2: MongoDB Atlas (Managed)
+Use [MongoDB Atlas free tier](https://www.mongodb.com/pricing) for a managed database. Copy the connection string to `MONGODB_URL`.
+## Running in Production
+For production deployments:
+```bash
+npm install
+npm run build
+npm run preview
+```
+The server listens on `http://localhost:4173` by default.