pluralchat

Runtime error

App Files Files Community

victor HF Staff commited on Sep 24, 2025

Commit

7bf1507

unverified ·

1 Parent(s): eeaa128

HuggingChat 2026 (#1875)

Browse files

* refactor: remove tokenizer-related functionality and dependencies

- Removed tokenizer dependencies from package.json.
- Deleted TokensCounter component and its usages across the application.
- Updated model configurations to exclude tokenizer properties.
- Refactored model processing logic to support only OpenAI-compatible endpoints.
- Adjusted API responses to omit tokenizer information.
- Cleaned up related utility functions and imports.

* Remove web search functionality and related components

- Deleted endpoints for various web search APIs (serpApi, serpStack, serper, webLocal, youApi).
- Removed generateQuery function and its usage in search.
- Eliminated web search related types and interfaces from the codebase.
- Updated Assistant and Conversation types to remove references to web search and embedding models.
- Cleaned up related utility functions and message updates for web search.
- Adjusted API routes and components to reflect the removal of web search features.
- Updated Vite configuration to exclude web search dependencies.

* Remove Assistants feature and related code

- Deleted the assistants page and its load function.
- Removed assistantId from conversation handling in server routes.
- Cleaned up conversation page and server routes to eliminate assistant references.
- Removed assistant-related imports and UI components from settings navigation.
- Deleted assistant-specific pages for editing, creating, and displaying avatars.
- Updated tools pages to reflect changes in imports and types.

* refactor: remove AWS endpoint files and update related configurations

* feat: enhance API client to include origin handling and add debug routes

* refactor: prioritize HF_TOKEN for authentication in OpenAI endpoints and update related configurations

* Remove tool management pages and components

- Deleted the ToolEdit component and its associated logic for editing tools.
- Removed the tool search functionality from the tools page.
- Eliminated the tool input component used for handling various input types.
- Removed the layout and page files for individual tool views and editing.
- Cleaned up the new tool creation page by removing the modal and ToolEdit component.

* Refactor codebase to remove tool-related features and improve formatting

- Removed all references to tools in metrics, models, and text generation modules.
- Updated various interfaces and types to reflect the removal of tool functionalities.
- Cleaned up code formatting for better readability and consistency.
- Adjusted API responses and request handling to align with the new structure.
- Ensured all related tests and specifications are updated accordingly.

* refactor: update README to reflect removal of web search and embedding features, and clarify model configuration

* chore: remove search chat feature (UI and /conversations/search API)

* Merge pull request #3 from gary149/remove-most-of-things-2

Remove most of things

* refactor(metrics): remove Prometheus metrics server and usages\n\n- Delete metrics server implementation and all references\n- Drop /metrics endpoint and Prometheus counters\n- Clean Helm templates (ports, ServiceMonitor) and env\n- Remove metrics docs and TOC entry\n- Adjust .env defaults and server hooks

* feat(ui): keep New Chat visible, fix toggles, and polish settings UI\n\n- Always show New Chat in desktop and mobile nav\n- Fix Switch component to toggle on click/keyboard\n- Simplify modal animation and allow disableFly for settings\n- Update settings layouts and styles

* chore(deps): remove prom-client and update lockfile

* chore(dev): allow ngrok host via server.allowedHosts

* refactor(metrics): remove monitoring values from Helm chart

* types: add ambient types for web search sources and stream outputs

- Add and ambient types
- Unblocks TS where older code references these without imports

* ui(settings): consolidate model actions into card and chip-style links

- Group actions into a subtle bordered container
- Promote “New chat” as primary action
- Convert external links and copy action to consistent chips
- Improve wrapping/alignment; adjust modal height and nav behaviors

* ui(chat): tidy message actions and send button styles

- Remove stale comments and unused disabled classes
- Keep send CTA styling consistent across themes

* server(conversation): clean up endpoints and message handling

- Normalize POST/GET handling and error responses
- Simplify retry/continue branches and update storage writes
- Keep rate limiting and guest checks; minor typing tweaks
- Consistent vote/share handlers

* server(models): load models from OPENAI_BASE_URL (OpenAI-compatible)

- Prefer `OPENAI_BASE_URL` (or `OPENAI_MODEL_LIST_URL`) to fetch model list
- Support optional Authorization via HF_TOKEN/OPENAI_API_KEY
- Provide clearer errors when not configured

* server(settings): persist settings fields; minor cleanup

- Keep ethicsModalAccepted optional; set timestamp when provided
- Upsert with createdAt/updatedAt

* utils: message updates iterator and smoothing — minor tidy up

- Keep parsing and smoothing logic intact
- No behavioral changes

* server(models-thumbnail): fix image response typing and return type

- Avoid React type noise by casting result from satori-html
- Return Uint8Array for BodyInit clarity

* ui(layout): minor grid/transition tidy and error toast flow

- Keep layout responsive without behavior changes

* dev: allow dynamic ngrok subdomains in Vite server.allowedHosts

- Use .ngrok-free.app wildcard so fresh tunnels work without edits

* ui: polish nav icon sizing and share icon contrast

- Use square size for sidebar icon button and center content
- Ensure share icon has consistent contrast in light/dark modes

* ui(share): implement two-step share conversation modal; keep footer wording; remove legacy share flow and dark styles; disable duplicate copy tooltip; include leafId on copied URL

* ui(nav): remove skeleton placeholders from conversation list and InfiniteScroll loader

* ui(modal): ensure Escape closes all modals by listening on window and backdrop

* chore: revert unrelated changes from previous commit; keep only Modal Escape behavior

* chore: remove unused dependencies and playwright installation from Dockerfile and package.json

* Merge pull request #4 from gary149/ui-update

UI: remove conversation list skeletons and ensure Escape closes all modals

* build: re-add fs-extra for Vite config

* docs: update metadata in README for improved clarity

* fix: Vite/Svelte v6/5 compat and Docker build

- Replace deprecated Svelte DOM event directives in Switch.svelte (onclick/onkeydown)
- Fix Dockerfile chown by creating /home/user/.npm before chown
- Use CommonJS export in tailwind.config.cjs to silence ESM warning

* feat: default OPENAI_BASE_URL to HF router when unset

* chore: remove OPENAI_MODEL_LIST_URL usage and docs\n\n- Drop all references to OPENAI_MODEL_LIST_URL in code and debug endpoints\n- Default to HF router when OPENAI_BASE_URL is unset\n- Update UI copy and .env comments accordingly

* revert: default base URL fallback (revert 07c1aa44)\n\nRequire explicit OPENAI_BASE_URL again; remove implicit default to HF router in models loader.

* fix: simplify text and improve button styling in ShareConversationModal

* fix: update .env configuration for clarity and remove deprecated parameters

* fix: update version to 0.20.0 in package.json

* refactor: replace HF_TOKEN with OPENAI_API_KEY as the primary authorization token; update documentation and code references to reflect this change

* refactor: remove deprecated tools and assistant features across multiple files

* feat: add API Base URL display in Application Settings

* feat: implement multimodal support with user-configurable overrides and remove deprecated screenshot functionality

* feat: enhance reasoning handling by implementing autodetection of <think> blocks and updating rendering logic

* feat: sanitize titles by stripping <think> markers across multiple components and endpoints

* feat: update multimodal support by replacing CarbonImage with CarbonView and adjusting rendering logic

* feat(models): add model-id filter inputs on models page and settings sidebar; use search input type

* Merge pull request #5 from gary149/feat/model-id-filter-inputs

feat(models): add model-id filtering inputs on models page and settings

* Revert "Merge pull request #5 from gary149/feat/model-id-filter-inputs"

This reverts commit 5dae36913a6e96a5e7c24d5295560a4d40222eca, reversing
changes made to a550b4ece0553de12b0c6730da8815e8a5ec7bfe.

* feat(NavMenu): simplify models link display and always show model count

* fix(CopyToClipBoardBtn): update icon size for better visibility

* fix(layout): replace UserIcon with CarbonSettings for application settings button

* Remove deprecated documentation files and sections related to configuration, installation, and features that are no longer supported in the Chat UI project. This includes the removal of files for common issues, embeddings, multimodal models, OpenAI provider configurations, tools, theming, web search, and local installation instructions. Additionally, the main index file has been cleaned up to reflect the current state of the application.

* fix(svelte.config): enable dotenv override for local environment configuration

* fix(CopyToClipBoardBtn): replace IconCopy with CarbonCopy for consistency
fix(ChatWindow): clean up unused code and simplify conditional rendering
feat(page): update model link copy button to use CarbonCopy icon

* refactor(NavConversationItem): remove unused props and simplify height logic

* fix(package-lock): update version from 0.10.0 to 0.20.0 for consistency
fix(settings page): adjust button spacing and replace CarbonCode icon with CarbonArrowUpRight

* feat(NavConversationItem): adjust height for improved layout consistency
fix(OpenReasoningResults): update hover background color for better visibility
feat(+page): add search filter for model ID in model list
feat(+layout): implement search filter for model ID in settings navigation
fix(tailwind.config): add custom gray shades for enhanced design flexibility

* Refactor APIClient and Chat com

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.env +51 -110
Dockerfile +2 -4
PRIVACY.md +26 -10
README.md +63 -1031
chart/env/prod.yaml +0 -5
chart/templates/deployment.yaml +0 -5
chart/templates/service-monitor.yaml +0 -15
chart/templates/service.yaml +0 -6
chart/values.yaml +1 -2
docs/source/_toctree.yml +0 -64
docs/source/configuration/common-issues.md +0 -7
docs/source/configuration/embeddings.md +0 -105
docs/source/configuration/metrics.md +0 -9
docs/source/configuration/models/multimodal.md +0 -24
docs/source/configuration/models/overview.md +0 -147
docs/source/configuration/models/providers/anthropic.md +0 -117
docs/source/configuration/models/providers/aws.md +0 -35
docs/source/configuration/models/providers/cloudflare.md +0 -35
docs/source/configuration/models/providers/cohere.md +0 -26
docs/source/configuration/models/providers/google.md +0 -92
docs/source/configuration/models/providers/langserve.md +0 -22
docs/source/configuration/models/providers/llamacpp.md +0 -49
docs/source/configuration/models/providers/ollama.md +0 -39
docs/source/configuration/models/providers/openai.md +0 -181
docs/source/configuration/models/providers/tgi.md +0 -66
docs/source/configuration/models/tools.md +0 -62
docs/source/configuration/open-id.md +0 -16
docs/source/configuration/overview.md +0 -10
docs/source/configuration/theming.md +0 -18
docs/source/configuration/web-search.md +0 -58
docs/source/developing/architecture.md +0 -35
docs/source/developing/copy-huggingchat.md +0 -71
docs/source/index.md +0 -97
docs/source/installation/docker.md +0 -11
docs/source/installation/helm.md +0 -35
docs/source/installation/local.md +0 -34
docs/source/installation/spaces.md +0 -9
package-lock.json +0 -0
package.json +3 -42
scripts/populate.ts +3 -80
server.log +2 -0
src/ambient.d.ts +3 -0
src/app.html +12 -9
src/hooks.server.ts +13 -27
src/lib/APIClient.ts +13 -30
src/lib/actions/snapScrollToBottom.ts +2 -3
src/lib/buildPrompt.ts +0 -7
src/lib/components/AssistantSettings.svelte +0 -657
src/lib/components/AssistantToolPicker.svelte +0 -150
src/lib/components/CodeBlock.svelte +58 -6

.env CHANGED Viewed

@@ -1,79 +1,68 @@
 # Use .env.local to change these variables
 # DO NOT EDIT THIS FILE WITH SENSITIVE DATA
-### Config ###
-ENABLE_CONFIG_MANAGER=true
 ### MongoDB ###
 MONGODB_URL=#your mongodb URL here, use chat-ui-db image if you don't want to set this
 MONGODB_DB_NAME=chat-ui
 MONGODB_DIRECT_CONNECTION=false
 ### Local Storage ###
-MODELS_STORAGE_PATH= # where are .gguf for model inference stored
 MONGO_STORAGE_PATH= # where is the db folder stored
-### Endpoints config ###
-HF_API_ROOT=https://api-inference.huggingface.co/models
-# HF_TOKEN is used for a lot of things, not only for inference but also fetching tokenizers, etc.
-# We recommend using an HF_TOKEN even if you use a local endpoint.
-HF_TOKEN= #get it from https://huggingface.co/settings/token
-# API Keys for providers, you will need to specify models in the MODELS section but these keys can be kept secret
-OPENAI_API_KEY=#your openai api key here
-ANTHROPIC_API_KEY=#your anthropic api key here
-CLOUDFLARE_ACCOUNT_ID=#your cloudflare account id here
-CLOUDFLARE_API_TOKEN=#your cloudflare api token here
-COHERE_API_TOKEN=#your cohere api token here
-GOOGLE_GENAI_API_KEY=#your google genai api token here
-### Models ###
-## Models can support many different endpoints, check the documentation for more details
-MODELS=`[
-    {
-      "name": "NousResearch/Hermes-3-Llama-3.1-8B",
-      "description": "Nous Research's latest Hermes 3 release in 8B size.",
-      "promptExamples": [
-        {
-          "title": "Write an email",
-          "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
-        }, {
-          "title": "Code a game",
-          "prompt": "Code a basic snake game in python, give explanations for each step."
-        }, {
-          "title": "Recipe help",
-          "prompt": "How do I make a delicious lemon cheesecake?"
-        }
-      ]
-    }
-]`
-LOAD_GGUF_MODELS=true
-## Text Embedding Models used for websearch
-# Default is a model that runs locally on CPU.
-TEXT_EMBEDDING_MODELS = `[
-  {
-    "name": "Xenova/gte-small",
-    "displayName": "Xenova/gte-small",
-    "description": "Local embedding model running on the server.",
-    "chunkCharLength": 512,
-    "endpoints": [
-      { "type": "transformersjs" }
-    ]
-  }
-]`
-REASONING_SUMMARY=true # Change this to false to disable reasoning summary
-## Removed models, useful for migrating conversations
-# { name: string, displayName?: string, id?: string, transferTo?: string }`
-OLD_MODELS=`[]`
 ## Task model
-# name of the model used for tasks such as summarizing title, creating query, etc.
-# if not set, the first model in MODELS will be used
 TASK_MODEL=
 ### Authentication ###
 # Parameters to enable open id login
@@ -97,41 +86,6 @@ TRUSTED_EMAIL_HEADER=# header to use to get the user email, only use if you know
 ADMIN_CLI_LOGIN=true # set to false to disable the CLI login
 ADMIN_TOKEN=#We recommend leaving this empty, you can get the token from the terminal.
-### Websearch ###
-## API Keys used to activate search with web functionality. websearch is disabled if none are defined. choose one of the following:
-YDC_API_KEY=#your docs.you.com api key here
-SERPER_API_KEY=#your serper.dev api key here
-SERPAPI_KEY=#your serpapi key here
-SERPSTACK_API_KEY=#your serpstack api key here
-SEARCHAPI_KEY=#your searchapi api key here
-USE_LOCAL_WEBSEARCH=#set to true to parse google results yourself, overrides other API keys
-SEARXNG_QUERY_URL=# where '<query>' will be replaced with query keywords see https://docs.searxng.org/dev/search_api.html eg https://searxng.yourdomain.com/search?q=<query>&engines=duckduckgo,google&format=json
-BING_SUBSCRIPTION_KEY=#your key
-## Websearch configuration
-PLAYWRIGHT_ADBLOCKER=true
-WEBSEARCH_ALLOWLIST=`[]` # if it's defined, allow websites from only this list.
-WEBSEARCH_BLOCKLIST=`[]` # if it's defined, block websites from this list.
-WEBSEARCH_JAVASCRIPT=true # CPU usage reduces by 60% on average by disabling javascript. Enable to improve website compatibility
-WEBSEARCH_TIMEOUT = 3500 # in milliseconds, determines how long to wait to load a page before timing out
-ENABLE_LOCAL_FETCH=false #set to true to allow fetches on the local network. /!\ Only enable this if you have the proper firewall rules to prevent SSRF attacks and understand the implications.
-## Public app configuration ##
-PUBLIC_APP_GUEST_MESSAGE=# a message to the guest user. If not set, no message will be shown. Only used if you have authentication enabled.
-PUBLIC_APP_NAME=ChatUI # name used as title throughout the app
-PUBLIC_APP_ASSETS=chatui # used to find logos & favicons in static/$PUBLIC_APP_ASSETS
-PUBLIC_APP_DESCRIPTION=# description used throughout the app
-PUBLIC_APP_DATA_SHARING=# Set to 1 to enable an option in the user settings to share conversations with model authors
-PUBLIC_APP_DISCLAIMER=# Set to 1 to show a disclaimer on login page
-PUBLIC_APP_DISCLAIMER_MESSAGE=# Message to show on the login page
-PUBLIC_ANNOUNCEMENT_BANNERS=`[
-    {
-    "title": "chat-ui is now open source!",
-    "linkTitle": "check it out",
-    "linkHref": "https://github.com/huggingface/chat-ui"
-  }
-]`
 PUBLIC_SMOOTH_UPDATES=false # set to true to enable smoothing of messages client-side, can be CPU intensive
 PUBLIC_ORIGIN=#https://huggingface.co
 PUBLIC_SHARE_PREFIX=#https://hf.co/chat
@@ -144,17 +98,10 @@ PUBLIC_APPLE_APP_ID=#1234567890 / Leave empty to disable
 ### Feature Flags ###
 LLM_SUMMARIZATION=true # generate conversation titles with LLMs
-ENABLE_ASSISTANTS=false #set to true to enable assistants feature
-ENABLE_ASSISTANTS_RAG=false # /!\ This will let users specify arbitrary URLs that the server will then request. Make sure you have the proper firewall rules in place.
-REQUIRE_FEATURED_ASSISTANTS=false # require featured assistants to show in the list
-COMMUNITY_TOOLS=false # set to true to enable community tools
 ALLOW_IFRAME=true # Allow the app to be embedded in an iframe
 ENABLE_DATA_EXPORT=true
-### Tools ###
-# Check out public config in `chart/env/prod.yaml` for more details
-TOOLS=`[]`
 ### Rate limits ###
 # See `src/lib/server/usageLimits.ts`
 # {
@@ -167,21 +114,15 @@ TOOLS=`[]`
 # }
 USAGE_LIMITS=`{}`
 ### HuggingFace specific ###
-# Let user authenticate with their HF token in the /api routes. This is only useful if you have OAuth configured with huggingface.
-USE_HF_TOKEN_IN_API=false
 ## Feature flag & admin settings
 # Used for setting early access & admin flags to users
 HF_ORG_ADMIN=
 HF_ORG_EARLY_ACCESS=
 WEBHOOK_URL_REPORT_ASSISTANT=#provide slack webhook url to get notified for reports/feature requests
-IP_TOKEN_SECRET=
 ### Metrics ###
-METRICS_ENABLED=false
-METRICS_PORT=5565
 LOG_LEVEL=info
@@ -191,19 +132,19 @@ PARQUET_EXPORT_DATASET=
 PARQUET_EXPORT_HF_TOKEN=
 ADMIN_API_SECRET=# secret to admin API calls, like computing usage stats or exporting parquet data
 ### Docker build variables ###
 # These values cannot be updated at runtime
 # They need to be passed when building the docker image
 # See https://github.com/huggingface/chat-ui/main/.github/workflows/deploy-prod.yml#L44-L47
 APP_BASE="" # base path of the app, e.g. /chat, left blank as default
-PUBLIC_APP_COLOR=blue # can be any of tailwind colors: https://tailwindcss.com/docs/customizing-colors#default-color-palette
 ### Body size limit for SvelteKit https://svelte.dev/docs/kit/adapter-node#Environment-variables-BODY_SIZE_LIMIT
 BODY_SIZE_LIMIT=15728640
 PUBLIC_COMMIT_SHA=
 ### LEGACY parameters
-HF_ACCESS_TOKEN=#LEGACY! Use HF_TOKEN instead
 ALLOW_INSECURE_COOKIES=false # LEGACY! Use COOKIE_SECURE and COOKIE_SAMESITE instead
 PARQUET_EXPORT_SECRET=#DEPRECATED, use ADMIN_API_SECRET instead
 RATE_LIMIT= # /!\ DEPRECATED definition of messages per minute. Use USAGE_LIMITS.messagesPerMinute instead

 # Use .env.local to change these variables
 # DO NOT EDIT THIS FILE WITH SENSITIVE DATA
+### Models ###
+# Models are sourced exclusively from an OpenAI-compatible base URL.
+# Example: https://router.huggingface.co/v1
+OPENAI_BASE_URL=
+# Canonical auth token for any OpenAI-compatible provider
+OPENAI_API_KEY=#your provider API key (works for HF router, OpenAI, LM Studio, etc.)
+# Legacy alias (still supported): if set and OPENAI_API_KEY is empty, it will be used
+# HF_TOKEN=
 ### MongoDB ###
 MONGODB_URL=#your mongodb URL here, use chat-ui-db image if you don't want to set this
 MONGODB_DB_NAME=chat-ui
 MONGODB_DIRECT_CONNECTION=false
+## Public app configuration ##
+PUBLIC_APP_GUEST_MESSAGE=# a message to the guest user. If not set, no message will be shown. Only used if you have authentication enabled.
+PUBLIC_APP_NAME=ChatUI # name used as title throughout the app
+PUBLIC_APP_ASSETS=chatui # used to find logos & favicons in static/$PUBLIC_APP_ASSETS
+PUBLIC_APP_DESCRIPTION=# description used throughout the app
+PUBLIC_APP_DATA_SHARING=# Set to 1 to enable an option in the user settings to share conversations with model authors
 ### Local Storage ###
 MONGO_STORAGE_PATH= # where is the db folder stored
+REASONING_SUMMARY=false # Change this to false to disable reasoning summary
+## Models overrides
+MODELS=
 ## Task model
+# Optional: set to the model id/name from the `${OPENAI_BASE_URL}/models` list
+# to use for internal tasks (title summarization, etc). If not set, the current model will be used
 TASK_MODEL=
+# Arch router (OpenAI-compatible) endpoint base URL used for route selection
+# Example: https://api.openai.com/v1 or your hosted Arch endpoint
+LLM_ROUTER_ARCH_BASE_URL=
+## LLM Router Configuration
+# Path to routes policy (JSON array). Defaults to llm-router/routes.chat.json
+LLM_ROUTER_ROUTES_PATH=
+# Model used at the Arch router endpoint for selection
+LLM_ROUTER_ARCH_MODEL=
+# Fallback behavior
+# Route to map "other" to (must exist in routes file)
+LLM_ROUTER_OTHER_ROUTE=casual_conversation
+# Model to call if the Arch selection fails entirely
+LLM_ROUTER_FALLBACK_MODEL=
+# Arch selection timeout in milliseconds (default 10000)
+LLM_ROUTER_ARCH_TIMEOUT_MS=10000
+# Router UI overrides (client-visible)
+# Public display name for the router entry in the model list. Defaults to "Omni".
+PUBLIC_LLM_ROUTER_DISPLAY_NAME=Omni
+# Optional: public logo URL for the router entry. If unset, the UI shows a Carbon icon.
+PUBLIC_LLM_ROUTER_LOGO_URL=
+# Public alias id used for the virtual router model (Omni). Defaults to "omni".
+PUBLIC_LLM_ROUTER_ALIAS_ID=omni
 ### Authentication ###
 # Parameters to enable open id login
 ADMIN_CLI_LOGIN=true # set to false to disable the CLI login
 ADMIN_TOKEN=#We recommend leaving this empty, you can get the token from the terminal.
 PUBLIC_SMOOTH_UPDATES=false # set to true to enable smoothing of messages client-side, can be CPU intensive
 PUBLIC_ORIGIN=#https://huggingface.co
 PUBLIC_SHARE_PREFIX=#https://hf.co/chat
 ### Feature Flags ###
 LLM_SUMMARIZATION=true # generate conversation titles with LLMs
 ALLOW_IFRAME=true # Allow the app to be embedded in an iframe
 ENABLE_DATA_EXPORT=true
 ### Rate limits ###
 # See `src/lib/server/usageLimits.ts`
 # {
 # }
 USAGE_LIMITS=`{}`
 ### HuggingFace specific ###
 ## Feature flag & admin settings
 # Used for setting early access & admin flags to users
 HF_ORG_ADMIN=
 HF_ORG_EARLY_ACCESS=
 WEBHOOK_URL_REPORT_ASSISTANT=#provide slack webhook url to get notified for reports/feature requests
 ### Metrics ###
 LOG_LEVEL=info
 PARQUET_EXPORT_HF_TOKEN=
 ADMIN_API_SECRET=# secret to admin API calls, like computing usage stats or exporting parquet data
+### Config ###
+ENABLE_CONFIG_MANAGER=true
 ### Docker build variables ###
 # These values cannot be updated at runtime
 # They need to be passed when building the docker image
 # See https://github.com/huggingface/chat-ui/main/.github/workflows/deploy-prod.yml#L44-L47
 APP_BASE="" # base path of the app, e.g. /chat, left blank as default
 ### Body size limit for SvelteKit https://svelte.dev/docs/kit/adapter-node#Environment-variables-BODY_SIZE_LIMIT
 BODY_SIZE_LIMIT=15728640
 PUBLIC_COMMIT_SHA=
 ### LEGACY parameters
 ALLOW_INSECURE_COOKIES=false # LEGACY! Use COOKIE_SECURE and COOKIE_SAMESITE instead
 PARQUET_EXPORT_SECRET=#DEPRECATED, use ADMIN_API_SECRET instead
 RATE_LIMIT= # /!\ DEPRECATED definition of messages per minute. Use USAGE_LIMITS.messagesPerMinute instead

Dockerfile CHANGED Viewed

@@ -2,7 +2,6 @@
 ARG INCLUDE_DB=false
 FROM node:20-slim AS base
-ENV PLAYWRIGHT_SKIP_BROWSER_GC=1
 # install dotenv-cli
 RUN npm install -g dotenv-cli
@@ -21,7 +20,6 @@ WORKDIR /app
 RUN touch /app/.env.local
-RUN npm i --no-package-lock --no-save playwright@1.52.0
 USER root
@@ -31,9 +29,9 @@ RUN chown -R 1000:1000 /data/models
 RUN apt-get update
 RUN apt-get install gnupg curl git cmake clang libgomp1 -y
-RUN npx playwright install --with-deps chromium
-RUN chown -R 1000:1000 /home/user/.npm
 USER user

 ARG INCLUDE_DB=false
 FROM node:20-slim AS base
 # install dotenv-cli
 RUN npm install -g dotenv-cli
 RUN touch /app/.env.local
 USER root
 RUN apt-get update
 RUN apt-get install gnupg curl git cmake clang libgomp1 -y
+# ensure npm cache dir exists before adjusting ownership
+RUN mkdir -p /home/user/.npm && chown -R 1000:1000 /home/user/.npm
 USER user

PRIVACY.md CHANGED Viewed

@@ -1,22 +1,38 @@
 ## Privacy
-> Last updated: Feb 14, 2025
-Users of HuggingChat are authenticated through their HF user account.
-We endorse Privacy by Design. As such, your conversations are private to you and will not be shared with anyone, including model authors, for any purpose, including for research or model training purposes.
-You conversation data will only be stored to let you access past conversations. You can click on the Delete icon to delete any past conversation at any moment.
 🗓 Please also consult huggingface.co's main privacy policy at <https://huggingface.co/privacy>. To exercise any of your legal privacy rights, please send an email to <privacy@huggingface.co>.
 ## About available LLMs
 The goal of this app is to showcase that it is now possible to build an open source alternative to ChatGPT. 💪
-We aim to always provide a diverse set of state of the art open LLMs, hence we rotate the available models over time. Discuss available models and request new ones on the [models discussion page](https://huggingface.co/spaces/huggingchat/chat-ui/discussions/372).
-Check the [models](https://huggingface.co/chat/models/) page for an up-to-date list of the best available LLMs.
 ## Technical details
@@ -26,10 +42,10 @@ The app is completely open source, and further development takes place on the [h
 You can find the production configuration for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/chart/env/prod.yaml).
-The inference backend is running the optimized [text-generation-inference](https://github.com/huggingface/text-generation-inference) on HuggingFace's Inference API infrastructure.
-It is possible to deploy a copy of this app to a Space and customize it (swap model, add some UI elements, or store user messages according to your own Terms and conditions). You can also 1-click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
-We welcome any feedback on this app: please participate to the public discussion at <https://huggingface.co/spaces/huggingchat/chat-ui/discussions>
 <a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>

 ## Privacy
+> Last updated: Sep 15, 2025
+Basics:
+- Sign-in: You authenticate with your Hugging Face account.
+- Conversation history: Stored so you can access past chats; you can delete any conversation at any time from the UI.
 🗓 Please also consult huggingface.co's main privacy policy at <https://huggingface.co/privacy>. To exercise any of your legal privacy rights, please send an email to <privacy@huggingface.co>.
+## Data handling and processing
+HuggingChat uses Hugging Face’s Inference Providers to access models from multiple partners via a single API. Depending on the model and availability, inference runs with the corresponding provider.
+- Inference Providers documentation: <https://huggingface.co/docs/inference-providers>
+- Security & Compliance: <https://huggingface.co/docs/inference-providers/security>
+Security and routing facts
+- Hugging Face does not store any user data for training purposes.
+- Hugging Face does not store the request body or the response when routing requests through Hugging Face.
+- Logs are kept for debugging purposes for up to 30 days, but no user data or tokens are stored in those logs.
+- Inference Provider routing uses TLS/SSL to encrypt data in transit.
+- The Hugging Face Hub (which Inference Providers is a feature of) is SOC 2 Type 2 certified. See <https://huggingface.co/docs/hub/security>.
+External providers are responsible for their own security and data handling. Please consult each provider’s respective security and privacy policies via the Inference Providers documentation linked above.
 ## About available LLMs
 The goal of this app is to showcase that it is now possible to build an open source alternative to ChatGPT. 💪
+We aim to always provide a diverse set of state‑of‑the‑art open LLMs, and we may update the available models over time. Discuss models or request new ones on the [models discussion page](https://huggingface.co/spaces/huggingchat/chat-ui/discussions/372).
+Check the [models](https://huggingface.co/chat/models/) page for an up‑to‑date list of the best available LLMs.
 ## Technical details
 You can find the production configuration for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/chart/env/prod.yaml).
+HuggingChat connects to the OpenAI‑compatible Inference Providers router at `https://router.huggingface.co/v1` to access models across multiple providers. Provider selection may be automatic or fixed depending on the model configuration.
+It is possible to deploy a copy of this app to a Space and customize it (swap models, add UI elements, or store user messages according to your own Terms and Conditions). You can also 1‑click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
+We welcome any feedback on this app: please participate in the public discussion at <https://huggingface.co/spaces/huggingchat/chat-ui/discussions>
 <a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>

README.md CHANGED Viewed

@@ -1,236 +1,107 @@
 # Chat UI
-**Find the docs at [hf.co/docs/chat-ui](https://huggingface.co/docs/chat-ui/index).**
-![Chat UI repository thumbnail](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chatui-websearch.png)
 A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the [HuggingChat app on hf.co/chat](https://huggingface.co/chat).
 0. [Quickstart](#quickstart)
-1. [No Setup Deploy](#no-setup-deploy)
-2. [Setup](#setup)
-3. [Launch](#launch)
-4. [Web Search](#web-search)
-5. [Text Embedding Models](#text-embedding-models)
-6. [Extra parameters](#extra-parameters)
-7. [Common issues](#common-issues)
-8. [Deploying to a HF Space](#deploying-to-a-hf-space)
-9. [Building](#building)
-## Quickstart
-### Docker image
-You can deploy a chat-ui instance in a single command using the docker image. Get your huggingface token from [here](https://huggingface.co/settings/tokens).
-```bash
-docker run -p 3000 -e HF_TOKEN=hf_*** -v db:/data ghcr.io/huggingface/chat-ui-db:latest
-```
-Take a look at the [`.env` file](https://github.com/huggingface/chat-ui/blob/main/.env) and the readme to see all the environment variables that you can set. We have endpoint support for all OpenAI API compatible local services as well as many other providers like Anthropic, Cloudflare, Google Vertex AI, etc.
-### Local setup
-You can quickly start a locally running chat-ui & LLM text-generation server thanks to chat-ui's [llama.cpp server support](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
-**Step 1 (Start llama.cpp server):**
-Install llama.cpp w/ brew (for Mac):
-```bash
-# install llama.cpp
-brew install llama.cpp
-```
-or [build directly from the source](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md) for your target device:
-```
-git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make
-```
-Next, start the server with the [LLM of your choice](https://huggingface.co/models?library=gguf):
-```bash
-# start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example)
-llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
 ```
-A local LLaMA.cpp HTTP Server will start on `http://localhost:8080`. Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
-**Step 3 (make sure you have MongoDb running locally):**
-```bash
-docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
-```
-Read more [here](#database).
-**Step 4 (clone chat-ui):**
 ```bash
 git clone https://github.com/huggingface/chat-ui
 cd chat-ui
-```
-**Step 5 (tell chat-ui to use local llama.cpp server):**
-Add the following to your `.env.local`:
-```ini
-MODELS=`[
-  {
-    "name": "microsoft/Phi-3-mini-4k-instruct",
-    "endpoints": [{
-      "type" : "llamacpp",
-      "baseURL": "http://localhost:8080"
-    }],
-  },
-]`
-```
-Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
-**Step 6 (start chat-ui):**
-```bash
 npm install
 npm run dev -- --open
 ```
-Read more [here](#launch).
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-dark.png" height="auto"/>
-## No Setup Deploy
-If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.
-You can deploy your own customized Chat UI instance with any supported [LLM](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending) of your choice on [Hugging Face Spaces](https://huggingface.co/spaces). To do so, use the chat-ui template [available here](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
-Set `HF_TOKEN` in [Space secrets](https://huggingface.co/docs/hub/spaces-overview#managing-secrets) to deploy a model with gated access or a model in a private repository. It's also compatible with [Inference for PROs](https://huggingface.co/blog/inference-pro) curated list of powerful models with higher rate limits. Make sure to create your personal token first in your [User Access Tokens settings](https://huggingface.co/settings/tokens).
-Read the full tutorial [here](https://huggingface.co/docs/hub/spaces-sdks-docker-chatui#chatui-on-spaces).
-## Setup
-The default config for Chat UI is stored in the `.env` file. You will need to override some values to get Chat UI to run locally. This is done in `.env.local`.
-Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
-```env
-MONGODB_URL=<the URL to your MongoDB instance>
-HF_TOKEN=<your access token>
-```
-### Database
-The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.
-You can use a local MongoDB instance. The easiest way is to spin one up using docker:
 ```bash
 docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
 ```
-In which case the url of your DB will be `MONGODB_URL=mongodb://localhost:27017`.
-Alternatively, you can use a [free MongoDB Atlas](https://www.mongodb.com/pricing) instance for this, Chat UI should fit comfortably within their free tier. After which you can set the `MONGODB_URL` variable in `.env.local` to match your instance.
-### Hugging Face Access Token
-If you use a remote inference endpoint, you will need a Hugging Face access token to run Chat UI locally. You can get one from [your Hugging Face profile](https://huggingface.co/settings/tokens).
 ## Launch
-After you're done with the `.env.local` file you can run Chat UI locally with:
 ```bash
 npm install
 npm run dev
 ```
-## Web Search
-Chat UI features a powerful Web Search feature. It works by:
-1. Generating an appropriate search query from the user prompt.
-2. Performing web search and extracting content from webpages.
-3. Creating embeddings from texts using a text embedding model.
-4. From these embeddings, find the ones that are closest to the user query using a vector similarity search. Specifically, we use `inner product` distance.
-5. Get the corresponding texts to those closest embeddings and perform [Retrieval-Augmented Generation](https://huggingface.co/papers/2005.11401) (i.e. expand user prompt by adding those texts so that an LLM can use this information).
-## Text Embedding Models
-By default (for backward compatibility), when `TEXT_EMBEDDING_MODELS` environment variable is not defined, [transformers.js](https://huggingface.co/docs/transformers.js) embedding models will be used for embedding tasks, specifically, [Xenova/gte-small](https://huggingface.co/Xenova/gte-small) model.
-You can customize the embedding model by setting `TEXT_EMBEDDING_MODELS` in your `.env.local` file. For example:
-```env
-TEXT_EMBEDDING_MODELS = `[
-  {
-    "name": "Xenova/gte-small",
-    "displayName": "Xenova/gte-small",
-    "description": "locally running embedding",
-    "chunkCharLength": 512,
-    "endpoints": [
-      {"type": "transformersjs"}
-    ]
-  },
-  {
-    "name": "intfloat/e5-base-v2",
-    "displayName": "intfloat/e5-base-v2",
-    "description": "hosted embedding model",
-    "chunkCharLength": 768,
-    "preQuery": "query: ", # See https://huggingface.co/intfloat/e5-base-v2#faq
-    "prePassage": "passage: ", # See https://huggingface.co/intfloat/e5-base-v2#faq
-    "endpoints": [
-      {
-        "type": "tei",
-        "url": "http://127.0.0.1:8080/",
-        "authorization": "TOKEN_TYPE TOKEN" // optional authorization field. Example: "Basic VVNFUjpQQVNT"
-      }
-    ]
-  }
-]`
 ```
-The required fields are `name`, `chunkCharLength` and `endpoints`.
-Supported text embedding backends are: [`transformers.js`](https://huggingface.co/docs/transformers.js), [`TEI`](https://github.com/huggingface/text-embeddings-inference) and [`OpenAI`](https://platform.openai.com/docs/guides/embeddings). `transformers.js` models run locally as part of `chat-ui`, whereas `TEI` models run in a different environment & accessed through an API endpoint. `openai` models are accessed through the [OpenAI API](https://platform.openai.com/docs/guides/embeddings).
-When more than one embedding models are supplied in `.env.local` file, the first will be used by default, and the others will only be used on LLM's which configured `embeddingModel` to the name of the model.
 ## Extra parameters
-### OpenID connect
-The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
-```env
-OPENID_CONFIG=`{
-  PROVIDER_URL: "<your OIDC issuer>",
-  CLIENT_ID: "<your OIDC client ID>",
-  CLIENT_SECRET: "<your OIDC client secret>",
-  SCOPES: "openid profile",
-  TOLERANCE: // optional
-  RESOURCE: // optional
-}`
-```
-These variables will enable the openID sign-in modal for users.
-### Trusted header authentication
-You can set the env variable `TRUSTED_EMAIL_HEADER` to point to the header that contains the user's email address. This will allow you to authenticate users from the header. This setup is usually combined with a proxy that will be in front of chat-ui and will handle the auth and set the header.
-> [!WARNING]
-> Make sure to only allow requests to chat-ui through your proxy which handles authentication, otherwise users could authenticate as anyone by setting the header manually! Only set this up if you understand the implications and know how to do it correctly.
-Here is a list of header names for common auth providers:
-- Tailscale Serve: `Tailscale-User-Login`
-- Cloudflare Access: `Cf-Access-Authenticated-User-Email`
-- oauth2-proxy: `X-Forwarded-Email`
 ### Theming
 You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
@@ -241,785 +112,32 @@ PUBLIC_APP_ASSETS=chatui
 PUBLIC_APP_COLOR=blue
 PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
 PUBLIC_APP_DATA_SHARING=
-PUBLIC_APP_DISCLAIMER=
 ```
 - `PUBLIC_APP_NAME` The name used as a title throughout the app.
 - `PUBLIC_APP_ASSETS` Is used to find logos & favicons in `static/$PUBLIC_APP_ASSETS`, current options are `chatui` and `huggingchat`.
 - `PUBLIC_APP_COLOR` Can be any of the [tailwind colors](https://tailwindcss.com/docs/customizing-colors#default-color-palette).
 - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
-- `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.
-### Web Search config
-You can enable the web search through an API by adding `YDC_API_KEY` ([docs.you.com](https://docs.you.com)) or `SERPER_API_KEY` ([serper.dev](https://serper.dev/)) or `SERPAPI_KEY` ([serpapi.com](https://serpapi.com/)) or `SERPSTACK_API_KEY` ([serpstack.com](https://serpstack.com/)) or `SEARCHAPI_KEY` ([searchapi.io](https://www.searchapi.io/)) to your `.env.local`.
-You can also simply enable the local google websearch by setting `USE_LOCAL_WEBSEARCH=true` in your `.env.local` or specify a SearXNG instance by adding the query URL to `SEARXNG_QUERY_URL`.
-You can enable javascript when parsing webpages to improve compatibility with `WEBSEARCH_JAVASCRIPT=true` at the cost of increased CPU usage. You'll want at least 4 cores when enabling.
-### Custom models
-You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
-```env
-MODELS=`[
-  {
-    "name": "mistralai/Mistral-7B-Instruct-v0.2",
-    "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
-    "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
-    "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
-    "preprompt": "",
-    "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "temperature": 0.3,
-      "top_p": 0.95,
-      "repetition_penalty": 1.2,
-      "top_k": 50,
-      "truncate": 3072,
-      "max_new_tokens": 1024,
-      "stop": ["</s>"]
-    },
-    "promptExamples": [
-      {
-        "title": "Write an email",
-        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
-      }, {
-        "title": "Code a game",
-        "prompt": "Code a basic snake game in python, give explanations for each step."
-      }, {
-        "title": "Recipe help",
-        "prompt": "How do I make a delicious lemon cheesecake?"
-      }
-    ]
-  }
-]`
-```
-You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
-#### chatPromptTemplate
-In 2025 most chat-completion endpoints (local or remotely hosted) support the OpenAI-compatible API and take arrays of messages.
-If not, when querying the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To identify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
-The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability. You can find the prompts used in production for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md).
-```prompt
-{{preprompt}}
-{{#each messages}}
-  {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
-  {{#ifAssistant}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}
-{{/each}}
-{{assistantMessageToken}}
-```
-> [!INFO]
-> We also support Jinja2 templates for the `chatPromptTemplate` in addition to Handlebars templates. On startup we first try to compile with Jinja and if that fails we fall back to interpreting `chatPromptTemplate` as handlebars.
-#### Multi modal model
-We currently support [IDEFICS](https://huggingface.co/blog/idefics) (hosted on TGI), OpenAI and Claude 3 as multimodal models. You can enable it by setting `multimodal: true` in your `MODELS` configuration. For IDEFICS, you must have a [PRO HF Api token](https://huggingface.co/settings/tokens). For OpenAI, see the [OpenAI section](#openai-api-compatible-models). For Anthropic, see the [Anthropic section](#anthropic).
-```env
-    {
-      "name": "HuggingFaceM4/idefics-80b-instruct",
-      "multimodal" : true,
-      "description": "IDEFICS is the new multimodal model by Hugging Face.",
-      "preprompt": "",
-      "chatPromptTemplate" : "{{#each messages}}{{#ifUser}}User: {{content}}{{/ifUser}}<end_of_utterance>\nAssistant: {{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
-      "parameters": {
-        "temperature": 0.1,
-        "top_p": 0.95,
-        "repetition_penalty": 1.2,
-        "top_k": 12,
-        "truncate": 1000,
-        "max_new_tokens": 1024,
-        "stop": ["<end_of_utterance>", "User:", "\nUser:"]
-      }
-    }
-```
-#### Running your own models using a custom endpoint
-If you want to, instead of hitting models on the Hugging Face Inference API, you can run your own models locally.
-A good option is to hit a [text-generation-inference](https://github.com/huggingface/text-generation-inference), or a llama.cpp endpoint. You will find an example for TGI in the official [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template) for instance: both this app and a text-generation-inference server run inside the same container.
-To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
-```env
-{
-// rest of the model config here
-"endpoints": [{
-  "type" : "tgi",
-  "url": "https://HOST:PORT",
-  }]
-}
-```
-If `endpoints` are left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
-##### OpenAI API compatible models
-Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
-The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
-Parameters not supported by OpenAI (e.g., top_k, repetition_penalty, etc.) must be set in the extraBody of endpoints. Be aware that setting them in parameters will cause them to be omitted.
-```
-MODELS=`[
-  {
-    "name": "text-generation-webui",
-    "id": "text-generation-webui",
-    "parameters": {
-      "temperature": 0.9,
-      "top_p": 0.95,
-      "max_new_tokens": 1024,
-      "stop": []
-    },
-    "endpoints": [{
-      "type" : "openai",
-      "baseURL": "http://localhost:8000/v1",
-      "extraBody": {
-        "repetition_penalty": 1.2,
-        "top_k": 50,
-        "truncate": 1000
-      }
-    }]
-  }
-]`
-```
-The `openai` type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:
-```
-OPENAI_API_KEY=#your openai api key here
-MODELS=`[{
-      "name": "gpt-4",
-      "displayName": "GPT 4",
-      "endpoints" : [{
-        "type": "openai"
-      }]
-},
-      {
-      "name": "gpt-3.5-turbo",
-      "displayName": "GPT 3.5 Turbo",
-      "endpoints" : [{
-        "type": "openai"
-      }]
-}]`
-```
-You may also consume any model provider that provides compatible OpenAI API endpoint. For example, you may self-host [Portkey](https://github.com/Portkey-AI/gateway) gateway and experiment with Claude or GPTs offered by Azure OpenAI. Example for Claude from Anthropic:
-```
-MODELS=`[{
-  "name": "claude-2.1",
-  "displayName": "Claude 2.1",
-  "description": "Anthropic has been founded by former OpenAI researchers...",
-  "parameters": {
-      "temperature": 0.5,
-      "max_new_tokens": 4096,
-  },
-  "endpoints": [
-      {
-          "type": "openai",
-          "baseURL": "https://gateway.example.com/v1",
-          "defaultHeaders": {
-              "x-portkey-config": '{"provider":"anthropic","api_key":"sk-ant-abc...xyz"}'
-          }
-      }
-  ]
-}]`
-```
-Example for GPT 4 deployed on Azure OpenAI:
-```
-MODELS=`[{
-  "id": "gpt-4-1106-preview",
-  "name": "gpt-4-1106-preview",
-  "displayName": "gpt-4-1106-preview",
-  "parameters": {
-      "temperature": 0.5,
-      "max_new_tokens": 4096,
-  },
-  "endpoints": [
-      {
-          "type": "openai",
-          "baseURL": "https://{resource-name}.openai.azure.com/openai/deployments/{deployment-id}",
-          "defaultHeaders": {
-              "api-key": "{api-key}"
-          },
-          "defaultQuery": {
-              "api-version": "2023-05-15"
-          }
-      }
-  ]
-}]`
-```
-Or try Mistral from [Deepinfra](https://deepinfra.com/mistralai/Mistral-7B-Instruct-v0.1/api?example=openai-http):
-> Note, apiKey can either be set custom per endpoint, or globally using `OPENAI_API_KEY` variable.
-```
-MODELS=`[{
-  "name": "mistral-7b",
-  "displayName": "Mistral 7B",
-  "description": "A 7B dense Transformer, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context window.",
-  "parameters": {
-      "temperature": 0.5,
-      "max_new_tokens": 4096,
-  },
-  "endpoints": [
-      {
-          "type": "openai",
-          "baseURL": "https://api.deepinfra.com/v1/openai",
-          "apiKey": "abc...xyz"
-      }
-  ]
-}]`
-```
-_Non-streaming endpoints_
-For endpoints that don´t support streaming like o1 on Azure, you can pass `streamingSupported: false` in your endpoint config:
-```
-MODELS=`[{
-  "id": "o1-preview",
-  "name": "o1-preview",
-  "displayName": "o1-preview",
-  "systemRoleSupported": false,
-  "endpoints": [
-    {
-      "type": "openai",
-      "baseURL": "https://my-deployment.openai.azure.com/openai/deployments/o1-preview",
-      "defaultHeaders": {
-        "api-key": "$SECRET"
-      },
-      "streamingSupported": false,
-    }
-  ]
-}]`
-```
-##### Llama.cpp API server
-chat-ui also supports the llama.cpp API server directly without the need for an adapter. You can do this using the `llamacpp` endpoint type.
-If you want to run Chat UI with llama.cpp, you can do the following, using [microsoft/Phi-3-mini-4k-instruct-gguf](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf) as an example model:
-```bash
-# install llama.cpp
-brew install llama.cpp
-# start llama.cpp server
-llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
-```
-```env
-MODELS=`[
-  {
-      "name": "Local Zephyr",
-      "chatPromptTemplate": "<|system|>\n{{preprompt}}</s>\n{{#each messages}}{{#ifUser}}<|user|>\n{{content}}</s>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}</s>\n{{/ifAssistant}}{{/each}}",
-      "parameters": {
-        "temperature": 0.1,
-        "top_p": 0.95,
-        "repetition_penalty": 1.2,
-        "top_k": 50,
-        "truncate": 1000,
-        "max_new_tokens": 2048,
-        "stop": ["</s>"]
-      },
-      "endpoints": [
-        {
-         "url": "http://127.0.0.1:8080",
-         "type": "llamacpp"
-        }
-      ]
-  }
-]`
-```
-Start chat-ui with `npm run dev` and you should be able to chat with Zephyr locally.
-#### Ollama
-We also support the Ollama inference server. Spin up a model with
-```cli
-ollama run mistral
-```
-Then specify the endpoints like so:
-```env
-MODELS=`[
-  {
-      "name": "Ollama Mistral",
-      "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}",
-      "parameters": {
-        "temperature": 0.1,
-        "top_p": 0.95,
-        "repetition_penalty": 1.2,
-        "top_k": 50,
-        "truncate": 3072,
-        "max_new_tokens": 1024,
-        "stop": ["</s>"]
-      },
-      "endpoints": [
-        {
-         "type": "ollama",
-         "url" : "http://127.0.0.1:11434",
-         "ollamaName" : "mistral"
-        }
-      ]
-  }
-]`
-```
-#### Anthropic
-We also support Anthropic models (including multimodal ones via `multmodal: true`) through the official SDK. You may provide your API key via the `ANTHROPIC_API_KEY` env variable, or alternatively, through the `endpoints.apiKey` as per the following example.
-```
-MODELS=`[
-  {
-      "name": "claude-3-haiku-20240307",
-      "displayName": "Claude 3 Haiku",
-      "description": "Fastest and most compact model for near-instant responsiveness",
-      "multimodal": true,
-      "parameters": {
-        "max_new_tokens": 4096,
-      },
-      "endpoints": [
-        {
-          "type": "anthropic",
-          // optionals
-          "apiKey": "sk-ant-...",
-          "baseURL": "https://api.anthropic.com",
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-  {
-      "name": "claude-3-sonnet-20240229",
-      "displayName": "Claude 3 Sonnet",
-      "description": "Ideal balance of intelligence and speed",
-      "multimodal": true,
-      "parameters": {
-        "max_new_tokens": 4096,
-      },
-      "endpoints": [
-        {
-          "type": "anthropic",
-          // optionals
-          "apiKey": "sk-ant-...",
-          "baseURL": "https://api.anthropic.com",
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-  {
-      "name": "claude-3-opus-20240229",
-      "displayName": "Claude 3 Opus",
-      "description": "Most powerful model for highly complex tasks",
-      "multimodal": true,
-      "parameters": {
-         "max_new_tokens": 4096
-      },
-      "endpoints": [
-        {
-          "type": "anthropic",
-          // optionals
-          "apiKey": "sk-ant-...",
-          "baseURL": "https://api.anthropic.com",
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  }
-]`
-```
-We also support using Anthropic models running on Vertex AI. Authentication is done using Google Application Default Credentials. Project ID can be provided through the `endpoints.projectId` as per the following example:
-```
-MODELS=`[
-  {
-      "name": "claude-3-sonnet@20240229",
-      "displayName": "Claude 3 Sonnet",
-      "description": "Ideal balance of intelligence and speed",
-      "multimodal": true,
-      "parameters": {
-        "max_new_tokens": 4096,
-      },
-      "endpoints": [
-        {
-          "type": "anthropic-vertex",
-          "region": "us-central1",
-          "projectId": "gcp-project-id",
-          // optionals
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-  {
-      "name": "claude-3-haiku@20240307",
-      "displayName": "Claude 3 Haiku",
-      "description": "Fastest, most compact model for near-instant responsiveness",
-      "multimodal": true,
-      "parameters": {
-         "max_new_tokens": 4096
-      },
-      "endpoints": [
-        {
-          "type": "anthropic-vertex",
-          "region": "us-central1",
-          "projectId": "gcp-project-id",
-          // optionals
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  }
-]`
-```
-#### Amazon
-You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
-```env
-"endpoints": [
-    {
-      "type" : "aws",
-      "service" : "sagemaker"
-      "url": "",
-      "accessKey": "",
-      "secretKey" : "",
-      "sessionToken": "",
-      "region": "",
-      "weight": 1
-    }
-]
-```
-You can also set `"service" : "lambda"` to use a lambda instance.
-You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
-#### Cloudflare Workers AI
-You can also use Cloudflare Workers AI to run your own models with serverless inference.
-You will need to have a Cloudflare account, then get your [account ID](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/) as well as your [API token](https://developers.cloudflare.com/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id) for Workers AI.
-You can either specify them directly in your `.env.local` using the `CLOUDFLARE_ACCOUNT_ID` and `CLOUDFLARE_API_TOKEN` variables, or you can set them directly in the endpoint config.
-You can find the list of models available on Cloudflare [here](https://developers.cloudflare.com/workers-ai/models/#text-generation).
-```env
-  {
-  "name" : "nousresearch/hermes-2-pro-mistral-7b",
-  "tokenizer": "nousresearch/hermes-2-pro-mistral-7b",
-  "parameters": {
-    "stop": ["<|im_end|>"]
-  },
-  "endpoints" : [
-    {
-      "type" : "cloudflare"
-      <!-- optionally specify these
-      "accountId": "your-account-id",
-      "authToken": "your-api-token"
-      -->
-    }
-  ]
-}
-```
-#### Cohere
-You can also use Cohere to run their models directly from chat-ui. You will need to have a Cohere account, then get your [API token](https://dashboard.cohere.com/api-keys). You can either specify it directly in your `.env.local` using the `COHERE_API_TOKEN` variable, or you can set it in the endpoint config.
-Here is an example of a Cohere model config. You can set which model you want to use by setting the `id` field to the model name.
-```env
-  {
-    "name" : "CohereForAI/c4ai-command-r-v01",
-    "id": "command-r",
-    "description": "C4AI Command-R is a research release of a 35 billion parameter highly performant generative model",
-    "endpoints": [
-      {
-        "type": "cohere",
-        <!-- optionally specify these, or use COHERE_API_TOKEN
-        "apiKey": "your-api-token"
-        -->
-      }
-    ]
-  }
-```
-##### Google Vertex models
-Chat UI can connect to the google Vertex API endpoints ([List of supported models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models)).
-To enable:
-1. [Select](https://console.cloud.google.com/project) or [create](https://cloud.google.com/resource-manager/docs/creating-managing-projects#creating_a_project) a Google Cloud project.
-1. [Enable billing for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
-1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
-1. [Set up authentication with a service account](https://cloud.google.com/docs/authentication/getting-started)
-   so you can access the API from your local workstation.
-The service account credentials file can be imported as an environmental variable:
-```env
-    GOOGLE_APPLICATION_CREDENTIALS = clientid.json
-```
-Make sure your docker container has access to the file and the variable is correctly set.
-Afterwards Google Vertex endpoints can be configured as following:
-```
-MODELS=`[
-//...
-    {
-       "name": "gemini-1.5-pro",
-       "displayName": "Vertex Gemini Pro 1.5",
-       "multimodal": true,
-       "endpoints" : [{
-          "type": "vertex",
-          "project": "abc-xyz",
-          "location": "europe-west3",
-          "extraBody": {
-          "model_version": "gemini-1.5-pro-preview-0409",
-          },
-          // Optional
-          "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
-          "apiEndpoint": "", // alternative api endpoint url,
-          "tools": [{
-            "googleSearchRetrieval": {
-              "disableAttribution": true
-            }
-          }],
-          "multimodal": {
-            "image": {
-              "supportedMimeTypes": ["image/png", "image/jpeg", "image/webp"],
-              "preferredMimeType": "image/png",
-              "maxSizeInMB": 5,
-              "maxWidth": 2000,
-              "maxHeight": 1000,
-            }
-          }
-       }]
-     },
-]`
-```
-##### LangServe
-LangChain applications that are deployed using LangServe can be called with the following config:
-```
-MODELS=`[
-//...
-    {
-       "name": "summarization-chain", //model-name
-       "endpoints" : [{
-         "type": "langserve",
-         "url" : "http://127.0.0.1:8100",
-       }]
-     },
-]`
-```
-### Model Context Protocol (MCP) Support (Upcoming)
-The project is planning to introduce support for the Model Context Protocol (MCP). MCP is a specification designed to standardize how language models receive and understand context from various sources. This will enable more flexible and powerful integrations, allowing models to seamlessly access and utilize a broader range of information, such as user history, external documents, or real-time data, in a structured way.
-This is an upcoming feature, and we believe it will significantly enhance the capabilities and extensibility of Chat UI.
-We are actively seeking contributions from the community to help design, implement, and integrate MCP support into Chat UI. If you are interested in shaping the future of how Chat UI handles model context and want to contribute to this exciting development, please look for issues tagged with 'MCP' or 'Model Context Protocol' on our issue tracker. Your expertise and input would be invaluable!
-### Custom endpoint authorization
-#### Basic and Bearer
-Custom endpoints may require authorization, depending on how you configure them. Authentication will usually be set either with `Basic` or `Bearer`.
-For `Basic` we will need to generate a base64 encoding of the username and password.
-`echo -n "USER:PASS" | base64`
-> VVNFUjpQQVNT
-For `Bearer` you can use a token, which can be grabbed from [here](https://huggingface.co/settings/tokens).
-You can then add the generated information and the `authorization` parameter to your `.env.local`.
-```env
-"endpoints": [
-  {
-    "url": "https://HOST:PORT",
-    "authorization": "Basic VVNFUjpQQVNT",
-  }
-]
-```
-Please note that if `HF_TOKEN` is also set or not empty, it will take precedence.
-#### Models hosted on multiple custom endpoints
-If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
-```env
-"endpoints": [
-  {
-    "url": "https://HOST:PORT",
-    "weight": 1
-  },
-  {
-    "url": "https://HOST:PORT",
-    "weight": 2
-  }
-  ...
-]
-```
-#### Client Certificate Authentication (mTLS)
-Custom endpoints may require client certificate authentication, depending on how you configure them. To enable mTLS between Chat UI and your custom endpoint, you will need to set the `USE_CLIENT_CERTIFICATE` to `true`, and add the `CERT_PATH` and `KEY_PATH` parameters to your `.env.local`. These parameters should point to the location of the certificate and key files on your local machine. The certificate and key files should be in PEM format. The key file can be encrypted with a passphrase, in which case you will also need to add the `CLIENT_KEY_PASSWORD` parameter to your `.env.local`.
-If you're using a certificate signed by a private CA, you will also need to add the `CA_PATH` parameter to your `.env.local`. This parameter should point to the location of the CA certificate file on your local machine.
-If you're using a self-signed certificate, e.g. for testing or development purposes, you can set the `REJECT_UNAUTHORIZED` parameter to `false` in your `.env.local`. This will disable certificate validation, and allow Chat UI to connect to your custom endpoint.
-#### Specific Embedding Model
-A model can use any of the embedding models defined in `.env.local`, (currently used when web searching),
-by default it will use the first embedding model, but it can be changed with the field `embeddingModel`:
-```env
-TEXT_EMBEDDING_MODELS = `[
-  {
-    "name": "Xenova/gte-small",
-    "chunkCharLength": 512,
-    "endpoints": [
-      {"type": "transformersjs"}
-    ]
-  },
-  {
-    "name": "intfloat/e5-base-v2",
-    "chunkCharLength": 768,
-    "endpoints": [
-      {"type": "tei", "url": "http://127.0.0.1:8080/", "authorization": "Basic VVNFUjpQQVNT"},
-      {"type": "tei", "url": "http://127.0.0.1:8081/"}
-    ]
-  }
-]`
-MODELS=`[
-  {
-      "name": "Ollama Mistral",
-      "chatPromptTemplate": "...",
-      "embeddingModel": "intfloat/e5-base-v2"
-      "parameters": {
-        ...
-      },
-      "endpoints": [
-        ...
-      ]
-  }
-]`
-```
-### Reasoning Models
-ChatUI supports specialized reasoning/Chain-of-Thought (CoT) models through the `reasoning` configuration field. When properly configured, this displays a UI widget that allows users to view or collapse the model’s reasoning steps. We support three types of reasoning parsing:
-#### Token-Based Delimitations
-For models like DeepSeek R1, token-based delimitations can be used to identify reasoning steps. This is done by specifying the `beginToken` and `endToken` fields in the `reasoning` configuration.
-Example configuration for DeepSeek R1 (token-based):
-```json
-{
-	"name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
-	// ...
-	"reasoning": {
-		"type": "tokens",
-		"beginToken": "<think>",
-		"endToken": "</think>"
-	}
-}
-```
-#### Summarizing the Chain of Thought
-For models like QwQ, which return a chain of thought but do not explicitly provide a final answer, the `summarize` type can be used. This automatically summarizes the reasoning steps using the `TASK_MODEL` (or the first model in the configuration if `TASK_MODEL` is not specified) and displays the summary as the final answer.
-Example configuration for QwQ (summarize-based):
-```json
-{
-	"name": "Qwen/QwQ-32B-Preview",
-	// ...
-	"reasoning": {
-		"type": "summarize"
-	}
-}
-```
-#### Regex-Based Delimitations
-In some cases, the final answer can be extracted from the model output using a regular expression. This is achieved by specifying the `regex` field in the `reasoning` configuration. For example, if your model wraps the final answer in a `\boxed{}` tag, you can use the following configuration:
-```json
-{
-	"name": "model/yourmodel",
-	// ...
-	"reasoning": {
-		"type": "regex",
-		"regex": "\\\\boxed\\{(.+?)\\}"
-	}
-}
-```
-#### Enabling/Disabling Reasoning Summary
-You can toggle the summaries that are displayed alongside the CoT by changing the `REASONING_SUMMARY` env variable.
-```env
-REASONING_SUMMARY=false
-```
-## Common issues
-### 403：You don't have access to this conversation
-Most likely you are running chat-ui over HTTP. The recommended option is to setup something like NGINX to handle HTTPS and proxy the requests to chat-ui. If you really need to run over HTTP you can add `COOKIE_SECURE=false` and `COOKIE_SAMESITE=lax` to your `.env.local`.
-Make sure to set your `PUBLIC_ORIGIN` in your `.env.local` to the correct URL as well.
-## Deploying to a HF Space
-Create a `DOTENV_LOCAL` secret to your HF space with the content of your .env.local, and they will be picked up automatically when you run.
 ## Building
@@ -1032,89 +150,3 @@ npm run build
 You can preview the production build with `npm run preview`.
 > To deploy your app, you may need to install an [adapter](https://kit.svelte.dev/docs/adapters) for your target environment.
-## Config changes for HuggingChat
-The config file for HuggingChat is stored in the `chart/env/prod.yaml` file. It is the source of truth for the environment variables used for our CI/CD pipeline. For HuggingChat, as we need to customize the app color, as well as the base path, we build a custom docker image. You can find the workflow here.
-> [!TIP]
-> If you want to make changes to the model config used in production for HuggingChat, you should do so against `chart/env/prod.yaml`.
-### Running a copy of HuggingChat locally
-If you want to run an exact copy of HuggingChat locally, you will need to do the following first:
-1. Create an [OAuth App on the hub](https://huggingface.co/settings/applications/new) with `openid profile email` permissions. Make sure to set the callback URL to something like `http://localhost:5173/chat/login/callback` which matches the right path for your local instance.
-2. Create a [HF Token](https://huggingface.co/settings/tokens) with your Hugging Face account. You will need a Pro account to be able to access some of the larger models available through HuggingChat.
-3. Create a free account with [serper.dev](https://serper.dev/) (you will get 2500 free search queries)
-4. Run an instance of mongoDB, however you want. (Local or remote)
-You can then create a new `.env.SECRET_CONFIG` file with the following content
-```env
-MONGODB_URL=<link to your mongo DB from step 4>
-HF_TOKEN=<your HF token from step 2>
-OPENID_CONFIG=`{
-  PROVIDER_URL: "https://huggingface.co",
-  CLIENT_ID: "<your client ID from step 1>",
-  CLIENT_SECRET: "<your client secret from step 1>",
-}`
-SERPER_API_KEY=<your serper API key from step 3>
-MESSAGES_BEFORE_LOGIN=<can be any numerical value, or set to 0 to require login>
-```
-You can then run `npm run updateLocalEnv` in the root of chat-ui. This will create a `.env.local` file which combines the `chart/env/prod.yaml` and the `.env.SECRET_CONFIG` file. You can then run `npm run dev` to start your local instance of HuggingChat.
-### Populate database
-> [!WARNING]
-> The `MONGODB_URL` used for this script will be fetched from `.env.local`. Make sure it's correct! The command runs directly on the database.
-You can populate the database using faker data using the `populate` script:
-```bash
-npm run populate <flags here>
-```
-At least one flag must be specified, the following flags are available:
-- `reset` - resets the database
-- `all` - populates all tables
-- `users` - populates the users table
-- `settings` - populates the settings table for existing users
-- `assistants` - populates the assistants table for existing users
-- `conversations` - populates the conversations table for existing users
-For example, you could use it like so:
-```bash
-npm run populate reset
-```
-to clear out the database. Then login in the app to create your user and run the following command:
-```bash
-npm run populate users settings assistants conversations
-```
-to populate the database with fake data, including fake conversations and assistants for your user.
-## Building the docker images locally
-You can build the docker images locally using the following commands:
-```bash
-docker build -t chat-ui-db:latest --build-arg INCLUDE_DB=true .
-docker build -t chat-ui:latest --build-arg INCLUDE_DB=false .
-docker build -t huggingchat:latest --build-arg INCLUDE_DB=false --build-arg APP_BASE=/chat --build-arg PUBLIC_APP_COLOR=yellow --build-arg SKIP_LLAMA_CPP_BUILD=true .
-```
-If you want to run the images with your local .env.local you have two options
-```bash
-DOTENV_LOCAL=$(<.env.local)  docker run --network=host -e DOTENV_LOCAL chat-ui-db
-```
-```bash
-docker run --network=host --mount type=bind,source="$(pwd)/.env.local",target=/app/.env.local chat-ui-db
-```

 # Chat UI
+![Chat UI repository thumbnail](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/Frame%2013.png)
 A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the [HuggingChat app on hf.co/chat](https://huggingface.co/chat).
 0. [Quickstart](#quickstart)
+1. [Database Options](#database-options)
+2. [Launch](#launch)
+3. [Optional Docker Image](#optional-docker-image)
+4. [Extra parameters](#extra-parameters)
+5. [Building](#building)
+> Note on models: Chat UI only supports OpenAI-compatible APIs via `OPENAI_BASE_URL` and the `/models` endpoint. Provider-specific integrations (legacy `MODELS` env var, GGUF discovery, embeddings, web-search helpers, etc.) are removed, but any service that speaks the OpenAI protocol—Hugging Face router, llama.cpp server, Ollama’s OpenAI bridge, OpenRouter, Anthropic-on-OpenRouter, etc.—will work.
+## Quickstart
+Chat UI speaks to OpenAI-compatible APIs only. The fastest way to get running is with the Hugging Face Inference Providers router plus your personal Hugging Face access token.
+**Step 1 – Create `.env.local`:**
+```env
+OPENAI_BASE_URL=https://router.huggingface.co/v1
+OPENAI_API_KEY=hf_************************
+# Fill in once you pick a database option below
+MONGODB_URL=
 ```
+`OPENAI_API_KEY` can come from any OpenAI-compatible endpoint you plan to call. Pick the combo that matches your setup and drop the values into `.env.local`:
+| Provider                                      | Example `OPENAI_BASE_URL`          | Example key env                                                         |
+| --------------------------------------------- | ---------------------------------- | ----------------------------------------------------------------------- |
+| Hugging Face Inference Providers router       | `https://router.huggingface.co/v1` | `OPENAI_API_KEY=hf_xxx` (or `HF_TOKEN` legacy alias)                    |
+| llama.cpp server (`llama.cpp --server --api`) | `http://127.0.0.1:8080/v1`         | `OPENAI_API_KEY=sk-local-demo` (any string works; llama.cpp ignores it) |
+| Ollama (with OpenAI-compatible bridge)        | `http://127.0.0.1:11434/v1`        | `OPENAI_API_KEY=ollama`                                                 |
+| OpenRouter                                    | `https://openrouter.ai/api/v1`     | `OPENAI_API_KEY=sk-or-v1-...`                                           |
+Check the root [`.env` template](./.env) for the full list of optional variables you can override.
+**Step 2 – Choose where MongoDB lives:** Either provision a managed cluster (for example MongoDB Atlas) or run a local container. Both approaches are described in [Database Options](#database-options). After you have the URI, drop it into `MONGODB_URL` (and, if desired, set `MONGODB_DB_NAME`).
+**Step 3 – Install and launch the dev server:**
 ```bash
 git clone https://github.com/huggingface/chat-ui
 cd chat-ui
 npm install
 npm run dev -- --open
 ```
+You now have Chat UI running against the Hugging Face router without needing to host MongoDB yourself.
+## Database Options
+Chat history, users, settings, files, and stats all live in MongoDB. You can point Chat UI at any MongoDB 6/7 deployment.
+### MongoDB Atlas (managed)
+1. Create a free cluster at [mongodb.com](https://www.mongodb.com/pricing).
+2. Add your IP (or `0.0.0.0/0` for development) to the network access list.
+3. Create a database user and copy the connection string.
+4. Paste that string into `MONGODB_URL` in `.env.local`. Keep the default `MONGODB_DB_NAME=chat-ui` or change it per environment.
+Atlas keeps MongoDB off your laptop, which is ideal for teams or cloud deployments.
+### Local MongoDB (container)
+If you prefer to run MongoDB locally:
 ```bash
 docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
 ```
+Then set `MONGODB_URL=mongodb://localhost:27017` in `.env.local`. You can also supply `MONGO_STORAGE_PATH` if you want Chat UI’s fallback in-memory server to persist under a specific folder.
 ## Launch
+After configuring your environment variables, start Chat UI with:
 ```bash
 npm install
 npm run dev
 ```
+The dev server listens on `http://localhost:5173` by default. Use `npm run build` / `npm run preview` for production builds.
+## Optional Docker Image
+Prefer containerized setup? You can run everything in one container as long as you supply a MongoDB URI (local or hosted):
+```bash
+docker run \
+  -p 3000 \
+  -e MONGODB_URL=mongodb://host.docker.internal:27017 \
+  -e OPENAI_BASE_URL=https://router.huggingface.co/v1 \
+  -e OPENAI_API_KEY=hf_*** \
+  -v db:/data \
+  ghcr.io/huggingface/chat-ui-db:latest
 ```
+`host.docker.internal` lets the container reach a MongoDB instance on your host machine; swap it for your Atlas URI if you use the hosted option. All environment variables accepted in `.env.local` can be provided as `-e` flags.
 ## Extra parameters
 ### Theming
 You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
 PUBLIC_APP_COLOR=blue
 PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
 PUBLIC_APP_DATA_SHARING=
 ```
 - `PUBLIC_APP_NAME` The name used as a title throughout the app.
 - `PUBLIC_APP_ASSETS` Is used to find logos & favicons in `static/$PUBLIC_APP_ASSETS`, current options are `chatui` and `huggingchat`.
 - `PUBLIC_APP_COLOR` Can be any of the [tailwind colors](https://tailwindcss.com/docs/customizing-colors#default-color-palette).
 - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
+### Models
+This build does not use the `MODELS` env var or GGUF discovery. Configure models via `OPENAI_BASE_URL` only; Chat UI will fetch `${OPENAI_BASE_URL}/models` and populate the list automatically. Authorization uses `OPENAI_API_KEY` (preferred). `HF_TOKEN` remains a legacy alias.
+### LLM Router (Optional)
+Chat UI can perform client-side routing using an Arch Router model without running a separate router service. The UI exposes a virtual model alias called "Omni" (configurable) that, when selected, chooses the best route/model for each message.
+- Provide a routes policy JSON via `LLM_ROUTER_ROUTES_PATH`. No sample file ships with this branch, so you must point the variable to a JSON array you create yourself (for example, commit one in your project like `config/routes.chat.json`). Each route entry needs `name`, `description`, `primary_model`, and optional `fallback_models`.
+- Configure the Arch router selection endpoint with `LLM_ROUTER_ARCH_BASE_URL` (OpenAI-compatible `/chat/completions`) and `LLM_ROUTER_ARCH_MODEL` (e.g. `router/omni`). The Arch call reuses `OPENAI_API_KEY` for auth.
+- Map `other` to a concrete route via `LLM_ROUTER_OTHER_ROUTE` (default: `casual_conversation`). If Arch selection fails, calls fall back to `LLM_ROUTER_FALLBACK_MODEL`.
+- Selection timeout can be tuned via `LLM_ROUTER_ARCH_TIMEOUT_MS` (default 10000).
+- Omni alias configuration: `PUBLIC_LLM_ROUTER_ALIAS_ID` (default `omni`), `PUBLIC_LLM_ROUTER_DISPLAY_NAME` (default `Omni`), and optional `PUBLIC_LLM_ROUTER_LOGO_URL`.
+When you select Omni in the UI, Chat UI will:
+- Call the Arch endpoint once (non-streaming) to pick the best route for the last turns.
+- Emit RouterMetadata immediately (route and actual model used) so the UI can display it.
+- Stream from the selected model via your configured `OPENAI_BASE_URL`. On errors, it tries route fallbacks.
 ## Building
 You can preview the production build with `npm run preview`.
 > To deploy your app, you may need to install an [adapter](https://kit.svelte.dev/docs/adapters) for your target environment.

chart/env/prod.yaml CHANGED Viewed

@@ -51,11 +51,8 @@ envVars:
   COOKIE_SAMESITE: "lax"
   COOKIE_SECURE: "true"
   ENABLE_ASSISTANTS: "true"
-  ENABLE_ASSISTANTS_RAG: "true"
   ENABLE_CONFIG_MANAGER: "false"
-  METRICS_PORT: 5565
   LOG_LEVEL: "debug"
-  METRICS_ENABLED: "true"
   MODELS: >
     [
       {
@@ -542,10 +539,8 @@ envVars:
   PUBLIC_APP_ASSETS: "huggingchat"
   PUBLIC_APP_COLOR: "yellow"
   PUBLIC_APP_DESCRIPTION: "Making the community's best AI chat models available to everyone."
-  PUBLIC_APP_DISCLAIMER_MESSAGE: "Disclaimer: AI is an area of active research with known problems such as biased generation and misinformation. Do not use this application for high-stakes decisions or advice."
   PUBLIC_APP_GUEST_MESSAGE: "Sign in with a free Hugging Face account to continue using HuggingChat."
   PUBLIC_APP_DATA_SHARING: 0
-  PUBLIC_APP_DISCLAIMER: 1
   PUBLIC_PLAUSIBLE_SCRIPT_URL: "/js/script.js"
   REQUIRE_FEATURED_ASSISTANTS: "true"
   TASK_MODEL: >

   COOKIE_SAMESITE: "lax"
   COOKIE_SECURE: "true"
   ENABLE_ASSISTANTS: "true"
   ENABLE_CONFIG_MANAGER: "false"
   LOG_LEVEL: "debug"
   MODELS: >
     [
       {
   PUBLIC_APP_ASSETS: "huggingchat"
   PUBLIC_APP_COLOR: "yellow"
   PUBLIC_APP_DESCRIPTION: "Making the community's best AI chat models available to everyone."
   PUBLIC_APP_GUEST_MESSAGE: "Sign in with a free Hugging Face account to continue using HuggingChat."
   PUBLIC_APP_DATA_SHARING: 0
   PUBLIC_PLAUSIBLE_SCRIPT_URL: "/js/script.js"
   REQUIRE_FEATURED_ASSISTANTS: "true"
   TASK_MODEL: >

chart/templates/deployment.yaml CHANGED Viewed

@@ -53,11 +53,6 @@ spec:
             - containerPort: {{ $.Values.envVars.APP_PORT | default 3000 | int }}
               name: http
               protocol: TCP
-            {{- if $.Values.monitoring.enabled }}
-            - containerPort: {{ $.Values.envVars.METRICS_PORT | default 5565 | int }}
-              name: metrics
-              protocol: TCP
-            {{- end }}
           resources: {{ toYaml .Values.resources | nindent 12 }}
           {{- with $.Values.extraEnv }}
           env:

             - containerPort: {{ $.Values.envVars.APP_PORT | default 3000 | int }}
               name: http
               protocol: TCP
           resources: {{ toYaml .Values.resources | nindent 12 }}
           {{- with $.Values.extraEnv }}
           env:

chart/templates/service-monitor.yaml DELETED Viewed

@@ -1,15 +0,0 @@
-{{- if $.Values.monitoring.enabled }}
-apiVersion: monitoring.coreos.com/v1
-kind: ServiceMonitor
-metadata:
-  labels: {{ include "labels.standard" . | nindent 4 }}
-  name: {{ include "name" . }}
-  namespace: {{ .Release.Namespace }}
-spec:
-  selector:
-    matchLabels: {{ include "labels.standard" . | nindent 6 }}
-  endpoints:
-    - port: metrics
-      path: /metrics
-      interval: 15s
-{{- end }}

chart/templates/service.yaml CHANGED Viewed

@@ -11,11 +11,5 @@ spec:
     port: 80
     protocol: TCP
     targetPort: http
-  {{- if $.Values.monitoring.enabled }}
-  - name: metrics
-    port: 5565
-    protocol: TCP
-    targetPort: metrics
-  {{- end }}
   selector: {{ include "labels.standard" . | nindent 4 }}
   type: {{.Values.service.type}}

     port: 80
     protocol: TCP
     targetPort: http
   selector: {{ include "labels.standard" . | nindent 4 }}
   type: {{.Values.service.type}}

chart/values.yaml CHANGED Viewed

@@ -70,5 +70,4 @@ autoscaling:
   targetMemoryUtilizationPercentage: ""
   targetCPUUtilizationPercentage: ""
-monitoring:
-  enabled: false

   targetMemoryUtilizationPercentage: ""
   targetCPUUtilizationPercentage: ""
+## Metrics removed; monitoring configuration no longer used

docs/source/_toctree.yml DELETED Viewed

@@ -1,64 +0,0 @@
-- local: index
-  title: 🤗 Chat UI
-- title: Installation
-  sections:
-    - local: installation/local
-      title: Local
-    - local: installation/spaces
-      title: Spaces
-    - local: installation/docker
-      title: Docker
-    - local: installation/helm
-      title: Helm
-- title: Configuration
-  sections:
-    - local: configuration/overview
-      title: Overview
-    - local: configuration/theming
-      title: Theming
-    - local: configuration/open-id
-      title: OpenID
-    - local: configuration/web-search
-      title: Web Search
-    - local: configuration/metrics
-      title: Metrics
-    - local: configuration/embeddings
-      title: Text Embedding Models
-    - title: Models
-      sections:
-        - local: configuration/models/overview
-          title: Overview
-        - local: configuration/models/multimodal
-          title: Multimodal
-        - local: configuration/models/tools
-          title: Tools
-        - title: Providers
-          sections:
-            - local: configuration/models/providers/anthropic
-              title: Anthropic
-            - local: configuration/models/providers/aws
-              title: AWS
-            - local: configuration/models/providers/cloudflare
-              title: Cloudflare
-            - local: configuration/models/providers/cohere
-              title: Cohere
-            - local: configuration/models/providers/google
-              title: Google
-            - local: configuration/models/providers/langserve
-              title: Langserve
-            - local: configuration/models/providers/llamacpp
-              title: Llama.cpp
-            - local: configuration/models/providers/ollama
-              title: Ollama
-            - local: configuration/models/providers/openai
-              title: OpenAI
-            - local: configuration/models/providers/tgi
-              title: TGI
-    - local: configuration/common-issues
-      title: Common Issues
-- title: Developing
-  sections:
-    - local: developing/architecture
-      title: Architecture
-    - local: developing/copy-huggingchat
-      title: Copy HuggingChat

docs/source/configuration/common-issues.md DELETED Viewed

@@ -1,7 +0,0 @@
-# Common Issues
-## 403：You don't have access to this conversation
-Most likely you are running chat-ui over HTTP. The recommended option is to setup something like NGINX to handle HTTPS and proxy the requests to chat-ui. If you really need to run over HTTP you can add `ALLOW_INSECURE_COOKIES=true` to your `.env.local`.
-Make sure to set your `PUBLIC_ORIGIN` in your `.env.local` to the correct URL as well.

docs/source/configuration/embeddings.md DELETED Viewed

@@ -1,105 +0,0 @@
-# Text Embedding Models
-By default (for backward compatibility), when `TEXT_EMBEDDING_MODELS` environment variable is not defined, [transformers.js](https://huggingface.co/docs/transformers.js) embedding models will be used for embedding tasks, specifically, the [Xenova/gte-small](https://huggingface.co/Xenova/gte-small) model.
-You can customize the embedding model by setting `TEXT_EMBEDDING_MODELS` in your `.env.local` file where the required fields are `name`, `chunkCharLength` and `endpoints`.
-Supported text embedding backends are: [`transformers.js`](https://huggingface.co/docs/transformers.js), [`TEI`](https://github.com/huggingface/text-embeddings-inference) and [`OpenAI`](https://platform.openai.com/docs/guides/embeddings). `transformers.js` models run locally as part of `chat-ui`, whereas `TEI` models run in a different environment & accessed through an API endpoint. `openai` models are accessed through the [OpenAI API](https://platform.openai.com/docs/guides/embeddings).
-When more than one embedding models are supplied in `.env.local` file, the first will be used by default, and the others will only be used on LLM's which configured `embeddingModel` to the name of the model.
-## Transformers.js
-The Transformers.js backend uses local CPU for the embedding which can be quite slow. If possible, consider using TEI or OpenAI embeddings instead if you use web search frequently, as performance will improve significantly.
-```ini
-TEXT_EMBEDDING_MODELS = `[
-  {
-    "name": "Xenova/gte-small",
-    "displayName": "Xenova/gte-small",
-    "description": "locally running embedding",
-    "chunkCharLength": 512,
-    "endpoints": [
-      { "type": "transformersjs" }
-    ]
-  }
-]`
-```
-## Text Embeddings Inference (TEI)
-> Text Embeddings Inference (TEI) is a comprehensive toolkit designed for efficient deployment and serving of open source text embeddings models. It enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE, and E5.
-Some recommended models at the time of writing (May 2024) are `Snowflake/snowflake-arctic-embed-m` and `BAAI/bge-large-en-v1.5`. You may run TEI locally with GPU support via Docker:
-`docker run --gpus all -p 8080:80 -v tei-data:/data --name tei ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id YOUR/HF_MODEL`
-You can then hook this up to your Chat UI instance with the following configuration.
-```ini
-TEXT_EMBEDDING_MODELS=`[
-  {
-    "name": "YOUR/HF_MODEL",
-    "displayName": "YOUR/HF_MODEL",
-    "preQuery": "Check the model documentation for the preQuery. Not all models have one",
-    "prePassage": "Check the model documentation for the prePassage. Not all models have one",
-    "chunkCharLength": 512,
-    "endpoints": [{
-      "type": "tei",
-      "url": "http://127.0.0.1:8080/"
-    }]
-  }
-]`
-```
-Examples for `Snowflake/snowflake-arctic-embed-m` and `BAAI/bge-large-en-v1.5`:
-```ini
-TEXT_EMBEDDING_MODELS=`[
-  {
-    "name": "Snowflake/snowflake-arctic-embed-m",
-    "displayName": "Snowflake/snowflake-arctic-embed-m",
-    "preQuery": "Represent this sentence for searching relevant passages: ",
-    "chunkCharLength": 512,
-    "endpoints": [{
-      "type": "tei",
-      "url": "http://127.0.0.1:8080/"
-    }]
-  },{
-    "name": "BAAI/bge-large-en-v1.5",
-    "displayName": "BAAI/bge-large-en-v1.5",
-    "chunkCharLength": 512,
-    "endpoints": [{
-      "type": "tei",
-      "url": "http://127.0.0.1:8080/"
-    }]
-  }
-]`
-```
-## OpenAI
-It's also possible to host your own OpenAI API compatible embedding models. [`Infinity`](https://github.com/michaelfeil/infinity) is one example. You may run it locally with Docker:
-`docker run -it --gpus all -v infinity-data:/app/.cache -p 7997:7997 michaelf34/infinity:latest v2 --model-id nomic-ai/nomic-embed-text-v1 --port 7997`
-You can then hook this up to your Chat UI instance with the following configuration.
-```ini
-TEXT_EMBEDDING_MODELS=`[
-  {
-    "name": "nomic-ai/nomic-embed-text-v1",
-    "displayName": "nomic-ai/nomic-embed-text-v1",
-    "chunkCharLength": 512,
-    "model": {
-      "name": "nomic-ai/nomic-embed-text-v1"
-    },
-    "endpoints": [
-      {
-        "type": "openai",
-        "url": "https://127.0.0.1:7997/embeddings"
-      }
-    ]
-  }
-]`
-```

docs/source/configuration/metrics.md DELETED Viewed

@@ -1,9 +0,0 @@
-# Metrics
-The server can expose prometheus metrics on port `5565` but is off by default. You may enable the metrics server with `METRICS_ENABLED=true` and change the port with `METRICS_PORT=1234`.
-<Tip>
-In development with `npm run dev`, the metrics server does not shutdown gracefully due to Sveltekit not providing hooks for restart. It's recommended to disable the metrics server in this case.
-</Tip>

docs/source/configuration/models/multimodal.md DELETED Viewed

@@ -1,24 +0,0 @@
-# Multimodal
-We currently support [IDEFICS](https://huggingface.co/blog/idefics) (hosted on [TGI](./providers/tgi)), OpenAI and Anthropic Claude 3 as multimodal models. You can enable it by setting `multimodal: true` in your `MODELS` configuration. For IDEFICS, you must have a [PRO HF Api token](https://huggingface.co/settings/tokens). For OpenAI, see the [OpenAI section](./providers/openai). For Anthropic, see the [Anthropic section](./providers/anthropic).
-```ini
-MODELS=`[
-  {
-    "name": "HuggingFaceM4/idefics-80b-instruct",
-    "multimodal" : true,
-    "description": "IDEFICS is the new multimodal model by Hugging Face.",
-    "preprompt": "",
-    "chatPromptTemplate" : "{{#each messages}}{{#ifUser}}User: {{content}}{{/ifUser}}<end_of_utterance>\nAssistant: {{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "temperature": 0.1,
-      "top_p": 0.95,
-      "repetition_penalty": 1.2,
-      "top_k": 12,
-      "truncate": 1000,
-      "max_new_tokens": 1024,
-      "stop": ["<end_of_utterance>", "User:", "\nUser:"]
-    }
-  }
-]`
-```

docs/source/configuration/models/overview.md DELETED Viewed

@@ -1,147 +0,0 @@
-# Models Overview
-You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
-```ini
-MODELS=`[
-  {
-    "name": "mistralai/Mistral-7B-Instruct-v0.2",
-    "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
-    "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
-    "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
-    "preprompt": "",
-    "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "temperature": 0.3,
-      "top_p": 0.95,
-      "repetition_penalty": 1.2,
-      "top_k": 50,
-      "truncate": 3072,
-      "max_new_tokens": 1024,
-      "stop": ["</s>"]
-    },
-    "promptExamples": [
-      {
-        "title": "Write an email",
-        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
-      }, {
-        "title": "Code a game",
-        "prompt": "Code a basic snake game in python, give explanations for each step."
-      }, {
-        "title": "Recipe help",
-        "prompt": "How do I make a delicious lemon cheesecake?"
-      }
-    ]
-  }
-]`
-```
-You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
-## Chat Prompt Template
-When querying the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To identify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
-The following is the default `chatPromptTemplate`, although newlines and indentation have been added for readability. You can find the prompts used in production for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md). The templating language used is [Handlebars](https://www.npmjs.com/package/handlebars).
-```handlebars
-{{preprompt}}
-{{#each messages}}
-	{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
-	{{#ifAssistant
-	}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}
-{{/each}}
-{{assistantMessageToken}}
-```
-## Custom endpoint authorization
-### Basic and Bearer
-Custom endpoints may require authorization, depending on how you configure them. Authentication will usually be set either with `Basic` or `Bearer`.
-For `Basic` we will need to generate a base64 encoding of the username and password.
-`echo -n "USER:PASS" | base64`
-> VVNFUjpQQVNT
-For `Bearer` you can use a token, which can be grabbed from [here](https://huggingface.co/settings/tokens).
-You can then add the generated information and the `authorization` parameter to your `.env.local`.
-```ini
-"endpoints": [
-  {
-    "url": "https://HOST:PORT",
-    "authorization": "Basic VVNFUjpQQVNT",
-  }
-]
-```
-Please note that if `HF_TOKEN` is also set or not empty, it will take precedence.
-## Models hosted on multiple custom endpoints
-If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
-```ini
-"endpoints": [
-  {
-    "url": "https://HOST:PORT",
-    "weight": 1
-  },
-  {
-    "url": "https://HOST:PORT",
-    "weight": 2
-  }
-  ...
-]
-```
-## Client Certificate Authentication (mTLS)
-Custom endpoints may require client certificate authentication, depending on how you configure them. To enable mTLS between Chat UI and your custom endpoint, you will need to set the `USE_CLIENT_CERTIFICATE` to `true`, and add the `CERT_PATH` and `KEY_PATH` parameters to your `.env.local`. These parameters should point to the location of the certificate and key files on your local machine. The certificate and key files should be in PEM format. The key file can be encrypted with a passphrase, in which case you will also need to add the `CLIENT_KEY_PASSWORD` parameter to your `.env.local`.
-If you're using a certificate signed by a private CA, you will also need to add the `CA_PATH` parameter to your `.env.local`. This parameter should point to the location of the CA certificate file on your local machine.
-If you're using a self-signed certificate, e.g. for testing or development purposes, you can set the `REJECT_UNAUTHORIZED` parameter to `false` in your `.env.local`. This will disable certificate validation, and allow Chat UI to connect to your custom endpoint.
-## Specific Embedding Model
-A model can use any of the embedding models defined under `TEXT_EMBEDDING_MODELS`, (currently used when web searching). By default it will use the first embedding model, but it can be changed with the field `embeddingModel`:
-```ini
-TEXT_EMBEDDING_MODELS = `[
-  {
-    "name": "Xenova/gte-small",
-    "chunkCharLength": 512,
-    "endpoints": [
-      {"type": "transformersjs"}
-    ]
-  },
-  {
-    "name": "intfloat/e5-base-v2",
-    "chunkCharLength": 768,
-    "endpoints": [
-      {"type": "tei", "url": "http://127.0.0.1:8080/", "authorization": "Basic VVNFUjpQQVNT"},
-      {"type": "tei", "url": "http://127.0.0.1:8081/"}
-    ]
-  }
-]`
-MODELS=`[
-  {
-      "name": "Ollama Mistral",
-      "chatPromptTemplate": "...",
-      "embeddingModel": "intfloat/e5-base-v2"
-      "parameters": {
-        ...
-      },
-      "endpoints": [
-        ...
-      ]
-  }
-]`
-```

docs/source/configuration/models/providers/anthropic.md DELETED Viewed

@@ -1,117 +0,0 @@
-# Anthropic
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | Yes       |
-We also support Anthropic models (including multimodal ones via `multmodal: true`) through the official SDK. You may provide your API key via the `ANTHROPIC_API_KEY` env variable, or alternatively, through the `endpoints.apiKey` as per the following example.
-```ini
-MODELS=`[
-  {
-      "name": "claude-3-haiku-20240307",
-      "displayName": "Claude 3 Haiku",
-      "description": "Fastest and most compact model for near-instant responsiveness",
-      "multimodal": true,
-      "parameters": {
-        "max_new_tokens": 4096,
-      },
-      "endpoints": [
-        {
-          "type": "anthropic",
-          // optionals
-          "apiKey": "sk-ant-...",
-          "baseURL": "https://api.anthropic.com",
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-  {
-      "name": "claude-3-sonnet-20240229",
-      "displayName": "Claude 3 Sonnet",
-      "description": "Ideal balance of intelligence and speed",
-      "multimodal": true,
-      "parameters": {
-        "max_new_tokens": 4096,
-      },
-      "endpoints": [
-        {
-          "type": "anthropic",
-          // optionals
-          "apiKey": "sk-ant-...",
-          "baseURL": "https://api.anthropic.com",
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-  {
-      "name": "claude-3-opus-20240229",
-      "displayName": "Claude 3 Opus",
-      "description": "Most powerful model for highly complex tasks",
-      "multimodal": true,
-      "parameters": {
-         "max_new_tokens": 4096
-      },
-      "endpoints": [
-        {
-          "type": "anthropic",
-          // optionals
-          "apiKey": "sk-ant-...",
-          "baseURL": "https://api.anthropic.com",
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  }
-]`
-```
-## VertexAI
-We also support using Anthropic models running on Vertex AI. Authentication is done using Google Application Default Credentials. Project ID can be provided through the `endpoints.projectId` as per the following example:
-```ini
-MODELS=`[
-  {
-      "name": "claude-3-haiku@20240307",
-      "displayName": "Claude 3 Haiku",
-      "description": "Fastest, most compact model for near-instant responsiveness",
-      "multimodal": true,
-      "parameters": {
-         "max_new_tokens": 4096
-      },
-      "endpoints": [
-        {
-          "type": "anthropic-vertex",
-          "region": "us-central1",
-          "projectId": "gcp-project-id",
-          // optionals
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-  {
-      "name": "claude-3-sonnet@20240229",
-      "displayName": "Claude 3 Sonnet",
-      "description": "Ideal balance of intelligence and speed",
-      "multimodal": true,
-      "parameters": {
-        "max_new_tokens": 4096,
-      },
-      "endpoints": [
-        {
-          "type": "anthropic-vertex",
-          "region": "us-central1",
-          "projectId": "gcp-project-id",
-          // optionals
-          "defaultHeaders": {},
-          "defaultQuery": {}
-        }
-      ]
-  },
-]`
-```

docs/source/configuration/models/providers/aws.md DELETED Viewed

@@ -1,35 +0,0 @@
-# Amazon Web Services (AWS)
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | No        |
-You may specify your Amazon SageMaker instance as an endpoint for Chat UI:
-```ini
-MODELS=`[{
-  "name": "your-model",
-  "displayName": "Your Model",
-  "description": "Your description",
-  "parameters": {
-     "max_new_tokens": 4096
-  },
-  "endpoints": [
-    {
-      "type" : "aws",
-      "service" : "sagemaker"
-      "url": "",
-      "accessKey": "",
-      "secretKey" : "",
-      "sessionToken": "",
-      "region": "",
-      "weight": 1
-    }
-  ]
-}]`
-```
-You can also set `"service": "lambda"` to use a lambda instance.
-You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.

docs/source/configuration/models/providers/cloudflare.md DELETED Viewed

@@ -1,35 +0,0 @@
-# Cloudflare
-| Feature                        | Available |
-| ------------------------------ | --------- |
-| [Tools](../tools.md)           | No        |
-| [Multimodal](../multimodal.md) | No        |
-You may use Cloudflare Workers AI to run your own models with serverless inference.
-You will need to have a Cloudflare account, then get your [account ID](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/) as well as your [API token](https://developers.cloudflare.com/workers-ai/get-started/rest-api/#1-get-an-api-token) for Workers AI.
-You can either specify them directly in your `.env.local` using the `CLOUDFLARE_ACCOUNT_ID` and `CLOUDFLARE_API_TOKEN` variables, or you can set them directly in the endpoint config.
-You can find the list of models available on Cloudflare [here](https://developers.cloudflare.com/workers-ai/models/#text-generation).
-```ini
-MODELS=`[
-  {
-    "name" : "nousresearch/hermes-2-pro-mistral-7b",
-    "tokenizer": "nousresearch/hermes-2-pro-mistral-7b",
-    "parameters": {
-      "stop": ["<|im_end|>"]
-    },
-    "endpoints" : [
-      {
-        "type" : "cloudflare"
-        <!-- optionally specify these
-        "accountId": "your-account-id",
-        "authToken": "your-api-token"
-        -->
-      }
-    ]
-  }
-]`
-```

docs/source/configuration/models/providers/cohere.md DELETED Viewed

@@ -1,26 +0,0 @@
-# Cohere
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | Yes       |
-| [Multimodal](../multimodal) | No        |
-You may use Cohere to run their models directly from Chat UI. You will need to have a Cohere account, then get your [API token](https://dashboard.cohere.com/api-keys). You can either specify it directly in your `.env.local` using the `COHERE_API_TOKEN` variable, or you can set it in the endpoint config.
-Here is an example of a Cohere model config. You can set which model you want to use by setting the `id` field to the model name.
-```ini
-MODELS=`[
-  {
-    "name": "command-r-plus",
-    "displayName": "Command R+",
-    "tools": true,
-    "endpoints": [{
-      "type": "cohere",
-      <!-- optionally specify these, or use COHERE_API_TOKEN
-      "apiKey": "your-api-token"
-      -->
-    }]
-  }
-]`
-```

docs/source/configuration/models/providers/google.md DELETED Viewed

@@ -1,92 +0,0 @@
-# Google
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | No        |
-Chat UI can connect to the google Vertex API endpoints ([List of supported models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models)).
-To enable:
-1. [Select](https://console.cloud.google.com/project) or [create](https://cloud.google.com/resource-manager/docs/creating-managing-projects#creating_a_project) a Google Cloud project.
-1. [Enable billing for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
-1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
-1. [Set up authentication with a service account](https://cloud.google.com/docs/authentication/getting-started)
-   so you can access the API from your local workstation.
-The service account credentials file can be imported as an environmental variable:
-```ini
-GOOGLE_APPLICATION_CREDENTIALS = clientid.json
-```
-Make sure your docker container has access to the file and the variable is correctly set.
-Afterwards Google Vertex endpoints can be configured as following:
-```ini
-MODELS=`[
-  {
-    "name": "gemini-1.5-pro",
-    "displayName": "Vertex Gemini Pro 1.5",
-    "endpoints" : [{
-      "type": "vertex",
-      "project": "abc-xyz",
-      "location": "europe-west3",
-      "extraBody": {
-         "model_version": "gemini-1.5-pro-002",
-      },
-      // Optional
-      "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
-      "apiEndpoint": "", // alternative api endpoint url,
-      "tools": [{
-        "googleSearchRetrieval": {
-          "disableAttribution": true
-        }
-      }]
-    }]
-  }
-]`
-```
-## GenAI
-Or use the Gemini API API provider [from](https://github.com/google-gemini/generative-ai-js#readme):
-Make sure that you have an API key from Google Cloud Platform. To get an API key, follow the instructions [here](https://ai.google.dev/gemini-api/docs/api-key).
-You can either specify them directly in your `.env.local` using the `GOOGLE_GENAI_API_KEY` variables, or you can set them directly in the endpoint config.
-You can find the list of models available [here](https://ai.google.dev/gemini-api/docs/models/gemini), and experimental models available [here](https://ai.google.dev/gemini-api/docs/models/experimental-models).
-```ini
-MODELS=`[
-  {
-    "name": "gemini-1.5-flash",
-    "displayName": "Gemini Flash 1.5",
-    "multimodal": true,
-    "endpoints": [
-      {
-        "type": "genai",
-        // Optional
-        "apiKey": "abc...xyz"
-        "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
-      }
-    ]
-  },
-  {
-    "name": "gemini-1.5-pro",
-    "displayName": "Gemini Pro 1.5",
-    "multimodal": false,
-    "endpoints": [
-      {
-        "type": "genai",
-        // Optional
-        "apiKey": "abc...xyz"
-      }
-    ]
-  }
-]`
-```

docs/source/configuration/models/providers/langserve.md DELETED Viewed

@@ -1,22 +0,0 @@
-# LangServe
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | No        |
-LangChain applications that are deployed using LangServe can be called with the following config:
-```ini
-MODELS=`[
-  {
-    "name": "summarization-chain",
-    "displayName": "Summarization Chain"
-    "endpoints" : [{
-      "type": "langserve",
-      "url" : "http://127.0.0.1:8100",
-    }]
-  }
-]`
-```

docs/source/configuration/models/providers/llamacpp.md DELETED Viewed

@@ -1,49 +0,0 @@
-# Llama.cpp
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | No        |
-Chat UI supports the llama.cpp API server directly without the need for an adapter. You can do this using the `llamacpp` endpoint type.
-If you want to run Chat UI with llama.cpp, you can do the following, using [microsoft/Phi-3-mini-4k-instruct-gguf](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf) as an example model:
-```bash
-# install llama.cpp
-brew install llama.cpp
-# start llama.cpp server
-llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
-```
-_note: you can swap the `hf-repo` and `hf-file` with your fav GGUF on the [Hub](https://huggingface.co/models?library=gguf). For example: `--hf-repo TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF` for [this repo](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF) & `--hf-file tinyllama-1.1b-chat-v1.0.Q4_0.gguf` for [this file](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/blob/main/tinyllama-1.1b-chat-v1.0.Q4_0.gguf)._
-A local LLaMA.cpp HTTP Server will start on `http://localhost:8080` (to change the port or any other default options, please find [LLaMA.cpp HTTP Server readme](https://github.com/ggml-org/llama.cpp/tree/master/tools/server#readme)).
-Add the following to your `.env.local`:
-```ini
-MODELS=`[
-  {
-    "name": "Local microsoft/Phi-3-mini-4k-instruct-gguf",
-    "tokenizer": "microsoft/Phi-3-mini-4k-instruct-gguf",
-    "preprompt": "",
-    "chatPromptTemplate": "<s>{{preprompt}}{{#each messages}}{{#ifUser}}<|user|>\n{{content}}<|end|>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}<|end|>\n{{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "stop": ["<|end|>", "<|endoftext|>", "<|assistant|>"],
-      "temperature": 0.7,
-      "max_new_tokens": 1024,
-      "truncate": 3071
-    },
-    "endpoints": [{
-      "type" : "llamacpp",
-      "baseURL": "http://localhost:8080"
-    }],
-  },
-]`
-```
-<div class="flex justify-center">
-<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-light.png" height="auto"/>
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-dark.png" height="auto"/>
-</div>

docs/source/configuration/models/providers/ollama.md DELETED Viewed

@@ -1,39 +0,0 @@
-# Ollama
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | No        |
-We also support the Ollama inference server. Spin up a model with
-```bash
-ollama run mistral
-```
-Then specify the endpoints like so:
-```ini
-MODELS=`[
-  {
-    "name": "Ollama Mistral",
-    "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "temperature": 0.1,
-      "top_p": 0.95,
-      "repetition_penalty": 1.2,
-      "top_k": 50,
-      "truncate": 3072,
-      "max_new_tokens": 1024,
-      "stop": ["</s>"]
-    },
-    "endpoints": [
-      {
-        "type": "ollama",
-        "url" : "http://127.0.0.1:11434",
-        "ollamaName" : "mistral"
-      }
-    ]
-  }
-]`
-```

docs/source/configuration/models/providers/openai.md DELETED Viewed

@@ -1,181 +0,0 @@
-# OpenAI
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | No        |
-| [Multimodal](../multimodal) | Yes       |
-Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
-The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `/chat/completions`, change to `endpoint.completion` to `completions` to use the `/completions` endpoint.
-```ini
-MODELS=`[
-  {
-    "name": "text-generation-webui",
-    "id": "text-generation-webui",
-    "parameters": {
-      "temperature": 0.9,
-      "top_p": 0.95,
-      "repetition_penalty": 1.2,
-      "top_k": 50,
-      "truncate": 1000,
-      "max_new_tokens": 1024,
-      "stop": []
-    },
-    "endpoints": [{
-      "type" : "openai",
-      "baseURL": "http://localhost:8000/v1"
-    }]
-  }
-]`
-```
-The `openai` type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:
-```ini
-OPENAI_API_KEY=#your openai api key here
-MODELS=`[{
-  "name": "gpt-4",
-  "displayName": "GPT 4",
-  "endpoints" : [{
-    "type": "openai",
-    "apiKey": "or your openai api key here"
-  }]
-},{
-  "name": "gpt-3.5-turbo",
-  "displayName": "GPT 3.5 Turbo",
-  "endpoints" : [{
-    "type": "openai",
-    "apiKey": "or your openai api key here"
-  }]
-}]`
-```
-We also support models in the `o1` family. You need to add a few more options ot the config: Here is an example for `o1-mini`:
-```ini
-MODELS=`[
-  {
-      "name": "o1-mini",
-      "description": "ChatGPT o1-mini",
-      "systemRoleSupported": false,
-      "parameters": {
-        "max_new_tokens": 2048,
-      },
-      "endpoints" : [{
-        "type": "openai",
-        "useCompletionTokens": true,
-      }]
-  }
-]
-```
-You may also consume any model provider that provides compatible OpenAI API endpoint. For example, you may self-host [Portkey](https://github.com/Portkey-AI/gateway) gateway and experiment with Claude or GPTs offered by Azure OpenAI. Example for Claude from Anthropic:
-```ini
-MODELS=`[{
-  "name": "claude-2.1",
-  "displayName": "Claude 2.1",
-  "description": "Anthropic has been founded by former OpenAI researchers...",
-  "parameters": {
-    "temperature": 0.5,
-    "max_new_tokens": 4096,
-  },
-  "endpoints": [
-    {
-      "type": "openai",
-      "baseURL": "https://gateway.example.com/v1",
-      "defaultHeaders": {
-        "x-portkey-config": '{"provider":"anthropic","api_key":"sk-ant-abc...xyz"}'
-      }
-    }
-  ]
-}]`
-```
-Example for GPT 4 deployed on Azure OpenAI:
-```ini
-MODELS=`[{
-  "id": "gpt-4-1106-preview",
-  "name": "gpt-4-1106-preview",
-  "displayName": "gpt-4-1106-preview",
-  "parameters": {
-    "temperature": 0.5,
-    "max_new_tokens": 4096,
-  },
-  "endpoints": [
-    {
-      "type": "openai",
-      "baseURL": "https://{resource-name}.openai.azure.com/openai/deployments/{deployment-id}",
-      "defaultHeaders": {
-        "api-key": "{api-key}"
-      },
-      "defaultQuery": {
-        "api-version": "2023-05-15"
-      }
-    }
-  ]
-}]`
-```
-## DeepInfra
-Or try Mistral from [Deepinfra](https://deepinfra.com/mistralai/Mistral-7B-Instruct-v0.1/api?example=openai-http):
-> Note, apiKey can either be set custom per endpoint, or globally using `OPENAI_API_KEY` variable.
-```ini
-MODELS=`[{
-  "name": "mistral-7b",
-  "displayName": "Mistral 7B",
-  "description": "A 7B dense Transformer, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context window.",
-  "parameters": {
-    "temperature": 0.5,
-    "max_new_tokens": 4096,
-  },
-  "endpoints": [
-    {
-      "type": "openai",
-      "baseURL": "https://api.deepinfra.com/v1/openai",
-      "apiKey": "abc...xyz"
-    }
-  ]
-}]`
-```
-_Non-streaming endpoints_
-For endpoints that don´t support streaming like o1 on Azure, you can pass `streamingSupported: false` in your endpoint config:
-```
-MODELS=`[{
-  "id": "o1-preview",
-  "name": "o1-preview",
-  "displayName": "o1-preview",
-  "systemRoleSupported": false,
-  "endpoints": [
-    {
-      "type": "openai",
-      "baseURL": "https://my-deployment.openai.azure.com/openai/deployments/o1-preview",
-      "defaultHeaders": {
-        "api-key": "$SECRET"
-      },
-      "streamingSupported": false,
-    }
-  ]
-}]`
-```
-## Other
-Some other providers and their `baseURL` for reference.
-[Groq](https://groq.com/): https://api.groq.com/openai/v1
-[Fireworks](https://fireworks.ai/): https://api.fireworks.ai/inference/v1
-```
-```

docs/source/configuration/models/providers/tgi.md DELETED Viewed

@@ -1,66 +0,0 @@
-# Text Generation Inference (TGI)
-| Feature                     | Available |
-| --------------------------- | --------- |
-| [Tools](../tools)           | Yes\*     |
-| [Multimodal](../multimodal) | Yes\*     |
-\* Tools are only supported with the Cohere Command R+ model with the Xenova tokenizers. Please see the [Tools](../tools) section.
-\* Multimodal is only supported with the IDEFICS model. Please see the [Multimodal](../multimodal) section.
-By default, if `endpoints` are left unspecified, Chat UI will look for the model on the hosted Hugging Face inference API using the model name, and use your `HF_TOKEN`. Refer to the [overview](../overview) for more information about model configuration.
-```ini
-MODELS=`[
-  {
-    "name": "mistralai/Mistral-7B-Instruct-v0.2",
-    "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
-    "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
-    "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
-    "preprompt": "",
-    "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "temperature": 0.3,
-      "top_p": 0.95,
-      "repetition_penalty": 1.2,
-      "top_k": 50,
-      "truncate": 3072,
-      "max_new_tokens": 1024,
-      "stop": ["</s>"]
-    },
-    "promptExamples": [
-      {
-        "title": "Write an email",
-        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
-      }, {
-        "title": "Code a game",
-        "prompt": "Code a basic snake game in python, give explanations for each step."
-      }, {
-        "title": "Recipe help",
-        "prompt": "How do I make a delicious lemon cheesecake?"
-      }
-    ]
-  }
-]`
-```
-## Running your own models using a custom endpoint
-If you want to, instead of hitting models on the Hugging Face Inference API, you can run your own models locally.
-A good option is to hit a [text-generation-inference](https://github.com/huggingface/text-generation-inference) endpoint. This is what is done in the official [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template) for instance: both this app and a text-generation-inference server run inside the same container.
-To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
-```ini
-MODELS=`[{
-  "name": "your-model-name",
-  "displayName": "Your Model Name",
-  ... other model config
-  "endpoints": [{
-    "type" : "tgi",
-    "url": "https://HOST:PORT",
-  }]
-}]`
-```

docs/source/configuration/models/tools.md DELETED Viewed

@@ -1,62 +0,0 @@
-# Tools
-Tool calling instructs the model to generate an output matching a user-defined schema, which may be parsed for invoking external tools. The model simply chooses the tools and their parameters. Currently, only `TGI` and `Cohere` with `Command R+` are supported.
-<div class="flex justify-center">
-<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-light.png" height="auto"/>
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-dark.png" height="auto"/>
-</div>
-## TGI Configuration
-A custom tokenizer is required for prompting the model for generating tool calls, as well as prompting with the results. The expected format for these tools and the resulting tool calls are hard coded for TGI, so it's likely that only the following configuration will work:
-```ini
-MODELS=`[
-  {
-    "name" : "CohereForAI/c4ai-command-r-plus",
-    "displayName": "Command R+",
-    "description": "Command R+ is Cohere's latest LLM and is the first open weight model to beat GPT4 in the Chatbot Arena!",
-    "tools": true,
-    "tokenizer": "Xenova/c4ai-command-r-v01-tokenizer",
-    "modelUrl": "https://huggingface.co/CohereForAI/c4ai-command-r-plus",
-    "websiteUrl": "https://docs.cohere.com/docs/command-r-plus",
-    "logoUrl": "https://huggingface.co/datasets/huggingchat/models-logo/resolve/main/cohere-logo.png",
-    "parameters": {
-      "stop": ["<|END_OF_TURN_TOKEN|>"],
-      "truncate" : 28672,
-      "max_new_tokens" : 4096,
-      "temperature" : 0.3
-    }
-  }
-]`
-```
-## Cohere Configuration
-The Cohere provider supports the endpoint native method of tool calling. Refer to the `endpoints/cohere` for implementation details.
-```ini
-MODELS=`[
-  {
-    "name": "command-r-plus",
-    "displayName": "Command R+",
-    "description": "Command R+ is Cohere's latest LLM and is the first open weight model to beat GPT4 in the Chatbot Arena!",
-    "tools": true,
-    "websiteUrl": "https://docs.cohere.com/docs/command-r-plus",
-    "logoUrl": "https://huggingface.co/datasets/huggingchat/models-logo/resolve/main/cohere-logo.png",
-    "endpoints": [{
-      "type": "cohere",
-      "apiKey": "YOUR_API_KEY"
-    }]
-  }
-]`
-```
-## Adding Tools
-Tool implementations are placed in `src/lib/server/tools`, with helpers available for easy integration with HuggingFace Zero GPU spaces. In the future, there may be an OpenAPI interface for adding tools.
-## Adding Support for Additional Models
-The TGI implementation uses a custom tokenizer and hard coded schema for supporting tools. The Cohere implementation, on the other hand, uses the native support in the SDK to emit tool calls. This is the recommended way to add support for more models. Please see the `endpoints/cohere` section of the code for implementation details.

docs/source/configuration/open-id.md DELETED Viewed

@@ -1,16 +0,0 @@
-# OpenID
-The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
-```ini
-OPENID_CONFIG=`{
-  PROVIDER_URL: "<your OIDC issuer>",
-  CLIENT_ID: "<your OIDC client ID>",
-  CLIENT_SECRET: "<your OIDC client secret>",
-  SCOPES: "openid profile",
-  TOLERANCE: // optional
-  RESOURCE: // optional
-}`
-```
-Redirect URI: `/login/callback`

docs/source/configuration/overview.md DELETED Viewed

@@ -1,10 +0,0 @@
-# Configuration Overview
-Chat UI handles configuration with environment variables. The default config for Chat UI is stored in the `.env` file, which you may use as a reference. You will need to override some values to get Chat UI to run locally. This can be done in `.env.local` or via your environment. The bare minimum configuration to get Chat UI running is:
-```ini
-MONGODB_URL=mongodb://localhost:27017
-HF_TOKEN=your_token
-```
-The following sections detail various sections of the app you may want to configure.

docs/source/configuration/theming.md DELETED Viewed

@@ -1,18 +0,0 @@
-# Theming
-You can use a few environment variables to customize the look and feel of Chat UI. These are by default:
-```ini
-PUBLIC_APP_NAME=ChatUI
-PUBLIC_APP_ASSETS=chatui
-PUBLIC_APP_COLOR=blue
-PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
-PUBLIC_APP_DATA_SHARING=
-PUBLIC_APP_DISCLAIMER=
-```
-- `PUBLIC_APP_NAME` The name used as a title throughout the app.
-- `PUBLIC_APP_ASSETS` Is used to find logos & favicons in `static/$PUBLIC_APP_ASSETS`, current options are `chatui` and `huggingchat`.
-- `PUBLIC_APP_COLOR` Can be any of the [tailwind colors](https://tailwindcss.com/docs/customizing-colors#default-color-palette).
-- `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
-- `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.

docs/source/configuration/web-search.md DELETED Viewed

@@ -1,58 +0,0 @@
-# Web Search
-Chat UI features a powerful Web Search feature. A high level overview of how it works:
-1. Generate an appropriate search query from the user prompt using the `TASK_MODEL`
-2. Perform web search via an external provider (i.e. Serper) or via locally scrape Google results
-3. Load each search result into playwright and scrape
-4. Convert scraped HTML to Markdown tree with headings as parents
-5. Create embeddings for each Markdown element
-6. Find the embeddings closest to the user query using a vector similarity search (inner product)
-7. Get the corresponding Markdown elements and their parent, up to 8000 characters
-8. Supply the information as context to the model
-<div class="flex justify-center">
-<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-light.png" height="auto"/>
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-dark.png" height="auto"/>
-</div>
-## Providers
-Many providers are supported for the web search, or you can use locally scraped Google results.
-### Local
-For locally scraped Google results, put `USE_LOCAL_WEBSEARCH=true` in your `.env.local`. Please note that you may hit rate limits as we make no attempt to make the traffic look legitimate. To avoid this, you may choose a provider, such as Serper, used on the official instance.
-### SearXNG
-> SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
-You may enable support via the `SEARXNG_QUERY_URL` where `<query>` will be replaced with the query keywords. Please see [the official documentation](https://docs.searxng.org/dev/search_api.html) for more information
-Example: `https://searxng.yourdomain.com/search?q=<query>&engines=duckduckgo,google&format=json`
-### Third Party
-Many third party providers are supported as well. The official instance uses Serper.
-```ini
-YDC_API_KEY=docs.you.com api key here
-SERPER_API_KEY=serper.dev api key here
-SERPAPI_KEY=serpapi key here
-SERPSTACK_API_KEY=serpstack api key here
-SEARCHAPI_KEY=searchapi api key here
-```
-## Block/Allow List
-You may block or allow specific websites from the web search results. When using an allow list, only the links in the allowlist will be used. For supported search engines, the links will be blocked from the results directly. Any URL in the results that **partially or fully matches** the entry will be filtered out.
-```ini
-WEBSEARCH_BLOCKLIST=`["youtube.com", "https://example.com/foo/bar"]`
-WEBSEARCH_ALLOWLIST=`["stackoverflow.com"]`
-```
-## Disabling Javascript
-By default, Playwright will execute all Javascript on the page. This can be intensive, requiring up to 6 cores for full performance, on some webpages. You may block scripts from running by settings `WEBSEARCH_JAVASCRIPT=false`. However, this will not block Javascript inlined in the HTML.

docs/source/developing/architecture.md DELETED Viewed

@@ -1,35 +0,0 @@
-# Architecture
-This document discusses the high level overview of the Chat UI codebase. If you're looking to contribute or just want to understand how the codebase works, this is the place for you!
-## Overview
-Chat UI provides a simple interface connecting LLMs to external information and tools. The project uses [MongoDB](https://www.mongodb.com/) and [SvelteKit](https://kit.svelte.dev/) with [Tailwind](https://tailwindcss.com/).
-## Code Map
-This section discusses various modules of the codebase briefly. The headings are not paths since the codebase structure may change.
-### `routes`
-Provides all of the routes rendered with SSR via SvelteKit. The majority of backend and frontend logic can be found here, with some modules being pulled out into `lib` for the client and `lib/server` for the server.
-### `textGeneration`
-Provides a standard interface for most chat features such as model output, web search, assistants and tools. Outputs `MessageUpdate`s which provide fine-grained updates on the request status such as new tokens and web search results.
-### `endpoints`/`embeddingEndpoints`
-Provides a common streaming interface for many third party LLM and embedding providers.
-### `websearch`
-Implements web search querying and RAG. See the [Web Search](../configuration/web-search) section for more information.
-### `tools`
-Provides a common interface for external tools called by LLMs. See the [Tools](../configuration/models/tools.md) section for more information
-### `migrations`
-Includes all MongoDB migrations for maintaining backwards compatibility across schema changes. Any changes to the schema must include a migration

docs/source/developing/copy-huggingchat.md DELETED Viewed

@@ -1,71 +0,0 @@
-# Copy HuggingChat
-The config file for HuggingChat is stored in the `chart/env/prod.yaml` file. It is the source of truth for the environment variables used for our CI/CD pipeline. For HuggingChat, as we need to customize the app color, as well as the base path, we build a custom docker image. You can find the workflow here.
-<Tip>
-If you want to make changes to the model config used in production for HuggingChat, you should do so against `chart/env/prod.yaml`.
-</Tip>
-### Running a copy of HuggingChat locally
-If you want to run an exact copy of HuggingChat locally, you will need to do the following first:
-1. Create an [OAuth App on the hub](https://huggingface.co/settings/applications/new) with `openid profile email` permissions. Make sure to set the callback URL to something like `http://localhost:5173/chat/login/callback` which matches the right path for your local instance.
-2. Create a [HF Token](https://huggingface.co/settings/tokens) with your Hugging Face account. You will need a Pro account to be able to access some of the larger models available through HuggingChat.
-3. Create a free account with [serper.dev](https://serper.dev/) (you will get 2500 free search queries)
-4. Run an instance of MongoDB, however you want. (Local or remote)
-You can then create a new `.env.SECRET_CONFIG` file with the following content
-```ini
-MONGODB_URL=<link to your mongo DB from step 4>
-HF_TOKEN=<your HF token from step 2>
-OPENID_CONFIG=`{
-  PROVIDER_URL: "https://huggingface.co",
-  CLIENT_ID: "<your client ID from step 1>",
-  CLIENT_SECRET: "<your client secret from step 1>",
-}`
-SERPER_API_KEY=<your serper API key from step 3>
-MESSAGES_BEFORE_LOGIN=<can be any numerical value, or set to 0 to require login>
-```
-You can then run `npm run updateLocalEnv` in the root of chat-ui. This will create a `.env.local` file which combines the `chart/env/prod.yaml` and the `.env.SECRET_CONFIG` file. You can then run `npm run dev` to start your local instance of HuggingChat.
-### Populate database
-<Tip warning={true}>
-The `MONGODB_URL` used for this script will be fetched from `.env.local`. Make sure it's correct! The command runs directly on the database.
-</Tip>
-You can populate the database using faker data using the `populate` script:
-```bash
-npm run populate <flags here>
-```
-At least one flag must be specified, the following flags are available:
-- `reset` - resets the database
-- `all` - populates all tables
-- `users` - populates the users table
-- `settings` - populates the settings table for existing users
-- `assistants` - populates the assistants table for existing users
-- `conversations` - populates the conversations table for existing users
-For example, you could use it like so:
-```bash
-npm run populate reset
-```
-to clear out the database. Then login in the app to create your user and run the following command:
-```bash
-npm run populate users settings assistants conversations
-```
-to populate the database with fake data, including fake conversations and assistants for your user.

docs/source/index.md DELETED Viewed

@@ -1,97 +0,0 @@
-# 🤗 Chat UI
-Open source chat interface with support for tools, web search, multimodal and many API providers. The app uses MongoDB and SvelteKit behind the scenes. Try the live version of the app called [HuggingChat on hf.co/chat](https://huggingface.co/chat) or [setup your own instance](./installation/spaces).
-🔧 **[Tools](./configuration/models/tools)**: Function calling with custom tools and support for [Zero GPU spaces](https://huggingface.co/spaces/enzostvs/zero-gpu-spaces)
-🔍 **[Web Search](./configuration/web-search)**: Automated web search, scraping and RAG for all models
-🐙 **[Multimodal](./configuration/models/multimodal)**: Accepts image file uploads on supported providers
-👤 **[OpenID](./configuration/open-id)**: Optionally setup OpenID for user authentication
-<div class="flex gap-x-4">
-<div>
-Tools
-<div class="flex justify-center">
-<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-light.png" height="auto"/>
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-dark.png" height="auto"/>
-</div>
-</div>
-<div>
-Web Search
-<div class="flex justify-center">
-<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-light.png" height="auto"/>
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-dark.png" height="auto"/>
-</div>
-</div>
-</div>
-## Quickstart
-You can quickly have a locally running chat-ui & LLM text-generation server thanks to chat-ui's [llama.cpp server support](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
-**Step 1 (Start llama.cpp server):**
-```bash
-# install llama.cpp
-brew install llama.cpp
-# start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example)
-llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
-```
-A local LLaMA.cpp HTTP Server will start on `http://localhost:8080`. Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
-**Step 2 (tell chat-ui to use local llama.cpp server):**
-Add the following to your `.env.local`:
-```ini
-MODELS=`[
-  {
-    "name": "Local microsoft/Phi-3-mini-4k-instruct-gguf",
-    "tokenizer": "microsoft/Phi-3-mini-4k-instruct-gguf",
-    "preprompt": "",
-    "chatPromptTemplate": "<s>{{preprompt}}{{#each messages}}{{#ifUser}}<|user|>\n{{content}}<|end|>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}<|end|>\n{{/ifAssistant}}{{/each}}",
-    "parameters": {
-      "stop": ["<|end|>", "<|endoftext|>", "<|assistant|>"],
-      "temperature": 0.7,
-      "max_new_tokens": 1024,
-      "truncate": 3071
-    },
-    "endpoints": [{
-      "type" : "llamacpp",
-      "baseURL": "http://localhost:8080"
-    }],
-  },
-]`
-```
-Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
-**Step 3 (make sure you have MongoDb running locally):**
-```bash
-docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
-```
-Read more [here](https://github.com/huggingface/chat-ui?tab=Readme-ov-file#database).
-**Step 4 (start chat-ui):**
-```bash
-git clone https://github.com/huggingface/chat-ui
-cd chat-ui
-npm install
-npm run dev -- --open
-```
-Read more [here](https://github.com/huggingface/chat-ui?tab=readme-ov-file#launch).
-<div class="flex justify-center">
-<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-light.png" height="auto"/>
-<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-dark.png" height="auto"/>
-</div>

docs/source/installation/docker.md DELETED Viewed

@@ -1,11 +0,0 @@
-# Running on Docker
-Pre-built docker images are provided with and without MongoDB built in. Refer to the [configuration section](../configuration/overview) for env variables that must be provided. We recommend using the `--env-file` option to avoid leaking secrets into your shell history.
-```bash
-# Without built-in DB
-docker run -p 3000:3000 --env-file .env.local --name chat-ui ghcr.io/huggingface/chat-ui
-# With built-in DB
-docker run -p 3000:3000 --env-file .env.local -v chat-ui:/data --name chat-ui ghcr.io/huggingface/chat-ui-db
-```

docs/source/installation/helm.md DELETED Viewed

@@ -1,35 +0,0 @@
-# Helm
-<Tip warning={true}>
-**We highly discourage using the chart**. The Helm chart is a work in progress and should be considered unstable. Breaking changes to the chart may be pushed without migration guides or notice. Contributions welcome!
-</Tip>
-For installation on Kubernetes, you may use the helm chart in `/chart`. Please note that no chart repository has been setup, so you'll need to clone the repository and install the chart by path. The production values may be found at `chart/env/prod.yaml`.
-**Example values.yaml**
-```yaml
-replicas: 1
-domain: example.com
-service:
-  type: ClusterIP
-resources:
-  requests:
-    cpu: 100m
-    memory: 2Gi
-  limits:
-    # Recommended to use large limits when web search is enabled
-    cpu: "4"
-    memory: 6Gi
-envVars:
-  MONGODB_URL: mongodb://chat-ui-mongo:27017
-  # Ensure that your values.yaml will not leak anywhere
-  # PRs welcome for a chart rework with envFrom support!
-  HF_TOKEN: secret_token
-```

docs/source/installation/local.md DELETED Viewed

@@ -1,34 +0,0 @@
-# Running Locally
-You may start an instance locally for non-production use cases. For production use cases, please see the other installation options.
-## Configuration
-The default config for Chat UI is stored in the `.env` file. You will need to override some values to get Chat UI to run locally. Start by creating a `.env.local` file in the root of the repository as per the [configuration section](../configuration/overview). The bare minimum config you need to get Chat UI to run locally is the following:
-```ini
-MONGODB_URL=<the URL to your MongoDB instance>
-HF_TOKEN=<your access token> # find your token at hf.co/settings/token
-```
-## Database
-The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.
-You can use a local MongoDB instance. The easiest way is to spin one up using docker with persistence:
-```bash
-docker run -d -p 27017:27017 -v mongo-chat-ui:/data --name mongo-chat-ui mongo:latest
-```
-In which case the url of your DB will be `MONGODB_URL=mongodb://localhost:27017`.
-Alternatively, you can use a [free MongoDB Atlas](https://www.mongodb.com/pricing) instance for this, Chat UI should fit comfortably within their free tier. After which you can set the `MONGODB_URL` variable in `.env.local` to match your instance.
-## Starting the server
-```bash
-npm ci # install dependencies
-npm run build # build the project
-npm run preview -- --open # start the server with & open your instance at http://localhost:4173
-```

docs/source/installation/spaces.md DELETED Viewed

@@ -1,9 +0,0 @@
-# Running on Huggingface Spaces
-If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.
-You can deploy your own customized Chat UI instance with any supported [LLM](https://huggingface.co/models?pipeline_tag=text-generation) of your choice on [Hugging Face Spaces](https://huggingface.co/spaces). To do so, use the chat-ui template [available here](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
-Set `HF_TOKEN` in [Space secrets](https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables) to deploy a model with gated access or a model in a private repository. It's also compatible with [Inference for PROs](https://huggingface.co/blog/inference-pro) curated list of powerful models with higher rate limits. Make sure to create your personal token first in your [User Access Tokens settings](https://huggingface.co/settings/tokens).
-Read the full tutorial [here](https://huggingface.co/docs/hub/spaces-sdks-docker-chatui#chatui-on-spaces).

package-lock.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "chat-ui",
-	"version": "0.10.0",
 	"private": true,
 	"packageManager": "npm@9.5.0",
 	"scripts": {
@@ -18,9 +18,7 @@
 		"prepare": "husky"
 	},
 	"devDependencies": {
-		"@elysiajs/cors": "^1.3.3",
 		"@elysiajs/eden": "^1.3.2",
-		"@elysiajs/node": "^1.2.6",
 		"@faker-js/faker": "^8.4.1",
 		"@iconify-json/carbon": "^1.1.16",
 		"@iconify-json/eos-icons": "^1.1.6",
@@ -29,17 +27,12 @@
 		"@sveltejs/vite-plugin-svelte": "^5.0.3",
 		"@tailwindcss/typography": "^0.5.9",
 		"@types/dompurify": "^3.0.5",
-		"@types/express": "^4.17.21",
-		"@types/fs-extra": "^11.0.4",
 		"@types/js-yaml": "^4.0.9",
-		"@types/jsdom": "^21.1.1",
-		"@types/jsonpath": "^0.2.4",
 		"@types/katex": "^0.16.7",
 		"@types/mime-types": "^2.1.4",
 		"@types/minimist": "^1.2.5",
 		"@types/node": "^22.1.0",
 		"@types/parquetjs": "^0.10.3",
-		"@types/sbd": "^1.0.5",
 		"@types/uuid": "^9.0.8",
 		"@types/yazl": "^3.3.0",
 		"@typescript-eslint/eslint-plugin": "^6.x",
@@ -50,23 +43,19 @@
 		"eslint": "^8.28.0",
 		"eslint-config-prettier": "^8.5.0",
 		"eslint-plugin-svelte": "^2.45.1",
-		"fs-extra": "^11.3.0",
 		"isomorphic-dompurify": "^2.13.0",
 		"js-yaml": "^4.1.0",
-		"jsonrepair": "^3.12.0",
 		"minimist": "^1.2.8",
 		"mongodb-memory-server": "^10.1.2",
-		"node-llama-cpp": "^3.6.0",
 		"prettier": "^3.5.3",
 		"prettier-plugin-svelte": "^3.2.6",
 		"prettier-plugin-tailwindcss": "^0.6.11",
-		"prom-client": "^15.1.2",
 		"sade": "^1.8.1",
 		"superjson": "^2.2.2",
 		"svelte": "^5.33.3",
 		"svelte-check": "^4.0.0",
 		"svelte-gestures": "^5.1.3",
-		"ts-node": "^10.9.1",
 		"tslib": "^2.4.1",
 		"typescript": "^5.5.0",
 		"unplugin-icons": "^0.16.1",
@@ -77,52 +66,34 @@
 	},
 	"type": "module",
 	"dependencies": {
-		"@aws-sdk/credential-providers": "^3.592.0",
-		"@cliqz/adblocker-playwright": "^1.34.0",
 		"@elysiajs/swagger": "^1.3.0",
 		"@gradio/client": "^1.8.0",
 		"@huggingface/hub": "^2.2.0",
 		"@huggingface/inference": "^3.12.1",
-		"@huggingface/mcp-client": "^0.1.1",
-		"@huggingface/tasks": "^0.19.1",
-		"@huggingface/transformers": "^3.1.1",
 		"@iconify-json/bi": "^1.1.21",
-		"@playwright/browser-chromium": "^1.52.0",
 		"@resvg/resvg-js": "^2.6.2",
 		"autoprefixer": "^10.4.14",
-		"aws-sigv4-fetch": "^4.0.1",
-		"aws4": "^1.13.0",
 		"date-fns": "^2.29.3",
 		"dotenv": "^16.5.0",
-		"express": "^4.21.2",
 		"file-type": "^21.0.0",
-		"google-auth-library": "^9.13.0",
 		"handlebars": "^4.7.8",
 		"highlight.js": "^11.7.0",
 		"husky": "^9.0.11",
-		"image-size": "^1.2.1",
 		"ip-address": "^9.0.5",
-		"jose": "^5.3.0",
-		"jsdom": "^22.0.0",
 		"json5": "^2.2.3",
-		"jsonpath": "^1.1.1",
 		"katex": "^0.16.21",
 		"lint-staged": "^15.2.7",
 		"marked": "^12.0.1",
 		"mongodb": "^5.8.0",
 		"nanoid": "^5.0.9",
-		"natural": "^8.1.0",
 		"openid-client": "^5.4.2",
 		"parquetjs": "^0.11.2",
 		"pino": "^9.0.0",
 		"pino-pretty": "^11.0.0",
-		"playwright": "^1.52.0",
 		"postcss": "^8.4.31",
-		"saslprep": "^1.0.3",
 		"satori": "^0.10.11",
 		"satori-html": "^0.3.2",
-		"sbd": "^1.0.19",
-		"serpapi": "^1.1.1",
 		"sharp": "^0.33.4",
 		"tailwind-scrollbar": "^3.0.0",
 		"tailwindcss": "^3.4.0",
@@ -130,16 +101,6 @@
 		"vitest-browser-svelte": "^0.1.0",
 		"zod": "^3.22.3"
 	},
-	"optionalDependencies": {
-		"@anthropic-ai/sdk": "^0.32.1",
-		"@anthropic-ai/vertex-sdk": "^0.4.1",
-		"@aws-sdk/client-bedrock-runtime": "^3.631.0",
-		"@google-cloud/vertexai": "^1.1.0",
-		"@google/generative-ai": "^0.24.0",
-		"aws4fetch": "^1.0.17",
-		"cohere-ai": "^7.9.0",
-		"openai": "^4.44.0"
-	},
 	"overrides": {
 		"@reflink/reflink": "file:stub/@reflink/reflink"
 	}

 {
 	"name": "chat-ui",
+	"version": "0.20.0",
 	"private": true,
 	"packageManager": "npm@9.5.0",
 	"scripts": {
 		"prepare": "husky"
 	},
 	"devDependencies": {
 		"@elysiajs/eden": "^1.3.2",
 		"@faker-js/faker": "^8.4.1",
 		"@iconify-json/carbon": "^1.1.16",
 		"@iconify-json/eos-icons": "^1.1.6",
 		"@sveltejs/vite-plugin-svelte": "^5.0.3",
 		"@tailwindcss/typography": "^0.5.9",
 		"@types/dompurify": "^3.0.5",
 		"@types/js-yaml": "^4.0.9",
 		"@types/katex": "^0.16.7",
 		"@types/mime-types": "^2.1.4",
 		"@types/minimist": "^1.2.5",
 		"@types/node": "^22.1.0",
 		"@types/parquetjs": "^0.10.3",
 		"@types/uuid": "^9.0.8",
 		"@types/yazl": "^3.3.0",
 		"@typescript-eslint/eslint-plugin": "^6.x",
 		"eslint": "^8.28.0",
 		"eslint-config-prettier": "^8.5.0",
 		"eslint-plugin-svelte": "^2.45.1",
+		"fs-extra": "^11.3.1",
 		"isomorphic-dompurify": "^2.13.0",
 		"js-yaml": "^4.1.0",
 		"minimist": "^1.2.8",
 		"mongodb-memory-server": "^10.1.2",
 		"prettier": "^3.5.3",
 		"prettier-plugin-svelte": "^3.2.6",
 		"prettier-plugin-tailwindcss": "^0.6.11",
 		"sade": "^1.8.1",
 		"superjson": "^2.2.2",
 		"svelte": "^5.33.3",
 		"svelte-check": "^4.0.0",
 		"svelte-gestures": "^5.1.3",
 		"tslib": "^2.4.1",
 		"typescript": "^5.5.0",
 		"unplugin-icons": "^0.16.1",
 	},
 	"type": "module",
 	"dependencies": {
 		"@elysiajs/swagger": "^1.3.0",
 		"@gradio/client": "^1.8.0",
 		"@huggingface/hub": "^2.2.0",
 		"@huggingface/inference": "^3.12.1",
 		"@iconify-json/bi": "^1.1.21",
 		"@resvg/resvg-js": "^2.6.2",
 		"autoprefixer": "^10.4.14",
 		"date-fns": "^2.29.3",
 		"dotenv": "^16.5.0",
 		"file-type": "^21.0.0",
 		"handlebars": "^4.7.8",
 		"highlight.js": "^11.7.0",
 		"husky": "^9.0.11",
 		"ip-address": "^9.0.5",
 		"json5": "^2.2.3",
 		"katex": "^0.16.21",
 		"lint-staged": "^15.2.7",
 		"marked": "^12.0.1",
 		"mongodb": "^5.8.0",
 		"nanoid": "^5.0.9",
+		"openai": "^4.44.0",
 		"openid-client": "^5.4.2",
 		"parquetjs": "^0.11.2",
 		"pino": "^9.0.0",
 		"pino-pretty": "^11.0.0",
 		"postcss": "^8.4.31",
 		"satori": "^0.10.11",
 		"satori-html": "^0.3.2",
 		"sharp": "^0.33.4",
 		"tailwind-scrollbar": "^3.0.0",
 		"tailwindcss": "^3.4.0",
 		"vitest-browser-svelte": "^0.1.0",
 		"zod": "^3.22.3"
 	},
 	"overrides": {
 		"@reflink/reflink": "file:stub/@reflink/reflink"
 	}

scripts/populate.ts CHANGED Viewed

@@ -15,8 +15,6 @@ import type { User } from "../src/lib/types/User";
 import type { Assistant } from "../src/lib/types/Assistant";
 import type { Conversation } from "../src/lib/types/Conversation";
 import type { Settings } from "../src/lib/types/Settings";
-import type { CommunityToolDB, ToolLogoColor, ToolLogoIcon } from "../src/lib/types/Tool";
-import { defaultEmbeddingModel } from "../src/lib/server/embeddingModels.ts";
 import { Message } from "../src/lib/types/Message.ts";
 import { addChildren } from "../src/lib/utils/tree/addChildren.ts";
@@ -40,7 +38,7 @@ rl.on("close", function () {
 const samples = fs.readFileSync(path.join(__dirname, "samples.txt"), "utf8").split("\n---\n");
-const possibleFlags = ["reset", "all", "users", "settings", "assistants", "conversations", "tools"];
 const argv = minimist(process.argv.slice(2));
 const flags = argv["_"].filter((flag) => possibleFlags.includes(flag));
@@ -156,7 +154,6 @@ async function seed() {
 		await collections.settings.deleteMany({});
 		await collections.assistants.deleteMany({});
 		await collections.conversations.deleteMany({});
-		await collections.tools.deleteMany({});
 		await collections.migrationResults.deleteMany({});
 		await collections.semaphores.deleteMany({});
 		console.log("Reset done");
@@ -186,12 +183,12 @@ async function seed() {
 				userId: user._id,
 				shareConversationsWithModelAuthors: faker.datatype.boolean(0.25),
 				hideEmojiOnSidebar: faker.datatype.boolean(0.25),
-				ethicsModalAcceptedAt: faker.date.recent({ days: 30 }),
 				activeModel: faker.helpers.arrayElement(modelIds),
 				createdAt: faker.date.recent({ days: 30 }),
 				updatedAt: faker.date.recent({ days: 30 }),
 				disableStream: faker.datatype.boolean(0.25),
 				directPaste: faker.datatype.boolean(0.25),
 				customPrompts: {},
 				assistants: [],
 			};
@@ -272,7 +269,7 @@ async function seed() {
 							updatedAt: faker.date.recent({ days: 145 }),
 							model: faker.helpers.arrayElement(modelIds),
 							title: faker.internet.emoji() + " " + faker.hacker.phrase(),
-							embeddingModel: defaultEmbeddingModel.id,
 							messages,
 							rootMessageId: messages[0].id,
 						} satisfies Conversation;
@@ -287,80 +284,6 @@ async function seed() {
 		);
 		console.log("Done creating conversations.");
 	}
-	// generate Community Tools
-	if (flags.includes("tools") || flags.includes("all")) {
-		const tools = await Promise.all(
-			faker.helpers.multiple(
-				() => {
-					const _id = new ObjectId();
-					const displayName = faker.company.catchPhrase();
-					const description = faker.company.catchPhrase();
-					const color = faker.helpers.arrayElement([
-						"purple",
-						"blue",
-						"green",
-						"yellow",
-						"red",
-					]) satisfies ToolLogoColor;
-					const icon = faker.helpers.arrayElement([
-						"wikis",
-						"tools",
-						"camera",
-						"code",
-						"email",
-						"cloud",
-						"terminal",
-						"game",
-						"chat",
-						"speaker",
-						"video",
-					]) satisfies ToolLogoIcon;
-					const baseUrl = faker.helpers.arrayElement([
-						"stabilityai/stable-diffusion-3-medium",
-						"multimodalart/cosxl",
-						"gokaygokay/SD3-Long-Captioner",
-						"xichenhku/MimicBrush",
-					]);
-					// keep empty for populate for now
-					const user: User = faker.helpers.arrayElement(users);
-					const createdById = user._id;
-					const createdByName = user.username ?? user.name;
-					return {
-						type: "community" as const,
-						_id,
-						createdById,
-						createdByName,
-						displayName,
-						name: displayName.toLowerCase().replace(" ", "_"),
-						endpoint: "/test",
-						description,
-						color,
-						icon,
-						baseUrl,
-						inputs: [],
-						outputPath: null,
-						outputType: "str" as const,
-						showOutput: false,
-						useCount: faker.number.int({ min: 0, max: 100000 }),
-						last24HoursUseCount: faker.number.int({ min: 0, max: 1000 }),
-						createdAt: faker.date.recent({ days: 30 }),
-						updatedAt: faker.date.recent({ days: 30 }),
-						searchTokens: generateSearchTokens(displayName),
-						review: faker.helpers.enumValue(ReviewStatus),
-						outputComponent: null,
-						outputComponentIdx: null,
-					};
-				},
-				{ count: faker.number.int({ min: 10, max: 200 }) }
-			)
-		);
-		await collections.tools.insertMany(tools satisfies CommunityToolDB[]);
-	}
 }
 // run seed

 import type { Assistant } from "../src/lib/types/Assistant";
 import type { Conversation } from "../src/lib/types/Conversation";
 import type { Settings } from "../src/lib/types/Settings";
 import { Message } from "../src/lib/types/Message.ts";
 import { addChildren } from "../src/lib/utils/tree/addChildren.ts";
 const samples = fs.readFileSync(path.join(__dirname, "samples.txt"), "utf8").split("\n---\n");
+const possibleFlags = ["reset", "all", "users", "settings", "assistants", "conversations"];
 const argv = minimist(process.argv.slice(2));
 const flags = argv["_"].filter((flag) => possibleFlags.includes(flag));
 		await collections.settings.deleteMany({});
 		await collections.assistants.deleteMany({});
 		await collections.conversations.deleteMany({});
 		await collections.migrationResults.deleteMany({});
 		await collections.semaphores.deleteMany({});
 		console.log("Reset done");
 				userId: user._id,
 				shareConversationsWithModelAuthors: faker.datatype.boolean(0.25),
 				hideEmojiOnSidebar: faker.datatype.boolean(0.25),
 				activeModel: faker.helpers.arrayElement(modelIds),
 				createdAt: faker.date.recent({ days: 30 }),
 				updatedAt: faker.date.recent({ days: 30 }),
 				disableStream: faker.datatype.boolean(0.25),
 				directPaste: faker.datatype.boolean(0.25),
+				hidePromptExamples: {},
 				customPrompts: {},
 				assistants: [],
 			};
 							updatedAt: faker.date.recent({ days: 145 }),
 							model: faker.helpers.arrayElement(modelIds),
 							title: faker.internet.emoji() + " " + faker.hacker.phrase(),
+							// embeddings removed in this build
 							messages,
 							rootMessageId: messages[0].id,
 						} satisfies Conversation;
 		);
 		console.log("Done creating conversations.");
 	}
 }
 // run seed

server.log ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ /Users/vm/.venv/bin/python3: No module named uvicorn
2	+ /Users/vm/.venv/bin/python3: No module named uvicorn

src/ambient.d.ts CHANGED Viewed

@@ -2,3 +2,6 @@ declare module "*.ttf" {
 	const value: ArrayBuffer;
 	export default value;
 }

 	const value: ArrayBuffer;
 	export default value;
 }
+// Legacy helpers removed: web search support is deprecated, so we intentionally
+// avoid leaking those shapes into the global ambient types.

src/app.html CHANGED Viewed

@@ -5,15 +5,18 @@
 		<meta name="viewport" content="width=device-width, initial-scale=1" />
 		<meta name="theme-color" content="rgb(249, 250, 251)" />
 		<script>
-			if (
-				localStorage.theme === "dark" ||
-				(!("theme" in localStorage) && window.matchMedia("(prefers-color-scheme: dark)").matches)
-			) {
-				document.documentElement.classList.add("dark");
-				document
-					.querySelector('meta[name="theme-color"]')
-					.setAttribute("content", "rgb(26, 36, 50)");
-			}
 			// For some reason, Sveltekit doesn't let us load env variables from .env here, so we load it from hooks.server.ts
 			window.gaId = "%gaId%";

 		<meta name="viewport" content="width=device-width, initial-scale=1" />
 		<meta name="theme-color" content="rgb(249, 250, 251)" />
 		<script>
+			(function () {
+				try {
+					var prefersDark = window.matchMedia("(prefers-color-scheme: dark)").matches;
+					var stored = localStorage.getItem("theme");
+					var followSystem = stored === null || stored === "system";
+					var isDark = stored === "dark" || (followSystem && prefersDark);
+					if (isDark) {
+						document.documentElement.classList.add("dark");
+						document.querySelector('meta[name="theme-color"]').setAttribute("content", "#07090d");
+					}
+				} catch (e) {}
+			})();
 			// For some reason, Sveltekit doesn't let us load env variables from .env here, so we load it from hooks.server.ts
 			window.gaId = "%gaId%";

src/hooks.server.ts CHANGED Viewed

@@ -9,9 +9,7 @@ import { checkAndRunMigrations } from "$lib/migrations/migrations";
 import { building, dev } from "$app/environment";
 import { logger } from "$lib/server/logger";
 import { AbortedGenerations } from "$lib/server/abortedGenerations";
-import { MetricsServer } from "$lib/server/metrics";
 import { initExitHandler } from "$lib/server/exitHandler";
-import { refreshAssistantsCounts } from "$lib/jobs/refresh-assistants-counts";
 import { refreshConversationStats } from "$lib/jobs/refresh-conversation-stats";
 import { adminTokenManager } from "$lib/server/adminToken";
 import { isHostLocalhost } from "$lib/server/isURLLocal";
@@ -22,21 +20,25 @@ export const init: ServerInit = async () => {
 	// TODO: move this code on a started server hook, instead of using a "building" flag
 	if (!building) {
-		// Set HF_TOKEN as a process variable for Transformers.JS to see it
-		process.env.HF_TOKEN ??= config.HF_TOKEN;
 		logger.info("Starting server...");
 		initExitHandler();
 		checkAndRunMigrations();
-		if (config.ENABLE_ASSISTANTS) {
-			refreshAssistantsCounts();
-		}
 		refreshConversationStats();
-		// Init metrics server
-		MetricsServer.getInstance();
 		// Init AbortedGenerations refresh process
 		AbortedGenerations.getInstance();
@@ -186,23 +188,7 @@ export const handle: Handle = async ({ event, resolve }) => {
 			return errorResponse(401, ERROR_MESSAGES.authOnly);
 		}
-		// if login is not required and the call is not from /settings and we display the ethics modal with PUBLIC_APP_DISCLAIMER
-		//  we check if the user has accepted the ethics modal first.
-		// If login is required, `ethicsModalAcceptedAt` is already true at this point, so do not pass this condition. This saves a DB call.
-		if (
-			!requiresUser &&
-			!event.url.pathname.startsWith(`${base}/settings`) &&
-			config.PUBLIC_APP_DISCLAIMER === "1"
-		) {
-			const hasAcceptedEthicsModal = await collections.settings.countDocuments({
-				sessionId: event.locals.sessionId,
-				ethicsModalAcceptedAt: { $exists: true },
-			});
-			if (!hasAcceptedEthicsModal) {
-				return errorResponse(405, "You need to accept the welcome modal first");
-			}
-		}
 	}
 	let replaced = false;

 import { building, dev } from "$app/environment";
 import { logger } from "$lib/server/logger";
 import { AbortedGenerations } from "$lib/server/abortedGenerations";
 import { initExitHandler } from "$lib/server/exitHandler";
 import { refreshConversationStats } from "$lib/jobs/refresh-conversation-stats";
 import { adminTokenManager } from "$lib/server/adminToken";
 import { isHostLocalhost } from "$lib/server/isURLLocal";
 	// TODO: move this code on a started server hook, instead of using a "building" flag
 	if (!building) {
+		// Ensure legacy env expected by some libs: map OPENAI_API_KEY -> HF_TOKEN if absent
+		const canonicalToken = config.OPENAI_API_KEY || config.HF_TOKEN;
+		if (canonicalToken) {
+			process.env.HF_TOKEN ??= canonicalToken;
+		}
+		// Warn if legacy-only var is used
+		if (!config.OPENAI_API_KEY && config.HF_TOKEN) {
+			logger.warn(
+				"HF_TOKEN is deprecated in favor of OPENAI_API_KEY. Please migrate to OPENAI_API_KEY."
+			);
+		}
 		logger.info("Starting server...");
 		initExitHandler();
 		checkAndRunMigrations();
 		refreshConversationStats();
 		// Init AbortedGenerations refresh process
 		AbortedGenerations.getInstance();
 			return errorResponse(401, ERROR_MESSAGES.authOnly);
 		}
+		// Ethics disclaimer gating removed
 	}
 	let replaced = false;

src/lib/APIClient.ts CHANGED Viewed

@@ -20,28 +20,20 @@ superjson.registerCustom<ObjectId, string>(
 	"ObjectId"
 );
-export function useAPIClient({ fetch }: { fetch?: Treaty.Config["fetcher"] } = {}) {
-	let url;
-	if (!browser) {
-		let port;
-		if (process.argv.includes("--port")) {
-			port = parseInt(process.argv[process.argv.indexOf("--port") + 1]);
-		} else {
-			const mode = process.argv.find((arg) => arg === "preview" || arg === "dev");
-			if (mode === "preview") {
-				port = 4173;
-			} else if (mode === "dev") {
-				port = 5173;
-			} else {
-				port = 3000;
-			}
-		}
-		// Always use localhost for server-side requests to avoid external HTTP calls during SSR
-		url = `http://localhost:${port}${base}/api/v2`;
-	} else {
-		url = `${window.location.origin}${base}/api/v2`;
-	}
 	const app = treaty<App>(url, { fetcher: fetch });
 	return app;
 }
@@ -57,12 +49,3 @@ export function handleResponse<T extends Record<number, unknown>>(
 		typeof response.data === "string" ? response.data : JSON.stringify(response.data)
 	) as T[200];
 }
-// eslint-disable-next-line @typescript-eslint/no-explicit-any
-export type Success<T extends (...args: any) => any> =
-	Awaited<ReturnType<T>> extends {
-		data: infer D;
-		error: unknown;
-	}
-		? D
-		: never;

 	"ObjectId"
 );
+export function useAPIClient({
+	fetch,
+	origin,
+}: {
+	fetch?: Treaty.Config["fetcher"];
+	origin?: string;
+} = {}) {
+	// On the server, use the current request origin when available to avoid
+	// incorrect port guessing and ensure cookies are forwarded properly.
+	// Fall back to a sane default in dev if origin is missing.
+	const url = browser
+		? `${window.location.origin}${base}/api/v2`
+		: `${origin ?? `http://localhost:5173`}${base}/api/v2`;
 	const app = treaty<App>(url, { fetcher: fetch });
 	return app;
 }
 		typeof response.data === "string" ? response.data : JSON.stringify(response.data)
 	) as T[200];
 }

src/lib/actions/snapScrollToBottom.ts CHANGED Viewed

@@ -1,6 +1,5 @@
-import { navigating } from "$app/stores";
 import { tick } from "svelte";
-import { get } from "svelte/store";
 const detachedOffset = 10;
@@ -31,7 +30,7 @@ export const snapScrollToBottom = (node: HTMLElement, dependency: unknown) => {
 		const options = { ...defaultOptions, ..._options };
 		const { force } = options;
-		if (!force && isDetached && !get(navigating)) return;
 		// wait for next tick to ensure that the DOM is updated
 		await tick();

+import { navigating } from "$app/state";
 import { tick } from "svelte";
 const detachedOffset = 10;
 		const options = { ...defaultOptions, ..._options };
 		const { force } = options;
+		if (!force && isDetached && !navigating.to) return;
 		// wait for next tick to ensure that the DOM is updated
 		await tick();

src/lib/buildPrompt.ts CHANGED Viewed

@@ -1,11 +1,8 @@
 import type { EndpointParameters } from "./server/endpoints/endpoints";
 import type { BackendModel } from "./server/models";
-import type { Tool, ToolResult } from "./types/Tool";
 type buildPromptOptions = Pick<EndpointParameters, "messages" | "preprompt" | "continueMessage"> & {
 	model: BackendModel;
-	tools?: Tool[];
-	toolResults?: ToolResult[];
 };
 export async function buildPrompt({
@@ -13,8 +10,6 @@ export async function buildPrompt({
 	model,
 	preprompt,
 	continueMessage,
-	tools,
-	toolResults,
 }: buildPromptOptions): Promise<string> {
 	const filteredMessages = messages;
@@ -29,8 +24,6 @@ export async function buildPrompt({
 				role: m.from,
 			})),
 			preprompt,
-			tools,
-			toolResults,
 			continueMessage,
 		})
 		// Not super precise, but it's truncated in the model's backend anyway

 import type { EndpointParameters } from "./server/endpoints/endpoints";
 import type { BackendModel } from "./server/models";
 type buildPromptOptions = Pick<EndpointParameters, "messages" | "preprompt" | "continueMessage"> & {
 	model: BackendModel;
 };
 export async function buildPrompt({
 	model,
 	preprompt,
 	continueMessage,
 }: buildPromptOptions): Promise<string> {
 	const filteredMessages = messages;
 				role: m.from,
 			})),
 			preprompt,
 			continueMessage,
 		})
 		// Not super precise, but it's truncated in the model's backend anyway

src/lib/components/AssistantSettings.svelte DELETED Viewed

@@ -1,657 +0,0 @@
-<script lang="ts">
-	import type { Model } from "$lib/types/Model";
-	import type { Assistant } from "$lib/types/Assistant";
-	import { onMount } from "svelte";
-	import { page } from "$app/state";
-	import { base } from "$app/paths";
-	import CarbonPen from "~icons/carbon/pen";
-	import CarbonUpload from "~icons/carbon/upload";
-	import CarbonHelpFilled from "~icons/carbon/help";
-	import CarbonSettingsAdjust from "~icons/carbon/settings-adjust";
-	import CarbonTools from "~icons/carbon/tools";
-	import { useSettingsStore } from "$lib/stores/settings";
-	import IconInternet from "./icons/IconInternet.svelte";
-	import TokensCounter from "./TokensCounter.svelte";
-	import HoverTooltip from "./HoverTooltip.svelte";
-	import { findCurrentModel } from "$lib/utils/models";
-	import AssistantToolPicker from "./AssistantToolPicker.svelte";
-	import { error } from "$lib/stores/errors";
-	import { goto } from "$app/navigation";
-	import { usePublicConfig } from "$lib/utils/PublicConfig.svelte";
-	const publicConfig = usePublicConfig();
-	type AssistantFront = Omit<Assistant, "_id" | "createdById"> & { _id: string };
-	interface Props {
-		assistant?: AssistantFront | undefined;
-		models?: Model[];
-	}
-	let errors = $state<
-		{
-			field: string;
-			message: string;
-		}[]
-	>([]);
-	let { assistant = undefined, models = [] }: Props = $props();
-	let files: FileList | null = $state(null);
-	const settings = useSettingsStore();
-	let modelId = $state("");
-	let systemPrompt = $state(assistant?.preprompt ?? "");
-	let dynamicPrompt = $state(assistant?.dynamicPrompt ?? false);
-	let showModelSettings = $state(Object.values(assistant?.generateSettings ?? {}).some((v) => !!v));
-	onMount(async () => {
-		modelId = findCurrentModel(models, assistant ? assistant.modelId : $settings.activeModel).id;
-	});
-	let inputMessage1 = $state(assistant?.exampleInputs[0] ?? "");
-	let inputMessage2 = $state(assistant?.exampleInputs[1] ?? "");
-	let inputMessage3 = $state(assistant?.exampleInputs[2] ?? "");
-	let inputMessage4 = $state(assistant?.exampleInputs[3] ?? "");
-	function clearError(field: string) {
-		errors = errors.filter((e) => e.field !== field);
-	}
-	function onFilesChange(e: Event) {
-		const inputEl = e.target as HTMLInputElement;
-		if (inputEl.files?.length && inputEl.files[0].size > 0) {
-			if (!inputEl.files[0].type.includes("image")) {
-				inputEl.files = null;
-				files = null;
-				errors = [{ field: "avatar", message: "Only images are allowed" }];
-				return;
-			}
-			files = inputEl.files;
-			clearError("avatar");
-			deleteExistingAvatar = false;
-		}
-	}
-	function getError(field: string) {
-		return errors.find((error) => error.field === field)?.message ?? "";
-	}
-	let deleteExistingAvatar = $state(false);
-	let loading = $state(false);
-	let ragMode: false | "links" | "domains" | "all" = $state(
-		assistant?.rag?.allowAllDomains
-			? "all"
-			: (assistant?.rag?.allowedLinks?.length ?? 0 > 0)
-				? "links"
-				: (assistant?.rag?.allowedDomains?.length ?? 0) > 0
-					? "domains"
-					: false
-	);
-	let tools = $state(assistant?.tools ?? []);
-	const regex = /{{\s?(get|post|url|today)(=.*?)?\s?}}/g;
-	let templateVariables = $derived([...systemPrompt.matchAll(regex)]);
-	let selectedModel = $derived(models.find((m) => m.id === modelId));
-</script>
-<form
-	class="relative flex h-full flex-col overflow-y-auto md:p-8 md:pt-0"
-	enctype="multipart/form-data"
-	onsubmit={async (e) => {
-		e.preventDefault();
-		if (!e.target) {
-			return;
-		}
-		const formData = new FormData(e.target as HTMLFormElement, e.submitter);
-		loading = true;
-		if (files?.[0] && files[0].size > 0) {
-			formData.set("avatar", files[0]);
-		}
-		if (deleteExistingAvatar === true) {
-			if (assistant?.avatar) {
-				// if there is an avatar we explicitly removei t
-				formData.set("avatar", "null");
-			} else {
-				// else we just remove it from the input
-				formData.delete("avatar");
-			}
-		} else {
-			if (files === null) {
-				formData.delete("avatar");
-			}
-		}
-		formData.delete("ragMode");
-		if (ragMode === false || !page.data.enableAssistantsRAG) {
-			formData.set("ragAllowAll", "false");
-			formData.set("ragLinkList", "");
-			formData.set("ragDomainList", "");
-		} else if (ragMode === "all") {
-			formData.set("ragAllowAll", "true");
-			formData.set("ragLinkList", "");
-			formData.set("ragDomainList", "");
-		} else if (ragMode === "links") {
-			formData.set("ragAllowAll", "false");
-			formData.set("ragDomainList", "");
-		} else if (ragMode === "domains") {
-			formData.set("ragAllowAll", "false");
-			formData.set("ragLinkList", "");
-		}
-		formData.set("tools", tools.join(","));
-		let response: Response;
-		if (assistant?._id) {
-			response = await fetch(`${base}/api/assistant/${assistant._id}`, {
-				method: "PATCH",
-				body: formData,
-			});
-			if (response.ok) {
-				goto(`${base}/settings/assistants/${assistant?._id}`, { invalidateAll: true });
-			} else {
-				if (response.status === 400) {
-					const data = await response.json();
-					errors = data.errors;
-				} else {
-					$error = response.statusText;
-				}
-				loading = false;
-			}
-		} else {
-			response = await fetch(`${base}/api/assistant`, {
-				method: "POST",
-				body: formData,
-			});
-			if (response.ok) {
-				const { assistantId } = await response.json();
-				goto(`${base}/settings/assistants/${assistantId}`, { invalidateAll: true });
-			} else {
-				if (response.status === 400) {
-					const data = await response.json();
-					errors = data.errors;
-				} else {
-					$error = response.statusText;
-				}
-				loading = false;
-			}
-		}
-	}}
->
-	{#if assistant}
-		<h2 class="text-xl font-semibold">
-			Edit Assistant: {assistant?.name ?? "assistant"}
-		</h2>
-		<p class="mb-6 text-sm text-gray-500">
-			Modifying an existing assistant will propagate the changes to all users.
-		</p>
-	{:else}
-		<h2 class="text-xl font-semibold">Create new assistant</h2>
-		<p class="mb-6 text-sm text-gray-500">
-			Create and share your own AI Assistant. All assistants are <span
-				class="rounded-full border px-2 py-0.5 leading-none">public</span
-			>
-		</p>
-	{/if}
-	<div class="grid h-full w-full flex-1 grid-cols-2 gap-6 text-sm max-sm:grid-cols-1">
-		<div class="col-span-1 flex flex-col gap-4">
-			<div>
-				<div class="mb-1 block pb-2 text-sm font-semibold">Avatar</div>
-				<input
-					type="file"
-					accept="image/*"
-					name="avatar"
-					id="avatar"
-					class="hidden"
-					onchange={onFilesChange}
-				/>
-				{#if (files && files[0]) || (assistant?.avatar && !deleteExistingAvatar)}
-					<div class="group relative mx-auto h-12 w-12">
-						{#if files && files[0]}
-							<img
-								src={URL.createObjectURL(files[0])}
-								alt="avatar"
-								class="crop mx-auto h-12 w-12 cursor-pointer rounded-full object-cover"
-							/>
-						{:else if assistant?.avatar}
-							<img
-								src="{base}/settings/assistants/{assistant._id}/avatar.jpg?hash={assistant.avatar}"
-								alt="avatar"
-								class="crop mx-auto h-12 w-12 cursor-pointer rounded-full object-cover"
-							/>
-						{/if}
-						<label
-							for="avatar"
-							class="invisible absolute bottom-0 h-12 w-12 rounded-full bg-black bg-opacity-50 p-1 group-hover:visible hover:visible"
-						>
-							<CarbonPen class="mx-auto my-auto h-full cursor-pointer text-center text-white" />
-						</label>
-					</div>
-					<div class="mx-auto w-max pt-1">
-						<button
-							type="button"
-							onclick={(e) => {
-								e.preventDefault();
-								e.stopPropagation();
-								files = null;
-								deleteExistingAvatar = true;
-								clearError("avatar");
-							}}
-							class="mx-auto w-max text-center text-xs text-gray-600 hover:underline"
-						>
-							Delete
-						</button>
-					</div>
-				{:else}
-					<div class="mb-1 flex w-max flex-row gap-4">
-						<label
-							for="avatar"
-							class="btn flex h-8 rounded-lg border bg-white px-3 py-1 text-gray-500 shadow-sm transition-all hover:bg-gray-100"
-						>
-							<CarbonUpload class="mr-2 text-xs " /> Upload
-						</label>
-					</div>
-				{/if}
-				<p class="text-xs text-red-500">{getError("avatar")}</p>
-			</div>
-			<label>
-				<div class="mb-1 font-semibold">Name</div>
-				<input
-					name="name"
-					class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-					placeholder="Assistant Name"
-					value={assistant?.name ?? ""}
-					oninput={() => clearError("name")}
-				/>
-				<p class="text-xs text-red-500">{getError("name")}</p>
-			</label>
-			<label>
-				<div class="mb-1 font-semibold">Description</div>
-				<textarea
-					name="description"
-					class="h-15 w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-					placeholder="It knows everything about python"
-					value={assistant?.description ?? ""}
-					oninput={() => clearError("description")}
-				></textarea>
-				<p class="text-xs text-red-500">{getError("description")}</p>
-			</label>
-			<label>
-				<div class="mb-1 font-semibold">Model</div>
-				<div class="flex gap-2">
-					<select
-						name="modelId"
-						class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-						bind:value={modelId}
-						onchange={() => clearError("modelId")}
-					>
-						{#each models.filter((model) => !model.unlisted) as model}
-							<option value={model.id}>{model.displayName}</option>
-						{/each}
-					</select>
-					<p class="text-xs text-red-500">{getError("modelId")}</p>
-					<button
-						type="button"
-						class="flex aspect-square items-center gap-2 whitespace-nowrap rounded-lg border px-3 {showModelSettings
-							? 'border-blue-500/20 bg-blue-50 text-blue-600'
-							: ''}"
-						onclick={() => (showModelSettings = !showModelSettings)}
-						><CarbonSettingsAdjust class="text-xs" /></button
-					>
-				</div>
-				<div
-					class="mt-2 rounded-lg border border-blue-500/20 bg-blue-500/5 px-2 py-0.5"
-					class:hidden={!showModelSettings}
-				>
-					<p class="text-xs text-red-500">{getError("inputMessage1")}</p>
-					<div class="my-2 grid grid-cols-1 gap-2.5 sm:grid-cols-2 sm:grid-rows-2">
-						<label for="temperature" class="flex justify-between">
-							<span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
-								Temperature
-								<HoverTooltip
-									label="Temperature: Controls creativity, higher values allow more variety."
-								>
-									<CarbonHelpFilled
-										class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
-									/>
-								</HoverTooltip>
-							</span>
-							<input
-								type="number"
-								name="temperature"
-								min="0.1"
-								max="2"
-								step="0.1"
-								class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
-								placeholder={selectedModel?.parameters?.temperature?.toString() ?? "1"}
-								value={assistant?.generateSettings?.temperature ?? ""}
-							/>
-						</label>
-						<label for="top_p" class="flex justify-between">
-							<span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
-								Top P
-								<HoverTooltip
-									label="Top P: Sets word choice boundaries, lower values tighten focus."
-								>
-									<CarbonHelpFilled
-										class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
-									/>
-								</HoverTooltip>
-							</span>
-							<input
-								type="number"
-								name="top_p"
-								class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
-								min="0.05"
-								max="1"
-								step="0.05"
-								placeholder={selectedModel?.parameters?.top_p?.toString() ?? "1"}
-								value={assistant?.generateSettings?.top_p ?? ""}
-							/>
-						</label>
-						<label for="repetition_penalty" class="flex justify-between">
-							<span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
-								Repetition penalty
-								<HoverTooltip
-									label="Repetition penalty: Prevents reuse, higher values decrease repetition."
-								>
-									<CarbonHelpFilled
-										class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
-									/>
-								</HoverTooltip>
-							</span>
-							<input
-								type="number"
-								name="repetition_penalty"
-								min="0.1"
-								max="2"
-								step="0.05"
-								class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
-								placeholder={selectedModel?.parameters?.repetition_penalty?.toString() ?? "1.0"}
-								value={assistant?.generateSettings?.repetition_penalty ?? ""}
-							/>
-						</label>
-						<label for="top_k" class="flex justify-between">
-							<span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
-								Top K <HoverTooltip
-									label="Top K: Restricts word options, lower values for predictability."
-								>
-									<CarbonHelpFilled
-										class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
-									/>
-								</HoverTooltip>
-							</span>
-							<input
-								type="number"
-								name="top_k"
-								min="5"
-								max="100"
-								step="5"
-								class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
-								placeholder={selectedModel?.parameters?.top_k?.toString() ?? "50"}
-								value={assistant?.generateSettings?.top_k ?? ""}
-							/>
-						</label>
-					</div>
-				</div>
-			</label>
-			<label>
-				<div class="mb-1 font-semibold">User start messages</div>
-				<div class="grid gap-1.5 text-sm md:grid-cols-2">
-					<input
-						name="exampleInput1"
-						placeholder="Start Message 1"
-						bind:value={inputMessage1}
-						class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-						oninput={() => clearError("inputMessage1")}
-					/>
-					<input
-						name="exampleInput2"
-						placeholder="Start Message 2"
-						bind:value={inputMessage2}
-						class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-						oninput={() => clearError("inputMessage1")}
-					/>
-					<input
-						name="exampleInput3"
-						placeholder="Start Message 3"
-						bind:value={inputMessage3}
-						class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-						oninput={() => clearError("inputMessage1")}
-					/>
-					<input
-						name="exampleInput4"
-						placeholder="Start Message 4"
-						bind:value={inputMessage4}
-						class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-						oninput={() => clearError("inputMessage1")}
-					/>
-				</div>
-				<p class="text-xs text-red-500">{getError("inputMessage1")}</p>
-			</label>
-			{#if selectedModel?.tools}
-				<div>
-					<span class="text-smd font-semibold"
-						>Tools
-						<CarbonTools class="inline text-xs text-purple-600" />
-						<span class="ml-1 rounded bg-gray-100 px-1 py-0.5 text-xxs font-normal text-gray-600"
-							>Experimental</span
-						>
-					</span>
-					<p class="text-xs text-gray-500">
-						Choose up to 3 community tools that will be used with this assistant.
-					</p>
-				</div>
-				<AssistantToolPicker bind:toolIds={tools} />
-			{/if}
-			{#if page.data.enableAssistantsRAG}
-				<div class="flex flex-col flex-nowrap pb-4">
-					<span class="mt-2 text-smd font-semibold"
-						>Internet access
-						<IconInternet classNames="inline text-sm text-blue-600" />
-						{#if publicConfig.isHuggingChat}
-							<a
-								href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions/385"
-								target="_blank"
-								class="ml-0.5 rounded bg-gray-100 px-1 py-0.5 text-xxs font-normal text-gray-700 underline decoration-gray-400"
-								>Give feedback</a
-							>
-						{/if}
-					</span>
-					<label class="mt-1">
-						<input
-							checked={!ragMode}
-							onchange={() => (ragMode = false)}
-							type="radio"
-							name="ragMode"
-							value={false}
-						/>
-						<span class="my-2 text-sm" class:font-semibold={!ragMode}> Default </span>
-						{#if !ragMode}
-							<span class="block text-xs text-gray-500">
-								Assistant will not use internet to do information retrieval and will respond faster.
-								Recommended for most Assistants.
-							</span>
-						{/if}
-					</label>
-					<label class="mt-1">
-						<input
-							checked={ragMode === "all"}
-							onchange={() => (ragMode = "all")}
-							type="radio"
-							name="ragMode"
-							value={"all"}
-						/>
-						<span class="my-2 text-sm" class:font-semibold={ragMode === "all"}> Web search </span>
-						{#if ragMode === "all"}
-							<span class="block text-xs text-gray-500">
-								Assistant will do a web search on each user request to find information.
-							</span>
-						{/if}
-					</label>
-					<label class="mt-1">
-						<input
-							checked={ragMode === "domains"}
-							onchange={() => (ragMode = "domains")}
-							type="radio"
-							name="ragMode"
-							value={false}
-						/>
-						<span class="my-2 text-sm" class:font-semibold={ragMode === "domains"}>
-							Domains search
-						</span>
-					</label>
-					{#if ragMode === "domains"}
-						<span class="mb-2 text-xs text-gray-500">
-							Specify domains and URLs that the application can search, separated by commas.
-						</span>
-						<input
-							name="ragDomainList"
-							class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-							placeholder="wikipedia.org,bbc.com"
-							value={assistant?.rag?.allowedDomains?.join(",") ?? ""}
-							oninput={() => clearError("ragDomainList")}
-						/>
-						<p class="text-xs text-red-500">{getError("ragDomainList")}</p>
-					{/if}
-					<label class="mt-1">
-						<input
-							checked={ragMode === "links"}
-							onchange={() => (ragMode = "links")}
-							type="radio"
-							name="ragMode"
-							value={false}
-						/>
-						<span class="my-2 text-sm" class:font-semibold={ragMode === "links"}>
-							Specific Links
-						</span>
-					</label>
-					{#if ragMode === "links"}
-						<span class="mb-2 text-xs text-gray-500">
-							Specify a maximum of 10 direct URLs that the Assistant will access. HTML & Plain Text
-							only, separated by commas
-						</span>
-						<input
-							name="ragLinkList"
-							class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
-							placeholder="https://raw.githubusercontent.com/huggingface/chat-ui/main/README.md"
-							value={assistant?.rag?.allowedLinks.join(",") ?? ""}
-							oninput={() => clearError("ragLinkList")}
-						/>
-						<p class="text-xs text-red-500">{getError("ragLinkList")}</p>
-					{/if}
-				</div>
-			{/if}
-		</div>
-		<div class="relative col-span-1 flex h-full flex-col">
-			<div class="mb-1 flex justify-between text-sm">
-				<span class="block font-semibold"> Instructions (System Prompt) </span>
-				{#if dynamicPrompt && templateVariables.length}
-					<div class="relative">
-						<button
-							type="button"
-							class="peer rounded bg-blue-500/20 px-1 text-xs text-blue-600 focus:bg-blue-500/30 focus:text-blue-800 sm:text-sm"
-						>
-							{templateVariables.length} template variable{templateVariables.length > 1 ? "s" : ""}
-						</button>
-						<div
-							class="invisible absolute right-0 top-6 z-10 rounded-lg border bg-white p-2 text-xs shadow-lg peer-focus:visible hover:visible sm:w-96"
-						>
-							Will perform a GET or POST request and inject the response into the prompt. Works
-							better with plain text, csv or json content.
-							{#each templateVariables as match}
-								<div>
-									<a
-										href={match[1].toLowerCase() === "get" ? match[2] : "#"}
-										target={match[1].toLowerCase() === "get" ? "_blank" : ""}
-										class="text-gray-500 underline decoration-gray-300"
-									>
-										{match[1].toUpperCase()}: {match[2]}
-									</a>
-								</div>
-							{/each}
-						</div>
-					</div>
-				{/if}
-			</div>
-			<label class="pb-2 text-sm has-[:checked]:font-semibold">
-				<input type="checkbox" name="dynamicPrompt" bind:checked={dynamicPrompt} />
-				Dynamic Prompt
-				<p class="mb-2 text-xs font-normal text-gray-500">
-					Allow the use of template variables {"{{get=https://example.com/path}}"}
-					to insert dynamic content into your prompt by making GET requests to specified URLs on each
-					inference. You can also send the user's message as the body of a POST request, using {"{{post=https://example.com/path}}"}.
-					Use {"{{today}}"} to include the current date.
-				</p>
-			</label>
-			<div class="relative mb-20 flex h-full flex-col gap-2">
-				<textarea
-					name="preprompt"
-					class="min-h-[8lh] flex-1 rounded-lg border-2 border-gray-200 bg-gray-100 p-2 text-sm"
-					placeholder="You'll act as..."
-					bind:value={systemPrompt}
-					oninput={() => clearError("preprompt")}
-				></textarea>
-				{#if modelId}
-					{@const model = models.find((_model) => _model.id === modelId)}
-					{#if model?.tokenizer && systemPrompt}
-						<TokensCounter
-							classNames="absolute bottom-4 right-4"
-							prompt={systemPrompt}
-							modelTokenizer={model.tokenizer}
-							truncate={model?.parameters?.truncate}
-						/>
-					{/if}
-				{/if}
-				<p class="text-xs text-red-500">{getError("preprompt")}</p>
-			</div>
-			<div class="absolute bottom-6 flex w-full justify-end gap-2 md:right-0 md:w-fit">
-				<a
-					href={assistant ? `${base}/settings/assistants/${assistant?._id}` : `${base}/settings`}
-					class="flex items-center justify-center rounded-full bg-gray-200 px-5 py-2 font-semibold text-gray-600"
-				>
-					Cancel
-				</a>
-				<button
-					type="submit"
-					disabled={loading}
-					aria-disabled={loading}
-					class="flex items-center justify-center rounded-full bg-black px-8 py-2 font-semibold"
-					class:bg-gray-200={loading}
-					class:text-gray-600={loading}
-					class:text-white={!loading}
-				>
-					{assistant ? "Save" : "Create"}
-				</button>
-			</div>
-		</div>
-	</div>
-</form>

src/lib/components/AssistantToolPicker.svelte DELETED Viewed

@@ -1,150 +0,0 @@
-<script lang="ts">
-	import { base } from "$app/paths";
-	import type { ToolLogoColor, ToolLogoIcon } from "$lib/types/Tool";
-	import { debounce } from "$lib/utils/debounce";
-	import { onMount } from "svelte";
-	import ToolLogo from "./ToolLogo.svelte";
-	import CarbonClose from "~icons/carbon/close";
-	interface ToolSuggestion {
-		_id: string;
-		displayName: string;
-		createdByName: string;
-		color: ToolLogoColor;
-		icon: ToolLogoIcon;
-	}
-	interface Props {
-		toolIds?: string[];
-	}
-	let { toolIds = $bindable([]) }: Props = $props();
-	let selectedValues: ToolSuggestion[] = $state([]);
-	onMount(async () => {
-		selectedValues = await Promise.all(
-			toolIds.map(async (id) => await fetch(`${base}/api/tools/${id}`).then((res) => res.json()))
-		);
-		await fetchSuggestions("");
-	});
-	let inputValue = $state("");
-	let maxValues = 3;
-	let suggestions: ToolSuggestion[] = $state([]);
-	async function fetchSuggestions(query: string) {
-		suggestions = (await fetch(`${base}/api/tools/search?q=${query}`).then((res) =>
-			res.json()
-		)) satisfies ToolSuggestion[];
-	}
-	const debouncedFetch = debounce((query: string) => fetchSuggestions(query), 300);
-	function addValue(value: ToolSuggestion) {
-		if (selectedValues.length < maxValues && !selectedValues.includes(value)) {
-			selectedValues = [...selectedValues, value];
-			toolIds = [...toolIds, value._id];
-			inputValue = "";
-			suggestions = [];
-		}
-	}
-	function removeValue(id: ToolSuggestion["_id"]) {
-		selectedValues = selectedValues.filter((v) => v._id !== id);
-		toolIds = selectedValues.map((value) => value._id);
-	}
-</script>
-{#if selectedValues.length > 0}
-	<div class="flex flex-wrap items-center justify-center gap-2">
-		{#each selectedValues as value}
-			<div
-				class="flex items-center justify-center space-x-2 rounded border border-gray-300 bg-gray-200 px-2 py-1"
-			>
-				{#key value.color + value.icon}
-					<ToolLogo color={value.color} icon={value.icon} size="sm" />
-				{/key}
-				<div class="flex flex-col items-center justify-center py-1">
-					<a
-						href={`${base}/tools/${value._id}`}
-						target="_blank"
-						class="line-clamp-1 truncate font-semibold text-blue-600 hover:underline"
-						>{value.displayName}</a
-					>
-					{#if value.createdByName}
-						<p class="text-center text-xs text-gray-500">
-							Created by
-							<a class="underline" href="{base}/tools?user={value.createdByName}" target="_blank"
-								>{value.createdByName}</a
-							>
-						</p>
-					{:else}
-						<p class="text-center text-xs text-gray-500">Official HuggingChat tool</p>
-					{/if}
-				</div>
-				<button
-					onclick={(e) => {
-						e.preventDefault();
-						e.stopPropagation();
-						removeValue(value._id);
-					}}
-					class="text-lg text-gray-600"
-				>
-					<CarbonClose />
-				</button>
-			</div>
-		{/each}
-	</div>
-{/if}
-{#if selectedValues.length < maxValues}
-	<div class="group relative block">
-		<input
-			type="text"
-			bind:value={inputValue}
-			oninput={(ev) => {
-				inputValue = ev.currentTarget.value;
-				debouncedFetch(inputValue);
-			}}
-			disabled={selectedValues.length >= maxValues}
-			class="w-full rounded border border-gray-200 bg-gray-100 px-3 py-2"
-			class:opacity-50={selectedValues.length >= maxValues}
-			class:bg-gray-100={selectedValues.length >= maxValues}
-			placeholder="Type to search tools..."
-			tabindex="0"
-		/>
-		{#if suggestions.length > 0}
-			<div
-				class="invisible absolute z-10 mt-1 w-full rounded border border-gray-300 bg-white shadow-lg group-focus-within:visible"
-				tabindex="-1"
-			>
-				{#if inputValue === ""}
-					<p class="px-3 py-2 text-left text-xs text-gray-500">
-						Start typing to search for tools...
-					</p>
-				{:else}
-					{#each suggestions as suggestion}
-						<button
-							onclick={(e) => {
-								e.preventDefault();
-								e.stopPropagation();
-								addValue(suggestion);
-							}}
-							class="w-full cursor-pointer px-3 py-2 text-left hover:bg-blue-500 hover:text-white"
-							tabindex="0"
-						>
-							{suggestion.displayName}
-							{#if suggestion.createdByName}
-								<span class="text-xs text-gray-500"> by {suggestion.createdByName}</span>
-							{/if}
-						</button>
-					{/each}
-				{/if}
-			</div>
-		{/if}
-	</div>
-{/if}

src/lib/components/CodeBlock.svelte CHANGED Viewed

@@ -1,22 +1,74 @@
 <script lang="ts">
 	import CopyToClipBoardBtn from "./CopyToClipBoardBtn.svelte";
 	import DOMPurify from "isomorphic-dompurify";
 	interface Props {
 		code?: string;
 		rawCode?: string;
 	}
-	let { code = "", rawCode = "" }: Props = $props();
 </script>
 <div class="group relative my-4 rounded-lg">
 	<pre
-		class="scrollbar-custom overflow-auto px-5 font-mono scrollbar-thumb-gray-500 hover:scrollbar-thumb-gray-400 dark:scrollbar-thumb-white/10 dark:hover:scrollbar-thumb-white/20"><code
 			><!-- eslint-disable svelte/no-at-html-tags -->{@html DOMPurify.sanitize(code)}</code
 		></pre>
-	<CopyToClipBoardBtn
-		classNames="btn rounded-lg border border-gray-200 px-2 py-2 text-sm shadow-sm transition-all hover:border-gray-300 active:shadow-inner dark:border-gray-700 dark:hover:border-gray-500 absolute top-2 right-2 invisible opacity-0 group-hover:visible group-hover:opacity-100 dark:text-gray-700 text-gray-200"
-		value={rawCode}
-	/>
 </div>

 <script lang="ts">
 	import CopyToClipBoardBtn from "./CopyToClipBoardBtn.svelte";
 	import DOMPurify from "isomorphic-dompurify";
+	import HtmlPreviewModal from "./HtmlPreviewModal.svelte";
+	import PlayFilledAlt from "~icons/carbon/play-filled-alt";
+	import EosIconsLoading from "~icons/eos-icons/loading";
 	interface Props {
 		code?: string;
 		rawCode?: string;
+		loading?: boolean;
 	}
+	let { code = "", rawCode = "", loading = false }: Props = $props();
+	let previewOpen = $state(false);
+	function hasStrictHtml5Doctype(input: string): boolean {
+		if (!input) return false;
+		const withoutBOM = input.replace(/^\uFEFF/, "");
+		const trimmed = withoutBOM.trimStart();
+		// Strict HTML5 doctype: <!doctype html> with optional whitespace before >
+		return /^<!doctype\s+html\s*>/i.test(trimmed);
+	}
+	function isSvgDocument(input: string): boolean {
+		const trimmed = input.trimStart();
+		return /^(?:<\?xml[^>]*>\s*)?(?:<!doctype\s+svg[^>]*>\s*)?<svg[\s>]/i.test(trimmed);
+	}
+	let showPreview = $derived(hasStrictHtml5Doctype(rawCode) || isSvgDocument(rawCode));
 </script>
 <div class="group relative my-4 rounded-lg">
+	<div class="pointer-events-none sticky top-0 z-10 w-full">
+		<div
+			class="pointer-events-auto absolute right-2 top-2 flex items-center gap-1.5 md:right-3 md:top-3"
+		>
+			{#if showPreview}
+				<button
+					class="btn h-7 gap-1 rounded-lg border border-gray-600 bg-gray-600/50 px-2 text-xs text-gray-300 shadow-sm backdrop-blur transition-all hover:border-gray-500 active:shadow-inner disabled:cursor-not-allowed disabled:opacity-60 dark:border-gray-700 dark:text-gray-400 dark:hover:border-gray-500"
+					disabled={loading}
+					onclick={() => {
+						if (!loading) {
+							previewOpen = true;
+						}
+					}}
+					title="Preview HTML"
+					aria-label="Preview HTML"
+				>
+					{#if loading}
+						<EosIconsLoading class="size-3.5" />
+					{:else}
+						<PlayFilledAlt class="size-3.5" />
+					{/if}
+					Preview
+				</button>
+			{/if}
+			<CopyToClipBoardBtn
+				iconClassNames="size-3"
+				classNames="btn rounded-lg border size-7 text-sm shadow-sm transition-all bg-gray-600/50 backdrop-blur dark:hover:border-gray-500  active:shadow-inner border-gray-600 dark:border-gray-700 hover:border-gray-500 dark:text-gray-400 text-gray-300 "
+				value={rawCode}
+			/>
+		</div>
+	</div>
 	<pre
+		class="scrollbar-custom overflow-auto px-5 font-mono transition-[height] scrollbar-thumb-gray-500 hover:scrollbar-thumb-gray-400 dark:scrollbar-thumb-white/10 dark:hover:scrollbar-thumb-white/20"><code
 			><!-- eslint-disable svelte/no-at-html-tags -->{@html DOMPurify.sanitize(code)}</code
 		></pre>
+	{#if previewOpen}
+		<HtmlPreviewModal html={rawCode} onclose={() => (previewOpen = false)} />
+	{/if}
 </div>