victor HF Staff commited on
Commit
7bf1507
·
unverified ·
1 Parent(s): eeaa128

HuggingChat 2026 (#1875)

Browse files

* refactor: remove tokenizer-related functionality and dependencies

- Removed tokenizer dependencies from package.json.
- Deleted TokensCounter component and its usages across the application.
- Updated model configurations to exclude tokenizer properties.
- Refactored model processing logic to support only OpenAI-compatible endpoints.
- Adjusted API responses to omit tokenizer information.
- Cleaned up related utility functions and imports.

* Remove web search functionality and related components

- Deleted endpoints for various web search APIs (serpApi, serpStack, serper, webLocal, youApi).
- Removed generateQuery function and its usage in search.
- Eliminated web search related types and interfaces from the codebase.
- Updated Assistant and Conversation types to remove references to web search and embedding models.
- Cleaned up related utility functions and message updates for web search.
- Adjusted API routes and components to reflect the removal of web search features.
- Updated Vite configuration to exclude web search dependencies.

* Remove Assistants feature and related code

- Deleted the assistants page and its load function.
- Removed assistantId from conversation handling in server routes.
- Cleaned up conversation page and server routes to eliminate assistant references.
- Removed assistant-related imports and UI components from settings navigation.
- Deleted assistant-specific pages for editing, creating, and displaying avatars.
- Updated tools pages to reflect changes in imports and types.

* refactor: remove AWS endpoint files and update related configurations

* feat: enhance API client to include origin handling and add debug routes

* refactor: prioritize HF_TOKEN for authentication in OpenAI endpoints and update related configurations

* Remove tool management pages and components

- Deleted the ToolEdit component and its associated logic for editing tools.
- Removed the tool search functionality from the tools page.
- Eliminated the tool input component used for handling various input types.
- Removed the layout and page files for individual tool views and editing.
- Cleaned up the new tool creation page by removing the modal and ToolEdit component.

* Refactor codebase to remove tool-related features and improve formatting

- Removed all references to tools in metrics, models, and text generation modules.
- Updated various interfaces and types to reflect the removal of tool functionalities.
- Cleaned up code formatting for better readability and consistency.
- Adjusted API responses and request handling to align with the new structure.
- Ensured all related tests and specifications are updated accordingly.

* refactor: update README to reflect removal of web search and embedding features, and clarify model configuration

* chore: remove search chat feature (UI and /conversations/search API)

* Merge pull request #3 from gary149/remove-most-of-things-2

Remove most of things

* refactor(metrics): remove Prometheus metrics server and usages\n\n- Delete metrics server implementation and all references\n- Drop /metrics endpoint and Prometheus counters\n- Clean Helm templates (ports, ServiceMonitor) and env\n- Remove metrics docs and TOC entry\n- Adjust .env defaults and server hooks

* feat(ui): keep New Chat visible, fix toggles, and polish settings UI\n\n- Always show New Chat in desktop and mobile nav\n- Fix Switch component to toggle on click/keyboard\n- Simplify modal animation and allow disableFly for settings\n- Update settings layouts and styles

* chore(deps): remove prom-client and update lockfile

* chore(dev): allow ngrok host via server.allowedHosts

* refactor(metrics): remove monitoring values from Helm chart

* types: add ambient types for web search sources and stream outputs

- Add and ambient types
- Unblocks TS where older code references these without imports

* ui(settings): consolidate model actions into card and chip-style links

- Group actions into a subtle bordered container
- Promote “New chat” as primary action
- Convert external links and copy action to consistent chips
- Improve wrapping/alignment; adjust modal height and nav behaviors

* ui(chat): tidy message actions and send button styles

- Remove stale comments and unused disabled classes
- Keep send CTA styling consistent across themes

* server(conversation): clean up endpoints and message handling

- Normalize POST/GET handling and error responses
- Simplify retry/continue branches and update storage writes
- Keep rate limiting and guest checks; minor typing tweaks
- Consistent vote/share handlers

* server(models): load models from OPENAI_BASE_URL (OpenAI-compatible)

- Prefer `OPENAI_BASE_URL` (or `OPENAI_MODEL_LIST_URL`) to fetch model list
- Support optional Authorization via HF_TOKEN/OPENAI_API_KEY
- Provide clearer errors when not configured

* server(settings): persist settings fields; minor cleanup

- Keep ethicsModalAccepted optional; set timestamp when provided
- Upsert with createdAt/updatedAt

* utils: message updates iterator and smoothing — minor tidy up

- Keep parsing and smoothing logic intact
- No behavioral changes

* server(models-thumbnail): fix image response typing and return type

- Avoid React type noise by casting result from satori-html
- Return Uint8Array for BodyInit clarity

* ui(layout): minor grid/transition tidy and error toast flow

- Keep layout responsive without behavior changes

* dev: allow dynamic ngrok subdomains in Vite server.allowedHosts

- Use .ngrok-free.app wildcard so fresh tunnels work without edits

* ui: polish nav icon sizing and share icon contrast

- Use square size for sidebar icon button and center content
- Ensure share icon has consistent contrast in light/dark modes

* ui(share): implement two-step share conversation modal; keep footer wording; remove legacy share flow and dark styles; disable duplicate copy tooltip; include leafId on copied URL

* ui(nav): remove skeleton placeholders from conversation list and InfiniteScroll loader

* ui(modal): ensure Escape closes all modals by listening on window and backdrop

* chore: revert unrelated changes from previous commit; keep only Modal Escape behavior

* chore: remove unused dependencies and playwright installation from Dockerfile and package.json

* Merge pull request #4 from gary149/ui-update

UI: remove conversation list skeletons and ensure Escape closes all modals

* build: re-add fs-extra for Vite config

* docs: update metadata in README for improved clarity

* fix: Vite/Svelte v6/5 compat and Docker build

- Replace deprecated Svelte DOM event directives in Switch.svelte (onclick/onkeydown)
- Fix Dockerfile chown by creating /home/user/.npm before chown
- Use CommonJS export in tailwind.config.cjs to silence ESM warning

* feat: default OPENAI_BASE_URL to HF router when unset

* chore: remove OPENAI_MODEL_LIST_URL usage and docs\n\n- Drop all references to OPENAI_MODEL_LIST_URL in code and debug endpoints\n- Default to HF router when OPENAI_BASE_URL is unset\n- Update UI copy and .env comments accordingly

* revert: default base URL fallback (revert 07c1aa44)\n\nRequire explicit OPENAI_BASE_URL again; remove implicit default to HF router in models loader.

* fix: simplify text and improve button styling in ShareConversationModal

* fix: update .env configuration for clarity and remove deprecated parameters

* fix: update version to 0.20.0 in package.json

* refactor: replace HF_TOKEN with OPENAI_API_KEY as the primary authorization token; update documentation and code references to reflect this change

* refactor: remove deprecated tools and assistant features across multiple files

* feat: add API Base URL display in Application Settings

* feat: implement multimodal support with user-configurable overrides and remove deprecated screenshot functionality

* feat: enhance reasoning handling by implementing autodetection of <think> blocks and updating rendering logic

* feat: sanitize titles by stripping <think> markers across multiple components and endpoints

* feat: update multimodal support by replacing CarbonImage with CarbonView and adjusting rendering logic

* feat(models): add model-id filter inputs on models page and settings sidebar; use search input type

* Merge pull request #5 from gary149/feat/model-id-filter-inputs

feat(models): add model-id filtering inputs on models page and settings

* Revert "Merge pull request #5 from gary149/feat/model-id-filter-inputs"

This reverts commit 5dae36913a6e96a5e7c24d5295560a4d40222eca, reversing
changes made to a550b4ece0553de12b0c6730da8815e8a5ec7bfe.

* feat(NavMenu): simplify models link display and always show model count

* fix(CopyToClipBoardBtn): update icon size for better visibility

* fix(layout): replace UserIcon with CarbonSettings for application settings button

* Remove deprecated documentation files and sections related to configuration, installation, and features that are no longer supported in the Chat UI project. This includes the removal of files for common issues, embeddings, multimodal models, OpenAI provider configurations, tools, theming, web search, and local installation instructions. Additionally, the main index file has been cleaned up to reflect the current state of the application.

* fix(svelte.config): enable dotenv override for local environment configuration

* fix(CopyToClipBoardBtn): replace IconCopy with CarbonCopy for consistency
fix(ChatWindow): clean up unused code and simplify conditional rendering
feat(page): update model link copy button to use CarbonCopy icon

* refactor(NavConversationItem): remove unused props and simplify height logic

* fix(package-lock): update version from 0.10.0 to 0.20.0 for consistency
fix(settings page): adjust button spacing and replace CarbonCode icon with CarbonArrowUpRight

* feat(NavConversationItem): adjust height for improved layout consistency
fix(OpenReasoningResults): update hover background color for better visibility
feat(+page): add search filter for model ID in model list
feat(+layout): implement search filter for model ID in settings navigation
fix(tailwind.config): add custom gray shades for enhanced design flexibility

* Refactor APIClient and Chat com

This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .env +51 -110
  2. Dockerfile +2 -4
  3. PRIVACY.md +26 -10
  4. README.md +63 -1031
  5. chart/env/prod.yaml +0 -5
  6. chart/templates/deployment.yaml +0 -5
  7. chart/templates/service-monitor.yaml +0 -15
  8. chart/templates/service.yaml +0 -6
  9. chart/values.yaml +1 -2
  10. docs/source/_toctree.yml +0 -64
  11. docs/source/configuration/common-issues.md +0 -7
  12. docs/source/configuration/embeddings.md +0 -105
  13. docs/source/configuration/metrics.md +0 -9
  14. docs/source/configuration/models/multimodal.md +0 -24
  15. docs/source/configuration/models/overview.md +0 -147
  16. docs/source/configuration/models/providers/anthropic.md +0 -117
  17. docs/source/configuration/models/providers/aws.md +0 -35
  18. docs/source/configuration/models/providers/cloudflare.md +0 -35
  19. docs/source/configuration/models/providers/cohere.md +0 -26
  20. docs/source/configuration/models/providers/google.md +0 -92
  21. docs/source/configuration/models/providers/langserve.md +0 -22
  22. docs/source/configuration/models/providers/llamacpp.md +0 -49
  23. docs/source/configuration/models/providers/ollama.md +0 -39
  24. docs/source/configuration/models/providers/openai.md +0 -181
  25. docs/source/configuration/models/providers/tgi.md +0 -66
  26. docs/source/configuration/models/tools.md +0 -62
  27. docs/source/configuration/open-id.md +0 -16
  28. docs/source/configuration/overview.md +0 -10
  29. docs/source/configuration/theming.md +0 -18
  30. docs/source/configuration/web-search.md +0 -58
  31. docs/source/developing/architecture.md +0 -35
  32. docs/source/developing/copy-huggingchat.md +0 -71
  33. docs/source/index.md +0 -97
  34. docs/source/installation/docker.md +0 -11
  35. docs/source/installation/helm.md +0 -35
  36. docs/source/installation/local.md +0 -34
  37. docs/source/installation/spaces.md +0 -9
  38. package-lock.json +0 -0
  39. package.json +3 -42
  40. scripts/populate.ts +3 -80
  41. server.log +2 -0
  42. src/ambient.d.ts +3 -0
  43. src/app.html +12 -9
  44. src/hooks.server.ts +13 -27
  45. src/lib/APIClient.ts +13 -30
  46. src/lib/actions/snapScrollToBottom.ts +2 -3
  47. src/lib/buildPrompt.ts +0 -7
  48. src/lib/components/AssistantSettings.svelte +0 -657
  49. src/lib/components/AssistantToolPicker.svelte +0 -150
  50. src/lib/components/CodeBlock.svelte +58 -6
.env CHANGED
@@ -1,79 +1,68 @@
1
  # Use .env.local to change these variables
2
  # DO NOT EDIT THIS FILE WITH SENSITIVE DATA
3
 
4
- ### Config ###
5
- ENABLE_CONFIG_MANAGER=true
 
 
 
 
 
 
 
6
 
7
  ### MongoDB ###
8
  MONGODB_URL=#your mongodb URL here, use chat-ui-db image if you don't want to set this
9
  MONGODB_DB_NAME=chat-ui
10
  MONGODB_DIRECT_CONNECTION=false
11
 
 
 
 
 
 
 
 
 
12
  ### Local Storage ###
13
- MODELS_STORAGE_PATH= # where are .gguf for model inference stored
14
  MONGO_STORAGE_PATH= # where is the db folder stored
15
 
16
- ### Endpoints config ###
17
- HF_API_ROOT=https://api-inference.huggingface.co/models
18
- # HF_TOKEN is used for a lot of things, not only for inference but also fetching tokenizers, etc.
19
- # We recommend using an HF_TOKEN even if you use a local endpoint.
20
- HF_TOKEN= #get it from https://huggingface.co/settings/token
21
- # API Keys for providers, you will need to specify models in the MODELS section but these keys can be kept secret
22
- OPENAI_API_KEY=#your openai api key here
23
- ANTHROPIC_API_KEY=#your anthropic api key here
24
- CLOUDFLARE_ACCOUNT_ID=#your cloudflare account id here
25
- CLOUDFLARE_API_TOKEN=#your cloudflare api token here
26
- COHERE_API_TOKEN=#your cohere api token here
27
- GOOGLE_GENAI_API_KEY=#your google genai api token here
28
-
29
 
30
- ### Models ###
31
- ## Models can support many different endpoints, check the documentation for more details
32
- MODELS=`[
33
- {
34
- "name": "NousResearch/Hermes-3-Llama-3.1-8B",
35
- "description": "Nous Research's latest Hermes 3 release in 8B size.",
36
- "promptExamples": [
37
- {
38
- "title": "Write an email",
39
- "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
40
- }, {
41
- "title": "Code a game",
42
- "prompt": "Code a basic snake game in python, give explanations for each step."
43
- }, {
44
- "title": "Recipe help",
45
- "prompt": "How do I make a delicious lemon cheesecake?"
46
- }
47
- ]
48
- }
49
- ]`
50
-
51
- LOAD_GGUF_MODELS=true
52
- ## Text Embedding Models used for websearch
53
- # Default is a model that runs locally on CPU.
54
- TEXT_EMBEDDING_MODELS = `[
55
- {
56
- "name": "Xenova/gte-small",
57
- "displayName": "Xenova/gte-small",
58
- "description": "Local embedding model running on the server.",
59
- "chunkCharLength": 512,
60
- "endpoints": [
61
- { "type": "transformersjs" }
62
- ]
63
- }
64
- ]`
65
-
66
-
67
- REASONING_SUMMARY=true # Change this to false to disable reasoning summary
68
- ## Removed models, useful for migrating conversations
69
- # { name: string, displayName?: string, id?: string, transferTo?: string }`
70
- OLD_MODELS=`[]`
71
 
72
  ## Task model
73
- # name of the model used for tasks such as summarizing title, creating query, etc.
74
- # if not set, the first model in MODELS will be used
75
  TASK_MODEL=
76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ### Authentication ###
79
  # Parameters to enable open id login
@@ -97,41 +86,6 @@ TRUSTED_EMAIL_HEADER=# header to use to get the user email, only use if you know
97
  ADMIN_CLI_LOGIN=true # set to false to disable the CLI login
98
  ADMIN_TOKEN=#We recommend leaving this empty, you can get the token from the terminal.
99
 
100
-
101
- ### Websearch ###
102
- ## API Keys used to activate search with web functionality. websearch is disabled if none are defined. choose one of the following:
103
- YDC_API_KEY=#your docs.you.com api key here
104
- SERPER_API_KEY=#your serper.dev api key here
105
- SERPAPI_KEY=#your serpapi key here
106
- SERPSTACK_API_KEY=#your serpstack api key here
107
- SEARCHAPI_KEY=#your searchapi api key here
108
- USE_LOCAL_WEBSEARCH=#set to true to parse google results yourself, overrides other API keys
109
- SEARXNG_QUERY_URL=# where '<query>' will be replaced with query keywords see https://docs.searxng.org/dev/search_api.html eg https://searxng.yourdomain.com/search?q=<query>&engines=duckduckgo,google&format=json
110
- BING_SUBSCRIPTION_KEY=#your key
111
- ## Websearch configuration
112
- PLAYWRIGHT_ADBLOCKER=true
113
- WEBSEARCH_ALLOWLIST=`[]` # if it's defined, allow websites from only this list.
114
- WEBSEARCH_BLOCKLIST=`[]` # if it's defined, block websites from this list.
115
- WEBSEARCH_JAVASCRIPT=true # CPU usage reduces by 60% on average by disabling javascript. Enable to improve website compatibility
116
- WEBSEARCH_TIMEOUT = 3500 # in milliseconds, determines how long to wait to load a page before timing out
117
- ENABLE_LOCAL_FETCH=false #set to true to allow fetches on the local network. /!\ Only enable this if you have the proper firewall rules to prevent SSRF attacks and understand the implications.
118
-
119
-
120
- ## Public app configuration ##
121
- PUBLIC_APP_GUEST_MESSAGE=# a message to the guest user. If not set, no message will be shown. Only used if you have authentication enabled.
122
- PUBLIC_APP_NAME=ChatUI # name used as title throughout the app
123
- PUBLIC_APP_ASSETS=chatui # used to find logos & favicons in static/$PUBLIC_APP_ASSETS
124
- PUBLIC_APP_DESCRIPTION=# description used throughout the app
125
- PUBLIC_APP_DATA_SHARING=# Set to 1 to enable an option in the user settings to share conversations with model authors
126
- PUBLIC_APP_DISCLAIMER=# Set to 1 to show a disclaimer on login page
127
- PUBLIC_APP_DISCLAIMER_MESSAGE=# Message to show on the login page
128
- PUBLIC_ANNOUNCEMENT_BANNERS=`[
129
- {
130
- "title": "chat-ui is now open source!",
131
- "linkTitle": "check it out",
132
- "linkHref": "https://github.com/huggingface/chat-ui"
133
- }
134
- ]`
135
  PUBLIC_SMOOTH_UPDATES=false # set to true to enable smoothing of messages client-side, can be CPU intensive
136
  PUBLIC_ORIGIN=#https://huggingface.co
137
  PUBLIC_SHARE_PREFIX=#https://hf.co/chat
@@ -144,17 +98,10 @@ PUBLIC_APPLE_APP_ID=#1234567890 / Leave empty to disable
144
 
145
  ### Feature Flags ###
146
  LLM_SUMMARIZATION=true # generate conversation titles with LLMs
147
- ENABLE_ASSISTANTS=false #set to true to enable assistants feature
148
- ENABLE_ASSISTANTS_RAG=false # /!\ This will let users specify arbitrary URLs that the server will then request. Make sure you have the proper firewall rules in place.
149
- REQUIRE_FEATURED_ASSISTANTS=false # require featured assistants to show in the list
150
- COMMUNITY_TOOLS=false # set to true to enable community tools
151
  ALLOW_IFRAME=true # Allow the app to be embedded in an iframe
152
  ENABLE_DATA_EXPORT=true
153
 
154
- ### Tools ###
155
- # Check out public config in `chart/env/prod.yaml` for more details
156
- TOOLS=`[]`
157
-
158
  ### Rate limits ###
159
  # See `src/lib/server/usageLimits.ts`
160
  # {
@@ -167,21 +114,15 @@ TOOLS=`[]`
167
  # }
168
  USAGE_LIMITS=`{}`
169
 
170
-
171
  ### HuggingFace specific ###
172
- # Let user authenticate with their HF token in the /api routes. This is only useful if you have OAuth configured with huggingface.
173
- USE_HF_TOKEN_IN_API=false
174
  ## Feature flag & admin settings
175
  # Used for setting early access & admin flags to users
176
  HF_ORG_ADMIN=
177
  HF_ORG_EARLY_ACCESS=
178
  WEBHOOK_URL_REPORT_ASSISTANT=#provide slack webhook url to get notified for reports/feature requests
179
- IP_TOKEN_SECRET=
180
 
181
 
182
  ### Metrics ###
183
- METRICS_ENABLED=false
184
- METRICS_PORT=5565
185
  LOG_LEVEL=info
186
 
187
 
@@ -191,19 +132,19 @@ PARQUET_EXPORT_DATASET=
191
  PARQUET_EXPORT_HF_TOKEN=
192
  ADMIN_API_SECRET=# secret to admin API calls, like computing usage stats or exporting parquet data
193
 
 
 
194
 
195
  ### Docker build variables ###
196
  # These values cannot be updated at runtime
197
  # They need to be passed when building the docker image
198
  # See https://github.com/huggingface/chat-ui/main/.github/workflows/deploy-prod.yml#L44-L47
199
  APP_BASE="" # base path of the app, e.g. /chat, left blank as default
200
- PUBLIC_APP_COLOR=blue # can be any of tailwind colors: https://tailwindcss.com/docs/customizing-colors#default-color-palette
201
  ### Body size limit for SvelteKit https://svelte.dev/docs/kit/adapter-node#Environment-variables-BODY_SIZE_LIMIT
202
  BODY_SIZE_LIMIT=15728640
203
  PUBLIC_COMMIT_SHA=
204
 
205
  ### LEGACY parameters
206
- HF_ACCESS_TOKEN=#LEGACY! Use HF_TOKEN instead
207
  ALLOW_INSECURE_COOKIES=false # LEGACY! Use COOKIE_SECURE and COOKIE_SAMESITE instead
208
  PARQUET_EXPORT_SECRET=#DEPRECATED, use ADMIN_API_SECRET instead
209
  RATE_LIMIT= # /!\ DEPRECATED definition of messages per minute. Use USAGE_LIMITS.messagesPerMinute instead
 
1
  # Use .env.local to change these variables
2
  # DO NOT EDIT THIS FILE WITH SENSITIVE DATA
3
 
4
+ ### Models ###
5
+ # Models are sourced exclusively from an OpenAI-compatible base URL.
6
+ # Example: https://router.huggingface.co/v1
7
+ OPENAI_BASE_URL=
8
+
9
+ # Canonical auth token for any OpenAI-compatible provider
10
+ OPENAI_API_KEY=#your provider API key (works for HF router, OpenAI, LM Studio, etc.)
11
+ # Legacy alias (still supported): if set and OPENAI_API_KEY is empty, it will be used
12
+ # HF_TOKEN=
13
 
14
  ### MongoDB ###
15
  MONGODB_URL=#your mongodb URL here, use chat-ui-db image if you don't want to set this
16
  MONGODB_DB_NAME=chat-ui
17
  MONGODB_DIRECT_CONNECTION=false
18
 
19
+
20
+ ## Public app configuration ##
21
+ PUBLIC_APP_GUEST_MESSAGE=# a message to the guest user. If not set, no message will be shown. Only used if you have authentication enabled.
22
+ PUBLIC_APP_NAME=ChatUI # name used as title throughout the app
23
+ PUBLIC_APP_ASSETS=chatui # used to find logos & favicons in static/$PUBLIC_APP_ASSETS
24
+ PUBLIC_APP_DESCRIPTION=# description used throughout the app
25
+ PUBLIC_APP_DATA_SHARING=# Set to 1 to enable an option in the user settings to share conversations with model authors
26
+
27
  ### Local Storage ###
 
28
  MONGO_STORAGE_PATH= # where is the db folder stored
29
 
30
+ REASONING_SUMMARY=false # Change this to false to disable reasoning summary
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
+ ## Models overrides
33
+ MODELS=
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## Task model
36
+ # Optional: set to the model id/name from the `${OPENAI_BASE_URL}/models` list
37
+ # to use for internal tasks (title summarization, etc). If not set, the current model will be used
38
  TASK_MODEL=
39
 
40
+ # Arch router (OpenAI-compatible) endpoint base URL used for route selection
41
+ # Example: https://api.openai.com/v1 or your hosted Arch endpoint
42
+ LLM_ROUTER_ARCH_BASE_URL=
43
+
44
+ ## LLM Router Configuration
45
+ # Path to routes policy (JSON array). Defaults to llm-router/routes.chat.json
46
+ LLM_ROUTER_ROUTES_PATH=
47
+
48
+ # Model used at the Arch router endpoint for selection
49
+ LLM_ROUTER_ARCH_MODEL=
50
+
51
+ # Fallback behavior
52
+ # Route to map "other" to (must exist in routes file)
53
+ LLM_ROUTER_OTHER_ROUTE=casual_conversation
54
+ # Model to call if the Arch selection fails entirely
55
+ LLM_ROUTER_FALLBACK_MODEL=
56
+ # Arch selection timeout in milliseconds (default 10000)
57
+ LLM_ROUTER_ARCH_TIMEOUT_MS=10000
58
+
59
+ # Router UI overrides (client-visible)
60
+ # Public display name for the router entry in the model list. Defaults to "Omni".
61
+ PUBLIC_LLM_ROUTER_DISPLAY_NAME=Omni
62
+ # Optional: public logo URL for the router entry. If unset, the UI shows a Carbon icon.
63
+ PUBLIC_LLM_ROUTER_LOGO_URL=
64
+ # Public alias id used for the virtual router model (Omni). Defaults to "omni".
65
+ PUBLIC_LLM_ROUTER_ALIAS_ID=omni
66
 
67
  ### Authentication ###
68
  # Parameters to enable open id login
 
86
  ADMIN_CLI_LOGIN=true # set to false to disable the CLI login
87
  ADMIN_TOKEN=#We recommend leaving this empty, you can get the token from the terminal.
88
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
  PUBLIC_SMOOTH_UPDATES=false # set to true to enable smoothing of messages client-side, can be CPU intensive
90
  PUBLIC_ORIGIN=#https://huggingface.co
91
  PUBLIC_SHARE_PREFIX=#https://hf.co/chat
 
98
 
99
  ### Feature Flags ###
100
  LLM_SUMMARIZATION=true # generate conversation titles with LLMs
101
+
 
 
 
102
  ALLOW_IFRAME=true # Allow the app to be embedded in an iframe
103
  ENABLE_DATA_EXPORT=true
104
 
 
 
 
 
105
  ### Rate limits ###
106
  # See `src/lib/server/usageLimits.ts`
107
  # {
 
114
  # }
115
  USAGE_LIMITS=`{}`
116
 
 
117
  ### HuggingFace specific ###
 
 
118
  ## Feature flag & admin settings
119
  # Used for setting early access & admin flags to users
120
  HF_ORG_ADMIN=
121
  HF_ORG_EARLY_ACCESS=
122
  WEBHOOK_URL_REPORT_ASSISTANT=#provide slack webhook url to get notified for reports/feature requests
 
123
 
124
 
125
  ### Metrics ###
 
 
126
  LOG_LEVEL=info
127
 
128
 
 
132
  PARQUET_EXPORT_HF_TOKEN=
133
  ADMIN_API_SECRET=# secret to admin API calls, like computing usage stats or exporting parquet data
134
 
135
+ ### Config ###
136
+ ENABLE_CONFIG_MANAGER=true
137
 
138
  ### Docker build variables ###
139
  # These values cannot be updated at runtime
140
  # They need to be passed when building the docker image
141
  # See https://github.com/huggingface/chat-ui/main/.github/workflows/deploy-prod.yml#L44-L47
142
  APP_BASE="" # base path of the app, e.g. /chat, left blank as default
 
143
  ### Body size limit for SvelteKit https://svelte.dev/docs/kit/adapter-node#Environment-variables-BODY_SIZE_LIMIT
144
  BODY_SIZE_LIMIT=15728640
145
  PUBLIC_COMMIT_SHA=
146
 
147
  ### LEGACY parameters
 
148
  ALLOW_INSECURE_COOKIES=false # LEGACY! Use COOKIE_SECURE and COOKIE_SAMESITE instead
149
  PARQUET_EXPORT_SECRET=#DEPRECATED, use ADMIN_API_SECRET instead
150
  RATE_LIMIT= # /!\ DEPRECATED definition of messages per minute. Use USAGE_LIMITS.messagesPerMinute instead
Dockerfile CHANGED
@@ -2,7 +2,6 @@
2
  ARG INCLUDE_DB=false
3
 
4
  FROM node:20-slim AS base
5
- ENV PLAYWRIGHT_SKIP_BROWSER_GC=1
6
 
7
  # install dotenv-cli
8
  RUN npm install -g dotenv-cli
@@ -21,7 +20,6 @@ WORKDIR /app
21
  RUN touch /app/.env.local
22
 
23
 
24
- RUN npm i --no-package-lock --no-save playwright@1.52.0
25
 
26
  USER root
27
 
@@ -31,9 +29,9 @@ RUN chown -R 1000:1000 /data/models
31
  RUN apt-get update
32
  RUN apt-get install gnupg curl git cmake clang libgomp1 -y
33
 
34
- RUN npx playwright install --with-deps chromium
35
 
36
- RUN chown -R 1000:1000 /home/user/.npm
 
37
 
38
  USER user
39
 
 
2
  ARG INCLUDE_DB=false
3
 
4
  FROM node:20-slim AS base
 
5
 
6
  # install dotenv-cli
7
  RUN npm install -g dotenv-cli
 
20
  RUN touch /app/.env.local
21
 
22
 
 
23
 
24
  USER root
25
 
 
29
  RUN apt-get update
30
  RUN apt-get install gnupg curl git cmake clang libgomp1 -y
31
 
 
32
 
33
+ # ensure npm cache dir exists before adjusting ownership
34
+ RUN mkdir -p /home/user/.npm && chown -R 1000:1000 /home/user/.npm
35
 
36
  USER user
37
 
PRIVACY.md CHANGED
@@ -1,22 +1,38 @@
1
  ## Privacy
2
 
3
- > Last updated: Feb 14, 2025
4
 
5
- Users of HuggingChat are authenticated through their HF user account.
6
 
7
- We endorse Privacy by Design. As such, your conversations are private to you and will not be shared with anyone, including model authors, for any purpose, including for research or model training purposes.
8
-
9
- You conversation data will only be stored to let you access past conversations. You can click on the Delete icon to delete any past conversation at any moment.
10
 
11
  🗓 Please also consult huggingface.co's main privacy policy at <https://huggingface.co/privacy>. To exercise any of your legal privacy rights, please send an email to <privacy@huggingface.co>.
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ## About available LLMs
14
 
15
  The goal of this app is to showcase that it is now possible to build an open source alternative to ChatGPT. 💪
16
 
17
- We aim to always provide a diverse set of state of the art open LLMs, hence we rotate the available models over time. Discuss available models and request new ones on the [models discussion page](https://huggingface.co/spaces/huggingchat/chat-ui/discussions/372).
18
 
19
- Check the [models](https://huggingface.co/chat/models/) page for an up-to-date list of the best available LLMs.
20
 
21
  ## Technical details
22
 
@@ -26,10 +42,10 @@ The app is completely open source, and further development takes place on the [h
26
 
27
  You can find the production configuration for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/chart/env/prod.yaml).
28
 
29
- The inference backend is running the optimized [text-generation-inference](https://github.com/huggingface/text-generation-inference) on HuggingFace's Inference API infrastructure.
30
 
31
- It is possible to deploy a copy of this app to a Space and customize it (swap model, add some UI elements, or store user messages according to your own Terms and conditions). You can also 1-click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
32
 
33
- We welcome any feedback on this app: please participate to the public discussion at <https://huggingface.co/spaces/huggingchat/chat-ui/discussions>
34
 
35
  <a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>
 
1
  ## Privacy
2
 
3
+ > Last updated: Sep 15, 2025
4
 
5
+ Basics:
6
 
7
+ - Sign-in: You authenticate with your Hugging Face account.
8
+ - Conversation history: Stored so you can access past chats; you can delete any conversation at any time from the UI.
 
9
 
10
  🗓 Please also consult huggingface.co's main privacy policy at <https://huggingface.co/privacy>. To exercise any of your legal privacy rights, please send an email to <privacy@huggingface.co>.
11
 
12
+ ## Data handling and processing
13
+
14
+ HuggingChat uses Hugging Face’s Inference Providers to access models from multiple partners via a single API. Depending on the model and availability, inference runs with the corresponding provider.
15
+
16
+ - Inference Providers documentation: <https://huggingface.co/docs/inference-providers>
17
+ - Security & Compliance: <https://huggingface.co/docs/inference-providers/security>
18
+
19
+ Security and routing facts
20
+
21
+ - Hugging Face does not store any user data for training purposes.
22
+ - Hugging Face does not store the request body or the response when routing requests through Hugging Face.
23
+ - Logs are kept for debugging purposes for up to 30 days, but no user data or tokens are stored in those logs.
24
+ - Inference Provider routing uses TLS/SSL to encrypt data in transit.
25
+ - The Hugging Face Hub (which Inference Providers is a feature of) is SOC 2 Type 2 certified. See <https://huggingface.co/docs/hub/security>.
26
+
27
+ External providers are responsible for their own security and data handling. Please consult each provider’s respective security and privacy policies via the Inference Providers documentation linked above.
28
+
29
  ## About available LLMs
30
 
31
  The goal of this app is to showcase that it is now possible to build an open source alternative to ChatGPT. 💪
32
 
33
+ We aim to always provide a diverse set of stateoftheart open LLMs, and we may update the available models over time. Discuss models or request new ones on the [models discussion page](https://huggingface.co/spaces/huggingchat/chat-ui/discussions/372).
34
 
35
+ Check the [models](https://huggingface.co/chat/models/) page for an uptodate list of the best available LLMs.
36
 
37
  ## Technical details
38
 
 
42
 
43
  You can find the production configuration for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/chart/env/prod.yaml).
44
 
45
+ HuggingChat connects to the OpenAI‑compatible Inference Providers router at `https://router.huggingface.co/v1` to access models across multiple providers. Provider selection may be automatic or fixed depending on the model configuration.
46
 
47
+ It is possible to deploy a copy of this app to a Space and customize it (swap models, add UI elements, or store user messages according to your own Terms and Conditions). You can also 1click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
48
 
49
+ We welcome any feedback on this app: please participate in the public discussion at <https://huggingface.co/spaces/huggingchat/chat-ui/discussions>
50
 
51
  <a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>
README.md CHANGED
@@ -1,236 +1,107 @@
1
  # Chat UI
2
 
3
- **Find the docs at [hf.co/docs/chat-ui](https://huggingface.co/docs/chat-ui/index).**
4
-
5
- ![Chat UI repository thumbnail](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chatui-websearch.png)
6
 
7
  A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the [HuggingChat app on hf.co/chat](https://huggingface.co/chat).
8
 
9
  0. [Quickstart](#quickstart)
10
- 1. [No Setup Deploy](#no-setup-deploy)
11
- 2. [Setup](#setup)
12
- 3. [Launch](#launch)
13
- 4. [Web Search](#web-search)
14
- 5. [Text Embedding Models](#text-embedding-models)
15
- 6. [Extra parameters](#extra-parameters)
16
- 7. [Common issues](#common-issues)
17
- 8. [Deploying to a HF Space](#deploying-to-a-hf-space)
18
- 9. [Building](#building)
19
-
20
- ## Quickstart
21
-
22
- ### Docker image
23
-
24
- You can deploy a chat-ui instance in a single command using the docker image. Get your huggingface token from [here](https://huggingface.co/settings/tokens).
25
-
26
- ```bash
27
- docker run -p 3000 -e HF_TOKEN=hf_*** -v db:/data ghcr.io/huggingface/chat-ui-db:latest
28
- ```
29
-
30
- Take a look at the [`.env` file](https://github.com/huggingface/chat-ui/blob/main/.env) and the readme to see all the environment variables that you can set. We have endpoint support for all OpenAI API compatible local services as well as many other providers like Anthropic, Cloudflare, Google Vertex AI, etc.
31
 
32
- ### Local setup
33
 
34
- You can quickly start a locally running chat-ui & LLM text-generation server thanks to chat-ui's [llama.cpp server support](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
35
-
36
- **Step 1 (Start llama.cpp server):**
37
-
38
- Install llama.cpp w/ brew (for Mac):
39
-
40
- ```bash
41
- # install llama.cpp
42
- brew install llama.cpp
43
- ```
44
-
45
- or [build directly from the source](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md) for your target device:
46
 
47
- ```
48
- git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make
49
- ```
50
 
51
- Next, start the server with the [LLM of your choice](https://huggingface.co/models?library=gguf):
52
 
53
- ```bash
54
- # start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example)
55
- llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
 
 
56
  ```
57
 
58
- A local LLaMA.cpp HTTP Server will start on `http://localhost:8080`. Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
59
 
60
- **Step 3 (make sure you have MongoDb running locally):**
 
 
 
 
 
61
 
62
- ```bash
63
- docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
64
- ```
65
 
66
- Read more [here](#database).
67
 
68
- **Step 4 (clone chat-ui):**
69
 
70
  ```bash
71
  git clone https://github.com/huggingface/chat-ui
72
  cd chat-ui
73
- ```
74
-
75
- **Step 5 (tell chat-ui to use local llama.cpp server):**
76
-
77
- Add the following to your `.env.local`:
78
-
79
- ```ini
80
- MODELS=`[
81
- {
82
- "name": "microsoft/Phi-3-mini-4k-instruct",
83
- "endpoints": [{
84
- "type" : "llamacpp",
85
- "baseURL": "http://localhost:8080"
86
- }],
87
- },
88
- ]`
89
- ```
90
-
91
- Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
92
-
93
- **Step 6 (start chat-ui):**
94
-
95
- ```bash
96
  npm install
97
  npm run dev -- --open
98
  ```
99
 
100
- Read more [here](#launch).
101
 
102
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-dark.png" height="auto"/>
103
 
104
- ## No Setup Deploy
105
 
106
- If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.
107
 
108
- You can deploy your own customized Chat UI instance with any supported [LLM](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending) of your choice on [Hugging Face Spaces](https://huggingface.co/spaces). To do so, use the chat-ui template [available here](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
 
 
 
109
 
110
- Set `HF_TOKEN` in [Space secrets](https://huggingface.co/docs/hub/spaces-overview#managing-secrets) to deploy a model with gated access or a model in a private repository. It's also compatible with [Inference for PROs](https://huggingface.co/blog/inference-pro) curated list of powerful models with higher rate limits. Make sure to create your personal token first in your [User Access Tokens settings](https://huggingface.co/settings/tokens).
111
 
112
- Read the full tutorial [here](https://huggingface.co/docs/hub/spaces-sdks-docker-chatui#chatui-on-spaces).
113
 
114
- ## Setup
115
-
116
- The default config for Chat UI is stored in the `.env` file. You will need to override some values to get Chat UI to run locally. This is done in `.env.local`.
117
-
118
- Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
119
-
120
- ```env
121
- MONGODB_URL=<the URL to your MongoDB instance>
122
- HF_TOKEN=<your access token>
123
- ```
124
-
125
- ### Database
126
-
127
- The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.
128
-
129
- You can use a local MongoDB instance. The easiest way is to spin one up using docker:
130
 
131
  ```bash
132
  docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
133
  ```
134
 
135
- In which case the url of your DB will be `MONGODB_URL=mongodb://localhost:27017`.
136
-
137
- Alternatively, you can use a [free MongoDB Atlas](https://www.mongodb.com/pricing) instance for this, Chat UI should fit comfortably within their free tier. After which you can set the `MONGODB_URL` variable in `.env.local` to match your instance.
138
-
139
- ### Hugging Face Access Token
140
-
141
- If you use a remote inference endpoint, you will need a Hugging Face access token to run Chat UI locally. You can get one from [your Hugging Face profile](https://huggingface.co/settings/tokens).
142
 
143
  ## Launch
144
 
145
- After you're done with the `.env.local` file you can run Chat UI locally with:
146
 
147
  ```bash
148
  npm install
149
  npm run dev
150
  ```
151
 
152
- ## Web Search
153
-
154
- Chat UI features a powerful Web Search feature. It works by:
155
-
156
- 1. Generating an appropriate search query from the user prompt.
157
- 2. Performing web search and extracting content from webpages.
158
- 3. Creating embeddings from texts using a text embedding model.
159
- 4. From these embeddings, find the ones that are closest to the user query using a vector similarity search. Specifically, we use `inner product` distance.
160
- 5. Get the corresponding texts to those closest embeddings and perform [Retrieval-Augmented Generation](https://huggingface.co/papers/2005.11401) (i.e. expand user prompt by adding those texts so that an LLM can use this information).
161
 
162
- ## Text Embedding Models
163
 
164
- By default (for backward compatibility), when `TEXT_EMBEDDING_MODELS` environment variable is not defined, [transformers.js](https://huggingface.co/docs/transformers.js) embedding models will be used for embedding tasks, specifically, [Xenova/gte-small](https://huggingface.co/Xenova/gte-small) model.
165
 
166
- You can customize the embedding model by setting `TEXT_EMBEDDING_MODELS` in your `.env.local` file. For example:
167
-
168
- ```env
169
- TEXT_EMBEDDING_MODELS = `[
170
- {
171
- "name": "Xenova/gte-small",
172
- "displayName": "Xenova/gte-small",
173
- "description": "locally running embedding",
174
- "chunkCharLength": 512,
175
- "endpoints": [
176
- {"type": "transformersjs"}
177
- ]
178
- },
179
- {
180
- "name": "intfloat/e5-base-v2",
181
- "displayName": "intfloat/e5-base-v2",
182
- "description": "hosted embedding model",
183
- "chunkCharLength": 768,
184
- "preQuery": "query: ", # See https://huggingface.co/intfloat/e5-base-v2#faq
185
- "prePassage": "passage: ", # See https://huggingface.co/intfloat/e5-base-v2#faq
186
- "endpoints": [
187
- {
188
- "type": "tei",
189
- "url": "http://127.0.0.1:8080/",
190
- "authorization": "TOKEN_TYPE TOKEN" // optional authorization field. Example: "Basic VVNFUjpQQVNT"
191
- }
192
- ]
193
- }
194
- ]`
195
  ```
196
 
197
- The required fields are `name`, `chunkCharLength` and `endpoints`.
198
- Supported text embedding backends are: [`transformers.js`](https://huggingface.co/docs/transformers.js), [`TEI`](https://github.com/huggingface/text-embeddings-inference) and [`OpenAI`](https://platform.openai.com/docs/guides/embeddings). `transformers.js` models run locally as part of `chat-ui`, whereas `TEI` models run in a different environment & accessed through an API endpoint. `openai` models are accessed through the [OpenAI API](https://platform.openai.com/docs/guides/embeddings).
199
-
200
- When more than one embedding models are supplied in `.env.local` file, the first will be used by default, and the others will only be used on LLM's which configured `embeddingModel` to the name of the model.
201
 
202
  ## Extra parameters
203
 
204
- ### OpenID connect
205
-
206
- The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
207
-
208
- ```env
209
- OPENID_CONFIG=`{
210
- PROVIDER_URL: "<your OIDC issuer>",
211
- CLIENT_ID: "<your OIDC client ID>",
212
- CLIENT_SECRET: "<your OIDC client secret>",
213
- SCOPES: "openid profile",
214
- TOLERANCE: // optional
215
- RESOURCE: // optional
216
- }`
217
- ```
218
-
219
- These variables will enable the openID sign-in modal for users.
220
-
221
- ### Trusted header authentication
222
-
223
- You can set the env variable `TRUSTED_EMAIL_HEADER` to point to the header that contains the user's email address. This will allow you to authenticate users from the header. This setup is usually combined with a proxy that will be in front of chat-ui and will handle the auth and set the header.
224
-
225
- > [!WARNING]
226
- > Make sure to only allow requests to chat-ui through your proxy which handles authentication, otherwise users could authenticate as anyone by setting the header manually! Only set this up if you understand the implications and know how to do it correctly.
227
-
228
- Here is a list of header names for common auth providers:
229
-
230
- - Tailscale Serve: `Tailscale-User-Login`
231
- - Cloudflare Access: `Cf-Access-Authenticated-User-Email`
232
- - oauth2-proxy: `X-Forwarded-Email`
233
-
234
  ### Theming
235
 
236
  You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
@@ -241,785 +112,32 @@ PUBLIC_APP_ASSETS=chatui
241
  PUBLIC_APP_COLOR=blue
242
  PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
243
  PUBLIC_APP_DATA_SHARING=
244
- PUBLIC_APP_DISCLAIMER=
245
  ```
246
 
247
  - `PUBLIC_APP_NAME` The name used as a title throughout the app.
248
  - `PUBLIC_APP_ASSETS` Is used to find logos & favicons in `static/$PUBLIC_APP_ASSETS`, current options are `chatui` and `huggingchat`.
249
  - `PUBLIC_APP_COLOR` Can be any of the [tailwind colors](https://tailwindcss.com/docs/customizing-colors#default-color-palette).
250
  - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
251
- - `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.
252
-
253
- ### Web Search config
254
-
255
- You can enable the web search through an API by adding `YDC_API_KEY` ([docs.you.com](https://docs.you.com)) or `SERPER_API_KEY` ([serper.dev](https://serper.dev/)) or `SERPAPI_KEY` ([serpapi.com](https://serpapi.com/)) or `SERPSTACK_API_KEY` ([serpstack.com](https://serpstack.com/)) or `SEARCHAPI_KEY` ([searchapi.io](https://www.searchapi.io/)) to your `.env.local`.
256
-
257
- You can also simply enable the local google websearch by setting `USE_LOCAL_WEBSEARCH=true` in your `.env.local` or specify a SearXNG instance by adding the query URL to `SEARXNG_QUERY_URL`.
258
-
259
- You can enable javascript when parsing webpages to improve compatibility with `WEBSEARCH_JAVASCRIPT=true` at the cost of increased CPU usage. You'll want at least 4 cores when enabling.
260
-
261
- ### Custom models
262
-
263
- You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
264
-
265
- ```env
266
- MODELS=`[
267
- {
268
- "name": "mistralai/Mistral-7B-Instruct-v0.2",
269
- "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
270
- "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
271
- "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
272
- "preprompt": "",
273
- "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
274
- "parameters": {
275
- "temperature": 0.3,
276
- "top_p": 0.95,
277
- "repetition_penalty": 1.2,
278
- "top_k": 50,
279
- "truncate": 3072,
280
- "max_new_tokens": 1024,
281
- "stop": ["</s>"]
282
- },
283
- "promptExamples": [
284
- {
285
- "title": "Write an email",
286
- "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
287
- }, {
288
- "title": "Code a game",
289
- "prompt": "Code a basic snake game in python, give explanations for each step."
290
- }, {
291
- "title": "Recipe help",
292
- "prompt": "How do I make a delicious lemon cheesecake?"
293
- }
294
- ]
295
- }
296
- ]`
297
-
298
- ```
299
-
300
- You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
301
-
302
- #### chatPromptTemplate
303
-
304
- In 2025 most chat-completion endpoints (local or remotely hosted) support the OpenAI-compatible API and take arrays of messages.
305
-
306
- If not, when querying the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To identify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
307
-
308
- The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability. You can find the prompts used in production for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md).
309
-
310
- ```prompt
311
- {{preprompt}}
312
- {{#each messages}}
313
- {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
314
- {{#ifAssistant}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}
315
- {{/each}}
316
- {{assistantMessageToken}}
317
- ```
318
-
319
- > [!INFO]
320
- > We also support Jinja2 templates for the `chatPromptTemplate` in addition to Handlebars templates. On startup we first try to compile with Jinja and if that fails we fall back to interpreting `chatPromptTemplate` as handlebars.
321
-
322
- #### Multi modal model
323
-
324
- We currently support [IDEFICS](https://huggingface.co/blog/idefics) (hosted on TGI), OpenAI and Claude 3 as multimodal models. You can enable it by setting `multimodal: true` in your `MODELS` configuration. For IDEFICS, you must have a [PRO HF Api token](https://huggingface.co/settings/tokens). For OpenAI, see the [OpenAI section](#openai-api-compatible-models). For Anthropic, see the [Anthropic section](#anthropic).
325
-
326
- ```env
327
- {
328
- "name": "HuggingFaceM4/idefics-80b-instruct",
329
- "multimodal" : true,
330
- "description": "IDEFICS is the new multimodal model by Hugging Face.",
331
- "preprompt": "",
332
- "chatPromptTemplate" : "{{#each messages}}{{#ifUser}}User: {{content}}{{/ifUser}}<end_of_utterance>\nAssistant: {{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
333
- "parameters": {
334
- "temperature": 0.1,
335
- "top_p": 0.95,
336
- "repetition_penalty": 1.2,
337
- "top_k": 12,
338
- "truncate": 1000,
339
- "max_new_tokens": 1024,
340
- "stop": ["<end_of_utterance>", "User:", "\nUser:"]
341
- }
342
- }
343
- ```
344
-
345
- #### Running your own models using a custom endpoint
346
-
347
- If you want to, instead of hitting models on the Hugging Face Inference API, you can run your own models locally.
348
-
349
- A good option is to hit a [text-generation-inference](https://github.com/huggingface/text-generation-inference), or a llama.cpp endpoint. You will find an example for TGI in the official [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template) for instance: both this app and a text-generation-inference server run inside the same container.
350
-
351
- To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
352
-
353
- ```env
354
- {
355
- // rest of the model config here
356
- "endpoints": [{
357
- "type" : "tgi",
358
- "url": "https://HOST:PORT",
359
- }]
360
- }
361
- ```
362
-
363
- If `endpoints` are left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
364
-
365
- ##### OpenAI API compatible models
366
-
367
- Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
368
-
369
- The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
370
-
371
- Parameters not supported by OpenAI (e.g., top_k, repetition_penalty, etc.) must be set in the extraBody of endpoints. Be aware that setting them in parameters will cause them to be omitted.
372
-
373
- ```
374
- MODELS=`[
375
- {
376
- "name": "text-generation-webui",
377
- "id": "text-generation-webui",
378
- "parameters": {
379
- "temperature": 0.9,
380
- "top_p": 0.95,
381
- "max_new_tokens": 1024,
382
- "stop": []
383
- },
384
- "endpoints": [{
385
- "type" : "openai",
386
- "baseURL": "http://localhost:8000/v1",
387
- "extraBody": {
388
- "repetition_penalty": 1.2,
389
- "top_k": 50,
390
- "truncate": 1000
391
- }
392
- }]
393
- }
394
- ]`
395
-
396
- ```
397
-
398
- The `openai` type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:
399
-
400
- ```
401
- OPENAI_API_KEY=#your openai api key here
402
- MODELS=`[{
403
- "name": "gpt-4",
404
- "displayName": "GPT 4",
405
- "endpoints" : [{
406
- "type": "openai"
407
- }]
408
- },
409
- {
410
- "name": "gpt-3.5-turbo",
411
- "displayName": "GPT 3.5 Turbo",
412
- "endpoints" : [{
413
- "type": "openai"
414
- }]
415
- }]`
416
- ```
417
-
418
- You may also consume any model provider that provides compatible OpenAI API endpoint. For example, you may self-host [Portkey](https://github.com/Portkey-AI/gateway) gateway and experiment with Claude or GPTs offered by Azure OpenAI. Example for Claude from Anthropic:
419
-
420
- ```
421
- MODELS=`[{
422
- "name": "claude-2.1",
423
- "displayName": "Claude 2.1",
424
- "description": "Anthropic has been founded by former OpenAI researchers...",
425
- "parameters": {
426
- "temperature": 0.5,
427
- "max_new_tokens": 4096,
428
- },
429
- "endpoints": [
430
- {
431
- "type": "openai",
432
- "baseURL": "https://gateway.example.com/v1",
433
- "defaultHeaders": {
434
- "x-portkey-config": '{"provider":"anthropic","api_key":"sk-ant-abc...xyz"}'
435
- }
436
- }
437
- ]
438
- }]`
439
- ```
440
-
441
- Example for GPT 4 deployed on Azure OpenAI:
442
-
443
- ```
444
- MODELS=`[{
445
- "id": "gpt-4-1106-preview",
446
- "name": "gpt-4-1106-preview",
447
- "displayName": "gpt-4-1106-preview",
448
- "parameters": {
449
- "temperature": 0.5,
450
- "max_new_tokens": 4096,
451
- },
452
- "endpoints": [
453
- {
454
- "type": "openai",
455
- "baseURL": "https://{resource-name}.openai.azure.com/openai/deployments/{deployment-id}",
456
- "defaultHeaders": {
457
- "api-key": "{api-key}"
458
- },
459
- "defaultQuery": {
460
- "api-version": "2023-05-15"
461
- }
462
- }
463
- ]
464
- }]`
465
- ```
466
-
467
- Or try Mistral from [Deepinfra](https://deepinfra.com/mistralai/Mistral-7B-Instruct-v0.1/api?example=openai-http):
468
-
469
- > Note, apiKey can either be set custom per endpoint, or globally using `OPENAI_API_KEY` variable.
470
-
471
- ```
472
- MODELS=`[{
473
- "name": "mistral-7b",
474
- "displayName": "Mistral 7B",
475
- "description": "A 7B dense Transformer, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context window.",
476
- "parameters": {
477
- "temperature": 0.5,
478
- "max_new_tokens": 4096,
479
- },
480
- "endpoints": [
481
- {
482
- "type": "openai",
483
- "baseURL": "https://api.deepinfra.com/v1/openai",
484
- "apiKey": "abc...xyz"
485
- }
486
- ]
487
- }]`
488
- ```
489
-
490
- _Non-streaming endpoints_
491
-
492
- For endpoints that don´t support streaming like o1 on Azure, you can pass `streamingSupported: false` in your endpoint config:
493
-
494
- ```
495
- MODELS=`[{
496
- "id": "o1-preview",
497
- "name": "o1-preview",
498
- "displayName": "o1-preview",
499
- "systemRoleSupported": false,
500
- "endpoints": [
501
- {
502
- "type": "openai",
503
- "baseURL": "https://my-deployment.openai.azure.com/openai/deployments/o1-preview",
504
- "defaultHeaders": {
505
- "api-key": "$SECRET"
506
- },
507
- "streamingSupported": false,
508
- }
509
- ]
510
- }]`
511
- ```
512
-
513
- ##### Llama.cpp API server
514
-
515
- chat-ui also supports the llama.cpp API server directly without the need for an adapter. You can do this using the `llamacpp` endpoint type.
516
-
517
- If you want to run Chat UI with llama.cpp, you can do the following, using [microsoft/Phi-3-mini-4k-instruct-gguf](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf) as an example model:
518
-
519
- ```bash
520
- # install llama.cpp
521
- brew install llama.cpp
522
- # start llama.cpp server
523
- llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
524
- ```
525
-
526
- ```env
527
- MODELS=`[
528
- {
529
- "name": "Local Zephyr",
530
- "chatPromptTemplate": "<|system|>\n{{preprompt}}</s>\n{{#each messages}}{{#ifUser}}<|user|>\n{{content}}</s>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}</s>\n{{/ifAssistant}}{{/each}}",
531
- "parameters": {
532
- "temperature": 0.1,
533
- "top_p": 0.95,
534
- "repetition_penalty": 1.2,
535
- "top_k": 50,
536
- "truncate": 1000,
537
- "max_new_tokens": 2048,
538
- "stop": ["</s>"]
539
- },
540
- "endpoints": [
541
- {
542
- "url": "http://127.0.0.1:8080",
543
- "type": "llamacpp"
544
- }
545
- ]
546
- }
547
- ]`
548
- ```
549
-
550
- Start chat-ui with `npm run dev` and you should be able to chat with Zephyr locally.
551
-
552
- #### Ollama
553
-
554
- We also support the Ollama inference server. Spin up a model with
555
-
556
- ```cli
557
- ollama run mistral
558
- ```
559
-
560
- Then specify the endpoints like so:
561
-
562
- ```env
563
- MODELS=`[
564
- {
565
- "name": "Ollama Mistral",
566
- "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}",
567
- "parameters": {
568
- "temperature": 0.1,
569
- "top_p": 0.95,
570
- "repetition_penalty": 1.2,
571
- "top_k": 50,
572
- "truncate": 3072,
573
- "max_new_tokens": 1024,
574
- "stop": ["</s>"]
575
- },
576
- "endpoints": [
577
- {
578
- "type": "ollama",
579
- "url" : "http://127.0.0.1:11434",
580
- "ollamaName" : "mistral"
581
- }
582
- ]
583
- }
584
- ]`
585
- ```
586
-
587
- #### Anthropic
588
-
589
- We also support Anthropic models (including multimodal ones via `multmodal: true`) through the official SDK. You may provide your API key via the `ANTHROPIC_API_KEY` env variable, or alternatively, through the `endpoints.apiKey` as per the following example.
590
 
591
- ```
592
- MODELS=`[
593
- {
594
- "name": "claude-3-haiku-20240307",
595
- "displayName": "Claude 3 Haiku",
596
- "description": "Fastest and most compact model for near-instant responsiveness",
597
- "multimodal": true,
598
- "parameters": {
599
- "max_new_tokens": 4096,
600
- },
601
- "endpoints": [
602
- {
603
- "type": "anthropic",
604
- // optionals
605
- "apiKey": "sk-ant-...",
606
- "baseURL": "https://api.anthropic.com",
607
- "defaultHeaders": {},
608
- "defaultQuery": {}
609
- }
610
- ]
611
- },
612
- {
613
- "name": "claude-3-sonnet-20240229",
614
- "displayName": "Claude 3 Sonnet",
615
- "description": "Ideal balance of intelligence and speed",
616
- "multimodal": true,
617
- "parameters": {
618
- "max_new_tokens": 4096,
619
- },
620
- "endpoints": [
621
- {
622
- "type": "anthropic",
623
- // optionals
624
- "apiKey": "sk-ant-...",
625
- "baseURL": "https://api.anthropic.com",
626
- "defaultHeaders": {},
627
- "defaultQuery": {}
628
- }
629
- ]
630
- },
631
- {
632
- "name": "claude-3-opus-20240229",
633
- "displayName": "Claude 3 Opus",
634
- "description": "Most powerful model for highly complex tasks",
635
- "multimodal": true,
636
- "parameters": {
637
- "max_new_tokens": 4096
638
- },
639
- "endpoints": [
640
- {
641
- "type": "anthropic",
642
- // optionals
643
- "apiKey": "sk-ant-...",
644
- "baseURL": "https://api.anthropic.com",
645
- "defaultHeaders": {},
646
- "defaultQuery": {}
647
- }
648
- ]
649
- }
650
- ]`
651
- ```
652
-
653
- We also support using Anthropic models running on Vertex AI. Authentication is done using Google Application Default Credentials. Project ID can be provided through the `endpoints.projectId` as per the following example:
654
-
655
- ```
656
- MODELS=`[
657
- {
658
- "name": "claude-3-sonnet@20240229",
659
- "displayName": "Claude 3 Sonnet",
660
- "description": "Ideal balance of intelligence and speed",
661
- "multimodal": true,
662
- "parameters": {
663
- "max_new_tokens": 4096,
664
- },
665
- "endpoints": [
666
- {
667
- "type": "anthropic-vertex",
668
- "region": "us-central1",
669
- "projectId": "gcp-project-id",
670
- // optionals
671
- "defaultHeaders": {},
672
- "defaultQuery": {}
673
- }
674
- ]
675
- },
676
- {
677
- "name": "claude-3-haiku@20240307",
678
- "displayName": "Claude 3 Haiku",
679
- "description": "Fastest, most compact model for near-instant responsiveness",
680
- "multimodal": true,
681
- "parameters": {
682
- "max_new_tokens": 4096
683
- },
684
- "endpoints": [
685
- {
686
- "type": "anthropic-vertex",
687
- "region": "us-central1",
688
- "projectId": "gcp-project-id",
689
- // optionals
690
- "defaultHeaders": {},
691
- "defaultQuery": {}
692
- }
693
- ]
694
- }
695
- ]`
696
- ```
697
-
698
- #### Amazon
699
-
700
- You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
701
-
702
- ```env
703
- "endpoints": [
704
- {
705
- "type" : "aws",
706
- "service" : "sagemaker"
707
- "url": "",
708
- "accessKey": "",
709
- "secretKey" : "",
710
- "sessionToken": "",
711
- "region": "",
712
-
713
- "weight": 1
714
- }
715
- ]
716
- ```
717
-
718
- You can also set `"service" : "lambda"` to use a lambda instance.
719
-
720
- You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
721
-
722
- #### Cloudflare Workers AI
723
-
724
- You can also use Cloudflare Workers AI to run your own models with serverless inference.
725
-
726
- You will need to have a Cloudflare account, then get your [account ID](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/) as well as your [API token](https://developers.cloudflare.com/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id) for Workers AI.
727
-
728
- You can either specify them directly in your `.env.local` using the `CLOUDFLARE_ACCOUNT_ID` and `CLOUDFLARE_API_TOKEN` variables, or you can set them directly in the endpoint config.
729
 
730
- You can find the list of models available on Cloudflare [here](https://developers.cloudflare.com/workers-ai/models/#text-generation).
731
 
732
- ```env
733
- {
734
- "name" : "nousresearch/hermes-2-pro-mistral-7b",
735
- "tokenizer": "nousresearch/hermes-2-pro-mistral-7b",
736
- "parameters": {
737
- "stop": ["<|im_end|>"]
738
- },
739
- "endpoints" : [
740
- {
741
- "type" : "cloudflare"
742
- <!-- optionally specify these
743
- "accountId": "your-account-id",
744
- "authToken": "your-api-token"
745
- -->
746
- }
747
- ]
748
- }
749
- ```
750
-
751
- #### Cohere
752
-
753
- You can also use Cohere to run their models directly from chat-ui. You will need to have a Cohere account, then get your [API token](https://dashboard.cohere.com/api-keys). You can either specify it directly in your `.env.local` using the `COHERE_API_TOKEN` variable, or you can set it in the endpoint config.
754
-
755
- Here is an example of a Cohere model config. You can set which model you want to use by setting the `id` field to the model name.
756
-
757
- ```env
758
- {
759
- "name" : "CohereForAI/c4ai-command-r-v01",
760
- "id": "command-r",
761
- "description": "C4AI Command-R is a research release of a 35 billion parameter highly performant generative model",
762
- "endpoints": [
763
- {
764
- "type": "cohere",
765
- <!-- optionally specify these, or use COHERE_API_TOKEN
766
- "apiKey": "your-api-token"
767
- -->
768
- }
769
- ]
770
- }
771
- ```
772
-
773
- ##### Google Vertex models
774
 
775
- Chat UI can connect to the google Vertex API endpoints ([List of supported models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models)).
776
 
777
- To enable:
 
 
 
 
778
 
779
- 1. [Select](https://console.cloud.google.com/project) or [create](https://cloud.google.com/resource-manager/docs/creating-managing-projects#creating_a_project) a Google Cloud project.
780
- 1. [Enable billing for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
781
- 1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
782
- 1. [Set up authentication with a service account](https://cloud.google.com/docs/authentication/getting-started)
783
- so you can access the API from your local workstation.
784
 
785
- The service account credentials file can be imported as an environmental variable:
786
-
787
- ```env
788
- GOOGLE_APPLICATION_CREDENTIALS = clientid.json
789
- ```
790
-
791
- Make sure your docker container has access to the file and the variable is correctly set.
792
- Afterwards Google Vertex endpoints can be configured as following:
793
-
794
- ```
795
- MODELS=`[
796
- //...
797
- {
798
- "name": "gemini-1.5-pro",
799
- "displayName": "Vertex Gemini Pro 1.5",
800
- "multimodal": true,
801
- "endpoints" : [{
802
- "type": "vertex",
803
- "project": "abc-xyz",
804
- "location": "europe-west3",
805
- "extraBody": {
806
- "model_version": "gemini-1.5-pro-preview-0409",
807
- },
808
-
809
- // Optional
810
- "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
811
- "apiEndpoint": "", // alternative api endpoint url,
812
- "tools": [{
813
- "googleSearchRetrieval": {
814
- "disableAttribution": true
815
- }
816
- }],
817
- "multimodal": {
818
- "image": {
819
- "supportedMimeTypes": ["image/png", "image/jpeg", "image/webp"],
820
- "preferredMimeType": "image/png",
821
- "maxSizeInMB": 5,
822
- "maxWidth": 2000,
823
- "maxHeight": 1000,
824
- }
825
- }
826
- }]
827
- },
828
- ]`
829
-
830
- ```
831
-
832
- ##### LangServe
833
-
834
- LangChain applications that are deployed using LangServe can be called with the following config:
835
-
836
- ```
837
- MODELS=`[
838
- //...
839
- {
840
- "name": "summarization-chain", //model-name
841
- "endpoints" : [{
842
- "type": "langserve",
843
- "url" : "http://127.0.0.1:8100",
844
- }]
845
- },
846
- ]`
847
-
848
- ```
849
-
850
- ### Model Context Protocol (MCP) Support (Upcoming)
851
-
852
- The project is planning to introduce support for the Model Context Protocol (MCP). MCP is a specification designed to standardize how language models receive and understand context from various sources. This will enable more flexible and powerful integrations, allowing models to seamlessly access and utilize a broader range of information, such as user history, external documents, or real-time data, in a structured way.
853
-
854
- This is an upcoming feature, and we believe it will significantly enhance the capabilities and extensibility of Chat UI.
855
-
856
- We are actively seeking contributions from the community to help design, implement, and integrate MCP support into Chat UI. If you are interested in shaping the future of how Chat UI handles model context and want to contribute to this exciting development, please look for issues tagged with 'MCP' or 'Model Context Protocol' on our issue tracker. Your expertise and input would be invaluable!
857
-
858
- ### Custom endpoint authorization
859
-
860
- #### Basic and Bearer
861
-
862
- Custom endpoints may require authorization, depending on how you configure them. Authentication will usually be set either with `Basic` or `Bearer`.
863
-
864
- For `Basic` we will need to generate a base64 encoding of the username and password.
865
-
866
- `echo -n "USER:PASS" | base64`
867
-
868
- > VVNFUjpQQVNT
869
-
870
- For `Bearer` you can use a token, which can be grabbed from [here](https://huggingface.co/settings/tokens).
871
-
872
- You can then add the generated information and the `authorization` parameter to your `.env.local`.
873
-
874
- ```env
875
- "endpoints": [
876
- {
877
- "url": "https://HOST:PORT",
878
- "authorization": "Basic VVNFUjpQQVNT",
879
- }
880
- ]
881
- ```
882
-
883
- Please note that if `HF_TOKEN` is also set or not empty, it will take precedence.
884
-
885
- #### Models hosted on multiple custom endpoints
886
-
887
- If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
888
-
889
- ```env
890
- "endpoints": [
891
- {
892
- "url": "https://HOST:PORT",
893
- "weight": 1
894
- },
895
- {
896
- "url": "https://HOST:PORT",
897
- "weight": 2
898
- }
899
- ...
900
- ]
901
- ```
902
-
903
- #### Client Certificate Authentication (mTLS)
904
-
905
- Custom endpoints may require client certificate authentication, depending on how you configure them. To enable mTLS between Chat UI and your custom endpoint, you will need to set the `USE_CLIENT_CERTIFICATE` to `true`, and add the `CERT_PATH` and `KEY_PATH` parameters to your `.env.local`. These parameters should point to the location of the certificate and key files on your local machine. The certificate and key files should be in PEM format. The key file can be encrypted with a passphrase, in which case you will also need to add the `CLIENT_KEY_PASSWORD` parameter to your `.env.local`.
906
-
907
- If you're using a certificate signed by a private CA, you will also need to add the `CA_PATH` parameter to your `.env.local`. This parameter should point to the location of the CA certificate file on your local machine.
908
-
909
- If you're using a self-signed certificate, e.g. for testing or development purposes, you can set the `REJECT_UNAUTHORIZED` parameter to `false` in your `.env.local`. This will disable certificate validation, and allow Chat UI to connect to your custom endpoint.
910
-
911
- #### Specific Embedding Model
912
-
913
- A model can use any of the embedding models defined in `.env.local`, (currently used when web searching),
914
- by default it will use the first embedding model, but it can be changed with the field `embeddingModel`:
915
-
916
- ```env
917
- TEXT_EMBEDDING_MODELS = `[
918
- {
919
- "name": "Xenova/gte-small",
920
- "chunkCharLength": 512,
921
- "endpoints": [
922
- {"type": "transformersjs"}
923
- ]
924
- },
925
- {
926
- "name": "intfloat/e5-base-v2",
927
- "chunkCharLength": 768,
928
- "endpoints": [
929
- {"type": "tei", "url": "http://127.0.0.1:8080/", "authorization": "Basic VVNFUjpQQVNT"},
930
- {"type": "tei", "url": "http://127.0.0.1:8081/"}
931
- ]
932
- }
933
- ]`
934
-
935
- MODELS=`[
936
- {
937
- "name": "Ollama Mistral",
938
- "chatPromptTemplate": "...",
939
- "embeddingModel": "intfloat/e5-base-v2"
940
- "parameters": {
941
- ...
942
- },
943
- "endpoints": [
944
- ...
945
- ]
946
- }
947
- ]`
948
-
949
- ```
950
-
951
- ### Reasoning Models
952
-
953
- ChatUI supports specialized reasoning/Chain-of-Thought (CoT) models through the `reasoning` configuration field. When properly configured, this displays a UI widget that allows users to view or collapse the model’s reasoning steps. We support three types of reasoning parsing:
954
-
955
- #### Token-Based Delimitations
956
-
957
- For models like DeepSeek R1, token-based delimitations can be used to identify reasoning steps. This is done by specifying the `beginToken` and `endToken` fields in the `reasoning` configuration.
958
-
959
- Example configuration for DeepSeek R1 (token-based):
960
-
961
- ```json
962
- {
963
- "name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
964
- // ...
965
- "reasoning": {
966
- "type": "tokens",
967
- "beginToken": "<think>",
968
- "endToken": "</think>"
969
- }
970
- }
971
- ```
972
-
973
- #### Summarizing the Chain of Thought
974
-
975
- For models like QwQ, which return a chain of thought but do not explicitly provide a final answer, the `summarize` type can be used. This automatically summarizes the reasoning steps using the `TASK_MODEL` (or the first model in the configuration if `TASK_MODEL` is not specified) and displays the summary as the final answer.
976
-
977
- Example configuration for QwQ (summarize-based):
978
-
979
- ```json
980
- {
981
- "name": "Qwen/QwQ-32B-Preview",
982
- // ...
983
- "reasoning": {
984
- "type": "summarize"
985
- }
986
- }
987
- ```
988
-
989
- #### Regex-Based Delimitations
990
-
991
- In some cases, the final answer can be extracted from the model output using a regular expression. This is achieved by specifying the `regex` field in the `reasoning` configuration. For example, if your model wraps the final answer in a `\boxed{}` tag, you can use the following configuration:
992
-
993
- ```json
994
- {
995
- "name": "model/yourmodel",
996
- // ...
997
- "reasoning": {
998
- "type": "regex",
999
- "regex": "\\\\boxed\\{(.+?)\\}"
1000
- }
1001
- }
1002
- ```
1003
-
1004
- #### Enabling/Disabling Reasoning Summary
1005
-
1006
- You can toggle the summaries that are displayed alongside the CoT by changing the `REASONING_SUMMARY` env variable.
1007
-
1008
- ```env
1009
- REASONING_SUMMARY=false
1010
- ```
1011
-
1012
- ## Common issues
1013
-
1014
- ### 403:You don't have access to this conversation
1015
-
1016
- Most likely you are running chat-ui over HTTP. The recommended option is to setup something like NGINX to handle HTTPS and proxy the requests to chat-ui. If you really need to run over HTTP you can add `COOKIE_SECURE=false` and `COOKIE_SAMESITE=lax` to your `.env.local`.
1017
-
1018
- Make sure to set your `PUBLIC_ORIGIN` in your `.env.local` to the correct URL as well.
1019
-
1020
- ## Deploying to a HF Space
1021
-
1022
- Create a `DOTENV_LOCAL` secret to your HF space with the content of your .env.local, and they will be picked up automatically when you run.
1023
 
1024
  ## Building
1025
 
@@ -1032,89 +150,3 @@ npm run build
1032
  You can preview the production build with `npm run preview`.
1033
 
1034
  > To deploy your app, you may need to install an [adapter](https://kit.svelte.dev/docs/adapters) for your target environment.
1035
-
1036
- ## Config changes for HuggingChat
1037
-
1038
- The config file for HuggingChat is stored in the `chart/env/prod.yaml` file. It is the source of truth for the environment variables used for our CI/CD pipeline. For HuggingChat, as we need to customize the app color, as well as the base path, we build a custom docker image. You can find the workflow here.
1039
-
1040
- > [!TIP]
1041
- > If you want to make changes to the model config used in production for HuggingChat, you should do so against `chart/env/prod.yaml`.
1042
-
1043
- ### Running a copy of HuggingChat locally
1044
-
1045
- If you want to run an exact copy of HuggingChat locally, you will need to do the following first:
1046
-
1047
- 1. Create an [OAuth App on the hub](https://huggingface.co/settings/applications/new) with `openid profile email` permissions. Make sure to set the callback URL to something like `http://localhost:5173/chat/login/callback` which matches the right path for your local instance.
1048
- 2. Create a [HF Token](https://huggingface.co/settings/tokens) with your Hugging Face account. You will need a Pro account to be able to access some of the larger models available through HuggingChat.
1049
- 3. Create a free account with [serper.dev](https://serper.dev/) (you will get 2500 free search queries)
1050
- 4. Run an instance of mongoDB, however you want. (Local or remote)
1051
-
1052
- You can then create a new `.env.SECRET_CONFIG` file with the following content
1053
-
1054
- ```env
1055
- MONGODB_URL=<link to your mongo DB from step 4>
1056
- HF_TOKEN=<your HF token from step 2>
1057
- OPENID_CONFIG=`{
1058
- PROVIDER_URL: "https://huggingface.co",
1059
- CLIENT_ID: "<your client ID from step 1>",
1060
- CLIENT_SECRET: "<your client secret from step 1>",
1061
- }`
1062
- SERPER_API_KEY=<your serper API key from step 3>
1063
- MESSAGES_BEFORE_LOGIN=<can be any numerical value, or set to 0 to require login>
1064
- ```
1065
-
1066
- You can then run `npm run updateLocalEnv` in the root of chat-ui. This will create a `.env.local` file which combines the `chart/env/prod.yaml` and the `.env.SECRET_CONFIG` file. You can then run `npm run dev` to start your local instance of HuggingChat.
1067
-
1068
- ### Populate database
1069
-
1070
- > [!WARNING]
1071
- > The `MONGODB_URL` used for this script will be fetched from `.env.local`. Make sure it's correct! The command runs directly on the database.
1072
-
1073
- You can populate the database using faker data using the `populate` script:
1074
-
1075
- ```bash
1076
- npm run populate <flags here>
1077
- ```
1078
-
1079
- At least one flag must be specified, the following flags are available:
1080
-
1081
- - `reset` - resets the database
1082
- - `all` - populates all tables
1083
- - `users` - populates the users table
1084
- - `settings` - populates the settings table for existing users
1085
- - `assistants` - populates the assistants table for existing users
1086
- - `conversations` - populates the conversations table for existing users
1087
-
1088
- For example, you could use it like so:
1089
-
1090
- ```bash
1091
- npm run populate reset
1092
- ```
1093
-
1094
- to clear out the database. Then login in the app to create your user and run the following command:
1095
-
1096
- ```bash
1097
- npm run populate users settings assistants conversations
1098
- ```
1099
-
1100
- to populate the database with fake data, including fake conversations and assistants for your user.
1101
-
1102
- ## Building the docker images locally
1103
-
1104
- You can build the docker images locally using the following commands:
1105
-
1106
- ```bash
1107
- docker build -t chat-ui-db:latest --build-arg INCLUDE_DB=true .
1108
- docker build -t chat-ui:latest --build-arg INCLUDE_DB=false .
1109
- docker build -t huggingchat:latest --build-arg INCLUDE_DB=false --build-arg APP_BASE=/chat --build-arg PUBLIC_APP_COLOR=yellow --build-arg SKIP_LLAMA_CPP_BUILD=true .
1110
- ```
1111
-
1112
- If you want to run the images with your local .env.local you have two options
1113
-
1114
- ```bash
1115
- DOTENV_LOCAL=$(<.env.local) docker run --network=host -e DOTENV_LOCAL chat-ui-db
1116
- ```
1117
-
1118
- ```bash
1119
- docker run --network=host --mount type=bind,source="$(pwd)/.env.local",target=/app/.env.local chat-ui-db
1120
- ```
 
1
  # Chat UI
2
 
3
+ ![Chat UI repository thumbnail](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/Frame%2013.png)
 
 
4
 
5
  A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the [HuggingChat app on hf.co/chat](https://huggingface.co/chat).
6
 
7
  0. [Quickstart](#quickstart)
8
+ 1. [Database Options](#database-options)
9
+ 2. [Launch](#launch)
10
+ 3. [Optional Docker Image](#optional-docker-image)
11
+ 4. [Extra parameters](#extra-parameters)
12
+ 5. [Building](#building)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
+ > Note on models: Chat UI only supports OpenAI-compatible APIs via `OPENAI_BASE_URL` and the `/models` endpoint. Provider-specific integrations (legacy `MODELS` env var, GGUF discovery, embeddings, web-search helpers, etc.) are removed, but any service that speaks the OpenAI protocol—Hugging Face router, llama.cpp server, Ollama’s OpenAI bridge, OpenRouter, Anthropic-on-OpenRouter, etc.—will work.
15
 
16
+ ## Quickstart
 
 
 
 
 
 
 
 
 
 
 
17
 
18
+ Chat UI speaks to OpenAI-compatible APIs only. The fastest way to get running is with the Hugging Face Inference Providers router plus your personal Hugging Face access token.
 
 
19
 
20
+ **Step 1 Create `.env.local`:**
21
 
22
+ ```env
23
+ OPENAI_BASE_URL=https://router.huggingface.co/v1
24
+ OPENAI_API_KEY=hf_************************
25
+ # Fill in once you pick a database option below
26
+ MONGODB_URL=
27
  ```
28
 
29
+ `OPENAI_API_KEY` can come from any OpenAI-compatible endpoint you plan to call. Pick the combo that matches your setup and drop the values into `.env.local`:
30
 
31
+ | Provider | Example `OPENAI_BASE_URL` | Example key env |
32
+ | --------------------------------------------- | ---------------------------------- | ----------------------------------------------------------------------- |
33
+ | Hugging Face Inference Providers router | `https://router.huggingface.co/v1` | `OPENAI_API_KEY=hf_xxx` (or `HF_TOKEN` legacy alias) |
34
+ | llama.cpp server (`llama.cpp --server --api`) | `http://127.0.0.1:8080/v1` | `OPENAI_API_KEY=sk-local-demo` (any string works; llama.cpp ignores it) |
35
+ | Ollama (with OpenAI-compatible bridge) | `http://127.0.0.1:11434/v1` | `OPENAI_API_KEY=ollama` |
36
+ | OpenRouter | `https://openrouter.ai/api/v1` | `OPENAI_API_KEY=sk-or-v1-...` |
37
 
38
+ Check the root [`.env` template](./.env) for the full list of optional variables you can override.
 
 
39
 
40
+ **Step 2 – Choose where MongoDB lives:** Either provision a managed cluster (for example MongoDB Atlas) or run a local container. Both approaches are described in [Database Options](#database-options). After you have the URI, drop it into `MONGODB_URL` (and, if desired, set `MONGODB_DB_NAME`).
41
 
42
+ **Step 3 Install and launch the dev server:**
43
 
44
  ```bash
45
  git clone https://github.com/huggingface/chat-ui
46
  cd chat-ui
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  npm install
48
  npm run dev -- --open
49
  ```
50
 
51
+ You now have Chat UI running against the Hugging Face router without needing to host MongoDB yourself.
52
 
53
+ ## Database Options
54
 
55
+ Chat history, users, settings, files, and stats all live in MongoDB. You can point Chat UI at any MongoDB 6/7 deployment.
56
 
57
+ ### MongoDB Atlas (managed)
58
 
59
+ 1. Create a free cluster at [mongodb.com](https://www.mongodb.com/pricing).
60
+ 2. Add your IP (or `0.0.0.0/0` for development) to the network access list.
61
+ 3. Create a database user and copy the connection string.
62
+ 4. Paste that string into `MONGODB_URL` in `.env.local`. Keep the default `MONGODB_DB_NAME=chat-ui` or change it per environment.
63
 
64
+ Atlas keeps MongoDB off your laptop, which is ideal for teams or cloud deployments.
65
 
66
+ ### Local MongoDB (container)
67
 
68
+ If you prefer to run MongoDB locally:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
 
70
  ```bash
71
  docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
72
  ```
73
 
74
+ Then set `MONGODB_URL=mongodb://localhost:27017` in `.env.local`. You can also supply `MONGO_STORAGE_PATH` if you want Chat UI’s fallback in-memory server to persist under a specific folder.
 
 
 
 
 
 
75
 
76
  ## Launch
77
 
78
+ After configuring your environment variables, start Chat UI with:
79
 
80
  ```bash
81
  npm install
82
  npm run dev
83
  ```
84
 
85
+ The dev server listens on `http://localhost:5173` by default. Use `npm run build` / `npm run preview` for production builds.
 
 
 
 
 
 
 
 
86
 
87
+ ## Optional Docker Image
88
 
89
+ Prefer containerized setup? You can run everything in one container as long as you supply a MongoDB URI (local or hosted):
90
 
91
+ ```bash
92
+ docker run \
93
+ -p 3000 \
94
+ -e MONGODB_URL=mongodb://host.docker.internal:27017 \
95
+ -e OPENAI_BASE_URL=https://router.huggingface.co/v1 \
96
+ -e OPENAI_API_KEY=hf_*** \
97
+ -v db:/data \
98
+ ghcr.io/huggingface/chat-ui-db:latest
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  ```
100
 
101
+ `host.docker.internal` lets the container reach a MongoDB instance on your host machine; swap it for your Atlas URI if you use the hosted option. All environment variables accepted in `.env.local` can be provided as `-e` flags.
 
 
 
102
 
103
  ## Extra parameters
104
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
  ### Theming
106
 
107
  You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
 
112
  PUBLIC_APP_COLOR=blue
113
  PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
114
  PUBLIC_APP_DATA_SHARING=
 
115
  ```
116
 
117
  - `PUBLIC_APP_NAME` The name used as a title throughout the app.
118
  - `PUBLIC_APP_ASSETS` Is used to find logos & favicons in `static/$PUBLIC_APP_ASSETS`, current options are `chatui` and `huggingchat`.
119
  - `PUBLIC_APP_COLOR` Can be any of the [tailwind colors](https://tailwindcss.com/docs/customizing-colors#default-color-palette).
120
  - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
 
122
+ ### Models
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
123
 
124
+ This build does not use the `MODELS` env var or GGUF discovery. Configure models via `OPENAI_BASE_URL` only; Chat UI will fetch `${OPENAI_BASE_URL}/models` and populate the list automatically. Authorization uses `OPENAI_API_KEY` (preferred). `HF_TOKEN` remains a legacy alias.
125
 
126
+ ### LLM Router (Optional)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
127
 
128
+ Chat UI can perform client-side routing using an Arch Router model without running a separate router service. The UI exposes a virtual model alias called "Omni" (configurable) that, when selected, chooses the best route/model for each message.
129
 
130
+ - Provide a routes policy JSON via `LLM_ROUTER_ROUTES_PATH`. No sample file ships with this branch, so you must point the variable to a JSON array you create yourself (for example, commit one in your project like `config/routes.chat.json`). Each route entry needs `name`, `description`, `primary_model`, and optional `fallback_models`.
131
+ - Configure the Arch router selection endpoint with `LLM_ROUTER_ARCH_BASE_URL` (OpenAI-compatible `/chat/completions`) and `LLM_ROUTER_ARCH_MODEL` (e.g. `router/omni`). The Arch call reuses `OPENAI_API_KEY` for auth.
132
+ - Map `other` to a concrete route via `LLM_ROUTER_OTHER_ROUTE` (default: `casual_conversation`). If Arch selection fails, calls fall back to `LLM_ROUTER_FALLBACK_MODEL`.
133
+ - Selection timeout can be tuned via `LLM_ROUTER_ARCH_TIMEOUT_MS` (default 10000).
134
+ - Omni alias configuration: `PUBLIC_LLM_ROUTER_ALIAS_ID` (default `omni`), `PUBLIC_LLM_ROUTER_DISPLAY_NAME` (default `Omni`), and optional `PUBLIC_LLM_ROUTER_LOGO_URL`.
135
 
136
+ When you select Omni in the UI, Chat UI will:
 
 
 
 
137
 
138
+ - Call the Arch endpoint once (non-streaming) to pick the best route for the last turns.
139
+ - Emit RouterMetadata immediately (route and actual model used) so the UI can display it.
140
+ - Stream from the selected model via your configured `OPENAI_BASE_URL`. On errors, it tries route fallbacks.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
  ## Building
143
 
 
150
  You can preview the production build with `npm run preview`.
151
 
152
  > To deploy your app, you may need to install an [adapter](https://kit.svelte.dev/docs/adapters) for your target environment.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
chart/env/prod.yaml CHANGED
@@ -51,11 +51,8 @@ envVars:
51
  COOKIE_SAMESITE: "lax"
52
  COOKIE_SECURE: "true"
53
  ENABLE_ASSISTANTS: "true"
54
- ENABLE_ASSISTANTS_RAG: "true"
55
  ENABLE_CONFIG_MANAGER: "false"
56
- METRICS_PORT: 5565
57
  LOG_LEVEL: "debug"
58
- METRICS_ENABLED: "true"
59
  MODELS: >
60
  [
61
  {
@@ -542,10 +539,8 @@ envVars:
542
  PUBLIC_APP_ASSETS: "huggingchat"
543
  PUBLIC_APP_COLOR: "yellow"
544
  PUBLIC_APP_DESCRIPTION: "Making the community's best AI chat models available to everyone."
545
- PUBLIC_APP_DISCLAIMER_MESSAGE: "Disclaimer: AI is an area of active research with known problems such as biased generation and misinformation. Do not use this application for high-stakes decisions or advice."
546
  PUBLIC_APP_GUEST_MESSAGE: "Sign in with a free Hugging Face account to continue using HuggingChat."
547
  PUBLIC_APP_DATA_SHARING: 0
548
- PUBLIC_APP_DISCLAIMER: 1
549
  PUBLIC_PLAUSIBLE_SCRIPT_URL: "/js/script.js"
550
  REQUIRE_FEATURED_ASSISTANTS: "true"
551
  TASK_MODEL: >
 
51
  COOKIE_SAMESITE: "lax"
52
  COOKIE_SECURE: "true"
53
  ENABLE_ASSISTANTS: "true"
 
54
  ENABLE_CONFIG_MANAGER: "false"
 
55
  LOG_LEVEL: "debug"
 
56
  MODELS: >
57
  [
58
  {
 
539
  PUBLIC_APP_ASSETS: "huggingchat"
540
  PUBLIC_APP_COLOR: "yellow"
541
  PUBLIC_APP_DESCRIPTION: "Making the community's best AI chat models available to everyone."
 
542
  PUBLIC_APP_GUEST_MESSAGE: "Sign in with a free Hugging Face account to continue using HuggingChat."
543
  PUBLIC_APP_DATA_SHARING: 0
 
544
  PUBLIC_PLAUSIBLE_SCRIPT_URL: "/js/script.js"
545
  REQUIRE_FEATURED_ASSISTANTS: "true"
546
  TASK_MODEL: >
chart/templates/deployment.yaml CHANGED
@@ -53,11 +53,6 @@ spec:
53
  - containerPort: {{ $.Values.envVars.APP_PORT | default 3000 | int }}
54
  name: http
55
  protocol: TCP
56
- {{- if $.Values.monitoring.enabled }}
57
- - containerPort: {{ $.Values.envVars.METRICS_PORT | default 5565 | int }}
58
- name: metrics
59
- protocol: TCP
60
- {{- end }}
61
  resources: {{ toYaml .Values.resources | nindent 12 }}
62
  {{- with $.Values.extraEnv }}
63
  env:
 
53
  - containerPort: {{ $.Values.envVars.APP_PORT | default 3000 | int }}
54
  name: http
55
  protocol: TCP
 
 
 
 
 
56
  resources: {{ toYaml .Values.resources | nindent 12 }}
57
  {{- with $.Values.extraEnv }}
58
  env:
chart/templates/service-monitor.yaml DELETED
@@ -1,15 +0,0 @@
1
- {{- if $.Values.monitoring.enabled }}
2
- apiVersion: monitoring.coreos.com/v1
3
- kind: ServiceMonitor
4
- metadata:
5
- labels: {{ include "labels.standard" . | nindent 4 }}
6
- name: {{ include "name" . }}
7
- namespace: {{ .Release.Namespace }}
8
- spec:
9
- selector:
10
- matchLabels: {{ include "labels.standard" . | nindent 6 }}
11
- endpoints:
12
- - port: metrics
13
- path: /metrics
14
- interval: 15s
15
- {{- end }}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
chart/templates/service.yaml CHANGED
@@ -11,11 +11,5 @@ spec:
11
  port: 80
12
  protocol: TCP
13
  targetPort: http
14
- {{- if $.Values.monitoring.enabled }}
15
- - name: metrics
16
- port: 5565
17
- protocol: TCP
18
- targetPort: metrics
19
- {{- end }}
20
  selector: {{ include "labels.standard" . | nindent 4 }}
21
  type: {{.Values.service.type}}
 
11
  port: 80
12
  protocol: TCP
13
  targetPort: http
 
 
 
 
 
 
14
  selector: {{ include "labels.standard" . | nindent 4 }}
15
  type: {{.Values.service.type}}
chart/values.yaml CHANGED
@@ -70,5 +70,4 @@ autoscaling:
70
  targetMemoryUtilizationPercentage: ""
71
  targetCPUUtilizationPercentage: ""
72
 
73
- monitoring:
74
- enabled: false
 
70
  targetMemoryUtilizationPercentage: ""
71
  targetCPUUtilizationPercentage: ""
72
 
73
+ ## Metrics removed; monitoring configuration no longer used
 
docs/source/_toctree.yml DELETED
@@ -1,64 +0,0 @@
1
- - local: index
2
- title: 🤗 Chat UI
3
- - title: Installation
4
- sections:
5
- - local: installation/local
6
- title: Local
7
- - local: installation/spaces
8
- title: Spaces
9
- - local: installation/docker
10
- title: Docker
11
- - local: installation/helm
12
- title: Helm
13
- - title: Configuration
14
- sections:
15
- - local: configuration/overview
16
- title: Overview
17
- - local: configuration/theming
18
- title: Theming
19
- - local: configuration/open-id
20
- title: OpenID
21
- - local: configuration/web-search
22
- title: Web Search
23
- - local: configuration/metrics
24
- title: Metrics
25
- - local: configuration/embeddings
26
- title: Text Embedding Models
27
- - title: Models
28
- sections:
29
- - local: configuration/models/overview
30
- title: Overview
31
- - local: configuration/models/multimodal
32
- title: Multimodal
33
- - local: configuration/models/tools
34
- title: Tools
35
- - title: Providers
36
- sections:
37
- - local: configuration/models/providers/anthropic
38
- title: Anthropic
39
- - local: configuration/models/providers/aws
40
- title: AWS
41
- - local: configuration/models/providers/cloudflare
42
- title: Cloudflare
43
- - local: configuration/models/providers/cohere
44
- title: Cohere
45
- - local: configuration/models/providers/google
46
- title: Google
47
- - local: configuration/models/providers/langserve
48
- title: Langserve
49
- - local: configuration/models/providers/llamacpp
50
- title: Llama.cpp
51
- - local: configuration/models/providers/ollama
52
- title: Ollama
53
- - local: configuration/models/providers/openai
54
- title: OpenAI
55
- - local: configuration/models/providers/tgi
56
- title: TGI
57
- - local: configuration/common-issues
58
- title: Common Issues
59
- - title: Developing
60
- sections:
61
- - local: developing/architecture
62
- title: Architecture
63
- - local: developing/copy-huggingchat
64
- title: Copy HuggingChat
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/common-issues.md DELETED
@@ -1,7 +0,0 @@
1
- # Common Issues
2
-
3
- ## 403:You don't have access to this conversation
4
-
5
- Most likely you are running chat-ui over HTTP. The recommended option is to setup something like NGINX to handle HTTPS and proxy the requests to chat-ui. If you really need to run over HTTP you can add `ALLOW_INSECURE_COOKIES=true` to your `.env.local`.
6
-
7
- Make sure to set your `PUBLIC_ORIGIN` in your `.env.local` to the correct URL as well.
 
 
 
 
 
 
 
 
docs/source/configuration/embeddings.md DELETED
@@ -1,105 +0,0 @@
1
- # Text Embedding Models
2
-
3
- By default (for backward compatibility), when `TEXT_EMBEDDING_MODELS` environment variable is not defined, [transformers.js](https://huggingface.co/docs/transformers.js) embedding models will be used for embedding tasks, specifically, the [Xenova/gte-small](https://huggingface.co/Xenova/gte-small) model.
4
-
5
- You can customize the embedding model by setting `TEXT_EMBEDDING_MODELS` in your `.env.local` file where the required fields are `name`, `chunkCharLength` and `endpoints`.
6
-
7
- Supported text embedding backends are: [`transformers.js`](https://huggingface.co/docs/transformers.js), [`TEI`](https://github.com/huggingface/text-embeddings-inference) and [`OpenAI`](https://platform.openai.com/docs/guides/embeddings). `transformers.js` models run locally as part of `chat-ui`, whereas `TEI` models run in a different environment & accessed through an API endpoint. `openai` models are accessed through the [OpenAI API](https://platform.openai.com/docs/guides/embeddings).
8
-
9
- When more than one embedding models are supplied in `.env.local` file, the first will be used by default, and the others will only be used on LLM's which configured `embeddingModel` to the name of the model.
10
-
11
- ## Transformers.js
12
-
13
- The Transformers.js backend uses local CPU for the embedding which can be quite slow. If possible, consider using TEI or OpenAI embeddings instead if you use web search frequently, as performance will improve significantly.
14
-
15
- ```ini
16
- TEXT_EMBEDDING_MODELS = `[
17
- {
18
- "name": "Xenova/gte-small",
19
- "displayName": "Xenova/gte-small",
20
- "description": "locally running embedding",
21
- "chunkCharLength": 512,
22
- "endpoints": [
23
- { "type": "transformersjs" }
24
- ]
25
- }
26
- ]`
27
- ```
28
-
29
- ## Text Embeddings Inference (TEI)
30
-
31
- > Text Embeddings Inference (TEI) is a comprehensive toolkit designed for efficient deployment and serving of open source text embeddings models. It enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE, and E5.
32
-
33
- Some recommended models at the time of writing (May 2024) are `Snowflake/snowflake-arctic-embed-m` and `BAAI/bge-large-en-v1.5`. You may run TEI locally with GPU support via Docker:
34
-
35
- `docker run --gpus all -p 8080:80 -v tei-data:/data --name tei ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id YOUR/HF_MODEL`
36
-
37
- You can then hook this up to your Chat UI instance with the following configuration.
38
-
39
- ```ini
40
- TEXT_EMBEDDING_MODELS=`[
41
- {
42
- "name": "YOUR/HF_MODEL",
43
- "displayName": "YOUR/HF_MODEL",
44
- "preQuery": "Check the model documentation for the preQuery. Not all models have one",
45
- "prePassage": "Check the model documentation for the prePassage. Not all models have one",
46
- "chunkCharLength": 512,
47
- "endpoints": [{
48
- "type": "tei",
49
- "url": "http://127.0.0.1:8080/"
50
- }]
51
- }
52
- ]`
53
- ```
54
-
55
- Examples for `Snowflake/snowflake-arctic-embed-m` and `BAAI/bge-large-en-v1.5`:
56
-
57
- ```ini
58
- TEXT_EMBEDDING_MODELS=`[
59
- {
60
- "name": "Snowflake/snowflake-arctic-embed-m",
61
- "displayName": "Snowflake/snowflake-arctic-embed-m",
62
- "preQuery": "Represent this sentence for searching relevant passages: ",
63
- "chunkCharLength": 512,
64
- "endpoints": [{
65
- "type": "tei",
66
- "url": "http://127.0.0.1:8080/"
67
- }]
68
- },{
69
- "name": "BAAI/bge-large-en-v1.5",
70
- "displayName": "BAAI/bge-large-en-v1.5",
71
- "chunkCharLength": 512,
72
- "endpoints": [{
73
- "type": "tei",
74
- "url": "http://127.0.0.1:8080/"
75
- }]
76
- }
77
- ]`
78
- ```
79
-
80
- ## OpenAI
81
-
82
- It's also possible to host your own OpenAI API compatible embedding models. [`Infinity`](https://github.com/michaelfeil/infinity) is one example. You may run it locally with Docker:
83
-
84
- `docker run -it --gpus all -v infinity-data:/app/.cache -p 7997:7997 michaelf34/infinity:latest v2 --model-id nomic-ai/nomic-embed-text-v1 --port 7997`
85
-
86
- You can then hook this up to your Chat UI instance with the following configuration.
87
-
88
- ```ini
89
- TEXT_EMBEDDING_MODELS=`[
90
- {
91
- "name": "nomic-ai/nomic-embed-text-v1",
92
- "displayName": "nomic-ai/nomic-embed-text-v1",
93
- "chunkCharLength": 512,
94
- "model": {
95
- "name": "nomic-ai/nomic-embed-text-v1"
96
- },
97
- "endpoints": [
98
- {
99
- "type": "openai",
100
- "url": "https://127.0.0.1:7997/embeddings"
101
- }
102
- ]
103
- }
104
- ]`
105
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/metrics.md DELETED
@@ -1,9 +0,0 @@
1
- # Metrics
2
-
3
- The server can expose prometheus metrics on port `5565` but is off by default. You may enable the metrics server with `METRICS_ENABLED=true` and change the port with `METRICS_PORT=1234`.
4
-
5
- <Tip>
6
-
7
- In development with `npm run dev`, the metrics server does not shutdown gracefully due to Sveltekit not providing hooks for restart. It's recommended to disable the metrics server in this case.
8
-
9
- </Tip>
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/multimodal.md DELETED
@@ -1,24 +0,0 @@
1
- # Multimodal
2
-
3
- We currently support [IDEFICS](https://huggingface.co/blog/idefics) (hosted on [TGI](./providers/tgi)), OpenAI and Anthropic Claude 3 as multimodal models. You can enable it by setting `multimodal: true` in your `MODELS` configuration. For IDEFICS, you must have a [PRO HF Api token](https://huggingface.co/settings/tokens). For OpenAI, see the [OpenAI section](./providers/openai). For Anthropic, see the [Anthropic section](./providers/anthropic).
4
-
5
- ```ini
6
- MODELS=`[
7
- {
8
- "name": "HuggingFaceM4/idefics-80b-instruct",
9
- "multimodal" : true,
10
- "description": "IDEFICS is the new multimodal model by Hugging Face.",
11
- "preprompt": "",
12
- "chatPromptTemplate" : "{{#each messages}}{{#ifUser}}User: {{content}}{{/ifUser}}<end_of_utterance>\nAssistant: {{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
13
- "parameters": {
14
- "temperature": 0.1,
15
- "top_p": 0.95,
16
- "repetition_penalty": 1.2,
17
- "top_k": 12,
18
- "truncate": 1000,
19
- "max_new_tokens": 1024,
20
- "stop": ["<end_of_utterance>", "User:", "\nUser:"]
21
- }
22
- }
23
- ]`
24
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/overview.md DELETED
@@ -1,147 +0,0 @@
1
- # Models Overview
2
-
3
- You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
4
-
5
- ```ini
6
- MODELS=`[
7
- {
8
- "name": "mistralai/Mistral-7B-Instruct-v0.2",
9
- "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
10
- "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
11
- "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
12
- "preprompt": "",
13
- "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
14
- "parameters": {
15
- "temperature": 0.3,
16
- "top_p": 0.95,
17
- "repetition_penalty": 1.2,
18
- "top_k": 50,
19
- "truncate": 3072,
20
- "max_new_tokens": 1024,
21
- "stop": ["</s>"]
22
- },
23
- "promptExamples": [
24
- {
25
- "title": "Write an email",
26
- "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
27
- }, {
28
- "title": "Code a game",
29
- "prompt": "Code a basic snake game in python, give explanations for each step."
30
- }, {
31
- "title": "Recipe help",
32
- "prompt": "How do I make a delicious lemon cheesecake?"
33
- }
34
- ]
35
- }
36
- ]`
37
-
38
- ```
39
-
40
- You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
41
-
42
- ## Chat Prompt Template
43
-
44
- When querying the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To identify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
45
-
46
- The following is the default `chatPromptTemplate`, although newlines and indentation have been added for readability. You can find the prompts used in production for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md). The templating language used is [Handlebars](https://www.npmjs.com/package/handlebars).
47
-
48
- ```handlebars
49
- {{preprompt}}
50
- {{#each messages}}
51
- {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
52
- {{#ifAssistant
53
- }}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}
54
- {{/each}}
55
- {{assistantMessageToken}}
56
- ```
57
-
58
- ## Custom endpoint authorization
59
-
60
- ### Basic and Bearer
61
-
62
- Custom endpoints may require authorization, depending on how you configure them. Authentication will usually be set either with `Basic` or `Bearer`.
63
-
64
- For `Basic` we will need to generate a base64 encoding of the username and password.
65
-
66
- `echo -n "USER:PASS" | base64`
67
-
68
- > VVNFUjpQQVNT
69
-
70
- For `Bearer` you can use a token, which can be grabbed from [here](https://huggingface.co/settings/tokens).
71
-
72
- You can then add the generated information and the `authorization` parameter to your `.env.local`.
73
-
74
- ```ini
75
- "endpoints": [
76
- {
77
- "url": "https://HOST:PORT",
78
- "authorization": "Basic VVNFUjpQQVNT",
79
- }
80
- ]
81
- ```
82
-
83
- Please note that if `HF_TOKEN` is also set or not empty, it will take precedence.
84
-
85
- ## Models hosted on multiple custom endpoints
86
-
87
- If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
88
-
89
- ```ini
90
- "endpoints": [
91
- {
92
- "url": "https://HOST:PORT",
93
- "weight": 1
94
- },
95
- {
96
- "url": "https://HOST:PORT",
97
- "weight": 2
98
- }
99
- ...
100
- ]
101
- ```
102
-
103
- ## Client Certificate Authentication (mTLS)
104
-
105
- Custom endpoints may require client certificate authentication, depending on how you configure them. To enable mTLS between Chat UI and your custom endpoint, you will need to set the `USE_CLIENT_CERTIFICATE` to `true`, and add the `CERT_PATH` and `KEY_PATH` parameters to your `.env.local`. These parameters should point to the location of the certificate and key files on your local machine. The certificate and key files should be in PEM format. The key file can be encrypted with a passphrase, in which case you will also need to add the `CLIENT_KEY_PASSWORD` parameter to your `.env.local`.
106
-
107
- If you're using a certificate signed by a private CA, you will also need to add the `CA_PATH` parameter to your `.env.local`. This parameter should point to the location of the CA certificate file on your local machine.
108
-
109
- If you're using a self-signed certificate, e.g. for testing or development purposes, you can set the `REJECT_UNAUTHORIZED` parameter to `false` in your `.env.local`. This will disable certificate validation, and allow Chat UI to connect to your custom endpoint.
110
-
111
- ## Specific Embedding Model
112
-
113
- A model can use any of the embedding models defined under `TEXT_EMBEDDING_MODELS`, (currently used when web searching). By default it will use the first embedding model, but it can be changed with the field `embeddingModel`:
114
-
115
- ```ini
116
- TEXT_EMBEDDING_MODELS = `[
117
- {
118
- "name": "Xenova/gte-small",
119
- "chunkCharLength": 512,
120
- "endpoints": [
121
- {"type": "transformersjs"}
122
- ]
123
- },
124
- {
125
- "name": "intfloat/e5-base-v2",
126
- "chunkCharLength": 768,
127
- "endpoints": [
128
- {"type": "tei", "url": "http://127.0.0.1:8080/", "authorization": "Basic VVNFUjpQQVNT"},
129
- {"type": "tei", "url": "http://127.0.0.1:8081/"}
130
- ]
131
- }
132
- ]`
133
-
134
- MODELS=`[
135
- {
136
- "name": "Ollama Mistral",
137
- "chatPromptTemplate": "...",
138
- "embeddingModel": "intfloat/e5-base-v2"
139
- "parameters": {
140
- ...
141
- },
142
- "endpoints": [
143
- ...
144
- ]
145
- }
146
- ]`
147
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/anthropic.md DELETED
@@ -1,117 +0,0 @@
1
- # Anthropic
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | Yes |
7
-
8
- We also support Anthropic models (including multimodal ones via `multmodal: true`) through the official SDK. You may provide your API key via the `ANTHROPIC_API_KEY` env variable, or alternatively, through the `endpoints.apiKey` as per the following example.
9
-
10
- ```ini
11
- MODELS=`[
12
- {
13
- "name": "claude-3-haiku-20240307",
14
- "displayName": "Claude 3 Haiku",
15
- "description": "Fastest and most compact model for near-instant responsiveness",
16
- "multimodal": true,
17
- "parameters": {
18
- "max_new_tokens": 4096,
19
- },
20
- "endpoints": [
21
- {
22
- "type": "anthropic",
23
- // optionals
24
- "apiKey": "sk-ant-...",
25
- "baseURL": "https://api.anthropic.com",
26
- "defaultHeaders": {},
27
- "defaultQuery": {}
28
- }
29
- ]
30
- },
31
- {
32
- "name": "claude-3-sonnet-20240229",
33
- "displayName": "Claude 3 Sonnet",
34
- "description": "Ideal balance of intelligence and speed",
35
- "multimodal": true,
36
- "parameters": {
37
- "max_new_tokens": 4096,
38
- },
39
- "endpoints": [
40
- {
41
- "type": "anthropic",
42
- // optionals
43
- "apiKey": "sk-ant-...",
44
- "baseURL": "https://api.anthropic.com",
45
- "defaultHeaders": {},
46
- "defaultQuery": {}
47
- }
48
- ]
49
- },
50
- {
51
- "name": "claude-3-opus-20240229",
52
- "displayName": "Claude 3 Opus",
53
- "description": "Most powerful model for highly complex tasks",
54
- "multimodal": true,
55
- "parameters": {
56
- "max_new_tokens": 4096
57
- },
58
- "endpoints": [
59
- {
60
- "type": "anthropic",
61
- // optionals
62
- "apiKey": "sk-ant-...",
63
- "baseURL": "https://api.anthropic.com",
64
- "defaultHeaders": {},
65
- "defaultQuery": {}
66
- }
67
- ]
68
- }
69
- ]`
70
- ```
71
-
72
- ## VertexAI
73
-
74
- We also support using Anthropic models running on Vertex AI. Authentication is done using Google Application Default Credentials. Project ID can be provided through the `endpoints.projectId` as per the following example:
75
-
76
- ```ini
77
- MODELS=`[
78
- {
79
- "name": "claude-3-haiku@20240307",
80
- "displayName": "Claude 3 Haiku",
81
- "description": "Fastest, most compact model for near-instant responsiveness",
82
- "multimodal": true,
83
- "parameters": {
84
- "max_new_tokens": 4096
85
- },
86
- "endpoints": [
87
- {
88
- "type": "anthropic-vertex",
89
- "region": "us-central1",
90
- "projectId": "gcp-project-id",
91
- // optionals
92
- "defaultHeaders": {},
93
- "defaultQuery": {}
94
- }
95
- ]
96
- },
97
- {
98
- "name": "claude-3-sonnet@20240229",
99
- "displayName": "Claude 3 Sonnet",
100
- "description": "Ideal balance of intelligence and speed",
101
- "multimodal": true,
102
- "parameters": {
103
- "max_new_tokens": 4096,
104
- },
105
- "endpoints": [
106
- {
107
- "type": "anthropic-vertex",
108
- "region": "us-central1",
109
- "projectId": "gcp-project-id",
110
- // optionals
111
- "defaultHeaders": {},
112
- "defaultQuery": {}
113
- }
114
- ]
115
- },
116
- ]`
117
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/aws.md DELETED
@@ -1,35 +0,0 @@
1
- # Amazon Web Services (AWS)
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | No |
7
-
8
- You may specify your Amazon SageMaker instance as an endpoint for Chat UI:
9
-
10
- ```ini
11
- MODELS=`[{
12
- "name": "your-model",
13
- "displayName": "Your Model",
14
- "description": "Your description",
15
- "parameters": {
16
- "max_new_tokens": 4096
17
- },
18
- "endpoints": [
19
- {
20
- "type" : "aws",
21
- "service" : "sagemaker"
22
- "url": "",
23
- "accessKey": "",
24
- "secretKey" : "",
25
- "sessionToken": "",
26
- "region": "",
27
- "weight": 1
28
- }
29
- ]
30
- }]`
31
- ```
32
-
33
- You can also set `"service": "lambda"` to use a lambda instance.
34
-
35
- You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/cloudflare.md DELETED
@@ -1,35 +0,0 @@
1
- # Cloudflare
2
-
3
- | Feature | Available |
4
- | ------------------------------ | --------- |
5
- | [Tools](../tools.md) | No |
6
- | [Multimodal](../multimodal.md) | No |
7
-
8
- You may use Cloudflare Workers AI to run your own models with serverless inference.
9
-
10
- You will need to have a Cloudflare account, then get your [account ID](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/) as well as your [API token](https://developers.cloudflare.com/workers-ai/get-started/rest-api/#1-get-an-api-token) for Workers AI.
11
-
12
- You can either specify them directly in your `.env.local` using the `CLOUDFLARE_ACCOUNT_ID` and `CLOUDFLARE_API_TOKEN` variables, or you can set them directly in the endpoint config.
13
-
14
- You can find the list of models available on Cloudflare [here](https://developers.cloudflare.com/workers-ai/models/#text-generation).
15
-
16
- ```ini
17
- MODELS=`[
18
- {
19
- "name" : "nousresearch/hermes-2-pro-mistral-7b",
20
- "tokenizer": "nousresearch/hermes-2-pro-mistral-7b",
21
- "parameters": {
22
- "stop": ["<|im_end|>"]
23
- },
24
- "endpoints" : [
25
- {
26
- "type" : "cloudflare"
27
- <!-- optionally specify these
28
- "accountId": "your-account-id",
29
- "authToken": "your-api-token"
30
- -->
31
- }
32
- ]
33
- }
34
- ]`
35
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/cohere.md DELETED
@@ -1,26 +0,0 @@
1
- # Cohere
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | Yes |
6
- | [Multimodal](../multimodal) | No |
7
-
8
- You may use Cohere to run their models directly from Chat UI. You will need to have a Cohere account, then get your [API token](https://dashboard.cohere.com/api-keys). You can either specify it directly in your `.env.local` using the `COHERE_API_TOKEN` variable, or you can set it in the endpoint config.
9
-
10
- Here is an example of a Cohere model config. You can set which model you want to use by setting the `id` field to the model name.
11
-
12
- ```ini
13
- MODELS=`[
14
- {
15
- "name": "command-r-plus",
16
- "displayName": "Command R+",
17
- "tools": true,
18
- "endpoints": [{
19
- "type": "cohere",
20
- <!-- optionally specify these, or use COHERE_API_TOKEN
21
- "apiKey": "your-api-token"
22
- -->
23
- }]
24
- }
25
- ]`
26
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/google.md DELETED
@@ -1,92 +0,0 @@
1
- # Google
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | No |
7
-
8
- Chat UI can connect to the google Vertex API endpoints ([List of supported models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models)).
9
-
10
- To enable:
11
-
12
- 1. [Select](https://console.cloud.google.com/project) or [create](https://cloud.google.com/resource-manager/docs/creating-managing-projects#creating_a_project) a Google Cloud project.
13
- 1. [Enable billing for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
14
- 1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
15
- 1. [Set up authentication with a service account](https://cloud.google.com/docs/authentication/getting-started)
16
- so you can access the API from your local workstation.
17
-
18
- The service account credentials file can be imported as an environmental variable:
19
-
20
- ```ini
21
- GOOGLE_APPLICATION_CREDENTIALS = clientid.json
22
- ```
23
-
24
- Make sure your docker container has access to the file and the variable is correctly set.
25
- Afterwards Google Vertex endpoints can be configured as following:
26
-
27
- ```ini
28
- MODELS=`[
29
- {
30
- "name": "gemini-1.5-pro",
31
- "displayName": "Vertex Gemini Pro 1.5",
32
- "endpoints" : [{
33
- "type": "vertex",
34
- "project": "abc-xyz",
35
- "location": "europe-west3",
36
- "extraBody": {
37
- "model_version": "gemini-1.5-pro-002",
38
- },
39
- // Optional
40
- "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
41
- "apiEndpoint": "", // alternative api endpoint url,
42
- "tools": [{
43
- "googleSearchRetrieval": {
44
- "disableAttribution": true
45
- }
46
- }]
47
- }]
48
- }
49
- ]`
50
- ```
51
-
52
- ## GenAI
53
-
54
- Or use the Gemini API API provider [from](https://github.com/google-gemini/generative-ai-js#readme):
55
-
56
- Make sure that you have an API key from Google Cloud Platform. To get an API key, follow the instructions [here](https://ai.google.dev/gemini-api/docs/api-key).
57
-
58
- You can either specify them directly in your `.env.local` using the `GOOGLE_GENAI_API_KEY` variables, or you can set them directly in the endpoint config.
59
-
60
- You can find the list of models available [here](https://ai.google.dev/gemini-api/docs/models/gemini), and experimental models available [here](https://ai.google.dev/gemini-api/docs/models/experimental-models).
61
-
62
- ```ini
63
- MODELS=`[
64
- {
65
- "name": "gemini-1.5-flash",
66
- "displayName": "Gemini Flash 1.5",
67
- "multimodal": true,
68
- "endpoints": [
69
- {
70
- "type": "genai",
71
-
72
- // Optional
73
- "apiKey": "abc...xyz"
74
- "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
75
- }
76
- ]
77
- },
78
- {
79
- "name": "gemini-1.5-pro",
80
- "displayName": "Gemini Pro 1.5",
81
- "multimodal": false,
82
- "endpoints": [
83
- {
84
- "type": "genai",
85
-
86
- // Optional
87
- "apiKey": "abc...xyz"
88
- }
89
- ]
90
- }
91
- ]`
92
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/langserve.md DELETED
@@ -1,22 +0,0 @@
1
- # LangServe
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | No |
7
-
8
- LangChain applications that are deployed using LangServe can be called with the following config:
9
-
10
- ```ini
11
- MODELS=`[
12
- {
13
- "name": "summarization-chain",
14
- "displayName": "Summarization Chain"
15
- "endpoints" : [{
16
- "type": "langserve",
17
- "url" : "http://127.0.0.1:8100",
18
- }]
19
- }
20
- ]`
21
-
22
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/llamacpp.md DELETED
@@ -1,49 +0,0 @@
1
- # Llama.cpp
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | No |
7
-
8
- Chat UI supports the llama.cpp API server directly without the need for an adapter. You can do this using the `llamacpp` endpoint type.
9
-
10
- If you want to run Chat UI with llama.cpp, you can do the following, using [microsoft/Phi-3-mini-4k-instruct-gguf](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf) as an example model:
11
-
12
- ```bash
13
- # install llama.cpp
14
- brew install llama.cpp
15
- # start llama.cpp server
16
- llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
17
- ```
18
-
19
- _note: you can swap the `hf-repo` and `hf-file` with your fav GGUF on the [Hub](https://huggingface.co/models?library=gguf). For example: `--hf-repo TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF` for [this repo](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF) & `--hf-file tinyllama-1.1b-chat-v1.0.Q4_0.gguf` for [this file](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/blob/main/tinyllama-1.1b-chat-v1.0.Q4_0.gguf)._
20
-
21
- A local LLaMA.cpp HTTP Server will start on `http://localhost:8080` (to change the port or any other default options, please find [LLaMA.cpp HTTP Server readme](https://github.com/ggml-org/llama.cpp/tree/master/tools/server#readme)).
22
-
23
- Add the following to your `.env.local`:
24
-
25
- ```ini
26
- MODELS=`[
27
- {
28
- "name": "Local microsoft/Phi-3-mini-4k-instruct-gguf",
29
- "tokenizer": "microsoft/Phi-3-mini-4k-instruct-gguf",
30
- "preprompt": "",
31
- "chatPromptTemplate": "<s>{{preprompt}}{{#each messages}}{{#ifUser}}<|user|>\n{{content}}<|end|>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}<|end|>\n{{/ifAssistant}}{{/each}}",
32
- "parameters": {
33
- "stop": ["<|end|>", "<|endoftext|>", "<|assistant|>"],
34
- "temperature": 0.7,
35
- "max_new_tokens": 1024,
36
- "truncate": 3071
37
- },
38
- "endpoints": [{
39
- "type" : "llamacpp",
40
- "baseURL": "http://localhost:8080"
41
- }],
42
- },
43
- ]`
44
- ```
45
-
46
- <div class="flex justify-center">
47
- <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-light.png" height="auto"/>
48
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-dark.png" height="auto"/>
49
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/ollama.md DELETED
@@ -1,39 +0,0 @@
1
- # Ollama
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | No |
7
-
8
- We also support the Ollama inference server. Spin up a model with
9
-
10
- ```bash
11
- ollama run mistral
12
- ```
13
-
14
- Then specify the endpoints like so:
15
-
16
- ```ini
17
- MODELS=`[
18
- {
19
- "name": "Ollama Mistral",
20
- "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}",
21
- "parameters": {
22
- "temperature": 0.1,
23
- "top_p": 0.95,
24
- "repetition_penalty": 1.2,
25
- "top_k": 50,
26
- "truncate": 3072,
27
- "max_new_tokens": 1024,
28
- "stop": ["</s>"]
29
- },
30
- "endpoints": [
31
- {
32
- "type": "ollama",
33
- "url" : "http://127.0.0.1:11434",
34
- "ollamaName" : "mistral"
35
- }
36
- ]
37
- }
38
- ]`
39
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/openai.md DELETED
@@ -1,181 +0,0 @@
1
- # OpenAI
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | No |
6
- | [Multimodal](../multimodal) | Yes |
7
-
8
- Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
9
-
10
- The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `/chat/completions`, change to `endpoint.completion` to `completions` to use the `/completions` endpoint.
11
-
12
- ```ini
13
- MODELS=`[
14
- {
15
- "name": "text-generation-webui",
16
- "id": "text-generation-webui",
17
- "parameters": {
18
- "temperature": 0.9,
19
- "top_p": 0.95,
20
- "repetition_penalty": 1.2,
21
- "top_k": 50,
22
- "truncate": 1000,
23
- "max_new_tokens": 1024,
24
- "stop": []
25
- },
26
- "endpoints": [{
27
- "type" : "openai",
28
- "baseURL": "http://localhost:8000/v1"
29
- }]
30
- }
31
- ]`
32
-
33
- ```
34
-
35
- The `openai` type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:
36
-
37
- ```ini
38
- OPENAI_API_KEY=#your openai api key here
39
- MODELS=`[{
40
- "name": "gpt-4",
41
- "displayName": "GPT 4",
42
- "endpoints" : [{
43
- "type": "openai",
44
- "apiKey": "or your openai api key here"
45
- }]
46
- },{
47
- "name": "gpt-3.5-turbo",
48
- "displayName": "GPT 3.5 Turbo",
49
- "endpoints" : [{
50
- "type": "openai",
51
- "apiKey": "or your openai api key here"
52
- }]
53
- }]`
54
- ```
55
-
56
- We also support models in the `o1` family. You need to add a few more options ot the config: Here is an example for `o1-mini`:
57
-
58
- ```ini
59
- MODELS=`[
60
- {
61
- "name": "o1-mini",
62
- "description": "ChatGPT o1-mini",
63
- "systemRoleSupported": false,
64
- "parameters": {
65
- "max_new_tokens": 2048,
66
- },
67
- "endpoints" : [{
68
- "type": "openai",
69
- "useCompletionTokens": true,
70
- }]
71
- }
72
- ]
73
- ```
74
-
75
- You may also consume any model provider that provides compatible OpenAI API endpoint. For example, you may self-host [Portkey](https://github.com/Portkey-AI/gateway) gateway and experiment with Claude or GPTs offered by Azure OpenAI. Example for Claude from Anthropic:
76
-
77
- ```ini
78
- MODELS=`[{
79
- "name": "claude-2.1",
80
- "displayName": "Claude 2.1",
81
- "description": "Anthropic has been founded by former OpenAI researchers...",
82
- "parameters": {
83
- "temperature": 0.5,
84
- "max_new_tokens": 4096,
85
- },
86
- "endpoints": [
87
- {
88
- "type": "openai",
89
- "baseURL": "https://gateway.example.com/v1",
90
- "defaultHeaders": {
91
- "x-portkey-config": '{"provider":"anthropic","api_key":"sk-ant-abc...xyz"}'
92
- }
93
- }
94
- ]
95
- }]`
96
- ```
97
-
98
- Example for GPT 4 deployed on Azure OpenAI:
99
-
100
- ```ini
101
- MODELS=`[{
102
- "id": "gpt-4-1106-preview",
103
- "name": "gpt-4-1106-preview",
104
- "displayName": "gpt-4-1106-preview",
105
- "parameters": {
106
- "temperature": 0.5,
107
- "max_new_tokens": 4096,
108
- },
109
- "endpoints": [
110
- {
111
- "type": "openai",
112
- "baseURL": "https://{resource-name}.openai.azure.com/openai/deployments/{deployment-id}",
113
- "defaultHeaders": {
114
- "api-key": "{api-key}"
115
- },
116
- "defaultQuery": {
117
- "api-version": "2023-05-15"
118
- }
119
- }
120
- ]
121
- }]`
122
- ```
123
-
124
- ## DeepInfra
125
-
126
- Or try Mistral from [Deepinfra](https://deepinfra.com/mistralai/Mistral-7B-Instruct-v0.1/api?example=openai-http):
127
-
128
- > Note, apiKey can either be set custom per endpoint, or globally using `OPENAI_API_KEY` variable.
129
-
130
- ```ini
131
- MODELS=`[{
132
- "name": "mistral-7b",
133
- "displayName": "Mistral 7B",
134
- "description": "A 7B dense Transformer, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context window.",
135
- "parameters": {
136
- "temperature": 0.5,
137
- "max_new_tokens": 4096,
138
- },
139
- "endpoints": [
140
- {
141
- "type": "openai",
142
- "baseURL": "https://api.deepinfra.com/v1/openai",
143
- "apiKey": "abc...xyz"
144
- }
145
- ]
146
- }]`
147
- ```
148
-
149
- _Non-streaming endpoints_
150
-
151
- For endpoints that don´t support streaming like o1 on Azure, you can pass `streamingSupported: false` in your endpoint config:
152
-
153
- ```
154
- MODELS=`[{
155
- "id": "o1-preview",
156
- "name": "o1-preview",
157
- "displayName": "o1-preview",
158
- "systemRoleSupported": false,
159
- "endpoints": [
160
- {
161
- "type": "openai",
162
- "baseURL": "https://my-deployment.openai.azure.com/openai/deployments/o1-preview",
163
- "defaultHeaders": {
164
- "api-key": "$SECRET"
165
- },
166
- "streamingSupported": false,
167
- }
168
- ]
169
- }]`
170
- ```
171
-
172
- ## Other
173
-
174
- Some other providers and their `baseURL` for reference.
175
-
176
- [Groq](https://groq.com/): https://api.groq.com/openai/v1
177
- [Fireworks](https://fireworks.ai/): https://api.fireworks.ai/inference/v1
178
-
179
- ```
180
-
181
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/providers/tgi.md DELETED
@@ -1,66 +0,0 @@
1
- # Text Generation Inference (TGI)
2
-
3
- | Feature | Available |
4
- | --------------------------- | --------- |
5
- | [Tools](../tools) | Yes\* |
6
- | [Multimodal](../multimodal) | Yes\* |
7
-
8
- \* Tools are only supported with the Cohere Command R+ model with the Xenova tokenizers. Please see the [Tools](../tools) section.
9
-
10
- \* Multimodal is only supported with the IDEFICS model. Please see the [Multimodal](../multimodal) section.
11
-
12
- By default, if `endpoints` are left unspecified, Chat UI will look for the model on the hosted Hugging Face inference API using the model name, and use your `HF_TOKEN`. Refer to the [overview](../overview) for more information about model configuration.
13
-
14
- ```ini
15
- MODELS=`[
16
- {
17
- "name": "mistralai/Mistral-7B-Instruct-v0.2",
18
- "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
19
- "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
20
- "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
21
- "preprompt": "",
22
- "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
23
- "parameters": {
24
- "temperature": 0.3,
25
- "top_p": 0.95,
26
- "repetition_penalty": 1.2,
27
- "top_k": 50,
28
- "truncate": 3072,
29
- "max_new_tokens": 1024,
30
- "stop": ["</s>"]
31
- },
32
- "promptExamples": [
33
- {
34
- "title": "Write an email",
35
- "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
36
- }, {
37
- "title": "Code a game",
38
- "prompt": "Code a basic snake game in python, give explanations for each step."
39
- }, {
40
- "title": "Recipe help",
41
- "prompt": "How do I make a delicious lemon cheesecake?"
42
- }
43
- ]
44
- }
45
- ]`
46
- ```
47
-
48
- ## Running your own models using a custom endpoint
49
-
50
- If you want to, instead of hitting models on the Hugging Face Inference API, you can run your own models locally.
51
-
52
- A good option is to hit a [text-generation-inference](https://github.com/huggingface/text-generation-inference) endpoint. This is what is done in the official [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template) for instance: both this app and a text-generation-inference server run inside the same container.
53
-
54
- To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
55
-
56
- ```ini
57
- MODELS=`[{
58
- "name": "your-model-name",
59
- "displayName": "Your Model Name",
60
- ... other model config
61
- "endpoints": [{
62
- "type" : "tgi",
63
- "url": "https://HOST:PORT",
64
- }]
65
- }]`
66
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/models/tools.md DELETED
@@ -1,62 +0,0 @@
1
- # Tools
2
-
3
- Tool calling instructs the model to generate an output matching a user-defined schema, which may be parsed for invoking external tools. The model simply chooses the tools and their parameters. Currently, only `TGI` and `Cohere` with `Command R+` are supported.
4
-
5
- <div class="flex justify-center">
6
- <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-light.png" height="auto"/>
7
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-dark.png" height="auto"/>
8
- </div>
9
-
10
- ## TGI Configuration
11
-
12
- A custom tokenizer is required for prompting the model for generating tool calls, as well as prompting with the results. The expected format for these tools and the resulting tool calls are hard coded for TGI, so it's likely that only the following configuration will work:
13
-
14
- ```ini
15
- MODELS=`[
16
- {
17
- "name" : "CohereForAI/c4ai-command-r-plus",
18
- "displayName": "Command R+",
19
- "description": "Command R+ is Cohere's latest LLM and is the first open weight model to beat GPT4 in the Chatbot Arena!",
20
- "tools": true,
21
- "tokenizer": "Xenova/c4ai-command-r-v01-tokenizer",
22
- "modelUrl": "https://huggingface.co/CohereForAI/c4ai-command-r-plus",
23
- "websiteUrl": "https://docs.cohere.com/docs/command-r-plus",
24
- "logoUrl": "https://huggingface.co/datasets/huggingchat/models-logo/resolve/main/cohere-logo.png",
25
- "parameters": {
26
- "stop": ["<|END_OF_TURN_TOKEN|>"],
27
- "truncate" : 28672,
28
- "max_new_tokens" : 4096,
29
- "temperature" : 0.3
30
- }
31
- }
32
- ]`
33
- ```
34
-
35
- ## Cohere Configuration
36
-
37
- The Cohere provider supports the endpoint native method of tool calling. Refer to the `endpoints/cohere` for implementation details.
38
-
39
- ```ini
40
- MODELS=`[
41
- {
42
- "name": "command-r-plus",
43
- "displayName": "Command R+",
44
- "description": "Command R+ is Cohere's latest LLM and is the first open weight model to beat GPT4 in the Chatbot Arena!",
45
- "tools": true,
46
- "websiteUrl": "https://docs.cohere.com/docs/command-r-plus",
47
- "logoUrl": "https://huggingface.co/datasets/huggingchat/models-logo/resolve/main/cohere-logo.png",
48
- "endpoints": [{
49
- "type": "cohere",
50
- "apiKey": "YOUR_API_KEY"
51
- }]
52
- }
53
- ]`
54
- ```
55
-
56
- ## Adding Tools
57
-
58
- Tool implementations are placed in `src/lib/server/tools`, with helpers available for easy integration with HuggingFace Zero GPU spaces. In the future, there may be an OpenAPI interface for adding tools.
59
-
60
- ## Adding Support for Additional Models
61
-
62
- The TGI implementation uses a custom tokenizer and hard coded schema for supporting tools. The Cohere implementation, on the other hand, uses the native support in the SDK to emit tool calls. This is the recommended way to add support for more models. Please see the `endpoints/cohere` section of the code for implementation details.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/open-id.md DELETED
@@ -1,16 +0,0 @@
1
- # OpenID
2
-
3
- The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
4
-
5
- ```ini
6
- OPENID_CONFIG=`{
7
- PROVIDER_URL: "<your OIDC issuer>",
8
- CLIENT_ID: "<your OIDC client ID>",
9
- CLIENT_SECRET: "<your OIDC client secret>",
10
- SCOPES: "openid profile",
11
- TOLERANCE: // optional
12
- RESOURCE: // optional
13
- }`
14
- ```
15
-
16
- Redirect URI: `/login/callback`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/overview.md DELETED
@@ -1,10 +0,0 @@
1
- # Configuration Overview
2
-
3
- Chat UI handles configuration with environment variables. The default config for Chat UI is stored in the `.env` file, which you may use as a reference. You will need to override some values to get Chat UI to run locally. This can be done in `.env.local` or via your environment. The bare minimum configuration to get Chat UI running is:
4
-
5
- ```ini
6
- MONGODB_URL=mongodb://localhost:27017
7
- HF_TOKEN=your_token
8
- ```
9
-
10
- The following sections detail various sections of the app you may want to configure.
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/theming.md DELETED
@@ -1,18 +0,0 @@
1
- # Theming
2
-
3
- You can use a few environment variables to customize the look and feel of Chat UI. These are by default:
4
-
5
- ```ini
6
- PUBLIC_APP_NAME=ChatUI
7
- PUBLIC_APP_ASSETS=chatui
8
- PUBLIC_APP_COLOR=blue
9
- PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
10
- PUBLIC_APP_DATA_SHARING=
11
- PUBLIC_APP_DISCLAIMER=
12
- ```
13
-
14
- - `PUBLIC_APP_NAME` The name used as a title throughout the app.
15
- - `PUBLIC_APP_ASSETS` Is used to find logos & favicons in `static/$PUBLIC_APP_ASSETS`, current options are `chatui` and `huggingchat`.
16
- - `PUBLIC_APP_COLOR` Can be any of the [tailwind colors](https://tailwindcss.com/docs/customizing-colors#default-color-palette).
17
- - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
18
- - `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/configuration/web-search.md DELETED
@@ -1,58 +0,0 @@
1
- # Web Search
2
-
3
- Chat UI features a powerful Web Search feature. A high level overview of how it works:
4
-
5
- 1. Generate an appropriate search query from the user prompt using the `TASK_MODEL`
6
- 2. Perform web search via an external provider (i.e. Serper) or via locally scrape Google results
7
- 3. Load each search result into playwright and scrape
8
- 4. Convert scraped HTML to Markdown tree with headings as parents
9
- 5. Create embeddings for each Markdown element
10
- 6. Find the embeddings closest to the user query using a vector similarity search (inner product)
11
- 7. Get the corresponding Markdown elements and their parent, up to 8000 characters
12
- 8. Supply the information as context to the model
13
-
14
- <div class="flex justify-center">
15
- <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-light.png" height="auto"/>
16
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-dark.png" height="auto"/>
17
- </div>
18
-
19
- ## Providers
20
-
21
- Many providers are supported for the web search, or you can use locally scraped Google results.
22
-
23
- ### Local
24
-
25
- For locally scraped Google results, put `USE_LOCAL_WEBSEARCH=true` in your `.env.local`. Please note that you may hit rate limits as we make no attempt to make the traffic look legitimate. To avoid this, you may choose a provider, such as Serper, used on the official instance.
26
-
27
- ### SearXNG
28
-
29
- > SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
30
-
31
- You may enable support via the `SEARXNG_QUERY_URL` where `<query>` will be replaced with the query keywords. Please see [the official documentation](https://docs.searxng.org/dev/search_api.html) for more information
32
-
33
- Example: `https://searxng.yourdomain.com/search?q=<query>&engines=duckduckgo,google&format=json`
34
-
35
- ### Third Party
36
-
37
- Many third party providers are supported as well. The official instance uses Serper.
38
-
39
- ```ini
40
- YDC_API_KEY=docs.you.com api key here
41
- SERPER_API_KEY=serper.dev api key here
42
- SERPAPI_KEY=serpapi key here
43
- SERPSTACK_API_KEY=serpstack api key here
44
- SEARCHAPI_KEY=searchapi api key here
45
- ```
46
-
47
- ## Block/Allow List
48
-
49
- You may block or allow specific websites from the web search results. When using an allow list, only the links in the allowlist will be used. For supported search engines, the links will be blocked from the results directly. Any URL in the results that **partially or fully matches** the entry will be filtered out.
50
-
51
- ```ini
52
- WEBSEARCH_BLOCKLIST=`["youtube.com", "https://example.com/foo/bar"]`
53
- WEBSEARCH_ALLOWLIST=`["stackoverflow.com"]`
54
- ```
55
-
56
- ## Disabling Javascript
57
-
58
- By default, Playwright will execute all Javascript on the page. This can be intensive, requiring up to 6 cores for full performance, on some webpages. You may block scripts from running by settings `WEBSEARCH_JAVASCRIPT=false`. However, this will not block Javascript inlined in the HTML.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/developing/architecture.md DELETED
@@ -1,35 +0,0 @@
1
- # Architecture
2
-
3
- This document discusses the high level overview of the Chat UI codebase. If you're looking to contribute or just want to understand how the codebase works, this is the place for you!
4
-
5
- ## Overview
6
-
7
- Chat UI provides a simple interface connecting LLMs to external information and tools. The project uses [MongoDB](https://www.mongodb.com/) and [SvelteKit](https://kit.svelte.dev/) with [Tailwind](https://tailwindcss.com/).
8
-
9
- ## Code Map
10
-
11
- This section discusses various modules of the codebase briefly. The headings are not paths since the codebase structure may change.
12
-
13
- ### `routes`
14
-
15
- Provides all of the routes rendered with SSR via SvelteKit. The majority of backend and frontend logic can be found here, with some modules being pulled out into `lib` for the client and `lib/server` for the server.
16
-
17
- ### `textGeneration`
18
-
19
- Provides a standard interface for most chat features such as model output, web search, assistants and tools. Outputs `MessageUpdate`s which provide fine-grained updates on the request status such as new tokens and web search results.
20
-
21
- ### `endpoints`/`embeddingEndpoints`
22
-
23
- Provides a common streaming interface for many third party LLM and embedding providers.
24
-
25
- ### `websearch`
26
-
27
- Implements web search querying and RAG. See the [Web Search](../configuration/web-search) section for more information.
28
-
29
- ### `tools`
30
-
31
- Provides a common interface for external tools called by LLMs. See the [Tools](../configuration/models/tools.md) section for more information
32
-
33
- ### `migrations`
34
-
35
- Includes all MongoDB migrations for maintaining backwards compatibility across schema changes. Any changes to the schema must include a migration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/developing/copy-huggingchat.md DELETED
@@ -1,71 +0,0 @@
1
- # Copy HuggingChat
2
-
3
- The config file for HuggingChat is stored in the `chart/env/prod.yaml` file. It is the source of truth for the environment variables used for our CI/CD pipeline. For HuggingChat, as we need to customize the app color, as well as the base path, we build a custom docker image. You can find the workflow here.
4
-
5
- <Tip>
6
-
7
- If you want to make changes to the model config used in production for HuggingChat, you should do so against `chart/env/prod.yaml`.
8
-
9
- </Tip>
10
-
11
- ### Running a copy of HuggingChat locally
12
-
13
- If you want to run an exact copy of HuggingChat locally, you will need to do the following first:
14
-
15
- 1. Create an [OAuth App on the hub](https://huggingface.co/settings/applications/new) with `openid profile email` permissions. Make sure to set the callback URL to something like `http://localhost:5173/chat/login/callback` which matches the right path for your local instance.
16
- 2. Create a [HF Token](https://huggingface.co/settings/tokens) with your Hugging Face account. You will need a Pro account to be able to access some of the larger models available through HuggingChat.
17
- 3. Create a free account with [serper.dev](https://serper.dev/) (you will get 2500 free search queries)
18
- 4. Run an instance of MongoDB, however you want. (Local or remote)
19
-
20
- You can then create a new `.env.SECRET_CONFIG` file with the following content
21
-
22
- ```ini
23
- MONGODB_URL=<link to your mongo DB from step 4>
24
- HF_TOKEN=<your HF token from step 2>
25
- OPENID_CONFIG=`{
26
- PROVIDER_URL: "https://huggingface.co",
27
- CLIENT_ID: "<your client ID from step 1>",
28
- CLIENT_SECRET: "<your client secret from step 1>",
29
- }`
30
- SERPER_API_KEY=<your serper API key from step 3>
31
- MESSAGES_BEFORE_LOGIN=<can be any numerical value, or set to 0 to require login>
32
- ```
33
-
34
- You can then run `npm run updateLocalEnv` in the root of chat-ui. This will create a `.env.local` file which combines the `chart/env/prod.yaml` and the `.env.SECRET_CONFIG` file. You can then run `npm run dev` to start your local instance of HuggingChat.
35
-
36
- ### Populate database
37
-
38
- <Tip warning={true}>
39
-
40
- The `MONGODB_URL` used for this script will be fetched from `.env.local`. Make sure it's correct! The command runs directly on the database.
41
-
42
- </Tip>
43
-
44
- You can populate the database using faker data using the `populate` script:
45
-
46
- ```bash
47
- npm run populate <flags here>
48
- ```
49
-
50
- At least one flag must be specified, the following flags are available:
51
-
52
- - `reset` - resets the database
53
- - `all` - populates all tables
54
- - `users` - populates the users table
55
- - `settings` - populates the settings table for existing users
56
- - `assistants` - populates the assistants table for existing users
57
- - `conversations` - populates the conversations table for existing users
58
-
59
- For example, you could use it like so:
60
-
61
- ```bash
62
- npm run populate reset
63
- ```
64
-
65
- to clear out the database. Then login in the app to create your user and run the following command:
66
-
67
- ```bash
68
- npm run populate users settings assistants conversations
69
- ```
70
-
71
- to populate the database with fake data, including fake conversations and assistants for your user.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/index.md DELETED
@@ -1,97 +0,0 @@
1
- # 🤗 Chat UI
2
-
3
- Open source chat interface with support for tools, web search, multimodal and many API providers. The app uses MongoDB and SvelteKit behind the scenes. Try the live version of the app called [HuggingChat on hf.co/chat](https://huggingface.co/chat) or [setup your own instance](./installation/spaces).
4
-
5
- 🔧 **[Tools](./configuration/models/tools)**: Function calling with custom tools and support for [Zero GPU spaces](https://huggingface.co/spaces/enzostvs/zero-gpu-spaces)
6
-
7
- 🔍 **[Web Search](./configuration/web-search)**: Automated web search, scraping and RAG for all models
8
-
9
- 🐙 **[Multimodal](./configuration/models/multimodal)**: Accepts image file uploads on supported providers
10
-
11
- 👤 **[OpenID](./configuration/open-id)**: Optionally setup OpenID for user authentication
12
-
13
- <div class="flex gap-x-4">
14
-
15
- <div>
16
- Tools
17
- <div class="flex justify-center">
18
- <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-light.png" height="auto"/>
19
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/tools-dark.png" height="auto"/>
20
- </div>
21
- </div>
22
-
23
- <div>
24
- Web Search
25
- <div class="flex justify-center">
26
- <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-light.png" height="auto"/>
27
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/websearch-dark.png" height="auto"/>
28
- </div>
29
- </div>
30
-
31
- </div>
32
-
33
- ## Quickstart
34
-
35
- You can quickly have a locally running chat-ui & LLM text-generation server thanks to chat-ui's [llama.cpp server support](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
36
-
37
- **Step 1 (Start llama.cpp server):**
38
-
39
- ```bash
40
- # install llama.cpp
41
- brew install llama.cpp
42
- # start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example)
43
- llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
44
- ```
45
-
46
- A local LLaMA.cpp HTTP Server will start on `http://localhost:8080`. Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
47
-
48
- **Step 2 (tell chat-ui to use local llama.cpp server):**
49
-
50
- Add the following to your `.env.local`:
51
-
52
- ```ini
53
- MODELS=`[
54
- {
55
- "name": "Local microsoft/Phi-3-mini-4k-instruct-gguf",
56
- "tokenizer": "microsoft/Phi-3-mini-4k-instruct-gguf",
57
- "preprompt": "",
58
- "chatPromptTemplate": "<s>{{preprompt}}{{#each messages}}{{#ifUser}}<|user|>\n{{content}}<|end|>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}<|end|>\n{{/ifAssistant}}{{/each}}",
59
- "parameters": {
60
- "stop": ["<|end|>", "<|endoftext|>", "<|assistant|>"],
61
- "temperature": 0.7,
62
- "max_new_tokens": 1024,
63
- "truncate": 3071
64
- },
65
- "endpoints": [{
66
- "type" : "llamacpp",
67
- "baseURL": "http://localhost:8080"
68
- }],
69
- },
70
- ]`
71
- ```
72
-
73
- Read more [here](https://huggingface.co/docs/chat-ui/configuration/models/providers/llamacpp).
74
-
75
- **Step 3 (make sure you have MongoDb running locally):**
76
-
77
- ```bash
78
- docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
79
- ```
80
-
81
- Read more [here](https://github.com/huggingface/chat-ui?tab=Readme-ov-file#database).
82
-
83
- **Step 4 (start chat-ui):**
84
-
85
- ```bash
86
- git clone https://github.com/huggingface/chat-ui
87
- cd chat-ui
88
- npm install
89
- npm run dev -- --open
90
- ```
91
-
92
- Read more [here](https://github.com/huggingface/chat-ui?tab=readme-ov-file#launch).
93
-
94
- <div class="flex justify-center">
95
- <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-light.png" height="auto"/>
96
- <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chat-ui/llamacpp-dark.png" height="auto"/>
97
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/installation/docker.md DELETED
@@ -1,11 +0,0 @@
1
- # Running on Docker
2
-
3
- Pre-built docker images are provided with and without MongoDB built in. Refer to the [configuration section](../configuration/overview) for env variables that must be provided. We recommend using the `--env-file` option to avoid leaking secrets into your shell history.
4
-
5
- ```bash
6
- # Without built-in DB
7
- docker run -p 3000:3000 --env-file .env.local --name chat-ui ghcr.io/huggingface/chat-ui
8
-
9
- # With built-in DB
10
- docker run -p 3000:3000 --env-file .env.local -v chat-ui:/data --name chat-ui ghcr.io/huggingface/chat-ui-db
11
- ```
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/installation/helm.md DELETED
@@ -1,35 +0,0 @@
1
- # Helm
2
-
3
- <Tip warning={true}>
4
-
5
- **We highly discourage using the chart**. The Helm chart is a work in progress and should be considered unstable. Breaking changes to the chart may be pushed without migration guides or notice. Contributions welcome!
6
-
7
- </Tip>
8
-
9
- For installation on Kubernetes, you may use the helm chart in `/chart`. Please note that no chart repository has been setup, so you'll need to clone the repository and install the chart by path. The production values may be found at `chart/env/prod.yaml`.
10
-
11
- **Example values.yaml**
12
-
13
- ```yaml
14
- replicas: 1
15
-
16
- domain: example.com
17
-
18
- service:
19
- type: ClusterIP
20
-
21
- resources:
22
- requests:
23
- cpu: 100m
24
- memory: 2Gi
25
- limits:
26
- # Recommended to use large limits when web search is enabled
27
- cpu: "4"
28
- memory: 6Gi
29
-
30
- envVars:
31
- MONGODB_URL: mongodb://chat-ui-mongo:27017
32
- # Ensure that your values.yaml will not leak anywhere
33
- # PRs welcome for a chart rework with envFrom support!
34
- HF_TOKEN: secret_token
35
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/installation/local.md DELETED
@@ -1,34 +0,0 @@
1
- # Running Locally
2
-
3
- You may start an instance locally for non-production use cases. For production use cases, please see the other installation options.
4
-
5
- ## Configuration
6
-
7
- The default config for Chat UI is stored in the `.env` file. You will need to override some values to get Chat UI to run locally. Start by creating a `.env.local` file in the root of the repository as per the [configuration section](../configuration/overview). The bare minimum config you need to get Chat UI to run locally is the following:
8
-
9
- ```ini
10
- MONGODB_URL=<the URL to your MongoDB instance>
11
- HF_TOKEN=<your access token> # find your token at hf.co/settings/token
12
- ```
13
-
14
- ## Database
15
-
16
- The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.
17
-
18
- You can use a local MongoDB instance. The easiest way is to spin one up using docker with persistence:
19
-
20
- ```bash
21
- docker run -d -p 27017:27017 -v mongo-chat-ui:/data --name mongo-chat-ui mongo:latest
22
- ```
23
-
24
- In which case the url of your DB will be `MONGODB_URL=mongodb://localhost:27017`.
25
-
26
- Alternatively, you can use a [free MongoDB Atlas](https://www.mongodb.com/pricing) instance for this, Chat UI should fit comfortably within their free tier. After which you can set the `MONGODB_URL` variable in `.env.local` to match your instance.
27
-
28
- ## Starting the server
29
-
30
- ```bash
31
- npm ci # install dependencies
32
- npm run build # build the project
33
- npm run preview -- --open # start the server with & open your instance at http://localhost:4173
34
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/source/installation/spaces.md DELETED
@@ -1,9 +0,0 @@
1
- # Running on Huggingface Spaces
2
-
3
- If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.
4
-
5
- You can deploy your own customized Chat UI instance with any supported [LLM](https://huggingface.co/models?pipeline_tag=text-generation) of your choice on [Hugging Face Spaces](https://huggingface.co/spaces). To do so, use the chat-ui template [available here](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
6
-
7
- Set `HF_TOKEN` in [Space secrets](https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables) to deploy a model with gated access or a model in a private repository. It's also compatible with [Inference for PROs](https://huggingface.co/blog/inference-pro) curated list of powerful models with higher rate limits. Make sure to create your personal token first in your [User Access Tokens settings](https://huggingface.co/settings/tokens).
8
-
9
- Read the full tutorial [here](https://huggingface.co/docs/hub/spaces-sdks-docker-chatui#chatui-on-spaces).
 
 
 
 
 
 
 
 
 
 
package-lock.json CHANGED
The diff for this file is too large to render. See raw diff
 
package.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "name": "chat-ui",
3
- "version": "0.10.0",
4
  "private": true,
5
  "packageManager": "npm@9.5.0",
6
  "scripts": {
@@ -18,9 +18,7 @@
18
  "prepare": "husky"
19
  },
20
  "devDependencies": {
21
- "@elysiajs/cors": "^1.3.3",
22
  "@elysiajs/eden": "^1.3.2",
23
- "@elysiajs/node": "^1.2.6",
24
  "@faker-js/faker": "^8.4.1",
25
  "@iconify-json/carbon": "^1.1.16",
26
  "@iconify-json/eos-icons": "^1.1.6",
@@ -29,17 +27,12 @@
29
  "@sveltejs/vite-plugin-svelte": "^5.0.3",
30
  "@tailwindcss/typography": "^0.5.9",
31
  "@types/dompurify": "^3.0.5",
32
- "@types/express": "^4.17.21",
33
- "@types/fs-extra": "^11.0.4",
34
  "@types/js-yaml": "^4.0.9",
35
- "@types/jsdom": "^21.1.1",
36
- "@types/jsonpath": "^0.2.4",
37
  "@types/katex": "^0.16.7",
38
  "@types/mime-types": "^2.1.4",
39
  "@types/minimist": "^1.2.5",
40
  "@types/node": "^22.1.0",
41
  "@types/parquetjs": "^0.10.3",
42
- "@types/sbd": "^1.0.5",
43
  "@types/uuid": "^9.0.8",
44
  "@types/yazl": "^3.3.0",
45
  "@typescript-eslint/eslint-plugin": "^6.x",
@@ -50,23 +43,19 @@
50
  "eslint": "^8.28.0",
51
  "eslint-config-prettier": "^8.5.0",
52
  "eslint-plugin-svelte": "^2.45.1",
53
- "fs-extra": "^11.3.0",
54
  "isomorphic-dompurify": "^2.13.0",
55
  "js-yaml": "^4.1.0",
56
- "jsonrepair": "^3.12.0",
57
  "minimist": "^1.2.8",
58
  "mongodb-memory-server": "^10.1.2",
59
- "node-llama-cpp": "^3.6.0",
60
  "prettier": "^3.5.3",
61
  "prettier-plugin-svelte": "^3.2.6",
62
  "prettier-plugin-tailwindcss": "^0.6.11",
63
- "prom-client": "^15.1.2",
64
  "sade": "^1.8.1",
65
  "superjson": "^2.2.2",
66
  "svelte": "^5.33.3",
67
  "svelte-check": "^4.0.0",
68
  "svelte-gestures": "^5.1.3",
69
- "ts-node": "^10.9.1",
70
  "tslib": "^2.4.1",
71
  "typescript": "^5.5.0",
72
  "unplugin-icons": "^0.16.1",
@@ -77,52 +66,34 @@
77
  },
78
  "type": "module",
79
  "dependencies": {
80
- "@aws-sdk/credential-providers": "^3.592.0",
81
- "@cliqz/adblocker-playwright": "^1.34.0",
82
  "@elysiajs/swagger": "^1.3.0",
83
  "@gradio/client": "^1.8.0",
84
  "@huggingface/hub": "^2.2.0",
85
  "@huggingface/inference": "^3.12.1",
86
- "@huggingface/mcp-client": "^0.1.1",
87
- "@huggingface/tasks": "^0.19.1",
88
- "@huggingface/transformers": "^3.1.1",
89
  "@iconify-json/bi": "^1.1.21",
90
- "@playwright/browser-chromium": "^1.52.0",
91
  "@resvg/resvg-js": "^2.6.2",
92
  "autoprefixer": "^10.4.14",
93
- "aws-sigv4-fetch": "^4.0.1",
94
- "aws4": "^1.13.0",
95
  "date-fns": "^2.29.3",
96
  "dotenv": "^16.5.0",
97
- "express": "^4.21.2",
98
  "file-type": "^21.0.0",
99
- "google-auth-library": "^9.13.0",
100
  "handlebars": "^4.7.8",
101
  "highlight.js": "^11.7.0",
102
  "husky": "^9.0.11",
103
- "image-size": "^1.2.1",
104
  "ip-address": "^9.0.5",
105
- "jose": "^5.3.0",
106
- "jsdom": "^22.0.0",
107
  "json5": "^2.2.3",
108
- "jsonpath": "^1.1.1",
109
  "katex": "^0.16.21",
110
  "lint-staged": "^15.2.7",
111
  "marked": "^12.0.1",
112
  "mongodb": "^5.8.0",
113
  "nanoid": "^5.0.9",
114
- "natural": "^8.1.0",
115
  "openid-client": "^5.4.2",
116
  "parquetjs": "^0.11.2",
117
  "pino": "^9.0.0",
118
  "pino-pretty": "^11.0.0",
119
- "playwright": "^1.52.0",
120
  "postcss": "^8.4.31",
121
- "saslprep": "^1.0.3",
122
  "satori": "^0.10.11",
123
  "satori-html": "^0.3.2",
124
- "sbd": "^1.0.19",
125
- "serpapi": "^1.1.1",
126
  "sharp": "^0.33.4",
127
  "tailwind-scrollbar": "^3.0.0",
128
  "tailwindcss": "^3.4.0",
@@ -130,16 +101,6 @@
130
  "vitest-browser-svelte": "^0.1.0",
131
  "zod": "^3.22.3"
132
  },
133
- "optionalDependencies": {
134
- "@anthropic-ai/sdk": "^0.32.1",
135
- "@anthropic-ai/vertex-sdk": "^0.4.1",
136
- "@aws-sdk/client-bedrock-runtime": "^3.631.0",
137
- "@google-cloud/vertexai": "^1.1.0",
138
- "@google/generative-ai": "^0.24.0",
139
- "aws4fetch": "^1.0.17",
140
- "cohere-ai": "^7.9.0",
141
- "openai": "^4.44.0"
142
- },
143
  "overrides": {
144
  "@reflink/reflink": "file:stub/@reflink/reflink"
145
  }
 
1
  {
2
  "name": "chat-ui",
3
+ "version": "0.20.0",
4
  "private": true,
5
  "packageManager": "npm@9.5.0",
6
  "scripts": {
 
18
  "prepare": "husky"
19
  },
20
  "devDependencies": {
 
21
  "@elysiajs/eden": "^1.3.2",
 
22
  "@faker-js/faker": "^8.4.1",
23
  "@iconify-json/carbon": "^1.1.16",
24
  "@iconify-json/eos-icons": "^1.1.6",
 
27
  "@sveltejs/vite-plugin-svelte": "^5.0.3",
28
  "@tailwindcss/typography": "^0.5.9",
29
  "@types/dompurify": "^3.0.5",
 
 
30
  "@types/js-yaml": "^4.0.9",
 
 
31
  "@types/katex": "^0.16.7",
32
  "@types/mime-types": "^2.1.4",
33
  "@types/minimist": "^1.2.5",
34
  "@types/node": "^22.1.0",
35
  "@types/parquetjs": "^0.10.3",
 
36
  "@types/uuid": "^9.0.8",
37
  "@types/yazl": "^3.3.0",
38
  "@typescript-eslint/eslint-plugin": "^6.x",
 
43
  "eslint": "^8.28.0",
44
  "eslint-config-prettier": "^8.5.0",
45
  "eslint-plugin-svelte": "^2.45.1",
46
+ "fs-extra": "^11.3.1",
47
  "isomorphic-dompurify": "^2.13.0",
48
  "js-yaml": "^4.1.0",
 
49
  "minimist": "^1.2.8",
50
  "mongodb-memory-server": "^10.1.2",
 
51
  "prettier": "^3.5.3",
52
  "prettier-plugin-svelte": "^3.2.6",
53
  "prettier-plugin-tailwindcss": "^0.6.11",
 
54
  "sade": "^1.8.1",
55
  "superjson": "^2.2.2",
56
  "svelte": "^5.33.3",
57
  "svelte-check": "^4.0.0",
58
  "svelte-gestures": "^5.1.3",
 
59
  "tslib": "^2.4.1",
60
  "typescript": "^5.5.0",
61
  "unplugin-icons": "^0.16.1",
 
66
  },
67
  "type": "module",
68
  "dependencies": {
 
 
69
  "@elysiajs/swagger": "^1.3.0",
70
  "@gradio/client": "^1.8.0",
71
  "@huggingface/hub": "^2.2.0",
72
  "@huggingface/inference": "^3.12.1",
 
 
 
73
  "@iconify-json/bi": "^1.1.21",
 
74
  "@resvg/resvg-js": "^2.6.2",
75
  "autoprefixer": "^10.4.14",
 
 
76
  "date-fns": "^2.29.3",
77
  "dotenv": "^16.5.0",
 
78
  "file-type": "^21.0.0",
 
79
  "handlebars": "^4.7.8",
80
  "highlight.js": "^11.7.0",
81
  "husky": "^9.0.11",
 
82
  "ip-address": "^9.0.5",
 
 
83
  "json5": "^2.2.3",
 
84
  "katex": "^0.16.21",
85
  "lint-staged": "^15.2.7",
86
  "marked": "^12.0.1",
87
  "mongodb": "^5.8.0",
88
  "nanoid": "^5.0.9",
89
+ "openai": "^4.44.0",
90
  "openid-client": "^5.4.2",
91
  "parquetjs": "^0.11.2",
92
  "pino": "^9.0.0",
93
  "pino-pretty": "^11.0.0",
 
94
  "postcss": "^8.4.31",
 
95
  "satori": "^0.10.11",
96
  "satori-html": "^0.3.2",
 
 
97
  "sharp": "^0.33.4",
98
  "tailwind-scrollbar": "^3.0.0",
99
  "tailwindcss": "^3.4.0",
 
101
  "vitest-browser-svelte": "^0.1.0",
102
  "zod": "^3.22.3"
103
  },
 
 
 
 
 
 
 
 
 
 
104
  "overrides": {
105
  "@reflink/reflink": "file:stub/@reflink/reflink"
106
  }
scripts/populate.ts CHANGED
@@ -15,8 +15,6 @@ import type { User } from "../src/lib/types/User";
15
  import type { Assistant } from "../src/lib/types/Assistant";
16
  import type { Conversation } from "../src/lib/types/Conversation";
17
  import type { Settings } from "../src/lib/types/Settings";
18
- import type { CommunityToolDB, ToolLogoColor, ToolLogoIcon } from "../src/lib/types/Tool";
19
- import { defaultEmbeddingModel } from "../src/lib/server/embeddingModels.ts";
20
  import { Message } from "../src/lib/types/Message.ts";
21
 
22
  import { addChildren } from "../src/lib/utils/tree/addChildren.ts";
@@ -40,7 +38,7 @@ rl.on("close", function () {
40
 
41
  const samples = fs.readFileSync(path.join(__dirname, "samples.txt"), "utf8").split("\n---\n");
42
 
43
- const possibleFlags = ["reset", "all", "users", "settings", "assistants", "conversations", "tools"];
44
  const argv = minimist(process.argv.slice(2));
45
  const flags = argv["_"].filter((flag) => possibleFlags.includes(flag));
46
 
@@ -156,7 +154,6 @@ async function seed() {
156
  await collections.settings.deleteMany({});
157
  await collections.assistants.deleteMany({});
158
  await collections.conversations.deleteMany({});
159
- await collections.tools.deleteMany({});
160
  await collections.migrationResults.deleteMany({});
161
  await collections.semaphores.deleteMany({});
162
  console.log("Reset done");
@@ -186,12 +183,12 @@ async function seed() {
186
  userId: user._id,
187
  shareConversationsWithModelAuthors: faker.datatype.boolean(0.25),
188
  hideEmojiOnSidebar: faker.datatype.boolean(0.25),
189
- ethicsModalAcceptedAt: faker.date.recent({ days: 30 }),
190
  activeModel: faker.helpers.arrayElement(modelIds),
191
  createdAt: faker.date.recent({ days: 30 }),
192
  updatedAt: faker.date.recent({ days: 30 }),
193
  disableStream: faker.datatype.boolean(0.25),
194
  directPaste: faker.datatype.boolean(0.25),
 
195
  customPrompts: {},
196
  assistants: [],
197
  };
@@ -272,7 +269,7 @@ async function seed() {
272
  updatedAt: faker.date.recent({ days: 145 }),
273
  model: faker.helpers.arrayElement(modelIds),
274
  title: faker.internet.emoji() + " " + faker.hacker.phrase(),
275
- embeddingModel: defaultEmbeddingModel.id,
276
  messages,
277
  rootMessageId: messages[0].id,
278
  } satisfies Conversation;
@@ -287,80 +284,6 @@ async function seed() {
287
  );
288
  console.log("Done creating conversations.");
289
  }
290
-
291
- // generate Community Tools
292
- if (flags.includes("tools") || flags.includes("all")) {
293
- const tools = await Promise.all(
294
- faker.helpers.multiple(
295
- () => {
296
- const _id = new ObjectId();
297
- const displayName = faker.company.catchPhrase();
298
- const description = faker.company.catchPhrase();
299
- const color = faker.helpers.arrayElement([
300
- "purple",
301
- "blue",
302
- "green",
303
- "yellow",
304
- "red",
305
- ]) satisfies ToolLogoColor;
306
- const icon = faker.helpers.arrayElement([
307
- "wikis",
308
- "tools",
309
- "camera",
310
- "code",
311
- "email",
312
- "cloud",
313
- "terminal",
314
- "game",
315
- "chat",
316
- "speaker",
317
- "video",
318
- ]) satisfies ToolLogoIcon;
319
- const baseUrl = faker.helpers.arrayElement([
320
- "stabilityai/stable-diffusion-3-medium",
321
- "multimodalart/cosxl",
322
- "gokaygokay/SD3-Long-Captioner",
323
- "xichenhku/MimicBrush",
324
- ]);
325
-
326
- // keep empty for populate for now
327
-
328
- const user: User = faker.helpers.arrayElement(users);
329
- const createdById = user._id;
330
- const createdByName = user.username ?? user.name;
331
-
332
- return {
333
- type: "community" as const,
334
- _id,
335
- createdById,
336
- createdByName,
337
- displayName,
338
- name: displayName.toLowerCase().replace(" ", "_"),
339
- endpoint: "/test",
340
- description,
341
- color,
342
- icon,
343
- baseUrl,
344
- inputs: [],
345
- outputPath: null,
346
- outputType: "str" as const,
347
- showOutput: false,
348
- useCount: faker.number.int({ min: 0, max: 100000 }),
349
- last24HoursUseCount: faker.number.int({ min: 0, max: 1000 }),
350
- createdAt: faker.date.recent({ days: 30 }),
351
- updatedAt: faker.date.recent({ days: 30 }),
352
- searchTokens: generateSearchTokens(displayName),
353
- review: faker.helpers.enumValue(ReviewStatus),
354
- outputComponent: null,
355
- outputComponentIdx: null,
356
- };
357
- },
358
- { count: faker.number.int({ min: 10, max: 200 }) }
359
- )
360
- );
361
-
362
- await collections.tools.insertMany(tools satisfies CommunityToolDB[]);
363
- }
364
  }
365
 
366
  // run seed
 
15
  import type { Assistant } from "../src/lib/types/Assistant";
16
  import type { Conversation } from "../src/lib/types/Conversation";
17
  import type { Settings } from "../src/lib/types/Settings";
 
 
18
  import { Message } from "../src/lib/types/Message.ts";
19
 
20
  import { addChildren } from "../src/lib/utils/tree/addChildren.ts";
 
38
 
39
  const samples = fs.readFileSync(path.join(__dirname, "samples.txt"), "utf8").split("\n---\n");
40
 
41
+ const possibleFlags = ["reset", "all", "users", "settings", "assistants", "conversations"];
42
  const argv = minimist(process.argv.slice(2));
43
  const flags = argv["_"].filter((flag) => possibleFlags.includes(flag));
44
 
 
154
  await collections.settings.deleteMany({});
155
  await collections.assistants.deleteMany({});
156
  await collections.conversations.deleteMany({});
 
157
  await collections.migrationResults.deleteMany({});
158
  await collections.semaphores.deleteMany({});
159
  console.log("Reset done");
 
183
  userId: user._id,
184
  shareConversationsWithModelAuthors: faker.datatype.boolean(0.25),
185
  hideEmojiOnSidebar: faker.datatype.boolean(0.25),
 
186
  activeModel: faker.helpers.arrayElement(modelIds),
187
  createdAt: faker.date.recent({ days: 30 }),
188
  updatedAt: faker.date.recent({ days: 30 }),
189
  disableStream: faker.datatype.boolean(0.25),
190
  directPaste: faker.datatype.boolean(0.25),
191
+ hidePromptExamples: {},
192
  customPrompts: {},
193
  assistants: [],
194
  };
 
269
  updatedAt: faker.date.recent({ days: 145 }),
270
  model: faker.helpers.arrayElement(modelIds),
271
  title: faker.internet.emoji() + " " + faker.hacker.phrase(),
272
+ // embeddings removed in this build
273
  messages,
274
  rootMessageId: messages[0].id,
275
  } satisfies Conversation;
 
284
  );
285
  console.log("Done creating conversations.");
286
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
287
  }
288
 
289
  // run seed
server.log ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ /Users/vm/.venv/bin/python3: No module named uvicorn
2
+ /Users/vm/.venv/bin/python3: No module named uvicorn
src/ambient.d.ts CHANGED
@@ -2,3 +2,6 @@ declare module "*.ttf" {
2
  const value: ArrayBuffer;
3
  export default value;
4
  }
 
 
 
 
2
  const value: ArrayBuffer;
3
  export default value;
4
  }
5
+
6
+ // Legacy helpers removed: web search support is deprecated, so we intentionally
7
+ // avoid leaking those shapes into the global ambient types.
src/app.html CHANGED
@@ -5,15 +5,18 @@
5
  <meta name="viewport" content="width=device-width, initial-scale=1" />
6
  <meta name="theme-color" content="rgb(249, 250, 251)" />
7
  <script>
8
- if (
9
- localStorage.theme === "dark" ||
10
- (!("theme" in localStorage) && window.matchMedia("(prefers-color-scheme: dark)").matches)
11
- ) {
12
- document.documentElement.classList.add("dark");
13
- document
14
- .querySelector('meta[name="theme-color"]')
15
- .setAttribute("content", "rgb(26, 36, 50)");
16
- }
 
 
 
17
 
18
  // For some reason, Sveltekit doesn't let us load env variables from .env here, so we load it from hooks.server.ts
19
  window.gaId = "%gaId%";
 
5
  <meta name="viewport" content="width=device-width, initial-scale=1" />
6
  <meta name="theme-color" content="rgb(249, 250, 251)" />
7
  <script>
8
+ (function () {
9
+ try {
10
+ var prefersDark = window.matchMedia("(prefers-color-scheme: dark)").matches;
11
+ var stored = localStorage.getItem("theme");
12
+ var followSystem = stored === null || stored === "system";
13
+ var isDark = stored === "dark" || (followSystem && prefersDark);
14
+ if (isDark) {
15
+ document.documentElement.classList.add("dark");
16
+ document.querySelector('meta[name="theme-color"]').setAttribute("content", "#07090d");
17
+ }
18
+ } catch (e) {}
19
+ })();
20
 
21
  // For some reason, Sveltekit doesn't let us load env variables from .env here, so we load it from hooks.server.ts
22
  window.gaId = "%gaId%";
src/hooks.server.ts CHANGED
@@ -9,9 +9,7 @@ import { checkAndRunMigrations } from "$lib/migrations/migrations";
9
  import { building, dev } from "$app/environment";
10
  import { logger } from "$lib/server/logger";
11
  import { AbortedGenerations } from "$lib/server/abortedGenerations";
12
- import { MetricsServer } from "$lib/server/metrics";
13
  import { initExitHandler } from "$lib/server/exitHandler";
14
- import { refreshAssistantsCounts } from "$lib/jobs/refresh-assistants-counts";
15
  import { refreshConversationStats } from "$lib/jobs/refresh-conversation-stats";
16
  import { adminTokenManager } from "$lib/server/adminToken";
17
  import { isHostLocalhost } from "$lib/server/isURLLocal";
@@ -22,21 +20,25 @@ export const init: ServerInit = async () => {
22
 
23
  // TODO: move this code on a started server hook, instead of using a "building" flag
24
  if (!building) {
25
- // Set HF_TOKEN as a process variable for Transformers.JS to see it
26
- process.env.HF_TOKEN ??= config.HF_TOKEN;
 
 
 
 
 
 
 
 
 
 
27
 
28
  logger.info("Starting server...");
29
  initExitHandler();
30
 
31
  checkAndRunMigrations();
32
- if (config.ENABLE_ASSISTANTS) {
33
- refreshAssistantsCounts();
34
- }
35
  refreshConversationStats();
36
 
37
- // Init metrics server
38
- MetricsServer.getInstance();
39
-
40
  // Init AbortedGenerations refresh process
41
  AbortedGenerations.getInstance();
42
 
@@ -186,23 +188,7 @@ export const handle: Handle = async ({ event, resolve }) => {
186
  return errorResponse(401, ERROR_MESSAGES.authOnly);
187
  }
188
 
189
- // if login is not required and the call is not from /settings and we display the ethics modal with PUBLIC_APP_DISCLAIMER
190
- // we check if the user has accepted the ethics modal first.
191
- // If login is required, `ethicsModalAcceptedAt` is already true at this point, so do not pass this condition. This saves a DB call.
192
- if (
193
- !requiresUser &&
194
- !event.url.pathname.startsWith(`${base}/settings`) &&
195
- config.PUBLIC_APP_DISCLAIMER === "1"
196
- ) {
197
- const hasAcceptedEthicsModal = await collections.settings.countDocuments({
198
- sessionId: event.locals.sessionId,
199
- ethicsModalAcceptedAt: { $exists: true },
200
- });
201
-
202
- if (!hasAcceptedEthicsModal) {
203
- return errorResponse(405, "You need to accept the welcome modal first");
204
- }
205
- }
206
  }
207
 
208
  let replaced = false;
 
9
  import { building, dev } from "$app/environment";
10
  import { logger } from "$lib/server/logger";
11
  import { AbortedGenerations } from "$lib/server/abortedGenerations";
 
12
  import { initExitHandler } from "$lib/server/exitHandler";
 
13
  import { refreshConversationStats } from "$lib/jobs/refresh-conversation-stats";
14
  import { adminTokenManager } from "$lib/server/adminToken";
15
  import { isHostLocalhost } from "$lib/server/isURLLocal";
 
20
 
21
  // TODO: move this code on a started server hook, instead of using a "building" flag
22
  if (!building) {
23
+ // Ensure legacy env expected by some libs: map OPENAI_API_KEY -> HF_TOKEN if absent
24
+ const canonicalToken = config.OPENAI_API_KEY || config.HF_TOKEN;
25
+ if (canonicalToken) {
26
+ process.env.HF_TOKEN ??= canonicalToken;
27
+ }
28
+
29
+ // Warn if legacy-only var is used
30
+ if (!config.OPENAI_API_KEY && config.HF_TOKEN) {
31
+ logger.warn(
32
+ "HF_TOKEN is deprecated in favor of OPENAI_API_KEY. Please migrate to OPENAI_API_KEY."
33
+ );
34
+ }
35
 
36
  logger.info("Starting server...");
37
  initExitHandler();
38
 
39
  checkAndRunMigrations();
 
 
 
40
  refreshConversationStats();
41
 
 
 
 
42
  // Init AbortedGenerations refresh process
43
  AbortedGenerations.getInstance();
44
 
 
188
  return errorResponse(401, ERROR_MESSAGES.authOnly);
189
  }
190
 
191
+ // Ethics disclaimer gating removed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192
  }
193
 
194
  let replaced = false;
src/lib/APIClient.ts CHANGED
@@ -20,28 +20,20 @@ superjson.registerCustom<ObjectId, string>(
20
  "ObjectId"
21
  );
22
 
23
- export function useAPIClient({ fetch }: { fetch?: Treaty.Config["fetcher"] } = {}) {
24
- let url;
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- if (!browser) {
27
- let port;
28
- if (process.argv.includes("--port")) {
29
- port = parseInt(process.argv[process.argv.indexOf("--port") + 1]);
30
- } else {
31
- const mode = process.argv.find((arg) => arg === "preview" || arg === "dev");
32
- if (mode === "preview") {
33
- port = 4173;
34
- } else if (mode === "dev") {
35
- port = 5173;
36
- } else {
37
- port = 3000;
38
- }
39
- }
40
- // Always use localhost for server-side requests to avoid external HTTP calls during SSR
41
- url = `http://localhost:${port}${base}/api/v2`;
42
- } else {
43
- url = `${window.location.origin}${base}/api/v2`;
44
- }
45
  const app = treaty<App>(url, { fetcher: fetch });
46
  return app;
47
  }
@@ -57,12 +49,3 @@ export function handleResponse<T extends Record<number, unknown>>(
57
  typeof response.data === "string" ? response.data : JSON.stringify(response.data)
58
  ) as T[200];
59
  }
60
-
61
- // eslint-disable-next-line @typescript-eslint/no-explicit-any
62
- export type Success<T extends (...args: any) => any> =
63
- Awaited<ReturnType<T>> extends {
64
- data: infer D;
65
- error: unknown;
66
- }
67
- ? D
68
- : never;
 
20
  "ObjectId"
21
  );
22
 
23
+ export function useAPIClient({
24
+ fetch,
25
+ origin,
26
+ }: {
27
+ fetch?: Treaty.Config["fetcher"];
28
+ origin?: string;
29
+ } = {}) {
30
+ // On the server, use the current request origin when available to avoid
31
+ // incorrect port guessing and ensure cookies are forwarded properly.
32
+ // Fall back to a sane default in dev if origin is missing.
33
+ const url = browser
34
+ ? `${window.location.origin}${base}/api/v2`
35
+ : `${origin ?? `http://localhost:5173`}${base}/api/v2`;
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  const app = treaty<App>(url, { fetcher: fetch });
38
  return app;
39
  }
 
49
  typeof response.data === "string" ? response.data : JSON.stringify(response.data)
50
  ) as T[200];
51
  }
 
 
 
 
 
 
 
 
 
src/lib/actions/snapScrollToBottom.ts CHANGED
@@ -1,6 +1,5 @@
1
- import { navigating } from "$app/stores";
2
  import { tick } from "svelte";
3
- import { get } from "svelte/store";
4
 
5
  const detachedOffset = 10;
6
 
@@ -31,7 +30,7 @@ export const snapScrollToBottom = (node: HTMLElement, dependency: unknown) => {
31
  const options = { ...defaultOptions, ..._options };
32
  const { force } = options;
33
 
34
- if (!force && isDetached && !get(navigating)) return;
35
 
36
  // wait for next tick to ensure that the DOM is updated
37
  await tick();
 
1
+ import { navigating } from "$app/state";
2
  import { tick } from "svelte";
 
3
 
4
  const detachedOffset = 10;
5
 
 
30
  const options = { ...defaultOptions, ..._options };
31
  const { force } = options;
32
 
33
+ if (!force && isDetached && !navigating.to) return;
34
 
35
  // wait for next tick to ensure that the DOM is updated
36
  await tick();
src/lib/buildPrompt.ts CHANGED
@@ -1,11 +1,8 @@
1
  import type { EndpointParameters } from "./server/endpoints/endpoints";
2
  import type { BackendModel } from "./server/models";
3
- import type { Tool, ToolResult } from "./types/Tool";
4
 
5
  type buildPromptOptions = Pick<EndpointParameters, "messages" | "preprompt" | "continueMessage"> & {
6
  model: BackendModel;
7
- tools?: Tool[];
8
- toolResults?: ToolResult[];
9
  };
10
 
11
  export async function buildPrompt({
@@ -13,8 +10,6 @@ export async function buildPrompt({
13
  model,
14
  preprompt,
15
  continueMessage,
16
- tools,
17
- toolResults,
18
  }: buildPromptOptions): Promise<string> {
19
  const filteredMessages = messages;
20
 
@@ -29,8 +24,6 @@ export async function buildPrompt({
29
  role: m.from,
30
  })),
31
  preprompt,
32
- tools,
33
- toolResults,
34
  continueMessage,
35
  })
36
  // Not super precise, but it's truncated in the model's backend anyway
 
1
  import type { EndpointParameters } from "./server/endpoints/endpoints";
2
  import type { BackendModel } from "./server/models";
 
3
 
4
  type buildPromptOptions = Pick<EndpointParameters, "messages" | "preprompt" | "continueMessage"> & {
5
  model: BackendModel;
 
 
6
  };
7
 
8
  export async function buildPrompt({
 
10
  model,
11
  preprompt,
12
  continueMessage,
 
 
13
  }: buildPromptOptions): Promise<string> {
14
  const filteredMessages = messages;
15
 
 
24
  role: m.from,
25
  })),
26
  preprompt,
 
 
27
  continueMessage,
28
  })
29
  // Not super precise, but it's truncated in the model's backend anyway
src/lib/components/AssistantSettings.svelte DELETED
@@ -1,657 +0,0 @@
1
- <script lang="ts">
2
- import type { Model } from "$lib/types/Model";
3
- import type { Assistant } from "$lib/types/Assistant";
4
-
5
- import { onMount } from "svelte";
6
- import { page } from "$app/state";
7
- import { base } from "$app/paths";
8
- import CarbonPen from "~icons/carbon/pen";
9
- import CarbonUpload from "~icons/carbon/upload";
10
- import CarbonHelpFilled from "~icons/carbon/help";
11
- import CarbonSettingsAdjust from "~icons/carbon/settings-adjust";
12
- import CarbonTools from "~icons/carbon/tools";
13
-
14
- import { useSettingsStore } from "$lib/stores/settings";
15
- import IconInternet from "./icons/IconInternet.svelte";
16
- import TokensCounter from "./TokensCounter.svelte";
17
- import HoverTooltip from "./HoverTooltip.svelte";
18
- import { findCurrentModel } from "$lib/utils/models";
19
- import AssistantToolPicker from "./AssistantToolPicker.svelte";
20
- import { error } from "$lib/stores/errors";
21
- import { goto } from "$app/navigation";
22
- import { usePublicConfig } from "$lib/utils/PublicConfig.svelte";
23
-
24
- const publicConfig = usePublicConfig();
25
-
26
- type AssistantFront = Omit<Assistant, "_id" | "createdById"> & { _id: string };
27
-
28
- interface Props {
29
- assistant?: AssistantFront | undefined;
30
- models?: Model[];
31
- }
32
-
33
- let errors = $state<
34
- {
35
- field: string;
36
- message: string;
37
- }[]
38
- >([]);
39
-
40
- let { assistant = undefined, models = [] }: Props = $props();
41
-
42
- let files: FileList | null = $state(null);
43
- const settings = useSettingsStore();
44
- let modelId = $state("");
45
- let systemPrompt = $state(assistant?.preprompt ?? "");
46
- let dynamicPrompt = $state(assistant?.dynamicPrompt ?? false);
47
- let showModelSettings = $state(Object.values(assistant?.generateSettings ?? {}).some((v) => !!v));
48
-
49
- onMount(async () => {
50
- modelId = findCurrentModel(models, assistant ? assistant.modelId : $settings.activeModel).id;
51
- });
52
-
53
- let inputMessage1 = $state(assistant?.exampleInputs[0] ?? "");
54
- let inputMessage2 = $state(assistant?.exampleInputs[1] ?? "");
55
- let inputMessage3 = $state(assistant?.exampleInputs[2] ?? "");
56
- let inputMessage4 = $state(assistant?.exampleInputs[3] ?? "");
57
-
58
- function clearError(field: string) {
59
- errors = errors.filter((e) => e.field !== field);
60
- }
61
-
62
- function onFilesChange(e: Event) {
63
- const inputEl = e.target as HTMLInputElement;
64
- if (inputEl.files?.length && inputEl.files[0].size > 0) {
65
- if (!inputEl.files[0].type.includes("image")) {
66
- inputEl.files = null;
67
- files = null;
68
-
69
- errors = [{ field: "avatar", message: "Only images are allowed" }];
70
- return;
71
- }
72
- files = inputEl.files;
73
- clearError("avatar");
74
- deleteExistingAvatar = false;
75
- }
76
- }
77
-
78
- function getError(field: string) {
79
- return errors.find((error) => error.field === field)?.message ?? "";
80
- }
81
-
82
- let deleteExistingAvatar = $state(false);
83
-
84
- let loading = $state(false);
85
-
86
- let ragMode: false | "links" | "domains" | "all" = $state(
87
- assistant?.rag?.allowAllDomains
88
- ? "all"
89
- : (assistant?.rag?.allowedLinks?.length ?? 0 > 0)
90
- ? "links"
91
- : (assistant?.rag?.allowedDomains?.length ?? 0) > 0
92
- ? "domains"
93
- : false
94
- );
95
-
96
- let tools = $state(assistant?.tools ?? []);
97
- const regex = /{{\s?(get|post|url|today)(=.*?)?\s?}}/g;
98
-
99
- let templateVariables = $derived([...systemPrompt.matchAll(regex)]);
100
- let selectedModel = $derived(models.find((m) => m.id === modelId));
101
- </script>
102
-
103
- <form
104
- class="relative flex h-full flex-col overflow-y-auto md:p-8 md:pt-0"
105
- enctype="multipart/form-data"
106
- onsubmit={async (e) => {
107
- e.preventDefault();
108
- if (!e.target) {
109
- return;
110
- }
111
- const formData = new FormData(e.target as HTMLFormElement, e.submitter);
112
-
113
- loading = true;
114
- if (files?.[0] && files[0].size > 0) {
115
- formData.set("avatar", files[0]);
116
- }
117
-
118
- if (deleteExistingAvatar === true) {
119
- if (assistant?.avatar) {
120
- // if there is an avatar we explicitly removei t
121
- formData.set("avatar", "null");
122
- } else {
123
- // else we just remove it from the input
124
- formData.delete("avatar");
125
- }
126
- } else {
127
- if (files === null) {
128
- formData.delete("avatar");
129
- }
130
- }
131
-
132
- formData.delete("ragMode");
133
-
134
- if (ragMode === false || !page.data.enableAssistantsRAG) {
135
- formData.set("ragAllowAll", "false");
136
- formData.set("ragLinkList", "");
137
- formData.set("ragDomainList", "");
138
- } else if (ragMode === "all") {
139
- formData.set("ragAllowAll", "true");
140
- formData.set("ragLinkList", "");
141
- formData.set("ragDomainList", "");
142
- } else if (ragMode === "links") {
143
- formData.set("ragAllowAll", "false");
144
- formData.set("ragDomainList", "");
145
- } else if (ragMode === "domains") {
146
- formData.set("ragAllowAll", "false");
147
- formData.set("ragLinkList", "");
148
- }
149
-
150
- formData.set("tools", tools.join(","));
151
-
152
- let response: Response;
153
- if (assistant?._id) {
154
- response = await fetch(`${base}/api/assistant/${assistant._id}`, {
155
- method: "PATCH",
156
- body: formData,
157
- });
158
- if (response.ok) {
159
- goto(`${base}/settings/assistants/${assistant?._id}`, { invalidateAll: true });
160
- } else {
161
- if (response.status === 400) {
162
- const data = await response.json();
163
- errors = data.errors;
164
- } else {
165
- $error = response.statusText;
166
- }
167
- loading = false;
168
- }
169
- } else {
170
- response = await fetch(`${base}/api/assistant`, {
171
- method: "POST",
172
- body: formData,
173
- });
174
-
175
- if (response.ok) {
176
- const { assistantId } = await response.json();
177
- goto(`${base}/settings/assistants/${assistantId}`, { invalidateAll: true });
178
- } else {
179
- if (response.status === 400) {
180
- const data = await response.json();
181
- errors = data.errors;
182
- } else {
183
- $error = response.statusText;
184
- }
185
- loading = false;
186
- }
187
- }
188
- }}
189
- >
190
- {#if assistant}
191
- <h2 class="text-xl font-semibold">
192
- Edit Assistant: {assistant?.name ?? "assistant"}
193
- </h2>
194
- <p class="mb-6 text-sm text-gray-500">
195
- Modifying an existing assistant will propagate the changes to all users.
196
- </p>
197
- {:else}
198
- <h2 class="text-xl font-semibold">Create new assistant</h2>
199
- <p class="mb-6 text-sm text-gray-500">
200
- Create and share your own AI Assistant. All assistants are <span
201
- class="rounded-full border px-2 py-0.5 leading-none">public</span
202
- >
203
- </p>
204
- {/if}
205
-
206
- <div class="grid h-full w-full flex-1 grid-cols-2 gap-6 text-sm max-sm:grid-cols-1">
207
- <div class="col-span-1 flex flex-col gap-4">
208
- <div>
209
- <div class="mb-1 block pb-2 text-sm font-semibold">Avatar</div>
210
- <input
211
- type="file"
212
- accept="image/*"
213
- name="avatar"
214
- id="avatar"
215
- class="hidden"
216
- onchange={onFilesChange}
217
- />
218
-
219
- {#if (files && files[0]) || (assistant?.avatar && !deleteExistingAvatar)}
220
- <div class="group relative mx-auto h-12 w-12">
221
- {#if files && files[0]}
222
- <img
223
- src={URL.createObjectURL(files[0])}
224
- alt="avatar"
225
- class="crop mx-auto h-12 w-12 cursor-pointer rounded-full object-cover"
226
- />
227
- {:else if assistant?.avatar}
228
- <img
229
- src="{base}/settings/assistants/{assistant._id}/avatar.jpg?hash={assistant.avatar}"
230
- alt="avatar"
231
- class="crop mx-auto h-12 w-12 cursor-pointer rounded-full object-cover"
232
- />
233
- {/if}
234
-
235
- <label
236
- for="avatar"
237
- class="invisible absolute bottom-0 h-12 w-12 rounded-full bg-black bg-opacity-50 p-1 group-hover:visible hover:visible"
238
- >
239
- <CarbonPen class="mx-auto my-auto h-full cursor-pointer text-center text-white" />
240
- </label>
241
- </div>
242
- <div class="mx-auto w-max pt-1">
243
- <button
244
- type="button"
245
- onclick={(e) => {
246
- e.preventDefault();
247
- e.stopPropagation();
248
- files = null;
249
- deleteExistingAvatar = true;
250
- clearError("avatar");
251
- }}
252
- class="mx-auto w-max text-center text-xs text-gray-600 hover:underline"
253
- >
254
- Delete
255
- </button>
256
- </div>
257
- {:else}
258
- <div class="mb-1 flex w-max flex-row gap-4">
259
- <label
260
- for="avatar"
261
- class="btn flex h-8 rounded-lg border bg-white px-3 py-1 text-gray-500 shadow-sm transition-all hover:bg-gray-100"
262
- >
263
- <CarbonUpload class="mr-2 text-xs " /> Upload
264
- </label>
265
- </div>
266
- {/if}
267
- <p class="text-xs text-red-500">{getError("avatar")}</p>
268
- </div>
269
-
270
- <label>
271
- <div class="mb-1 font-semibold">Name</div>
272
- <input
273
- name="name"
274
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
275
- placeholder="Assistant Name"
276
- value={assistant?.name ?? ""}
277
- oninput={() => clearError("name")}
278
- />
279
- <p class="text-xs text-red-500">{getError("name")}</p>
280
- </label>
281
-
282
- <label>
283
- <div class="mb-1 font-semibold">Description</div>
284
- <textarea
285
- name="description"
286
- class="h-15 w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
287
- placeholder="It knows everything about python"
288
- value={assistant?.description ?? ""}
289
- oninput={() => clearError("description")}
290
- ></textarea>
291
- <p class="text-xs text-red-500">{getError("description")}</p>
292
- </label>
293
-
294
- <label>
295
- <div class="mb-1 font-semibold">Model</div>
296
- <div class="flex gap-2">
297
- <select
298
- name="modelId"
299
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
300
- bind:value={modelId}
301
- onchange={() => clearError("modelId")}
302
- >
303
- {#each models.filter((model) => !model.unlisted) as model}
304
- <option value={model.id}>{model.displayName}</option>
305
- {/each}
306
- </select>
307
- <p class="text-xs text-red-500">{getError("modelId")}</p>
308
- <button
309
- type="button"
310
- class="flex aspect-square items-center gap-2 whitespace-nowrap rounded-lg border px-3 {showModelSettings
311
- ? 'border-blue-500/20 bg-blue-50 text-blue-600'
312
- : ''}"
313
- onclick={() => (showModelSettings = !showModelSettings)}
314
- ><CarbonSettingsAdjust class="text-xs" /></button
315
- >
316
- </div>
317
- <div
318
- class="mt-2 rounded-lg border border-blue-500/20 bg-blue-500/5 px-2 py-0.5"
319
- class:hidden={!showModelSettings}
320
- >
321
- <p class="text-xs text-red-500">{getError("inputMessage1")}</p>
322
- <div class="my-2 grid grid-cols-1 gap-2.5 sm:grid-cols-2 sm:grid-rows-2">
323
- <label for="temperature" class="flex justify-between">
324
- <span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
325
- Temperature
326
-
327
- <HoverTooltip
328
- label="Temperature: Controls creativity, higher values allow more variety."
329
- >
330
- <CarbonHelpFilled
331
- class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
332
- />
333
- </HoverTooltip>
334
- </span>
335
- <input
336
- type="number"
337
- name="temperature"
338
- min="0.1"
339
- max="2"
340
- step="0.1"
341
- class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
342
- placeholder={selectedModel?.parameters?.temperature?.toString() ?? "1"}
343
- value={assistant?.generateSettings?.temperature ?? ""}
344
- />
345
- </label>
346
- <label for="top_p" class="flex justify-between">
347
- <span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
348
- Top P
349
- <HoverTooltip
350
- label="Top P: Sets word choice boundaries, lower values tighten focus."
351
- >
352
- <CarbonHelpFilled
353
- class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
354
- />
355
- </HoverTooltip>
356
- </span>
357
-
358
- <input
359
- type="number"
360
- name="top_p"
361
- class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
362
- min="0.05"
363
- max="1"
364
- step="0.05"
365
- placeholder={selectedModel?.parameters?.top_p?.toString() ?? "1"}
366
- value={assistant?.generateSettings?.top_p ?? ""}
367
- />
368
- </label>
369
- <label for="repetition_penalty" class="flex justify-between">
370
- <span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
371
- Repetition penalty
372
- <HoverTooltip
373
- label="Repetition penalty: Prevents reuse, higher values decrease repetition."
374
- >
375
- <CarbonHelpFilled
376
- class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
377
- />
378
- </HoverTooltip>
379
- </span>
380
- <input
381
- type="number"
382
- name="repetition_penalty"
383
- min="0.1"
384
- max="2"
385
- step="0.05"
386
- class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
387
- placeholder={selectedModel?.parameters?.repetition_penalty?.toString() ?? "1.0"}
388
- value={assistant?.generateSettings?.repetition_penalty ?? ""}
389
- />
390
- </label>
391
- <label for="top_k" class="flex justify-between">
392
- <span class="m-1 ml-0 flex items-center gap-1.5 whitespace-nowrap text-sm">
393
- Top K <HoverTooltip
394
- label="Top K: Restricts word options, lower values for predictability."
395
- >
396
- <CarbonHelpFilled
397
- class="inline text-xxs text-gray-500 group-hover/tooltip:text-blue-600"
398
- />
399
- </HoverTooltip>
400
- </span>
401
- <input
402
- type="number"
403
- name="top_k"
404
- min="5"
405
- max="100"
406
- step="5"
407
- class="w-20 rounded-lg border-2 border-gray-200 bg-gray-100 px-2 py-1"
408
- placeholder={selectedModel?.parameters?.top_k?.toString() ?? "50"}
409
- value={assistant?.generateSettings?.top_k ?? ""}
410
- />
411
- </label>
412
- </div>
413
- </div>
414
- </label>
415
-
416
- <label>
417
- <div class="mb-1 font-semibold">User start messages</div>
418
- <div class="grid gap-1.5 text-sm md:grid-cols-2">
419
- <input
420
- name="exampleInput1"
421
- placeholder="Start Message 1"
422
- bind:value={inputMessage1}
423
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
424
- oninput={() => clearError("inputMessage1")}
425
- />
426
- <input
427
- name="exampleInput2"
428
- placeholder="Start Message 2"
429
- bind:value={inputMessage2}
430
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
431
- oninput={() => clearError("inputMessage1")}
432
- />
433
-
434
- <input
435
- name="exampleInput3"
436
- placeholder="Start Message 3"
437
- bind:value={inputMessage3}
438
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
439
- oninput={() => clearError("inputMessage1")}
440
- />
441
- <input
442
- name="exampleInput4"
443
- placeholder="Start Message 4"
444
- bind:value={inputMessage4}
445
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
446
- oninput={() => clearError("inputMessage1")}
447
- />
448
- </div>
449
- <p class="text-xs text-red-500">{getError("inputMessage1")}</p>
450
- </label>
451
- {#if selectedModel?.tools}
452
- <div>
453
- <span class="text-smd font-semibold"
454
- >Tools
455
- <CarbonTools class="inline text-xs text-purple-600" />
456
- <span class="ml-1 rounded bg-gray-100 px-1 py-0.5 text-xxs font-normal text-gray-600"
457
- >Experimental</span
458
- >
459
- </span>
460
- <p class="text-xs text-gray-500">
461
- Choose up to 3 community tools that will be used with this assistant.
462
- </p>
463
- </div>
464
- <AssistantToolPicker bind:toolIds={tools} />
465
- {/if}
466
- {#if page.data.enableAssistantsRAG}
467
- <div class="flex flex-col flex-nowrap pb-4">
468
- <span class="mt-2 text-smd font-semibold"
469
- >Internet access
470
- <IconInternet classNames="inline text-sm text-blue-600" />
471
-
472
- {#if publicConfig.isHuggingChat}
473
- <a
474
- href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions/385"
475
- target="_blank"
476
- class="ml-0.5 rounded bg-gray-100 px-1 py-0.5 text-xxs font-normal text-gray-700 underline decoration-gray-400"
477
- >Give feedback</a
478
- >
479
- {/if}
480
- </span>
481
-
482
- <label class="mt-1">
483
- <input
484
- checked={!ragMode}
485
- onchange={() => (ragMode = false)}
486
- type="radio"
487
- name="ragMode"
488
- value={false}
489
- />
490
- <span class="my-2 text-sm" class:font-semibold={!ragMode}> Default </span>
491
- {#if !ragMode}
492
- <span class="block text-xs text-gray-500">
493
- Assistant will not use internet to do information retrieval and will respond faster.
494
- Recommended for most Assistants.
495
- </span>
496
- {/if}
497
- </label>
498
-
499
- <label class="mt-1">
500
- <input
501
- checked={ragMode === "all"}
502
- onchange={() => (ragMode = "all")}
503
- type="radio"
504
- name="ragMode"
505
- value={"all"}
506
- />
507
- <span class="my-2 text-sm" class:font-semibold={ragMode === "all"}> Web search </span>
508
- {#if ragMode === "all"}
509
- <span class="block text-xs text-gray-500">
510
- Assistant will do a web search on each user request to find information.
511
- </span>
512
- {/if}
513
- </label>
514
-
515
- <label class="mt-1">
516
- <input
517
- checked={ragMode === "domains"}
518
- onchange={() => (ragMode = "domains")}
519
- type="radio"
520
- name="ragMode"
521
- value={false}
522
- />
523
- <span class="my-2 text-sm" class:font-semibold={ragMode === "domains"}>
524
- Domains search
525
- </span>
526
- </label>
527
- {#if ragMode === "domains"}
528
- <span class="mb-2 text-xs text-gray-500">
529
- Specify domains and URLs that the application can search, separated by commas.
530
- </span>
531
-
532
- <input
533
- name="ragDomainList"
534
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
535
- placeholder="wikipedia.org,bbc.com"
536
- value={assistant?.rag?.allowedDomains?.join(",") ?? ""}
537
- oninput={() => clearError("ragDomainList")}
538
- />
539
- <p class="text-xs text-red-500">{getError("ragDomainList")}</p>
540
- {/if}
541
-
542
- <label class="mt-1">
543
- <input
544
- checked={ragMode === "links"}
545
- onchange={() => (ragMode = "links")}
546
- type="radio"
547
- name="ragMode"
548
- value={false}
549
- />
550
- <span class="my-2 text-sm" class:font-semibold={ragMode === "links"}>
551
- Specific Links
552
- </span>
553
- </label>
554
- {#if ragMode === "links"}
555
- <span class="mb-2 text-xs text-gray-500">
556
- Specify a maximum of 10 direct URLs that the Assistant will access. HTML & Plain Text
557
- only, separated by commas
558
- </span>
559
- <input
560
- name="ragLinkList"
561
- class="w-full rounded-lg border-2 border-gray-200 bg-gray-100 p-2"
562
- placeholder="https://raw.githubusercontent.com/huggingface/chat-ui/main/README.md"
563
- value={assistant?.rag?.allowedLinks.join(",") ?? ""}
564
- oninput={() => clearError("ragLinkList")}
565
- />
566
- <p class="text-xs text-red-500">{getError("ragLinkList")}</p>
567
- {/if}
568
- </div>
569
- {/if}
570
- </div>
571
-
572
- <div class="relative col-span-1 flex h-full flex-col">
573
- <div class="mb-1 flex justify-between text-sm">
574
- <span class="block font-semibold"> Instructions (System Prompt) </span>
575
- {#if dynamicPrompt && templateVariables.length}
576
- <div class="relative">
577
- <button
578
- type="button"
579
- class="peer rounded bg-blue-500/20 px-1 text-xs text-blue-600 focus:bg-blue-500/30 focus:text-blue-800 sm:text-sm"
580
- >
581
- {templateVariables.length} template variable{templateVariables.length > 1 ? "s" : ""}
582
- </button>
583
- <div
584
- class="invisible absolute right-0 top-6 z-10 rounded-lg border bg-white p-2 text-xs shadow-lg peer-focus:visible hover:visible sm:w-96"
585
- >
586
- Will perform a GET or POST request and inject the response into the prompt. Works
587
- better with plain text, csv or json content.
588
- {#each templateVariables as match}
589
- <div>
590
- <a
591
- href={match[1].toLowerCase() === "get" ? match[2] : "#"}
592
- target={match[1].toLowerCase() === "get" ? "_blank" : ""}
593
- class="text-gray-500 underline decoration-gray-300"
594
- >
595
- {match[1].toUpperCase()}: {match[2]}
596
- </a>
597
- </div>
598
- {/each}
599
- </div>
600
- </div>
601
- {/if}
602
- </div>
603
- <label class="pb-2 text-sm has-[:checked]:font-semibold">
604
- <input type="checkbox" name="dynamicPrompt" bind:checked={dynamicPrompt} />
605
- Dynamic Prompt
606
- <p class="mb-2 text-xs font-normal text-gray-500">
607
- Allow the use of template variables {"{{get=https://example.com/path}}"}
608
- to insert dynamic content into your prompt by making GET requests to specified URLs on each
609
- inference. You can also send the user's message as the body of a POST request, using {"{{post=https://example.com/path}}"}.
610
- Use {"{{today}}"} to include the current date.
611
- </p>
612
- </label>
613
-
614
- <div class="relative mb-20 flex h-full flex-col gap-2">
615
- <textarea
616
- name="preprompt"
617
- class="min-h-[8lh] flex-1 rounded-lg border-2 border-gray-200 bg-gray-100 p-2 text-sm"
618
- placeholder="You'll act as..."
619
- bind:value={systemPrompt}
620
- oninput={() => clearError("preprompt")}
621
- ></textarea>
622
- {#if modelId}
623
- {@const model = models.find((_model) => _model.id === modelId)}
624
- {#if model?.tokenizer && systemPrompt}
625
- <TokensCounter
626
- classNames="absolute bottom-4 right-4"
627
- prompt={systemPrompt}
628
- modelTokenizer={model.tokenizer}
629
- truncate={model?.parameters?.truncate}
630
- />
631
- {/if}
632
- {/if}
633
-
634
- <p class="text-xs text-red-500">{getError("preprompt")}</p>
635
- </div>
636
- <div class="absolute bottom-6 flex w-full justify-end gap-2 md:right-0 md:w-fit">
637
- <a
638
- href={assistant ? `${base}/settings/assistants/${assistant?._id}` : `${base}/settings`}
639
- class="flex items-center justify-center rounded-full bg-gray-200 px-5 py-2 font-semibold text-gray-600"
640
- >
641
- Cancel
642
- </a>
643
- <button
644
- type="submit"
645
- disabled={loading}
646
- aria-disabled={loading}
647
- class="flex items-center justify-center rounded-full bg-black px-8 py-2 font-semibold"
648
- class:bg-gray-200={loading}
649
- class:text-gray-600={loading}
650
- class:text-white={!loading}
651
- >
652
- {assistant ? "Save" : "Create"}
653
- </button>
654
- </div>
655
- </div>
656
- </div>
657
- </form>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/lib/components/AssistantToolPicker.svelte DELETED
@@ -1,150 +0,0 @@
1
- <script lang="ts">
2
- import { base } from "$app/paths";
3
- import type { ToolLogoColor, ToolLogoIcon } from "$lib/types/Tool";
4
- import { debounce } from "$lib/utils/debounce";
5
- import { onMount } from "svelte";
6
- import ToolLogo from "./ToolLogo.svelte";
7
-
8
- import CarbonClose from "~icons/carbon/close";
9
-
10
- interface ToolSuggestion {
11
- _id: string;
12
- displayName: string;
13
- createdByName: string;
14
- color: ToolLogoColor;
15
- icon: ToolLogoIcon;
16
- }
17
-
18
- interface Props {
19
- toolIds?: string[];
20
- }
21
-
22
- let { toolIds = $bindable([]) }: Props = $props();
23
-
24
- let selectedValues: ToolSuggestion[] = $state([]);
25
-
26
- onMount(async () => {
27
- selectedValues = await Promise.all(
28
- toolIds.map(async (id) => await fetch(`${base}/api/tools/${id}`).then((res) => res.json()))
29
- );
30
-
31
- await fetchSuggestions("");
32
- });
33
-
34
- let inputValue = $state("");
35
- let maxValues = 3;
36
-
37
- let suggestions: ToolSuggestion[] = $state([]);
38
-
39
- async function fetchSuggestions(query: string) {
40
- suggestions = (await fetch(`${base}/api/tools/search?q=${query}`).then((res) =>
41
- res.json()
42
- )) satisfies ToolSuggestion[];
43
- }
44
-
45
- const debouncedFetch = debounce((query: string) => fetchSuggestions(query), 300);
46
-
47
- function addValue(value: ToolSuggestion) {
48
- if (selectedValues.length < maxValues && !selectedValues.includes(value)) {
49
- selectedValues = [...selectedValues, value];
50
- toolIds = [...toolIds, value._id];
51
- inputValue = "";
52
- suggestions = [];
53
- }
54
- }
55
-
56
- function removeValue(id: ToolSuggestion["_id"]) {
57
- selectedValues = selectedValues.filter((v) => v._id !== id);
58
- toolIds = selectedValues.map((value) => value._id);
59
- }
60
- </script>
61
-
62
- {#if selectedValues.length > 0}
63
- <div class="flex flex-wrap items-center justify-center gap-2">
64
- {#each selectedValues as value}
65
- <div
66
- class="flex items-center justify-center space-x-2 rounded border border-gray-300 bg-gray-200 px-2 py-1"
67
- >
68
- {#key value.color + value.icon}
69
- <ToolLogo color={value.color} icon={value.icon} size="sm" />
70
- {/key}
71
- <div class="flex flex-col items-center justify-center py-1">
72
- <a
73
- href={`${base}/tools/${value._id}`}
74
- target="_blank"
75
- class="line-clamp-1 truncate font-semibold text-blue-600 hover:underline"
76
- >{value.displayName}</a
77
- >
78
- {#if value.createdByName}
79
- <p class="text-center text-xs text-gray-500">
80
- Created by
81
- <a class="underline" href="{base}/tools?user={value.createdByName}" target="_blank"
82
- >{value.createdByName}</a
83
- >
84
- </p>
85
- {:else}
86
- <p class="text-center text-xs text-gray-500">Official HuggingChat tool</p>
87
- {/if}
88
- </div>
89
- <button
90
- onclick={(e) => {
91
- e.preventDefault();
92
- e.stopPropagation();
93
- removeValue(value._id);
94
- }}
95
- class="text-lg text-gray-600"
96
- >
97
- <CarbonClose />
98
- </button>
99
- </div>
100
- {/each}
101
- </div>
102
- {/if}
103
-
104
- {#if selectedValues.length < maxValues}
105
- <div class="group relative block">
106
- <input
107
- type="text"
108
- bind:value={inputValue}
109
- oninput={(ev) => {
110
- inputValue = ev.currentTarget.value;
111
- debouncedFetch(inputValue);
112
- }}
113
- disabled={selectedValues.length >= maxValues}
114
- class="w-full rounded border border-gray-200 bg-gray-100 px-3 py-2"
115
- class:opacity-50={selectedValues.length >= maxValues}
116
- class:bg-gray-100={selectedValues.length >= maxValues}
117
- placeholder="Type to search tools..."
118
- tabindex="0"
119
- />
120
- {#if suggestions.length > 0}
121
- <div
122
- class="invisible absolute z-10 mt-1 w-full rounded border border-gray-300 bg-white shadow-lg group-focus-within:visible"
123
- tabindex="-1"
124
- >
125
- {#if inputValue === ""}
126
- <p class="px-3 py-2 text-left text-xs text-gray-500">
127
- Start typing to search for tools...
128
- </p>
129
- {:else}
130
- {#each suggestions as suggestion}
131
- <button
132
- onclick={(e) => {
133
- e.preventDefault();
134
- e.stopPropagation();
135
- addValue(suggestion);
136
- }}
137
- class="w-full cursor-pointer px-3 py-2 text-left hover:bg-blue-500 hover:text-white"
138
- tabindex="0"
139
- >
140
- {suggestion.displayName}
141
- {#if suggestion.createdByName}
142
- <span class="text-xs text-gray-500"> by {suggestion.createdByName}</span>
143
- {/if}
144
- </button>
145
- {/each}
146
- {/if}
147
- </div>
148
- {/if}
149
- </div>
150
- {/if}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/lib/components/CodeBlock.svelte CHANGED
@@ -1,22 +1,74 @@
1
  <script lang="ts">
2
  import CopyToClipBoardBtn from "./CopyToClipBoardBtn.svelte";
3
  import DOMPurify from "isomorphic-dompurify";
 
 
 
4
 
5
  interface Props {
6
  code?: string;
7
  rawCode?: string;
 
8
  }
9
 
10
- let { code = "", rawCode = "" }: Props = $props();
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  </script>
12
 
13
  <div class="group relative my-4 rounded-lg">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  <pre
15
- class="scrollbar-custom overflow-auto px-5 font-mono scrollbar-thumb-gray-500 hover:scrollbar-thumb-gray-400 dark:scrollbar-thumb-white/10 dark:hover:scrollbar-thumb-white/20"><code
16
  ><!-- eslint-disable svelte/no-at-html-tags -->{@html DOMPurify.sanitize(code)}</code
17
  ></pre>
18
- <CopyToClipBoardBtn
19
- classNames="btn rounded-lg border border-gray-200 px-2 py-2 text-sm shadow-sm transition-all hover:border-gray-300 active:shadow-inner dark:border-gray-700 dark:hover:border-gray-500 absolute top-2 right-2 invisible opacity-0 group-hover:visible group-hover:opacity-100 dark:text-gray-700 text-gray-200"
20
- value={rawCode}
21
- />
22
  </div>
 
1
  <script lang="ts">
2
  import CopyToClipBoardBtn from "./CopyToClipBoardBtn.svelte";
3
  import DOMPurify from "isomorphic-dompurify";
4
+ import HtmlPreviewModal from "./HtmlPreviewModal.svelte";
5
+ import PlayFilledAlt from "~icons/carbon/play-filled-alt";
6
+ import EosIconsLoading from "~icons/eos-icons/loading";
7
 
8
  interface Props {
9
  code?: string;
10
  rawCode?: string;
11
+ loading?: boolean;
12
  }
13
 
14
+ let { code = "", rawCode = "", loading = false }: Props = $props();
15
+
16
+ let previewOpen = $state(false);
17
+
18
+ function hasStrictHtml5Doctype(input: string): boolean {
19
+ if (!input) return false;
20
+ const withoutBOM = input.replace(/^\uFEFF/, "");
21
+ const trimmed = withoutBOM.trimStart();
22
+ // Strict HTML5 doctype: <!doctype html> with optional whitespace before >
23
+ return /^<!doctype\s+html\s*>/i.test(trimmed);
24
+ }
25
+
26
+ function isSvgDocument(input: string): boolean {
27
+ const trimmed = input.trimStart();
28
+ return /^(?:<\?xml[^>]*>\s*)?(?:<!doctype\s+svg[^>]*>\s*)?<svg[\s>]/i.test(trimmed);
29
+ }
30
+
31
+ let showPreview = $derived(hasStrictHtml5Doctype(rawCode) || isSvgDocument(rawCode));
32
  </script>
33
 
34
  <div class="group relative my-4 rounded-lg">
35
+ <div class="pointer-events-none sticky top-0 z-10 w-full">
36
+ <div
37
+ class="pointer-events-auto absolute right-2 top-2 flex items-center gap-1.5 md:right-3 md:top-3"
38
+ >
39
+ {#if showPreview}
40
+ <button
41
+ class="btn h-7 gap-1 rounded-lg border border-gray-600 bg-gray-600/50 px-2 text-xs text-gray-300 shadow-sm backdrop-blur transition-all hover:border-gray-500 active:shadow-inner disabled:cursor-not-allowed disabled:opacity-60 dark:border-gray-700 dark:text-gray-400 dark:hover:border-gray-500"
42
+ disabled={loading}
43
+ onclick={() => {
44
+ if (!loading) {
45
+ previewOpen = true;
46
+ }
47
+ }}
48
+ title="Preview HTML"
49
+ aria-label="Preview HTML"
50
+ >
51
+ {#if loading}
52
+ <EosIconsLoading class="size-3.5" />
53
+ {:else}
54
+ <PlayFilledAlt class="size-3.5" />
55
+ {/if}
56
+ Preview
57
+ </button>
58
+ {/if}
59
+ <CopyToClipBoardBtn
60
+ iconClassNames="size-3"
61
+ classNames="btn rounded-lg border size-7 text-sm shadow-sm transition-all bg-gray-600/50 backdrop-blur dark:hover:border-gray-500 active:shadow-inner border-gray-600 dark:border-gray-700 hover:border-gray-500 dark:text-gray-400 text-gray-300 "
62
+ value={rawCode}
63
+ />
64
+ </div>
65
+ </div>
66
  <pre
67
+ class="scrollbar-custom overflow-auto px-5 font-mono transition-[height] scrollbar-thumb-gray-500 hover:scrollbar-thumb-gray-400 dark:scrollbar-thumb-white/10 dark:hover:scrollbar-thumb-white/20"><code
68
  ><!-- eslint-disable svelte/no-at-html-tags -->{@html DOMPurify.sanitize(code)}</code
69
  ></pre>
70
+
71
+ {#if previewOpen}
72
+ <HtmlPreviewModal html={rawCode} onclose={() => (previewOpen = false)} />
73
+ {/if}
74
  </div>