Spaces:

Felladrin
/

MiniSearch

Running

App Files Files Community

MiniSearch / docs /configuration.md

github-actions[bot]

Sync from https://github.com/felladrin/MiniSearch

10d1fd4 9 days ago

preview code

raw

history blame contribute delete

7.48 kB

Configuration

Environment Variables

All configuration is done via environment variables. Create a .env file in the project root.

Access Control

Variable	Default	Description
`ACCESS_KEYS`	`''`	Comma-separated list of valid access keys (e.g., `'key1,key2,key3'`)
`ACCESS_KEY_TIMEOUT_HOURS`	`24`	Hours to cache validated keys in browser. Set to `0` to require validation on every request

Example:

ACCESS_KEYS="my-secret-key-1,my-secret-key-2"
ACCESS_KEY_TIMEOUT_HOURS="24"

AI Model Defaults

Configure default models for different inference types:

Variable	Default	Description
`WEBLLM_DEFAULT_F16_MODEL_ID`	`Qwen3-0.6B-q4f16_1-MLC`	Default WebLLM model with F16 shaders (requires WebGPU)
`WEBLLM_DEFAULT_F32_MODEL_ID`	`Qwen3-0.6B-q4f32_1-MLC`	Default WebLLM model with F32 shaders (CPU fallback)
`WLLAMA_DEFAULT_MODEL_ID`	`qwen-3-0.6b`	Default Wllama model (CPU-based, no WebGPU required)

Model Selection Notes:

F16 models are faster but require WebGPU with F16 shader support
F32 models work on all WebGPU-capable devices
Wllama models run on CPU via WebAssembly (slower but most compatible)

Internal API Configuration

For self-hosted OpenAI-compatible APIs:

Variable	Default	Description
`INTERNAL_OPENAI_COMPATIBLE_API_BASE_URL`	`''`	Base URL of your API (e.g., `https://api.internal.company.com/v1`)
`INTERNAL_OPENAI_COMPATIBLE_API_KEY`	`''`	API key for authentication
`INTERNAL_OPENAI_COMPATIBLE_API_MODEL`	`''`	Model ID to use (auto-detected if empty)
`INTERNAL_OPENAI_COMPATIBLE_API_NAME`	`Internal API`	Display name shown in UI

Example:

INTERNAL_OPENAI_COMPATIBLE_API_BASE_URL="https://llm.internal.company.com/v1"
INTERNAL_OPENAI_COMPATIBLE_API_KEY="sk-internal-xxx"
INTERNAL_OPENAI_COMPATIBLE_API_MODEL="llama-3.1-8b"
INTERNAL_OPENAI_COMPATIBLE_API_NAME="Company LLM"

Default Behavior

Variable	Default	Description
`DEFAULT_INFERENCE_TYPE`	`browser`	Default AI inference type (`browser`, `openai`, `horde`, `internal`)

Application Settings

Settings are stored in browser localStorage and can be changed via the Settings UI.

Core Settings

Setting	Type	Default	Description
`enableAiResponse`	boolean	`false`	Enable AI-generated responses for searches
`enableWebGpu`	boolean	`true`	Use WebGPU acceleration when available
`enableImageSearch`	boolean	`true`	Include image results in searches
`searchResultsToConsider`	number	`3`	Number of top search results to include in AI context
`searchResultsLimit`	number	`15`	Maximum search results to fetch
`systemPrompt`	string	(template)	Custom system prompt template for AI

Inference Settings

Setting	Type	Default	Description
`inferenceType`	enum	`'browser'`	AI provider: `browser`, `openai`, `horde`, `internal`
`inferenceTemperature`	number	`0.7`	Sampling temperature (0.0-1.0)
`inferenceTopP`	number	`0.9`	Nucleus sampling parameter
`inferenceMaxTokens`	number	`4096`	Maximum tokens per generation
`inferenceTopK`	number	`40`	Top-K sampling parameter (browser only)
`minP`	number	`0.1`	Min-p sampling threshold
`repeatPenalty`	number	`1.1`	Penalty for token repetition

Model Selection

WebLLM Models:

Uses MLC LLM model registry
Models loaded from HuggingFace
Common options: Qwen3-0.6B, SmolLM2-1.7B, Llama-3.2-1B

Wllama Models:

40+ pre-configured models
Range from 135M to 3.8B parameters
All quantized to Q4_K_S or UD-Q4_K_XL
Stored at: Felladrin/gguf-sharded-* on HuggingFace

OpenAI/Internal:

Any OpenAI-compatible API
Auto-model detection if not specified
Supports streaming and reasoning models

AI Horde:

Uses aihorde.net distributed network
Anonymous or authenticated access
Parallel generation with race conditions

History Settings

Setting	Type	Default	Description
`historyRetentionDays`	number	`30`	Days to keep search history
`historyMaxEntries`	number	`1000`	Maximum history entries before cleanup
`enableHistorySync`	boolean	`true`	Save history to IndexedDB

Privacy Settings

Setting	Type	Default	Description
`enableTelemetry`	boolean	`false`	Enable anonymous usage analytics
`shareModelDownloads`	boolean	`true`	Share model downloads via WebRTC (peer-to-peer)

Docker Configuration

docker-compose.yml (Development)

services:
  development-server:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "7861:7860"  # App
      - "8888:8888"  # SearXNG
    environment:
      - ACCESS_KEYS=${ACCESS_KEYS:-}
      - ACCESS_KEY_TIMEOUT_HOURS=${ACCESS_KEY_TIMEOUT_HOURS:-24}
      - WEBLLM_DEFAULT_F16_MODEL_ID=${WEBLLM_DEFAULT_F16_MODEL_ID:-Qwen3-0.6B-q4f16_1-MLC}
      # ... more env vars
    volumes:
      - .:/home/user/app  # Live code mounting
      - /home/user/app/node_modules

docker-compose.production.yml

Same structure but without volume mounts and with pre-built assets.

Dockerfile Environment

The Dockerfile sets up:

Builder stage: Compiles llama-server from llama.cpp
Runtime stage:
- Node.js LTS
- Python 3 + SearXNG
- llama-server binary

Multi-service container runs all three concurrently via shell process composition.

Vite Environment Injection

Environment variables are injected at build time via vite.config.ts:

// Injected into import.meta.env
VITE_SEARCH_TOKEN
VITE_ACCESS_KEYS_ENABLED
VITE_WEBLLM_DEFAULT_F16_MODEL_ID
VITE_WEBLLM_DEFAULT_F32_MODEL_ID
VITE_WLLAMA_DEFAULT_MODEL_ID
VITE_INTERNAL_API_ENABLED
VITE_DEFAULT_INFERENCE_TYPE

These are accessed in client code as:

const token = import.meta.env.VITE_SEARCH_TOKEN;

Configuration Patterns

Scenario: Private Team Instance

# .env
ACCESS_KEYS="team-alpha-2024,team-beta-2024"
ACCESS_KEY_TIMEOUT_HOURS="8"
DEFAULT_INFERENCE_TYPE="internal"
INTERNAL_OPENAI_COMPATIBLE_API_BASE_URL="https://llm.company.com/v1"
INTERNAL_OPENAI_COMPATIBLE_API_KEY="sk-xxx"
INTERNAL_OPENAI_COMPATIBLE_API_MODEL="llama-3.1-70b"

Scenario: Public Demo (No AI)

# .env - empty, no access keys
# AI disabled by default in settings

Scenario: Browser-Only AI

# .env - minimal or empty
# Users choose WebLLM or Wllama in settings
# Models download to user's browser (no server AI)

Debugging Configuration

Enable verbose logging:

# In browser console
localStorage.setItem('debug', 'minisearch:*');

Check effective configuration:

// In browser console
console.log('Settings:', JSON.parse(localStorage.getItem('settings') || '{}'));
console.log('Env:', import.meta.env);