MiniSearch / docs /configuration.md
github-actions[bot]
Sync from https://github.com/felladrin/MiniSearch
10d1fd4

Configuration

Environment Variables

All configuration is done via environment variables. Create a .env file in the project root.

Access Control

Variable Default Description
ACCESS_KEYS '' Comma-separated list of valid access keys (e.g., 'key1,key2,key3')
ACCESS_KEY_TIMEOUT_HOURS 24 Hours to cache validated keys in browser. Set to 0 to require validation on every request

Example:

ACCESS_KEYS="my-secret-key-1,my-secret-key-2"
ACCESS_KEY_TIMEOUT_HOURS="24"

AI Model Defaults

Configure default models for different inference types:

Variable Default Description
WEBLLM_DEFAULT_F16_MODEL_ID Qwen3-0.6B-q4f16_1-MLC Default WebLLM model with F16 shaders (requires WebGPU)
WEBLLM_DEFAULT_F32_MODEL_ID Qwen3-0.6B-q4f32_1-MLC Default WebLLM model with F32 shaders (CPU fallback)
WLLAMA_DEFAULT_MODEL_ID qwen-3-0.6b Default Wllama model (CPU-based, no WebGPU required)

Model Selection Notes:

  • F16 models are faster but require WebGPU with F16 shader support
  • F32 models work on all WebGPU-capable devices
  • Wllama models run on CPU via WebAssembly (slower but most compatible)

Internal API Configuration

For self-hosted OpenAI-compatible APIs:

Variable Default Description
INTERNAL_OPENAI_COMPATIBLE_API_BASE_URL '' Base URL of your API (e.g., https://api.internal.company.com/v1)
INTERNAL_OPENAI_COMPATIBLE_API_KEY '' API key for authentication
INTERNAL_OPENAI_COMPATIBLE_API_MODEL '' Model ID to use (auto-detected if empty)
INTERNAL_OPENAI_COMPATIBLE_API_NAME Internal API Display name shown in UI

Example:

INTERNAL_OPENAI_COMPATIBLE_API_BASE_URL="https://llm.internal.company.com/v1"
INTERNAL_OPENAI_COMPATIBLE_API_KEY="sk-internal-xxx"
INTERNAL_OPENAI_COMPATIBLE_API_MODEL="llama-3.1-8b"
INTERNAL_OPENAI_COMPATIBLE_API_NAME="Company LLM"

Default Behavior

Variable Default Description
DEFAULT_INFERENCE_TYPE browser Default AI inference type (browser, openai, horde, internal)

Application Settings

Settings are stored in browser localStorage and can be changed via the Settings UI.

Core Settings

Setting Type Default Description
enableAiResponse boolean false Enable AI-generated responses for searches
enableWebGpu boolean true Use WebGPU acceleration when available
enableImageSearch boolean true Include image results in searches
searchResultsToConsider number 3 Number of top search results to include in AI context
searchResultsLimit number 15 Maximum search results to fetch
systemPrompt string (template) Custom system prompt template for AI

Inference Settings

Setting Type Default Description
inferenceType enum 'browser' AI provider: browser, openai, horde, internal
inferenceTemperature number 0.7 Sampling temperature (0.0-1.0)
inferenceTopP number 0.9 Nucleus sampling parameter
inferenceMaxTokens number 4096 Maximum tokens per generation
inferenceTopK number 40 Top-K sampling parameter (browser only)
minP number 0.1 Min-p sampling threshold
repeatPenalty number 1.1 Penalty for token repetition

Model Selection

WebLLM Models:

  • Uses MLC LLM model registry
  • Models loaded from HuggingFace
  • Common options: Qwen3-0.6B, SmolLM2-1.7B, Llama-3.2-1B

Wllama Models:

  • 40+ pre-configured models
  • Range from 135M to 3.8B parameters
  • All quantized to Q4_K_S or UD-Q4_K_XL
  • Stored at: Felladrin/gguf-sharded-* on HuggingFace

OpenAI/Internal:

  • Any OpenAI-compatible API
  • Auto-model detection if not specified
  • Supports streaming and reasoning models

AI Horde:

  • Uses aihorde.net distributed network
  • Anonymous or authenticated access
  • Parallel generation with race conditions

History Settings

Setting Type Default Description
historyRetentionDays number 30 Days to keep search history
historyMaxEntries number 1000 Maximum history entries before cleanup
enableHistorySync boolean true Save history to IndexedDB

Privacy Settings

Setting Type Default Description
enableTelemetry boolean false Enable anonymous usage analytics
shareModelDownloads boolean true Share model downloads via WebRTC (peer-to-peer)

Docker Configuration

docker-compose.yml (Development)

services:
  development-server:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "7861:7860"  # App
      - "8888:8888"  # SearXNG
    environment:
      - ACCESS_KEYS=${ACCESS_KEYS:-}
      - ACCESS_KEY_TIMEOUT_HOURS=${ACCESS_KEY_TIMEOUT_HOURS:-24}
      - WEBLLM_DEFAULT_F16_MODEL_ID=${WEBLLM_DEFAULT_F16_MODEL_ID:-Qwen3-0.6B-q4f16_1-MLC}
      # ... more env vars
    volumes:
      - .:/home/user/app  # Live code mounting
      - /home/user/app/node_modules

docker-compose.production.yml

Same structure but without volume mounts and with pre-built assets.

Dockerfile Environment

The Dockerfile sets up:

  1. Builder stage: Compiles llama-server from llama.cpp
  2. Runtime stage:
    • Node.js LTS
    • Python 3 + SearXNG
    • llama-server binary

Multi-service container runs all three concurrently via shell process composition.

Vite Environment Injection

Environment variables are injected at build time via vite.config.ts:

// Injected into import.meta.env
VITE_SEARCH_TOKEN
VITE_ACCESS_KEYS_ENABLED
VITE_WEBLLM_DEFAULT_F16_MODEL_ID
VITE_WEBLLM_DEFAULT_F32_MODEL_ID
VITE_WLLAMA_DEFAULT_MODEL_ID
VITE_INTERNAL_API_ENABLED
VITE_DEFAULT_INFERENCE_TYPE

These are accessed in client code as:

const token = import.meta.env.VITE_SEARCH_TOKEN;

Configuration Patterns

Scenario: Private Team Instance

# .env
ACCESS_KEYS="team-alpha-2024,team-beta-2024"
ACCESS_KEY_TIMEOUT_HOURS="8"
DEFAULT_INFERENCE_TYPE="internal"
INTERNAL_OPENAI_COMPATIBLE_API_BASE_URL="https://llm.company.com/v1"
INTERNAL_OPENAI_COMPATIBLE_API_KEY="sk-xxx"
INTERNAL_OPENAI_COMPATIBLE_API_MODEL="llama-3.1-70b"

Scenario: Public Demo (No AI)

# .env - empty, no access keys
# AI disabled by default in settings

Scenario: Browser-Only AI

# .env - minimal or empty
# Users choose WebLLM or Wllama in settings
# Models download to user's browser (no server AI)

Debugging Configuration

Enable verbose logging:

# In browser console
localStorage.setItem('debug', 'minisearch:*');

Check effective configuration:

// In browser console
console.log('Settings:', JSON.parse(localStorage.getItem('settings') || '{}'));
console.log('Env:', import.meta.env);

Related Topics

  • AI Integration: docs/ai-integration.md - Detailed inference type configuration
  • Security: docs/security.md - Access control and privacy details
  • Deployment: docs/overview.md - Container architecture and production setup