# Glossary Codebase-specific terms, jargon, and domain concepts used in MiniSearch. ## Core System Concepts ### Search Token & Hash A security mechanism used to authorize communication between the client and the internal search/AI endpoints. - **Search Token**: A string generated at build time (`VITE_SEARCH_TOKEN`). Used to verify that requests to the server originate from a trusted build. - **Search Token Hash**: To avoid exposing the raw token in all requests, the client generates a hash of the token. Managed via the `lastSearchTokenHashPubSub` channel. - **Verification**: The server verifies these tokens to prevent unauthorized access to the search API. Stored in `server/verifiedTokens.ts` as an in-memory `Set`. ### Inference Types MiniSearch supports multiple backends for Large Language Model (LLM) inference, configured via `inferenceType` in the application settings. | Type | Description | Implementation | |------|-------------|----------------| | `browser` | Local inference using WASM (Wllama) | Client-side, privacy-preserving | | `openai` | Connection to any OpenAI-compatible external API | Requires API key | | `horde` | Crowdsourced inference via the AI Horde network | Distributed, anonymous or authenticated | | `internal` | Server-side proxy using pre-configured credentials | API key hidden from client | ### PubSub (State Management) Instead of a heavy state management library like Redux, MiniSearch uses a minimalist Publish-Subscribe pattern powered by the `create-pubsub` library. - **Data Flow**: Components subscribe to "channels" (e.g., `queryPubSub`, `responsePubSub`) - **Tuple Pattern**: Each channel is a 3-element tuple: `[update, subscribe, get]` - **Persistence**: Some channels use `createLocalStoragePubSub` to automatically sync state with `localStorage` - **Throttling**: UI-heavy updates like AI response streaming are throttled to ~12 updates/sec using `throttleit` ### Reranker A secondary search stage that takes initial results from SearXNG and re-orders them based on relevance to the query using a cross-encoder model (`jina-reranker-v1-tiny-en`) running on a local `llama-server` instance. - **Implementation**: Spawns `llama-server` child process with `--reranking` and `--pooling rank` flags - **Health Check**: Polls `/health` endpoint via `getRerankerStatus` - **Scoring**: Results filtered using standard deviation thresholds (`kStandardDeviationFactor = 0.3`) - **Fallback**: If reranker is unhealthy, returns unranked SearXNG results ### Wllama A WebAssembly (WASM) based integration of `llama.cpp` for running LLMs on the CPU in the browser. - **Initialization**: Loads models from HuggingFace using `initializeWllama` - **Warmup**: Includes a warmup phase with a single token completion using `n_threads: 1` - **OPFS**: Uses the Origin Private File System via Wllama's cache manager to store model shards locally - **Models**: GGUF format, Q4_K_S or UD-Q4_K_XL quantized, stored at `Felladrin/gguf-sharded-*` on HuggingFace ### AI Horde A crowdsourced distributed cluster of workers providing AI inference. MiniSearch integrates with it using a polling strategy against the `/generate/text/status` endpoint. - **Kudos**: Virtual currency used by the Horde. Default anonymous key is `0000000000` - **Polling**: Requests sent to async API, status checked periodically until completion - **Cancellation**: Can abort generation via `DELETE` on the status endpoint ### Conversation Memory & Rolling Summary A mechanism to handle long chats that exceed the LLM context window. - **Summarization**: When older messages are dropped, `createLlmSummary` asks the LLM to condense them under a limit of 800 tokens - **Extractive Fallback**: If LLM summarization fails, `summarizeDroppedMessages` uses a token-counting extractive approach - **Token Budget**: Computed based on `openAiContextLength` setting and current message count ## Technical Jargon & Abbreviations ### SearXNG A privacy-respecting metasearch engine that aggregates results from multiple search engines without tracking. Runs locally on port 8888 within the Docker container. ### GGUF GGML Universal File format. Binary format for storing LLM weights, optimized for fast loading and inference. Used by Wllama and llama-server. ### Dexie A minimalist wrapper for IndexedDB used for client-side persistence. MiniSearch uses two Dexie databases: - **SearchCacheDatabase**: Temporary cache with TTL-based expiration - **HistoryDatabase**: Long-term search history with retention policies ### Vite Server Hooks Middleware registered via Vite plugin hooks (`configureServer`, `configurePreviewServer`). All server-side logic in MiniSearch is implemented as hooks: | Hook | Purpose | |------|---------| | `compressionServerHook` | gzip/brotli compression | | `crossOriginServerHook` | COOP/COEP headers for SharedArrayBuffer | | `searchEndpointServerHook` | `/search/text` and `/search/images` endpoints | | `statusEndpointServerHook` | `/status` health check | | `cacheServerHook` | Cache-Control headers | | `validateAccessKeyServerHook` | Access key validation | | `internalApiEndpointServerHook` | `/inference` proxy | | `rerankerServiceHook` | llama-server lifecycle management | ### Circuit Breaker A resilience pattern used in `webSearchService.ts` to handle SearXNG service degradation. Opens after 5 consecutive failures, blocking requests for 60 seconds before attempting reset. ### LRU Pruning Least Recently Used cache eviction strategy. The search cache prunes oldest entries every 10 writes when `MAX_ENTRIES` (100) is reached. ### Argon2id A password hashing algorithm used for access key validation. Client hashes the access key before transmission; server verifies against configured keys. ## Data Structures ### SearchCacheDatabase Schema | Store | Primary Key | Indexed Field | Entry Type | |-------|-------------|---------------|------------| | `textSearchHistory` | key (hash) | timestamp | TextSearchCache | | `imageSearchHistory` | key (hash) | timestamp | ImageSearchCache | ### HistoryDatabase Schema | Table | Purpose | |-------|---------| | `searches` | Canonical log of each query with hydrated results payloads | | `llmResponses` | AI answers tied to their originating search run | | `chatHistory` | Chronological chat turns scoped by `conversationId` | ### PubSub Channel Types | Channel | Data Type | Persistence | |---------|-----------|-------------| | `queryPubSub` | `string` | Memory | | `responsePubSub` | `string` | Memory (throttled) | | `settingsPubSub` | `Settings` | localStorage | | `textSearchResultsPubSub` | `TextSearchResults` | Memory | | `textGenerationStatePubSub` | `TextGenerationState` | Memory | | `chatMessagesPubSub` | `ChatMessage[]` | Memory | | `conversationSummaryPubSub` | `{id, summary}` | Memory | ## Related Topics - **Overview**: `docs/overview.md` - System architecture - **Configuration**: `docs/configuration.md` - Environment variables and settings - **UI Components**: `docs/ui-components.md` - Component architecture - **Reranking**: `docs/reranking.md` - Reranker subsystem