MiniSearch / docs /glossary.md
github-actions[bot]
Sync from https://github.com/felladrin/MiniSearch
9cacba2
|
Raw
History Blame Contribute Delete
7.12 kB
# Glossary
Codebase-specific terms, jargon, and domain concepts used in MiniSearch.
## Core System Concepts
### Search Token & Hash
A security mechanism used to authorize communication between the client and the internal search/AI endpoints.
- **Search Token**: A string generated at build time (`VITE_SEARCH_TOKEN`). Used to verify that requests to the server originate from a trusted build.
- **Search Token Hash**: To avoid exposing the raw token in all requests, the client generates a hash of the token. Managed via the `lastSearchTokenHashPubSub` channel.
- **Verification**: The server verifies these tokens to prevent unauthorized access to the search API. Stored in `server/verifiedTokens.ts` as an in-memory `Set<string>`.
### Inference Types
MiniSearch supports multiple backends for Large Language Model (LLM) inference, configured via `inferenceType` in the application settings.
| Type | Description | Implementation |
|------|-------------|----------------|
| `browser` | Local inference using WASM (Wllama) | Client-side, privacy-preserving |
| `openai` | Connection to any OpenAI-compatible external API | Requires API key |
| `horde` | Crowdsourced inference via the AI Horde network | Distributed, anonymous or authenticated |
| `internal` | Server-side proxy using pre-configured credentials | API key hidden from client |
### PubSub (State Management)
Instead of a heavy state management library like Redux, MiniSearch uses a minimalist Publish-Subscribe pattern powered by the `create-pubsub` library.
- **Data Flow**: Components subscribe to "channels" (e.g., `queryPubSub`, `responsePubSub`)
- **Tuple Pattern**: Each channel is a 3-element tuple: `[update, subscribe, get]`
- **Persistence**: Some channels use `createLocalStoragePubSub` to automatically sync state with `localStorage`
- **Throttling**: UI-heavy updates like AI response streaming are throttled to ~12 updates/sec using `throttleit`
### Reranker
A secondary search stage that takes initial results from SearXNG and re-orders them based on relevance to the query using a cross-encoder model (`jina-reranker-v1-tiny-en`) running on a local `llama-server` instance.
- **Implementation**: Spawns `llama-server` child process with `--reranking` and `--pooling rank` flags
- **Health Check**: Polls `/health` endpoint via `getRerankerStatus`
- **Scoring**: Results filtered using standard deviation thresholds (`kStandardDeviationFactor = 0.3`)
- **Fallback**: If reranker is unhealthy, returns unranked SearXNG results
### Wllama
A WebAssembly (WASM) based integration of `llama.cpp` for running LLMs on the CPU in the browser.
- **Initialization**: Loads models from HuggingFace using `initializeWllama`
- **Warmup**: Includes a warmup phase with a single token completion using `n_threads: 1`
- **OPFS**: Uses the Origin Private File System via Wllama's cache manager to store model shards locally
- **Models**: GGUF format, Q4_K_S or UD-Q4_K_XL quantized, stored at `Felladrin/gguf-sharded-*` on HuggingFace
### AI Horde
A crowdsourced distributed cluster of workers providing AI inference. MiniSearch integrates with it using a polling strategy against the `/generate/text/status` endpoint.
- **Kudos**: Virtual currency used by the Horde. Default anonymous key is `0000000000`
- **Polling**: Requests sent to async API, status checked periodically until completion
- **Cancellation**: Can abort generation via `DELETE` on the status endpoint
### Conversation Memory & Rolling Summary
A mechanism to handle long chats that exceed the LLM context window.
- **Summarization**: When older messages are dropped, `createLlmSummary` asks the LLM to condense them under a limit of 800 tokens
- **Extractive Fallback**: If LLM summarization fails, `summarizeDroppedMessages` uses a token-counting extractive approach
- **Token Budget**: Computed based on `openAiContextLength` setting and current message count
## Technical Jargon & Abbreviations
### SearXNG
A privacy-respecting metasearch engine that aggregates results from multiple search engines without tracking. Runs locally on port 8888 within the Docker container.
### GGUF
GGML Universal File format. Binary format for storing LLM weights, optimized for fast loading and inference. Used by Wllama and llama-server.
### Dexie
A minimalist wrapper for IndexedDB used for client-side persistence. MiniSearch uses two Dexie databases:
- **SearchCacheDatabase**: Temporary cache with TTL-based expiration
- **HistoryDatabase**: Long-term search history with retention policies
### Vite Server Hooks
Middleware registered via Vite plugin hooks (`configureServer`, `configurePreviewServer`). All server-side logic in MiniSearch is implemented as hooks:
| Hook | Purpose |
|------|---------|
| `compressionServerHook` | gzip/brotli compression |
| `crossOriginServerHook` | COOP/COEP headers for SharedArrayBuffer |
| `searchEndpointServerHook` | `/search/text` and `/search/images` endpoints |
| `statusEndpointServerHook` | `/status` health check |
| `cacheServerHook` | Cache-Control headers |
| `validateAccessKeyServerHook` | Access key validation |
| `internalApiEndpointServerHook` | `/inference` proxy |
| `rerankerServiceHook` | llama-server lifecycle management |
### Circuit Breaker
A resilience pattern used in `webSearchService.ts` to handle SearXNG service degradation. Opens after 5 consecutive failures, blocking requests for 60 seconds before attempting reset.
### LRU Pruning
Least Recently Used cache eviction strategy. The search cache prunes oldest entries every 10 writes when `MAX_ENTRIES` (100) is reached.
### Argon2id
A password hashing algorithm used for access key validation. Client hashes the access key before transmission; server verifies against configured keys.
## Data Structures
### SearchCacheDatabase Schema
| Store | Primary Key | Indexed Field | Entry Type |
|-------|-------------|---------------|------------|
| `textSearchHistory` | key (hash) | timestamp | TextSearchCache |
| `imageSearchHistory` | key (hash) | timestamp | ImageSearchCache |
### HistoryDatabase Schema
| Table | Purpose |
|-------|---------|
| `searches` | Canonical log of each query with hydrated results payloads |
| `llmResponses` | AI answers tied to their originating search run |
| `chatHistory` | Chronological chat turns scoped by `conversationId` |
### PubSub Channel Types
| Channel | Data Type | Persistence |
|---------|-----------|-------------|
| `queryPubSub` | `string` | Memory |
| `responsePubSub` | `string` | Memory (throttled) |
| `settingsPubSub` | `Settings` | localStorage |
| `textSearchResultsPubSub` | `TextSearchResults` | Memory |
| `textGenerationStatePubSub` | `TextGenerationState` | Memory |
| `chatMessagesPubSub` | `ChatMessage[]` | Memory |
| `conversationSummaryPubSub` | `{id, summary}` | Memory |
## Related Topics
- **Overview**: `docs/overview.md` - System architecture
- **Configuration**: `docs/configuration.md` - Environment variables and settings
- **UI Components**: `docs/ui-components.md` - Component architecture
- **Reranking**: `docs/reranking.md` - Reranker subsystem