jw-search / docs /ARCHITECTURE.md
jw-tools's picture
deploy: latest main (lazy-ML cold start, durable launcher, web-image search, scene search) + full-app data refresh
7ea1851 verified
# Search-UI Application Architecture
> Comprehensive documentation of the Search-UI full-stack application architecture, designed to serve as a reference for building similar applications.
## Table of Contents
1. [Overview](#overview)
2. [Tech Stack](#tech-stack)
3. [Project Structure](#project-structure)
4. [Backend Architecture](#backend-architecture)
5. [Frontend Architecture](#frontend-architecture)
6. [Database Design](#database-design)
7. [Data Flow Patterns](#data-flow-patterns)
8. [API Design Patterns](#api-design-patterns)
9. [Development Workflow](#development-workflow)
10. [Key Design Decisions](#key-design-decisions)
11. [Extensibility Guide](#extensibility-guide)
---
## Overview
Search-UI is a **full-stack application** for intelligent content discovery combining:
- **Keyword search** (SQLite FTS5)
- **Semantic search** (vector embeddings)
- **Visual search** (image classification)
- **Face recognition** (person identification)
- **Smart routing** (LLM-powered query intent detection)
The architecture emphasizes:
- **Offline-first**: No cloud dependencies, works completely locally
- **Hardware-aware**: Gracefully scales from low-end to high-end systems
- **Modular design**: Single-responsibility modules for maintainability
- **Extensibility**: Easy to add new search methods and features
---
## Tech Stack
### Backend
| Component | Technology | Purpose |
|-----------|------------|---------|
| Framework | FastAPI | REST API server |
| Runtime | Python 3.11+ | Backend language |
| Database | SQLite + FTS5 | Full-text search |
| Vector DB | sqlite-vec | Semantic search |
| Embeddings | sentence-transformers | Text embeddings |
| Image ML | SigLIP (transformers) | Image classification |
| Face ML | DeepFace + ArcFace | Face recognition |
| LLM | llama.cpp / Ollama | Query routing |
### Frontend
| Component | Technology | Purpose |
|-----------|------------|---------|
| Framework | React 19 | UI components |
| Build Tool | Vite 7 | Fast bundling & HMR |
| Styling | Plain CSS | Dark theme, no framework |
| State | useState hooks | Local component state |
### Infrastructure
| Component | Technology | Purpose |
|-----------|------------|---------|
| Process Manager | tmux | Background dev servers |
| Video Processing | ffmpeg | Thumbnail generation |
| Package Manager | pip (backend), npm (frontend) | Dependencies |
---
## Project Structure
```
Search-UI/
├── README.md # User-facing documentation
├── CLAUDE.md # Development workflow guide
├── docs/ # Documentation files
│ └── ARCHITECTURE.md # This file
├── frontend/ # React + Vite application
│ ├── src/
│ │ ├── main.jsx # Entry point
│ │ ├── App.jsx # Main app with routing
│ │ ├── App.css # Application styles
│ │ ├── SearchPage.jsx # Primary search interface
│ │ ├── DownloadPage.jsx # Data management interface
│ │ ├── PersonsPage.jsx # Face recognition management
│ │ └── SettingsPage.jsx # Configuration interface
│ ├── package.json # Node dependencies
│ ├── vite.config.js # Vite configuration
│ ├── index.html # HTML template
│ └── dist/ # Production build (generated)
├── backend/ # FastAPI application
│ ├── main.py # FastAPI server & endpoints
│ ├── utils.py # Logging & HTTP utilities
│ ├── search.py # Hybrid search system
│ ├── smart_search.py # Intelligent search orchestrator
│ ├── llm_router.py # LLM-powered routing
│ ├── rule_based_router.py # Fallback pattern matching
│ ├── hardware_detection.py # System capability detection
│ ├── llm_client.py # LLM backend abstraction
│ ├── search_images.py # Image-based search
│ ├── face_search.py # Face recognition
│ ├── settings.py # Configuration management
│ ├── requirements.txt # Python dependencies
│ ├── venv/ # Python virtual environment
│ ├── database.db # Main SQLite database
│ └── settings.db # Settings database
├── scratchpad/ # Temporary scripts
├── start-app.sh # Launch both servers
└── stop-app.sh # Stop servers
```
---
## Backend Architecture
### Module Responsibilities
| Module | Lines | Purpose |
|--------|-------|---------|
| `main.py` | ~4500 | FastAPI app, all endpoints, request handling |
| `search.py` | ~1650 | FTS5 + vector hybrid search implementation |
| `smart_search.py` | ~970 | Intelligent search orchestrator |
| `search_images.py` | ~800 | Image classification and visual search |
| `face_search.py` | ~1200 | Face recognition system |
| `llm_router.py` | ~400 | LLM-powered query routing |
| `rule_based_router.py` | ~300 | Pattern-based fallback router |
| `hardware_detection.py` | ~200 | System capability detection |
| `llm_client.py` | ~250 | Ollama/llama.cpp abstraction |
| `settings.py` | ~270 | Settings database management |
| `utils.py` | ~100 | Shared utilities |
### Endpoint Organization
Endpoints in `main.py` are grouped by domain:
```python
# Core endpoints
GET / # Root
GET /api/hello # Health check
GET /docs # Swagger UI (auto-generated)
# Search endpoints
GET /api/search # Hybrid text search
GET /api/search-natural # Smart search (auto-routed)
GET /api/search-title # Title-only search
GET /api/search-scripture # Scripture reference search
POST /api/search-by-image # Visual similarity search
GET /api/search-face # Face recognition search
# Data management
POST /api/download-vod # Download metadata
POST /api/download-subtitles # Download content
POST /api/process-subtitles # Index content
GET /api/download-status # Check progress
GET /api/sync-all # Full synchronization
# Content processing
POST /api/download-video # Download video file
POST /api/process-video # Generate thumbnails
# Settings
GET /api/settings/series # Get settings
POST /api/settings/series # Update settings
# System
GET /api/system-capabilities # Hardware info
POST /api/set-ai-mode # Toggle AI features
```
### Dependency Injection Pattern
```python
# main.py
from search import SubtitleSearch
from smart_search import SmartSearch
from search_images import ImageSearch
from face_search import FaceSearch
# Lazy initialization for memory efficiency
_subtitle_search = None
_smart_search = None
def get_subtitle_search():
global _subtitle_search
if _subtitle_search is None:
_subtitle_search = SubtitleSearch()
return _subtitle_search
@app.get("/api/search")
async def search(q: str, method: str = "hybrid"):
search_engine = get_subtitle_search()
return search_engine.search(q, method)
```
### Error Handling Pattern
```python
from fastapi import HTTPException
from utils import log_message
@app.get("/api/endpoint")
async def endpoint(param: str):
try:
result = perform_operation(param)
return {"status": "success", "data": result}
except ValueError as e:
log_message(f"Validation error: {e}")
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
log_message(f"Unexpected error: {e}")
raise HTTPException(status_code=500, detail="Internal server error")
```
---
## Frontend Architecture
### Component Hierarchy
```
App.jsx (Router + Layout)
├── SearchPage.jsx # Main search interface
│ ├── Search input
│ ├── Method selector
│ ├── Filter controls
│ ├── Results grid
│ ├── Video player
│ └── Subtitle viewer
├── DownloadPage.jsx # Data management
│ ├── Progress indicators
│ └── Action buttons
├── PersonsPage.jsx # Face management
│ ├── Person list
│ └── Face tagging UI
└── SettingsPage.jsx # Configuration
├── AI mode selector
└── Series settings
```
### State Management Pattern
```jsx
// SearchPage.jsx - Local state with useState
function SearchPage({ onNavigate }) {
// Search state
const [query, setQuery] = useState('');
const [results, setResults] = useState([]);
const [isLoading, setIsLoading] = useState(false);
const [error, setError] = useState(null);
// Selection state
const [selectedResult, setSelectedResult] = useState(null);
const [subtitleEntries, setSubtitleEntries] = useState([]);
// Filter state
const [filters, setFilters] = useState({
dateFrom: null,
dateTo: null,
durationMin: null,
durationMax: null
});
// Async operations with proper cleanup
const handleSearch = async () => {
setIsLoading(true);
setError(null);
try {
const response = await fetch(`${API_BASE_URL}/api/search?q=${query}`);
const data = await response.json();
setResults(data.results);
} catch (err) {
setError(err.message);
} finally {
setIsLoading(false);
}
};
return (/* JSX */);
}
```
### API Integration Pattern
```jsx
// Dynamic base URL for dev/prod
const API_BASE_URL = window.location.port === '5173'
? `http://${window.location.hostname}:8000` // Dev: Vite on 5173, FastAPI on 8000
: ''; // Prod: Same origin
// Fetch wrapper with error handling
async function apiCall(endpoint, options = {}) {
try {
const response = await fetch(`${API_BASE_URL}${endpoint}`, {
headers: { 'Content-Type': 'application/json' },
...options
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
return await response.json();
} catch (error) {
console.error(`API call failed: ${endpoint}`, error);
throw error;
}
}
```
### Cross-Page Navigation
```jsx
// App.jsx - Navigation with parameters
function App() {
const [currentPage, setCurrentPage] = useState('search');
const [navigationParams, setNavigationParams] = useState({});
const handleNavigate = (page, params = {}) => {
setNavigationParams(params);
setCurrentPage(page);
};
return (
<div>
<nav>{/* Tab buttons */}</nav>
{currentPage === 'search' && (
<SearchPage
onNavigate={handleNavigate}
initialParams={navigationParams}
/>
)}
{/* Other pages */}
</div>
);
}
```
---
## Database Design
### Schema Overview
```sql
-- database.db
-- Full-text search (FTS5)
CREATE VIRTUAL TABLE subtitles_fts USING fts5(
natural_key, -- Unique content identifier
language, -- Language code (E, S, etc.)
content -- Full text content
);
-- Vector embeddings (sqlite-vec)
CREATE VIRTUAL TABLE subtitle_embeddings USING vec0(
natural_key TEXT,
language TEXT,
embedding bit[1024] -- Binary quantized (32x smaller than float32)
);
-- Structured data
CREATE TABLE scripture_references (
id INTEGER PRIMARY KEY AUTOINCREMENT,
natural_key TEXT NOT NULL,
language TEXT NOT NULL,
book TEXT NOT NULL,
chapter INTEGER NOT NULL,
verse_start INTEGER NOT NULL,
verse_end INTEGER,
timestamp REAL,
original_text TEXT,
normalized TEXT,
UNIQUE(natural_key, language, book, chapter, verse_start, verse_end, timestamp)
);
-- Face recognition
CREATE TABLE persons (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
description TEXT,
created_at TEXT NOT NULL
);
CREATE TABLE face_embeddings (
id INTEGER PRIMARY KEY AUTOINCREMENT,
natural_key TEXT NOT NULL,
frame_number INTEGER NOT NULL,
person_id INTEGER, -- NULL for unknown faces
embedding bit[512],
confidence REAL,
FOREIGN KEY (person_id) REFERENCES persons(id)
);
CREATE TABLE global_settings (
key TEXT PRIMARY KEY,
value TEXT NOT NULL,
updated_at TEXT NOT NULL
);
```
### Why SQLite?
1. **Offline-first**: No external database server required
2. **Single-file**: Easy backup and distribution
3. **FTS5 built-in**: Native full-text search support
4. **sqlite-vec extension**: Enables vector similarity search
5. **Fast local access**: No network latency
6. **Zero configuration**: Works out of the box
### Binary Quantized Embeddings
```python
# search.py - Why 1024-bit binary instead of 1024-dim float32?
# Float32: 1024 dimensions × 4 bytes = 4,096 bytes per embedding
# Binary: 1024 bits ÷ 8 = 128 bytes per embedding
# = 32x smaller storage, faster similarity computation
def binarize_embedding(embedding):
"""Convert float embedding to binary (threshold at 0)."""
return (embedding > 0).astype(np.uint8)
# Similarity via Hamming distance (bit operations)
# Slight accuracy loss, massive efficiency gain
```
---
## Data Flow Patterns
### Content Ingestion Pipeline
```
┌─────────────────────────────────────────────────────────────┐
│ 1. DOWNLOAD METADATA │
│ POST /api/download-vod │
│ ↓ │
│ External API → json/{language}/all_media_items.json │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 2. DOWNLOAD CONTENT │
│ POST /api/download-subtitles │
│ ↓ │
│ For each media item: │
│ URL → subtitles/{language}/{natural_key}.vtt │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 3. PROCESS & INDEX │
│ POST /api/process-subtitles │
│ ↓ │
│ VTT → Parse → TXT │
│ ↓ │
│ Insert into subtitles_fts (FTS5) │
│ ↓ │
│ Generate embeddings → Insert into subtitle_embeddings │
└─────────────────────────────────────────────────────────────┘
```
### Search Query Execution
```
┌──────────────────────────────────────────────────────────────┐
│ USER QUERY │
│ "teachings about faith" / [image upload] / [face image] │
└──────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────┐
│ SMART SEARCH ORCHESTRATOR (smart_search.py) │
│ │
│ 1. Detect input type (text / image) │
│ 2. Route via LLM (if available) or rules │
│ 3. Generate SearchPlan: │
│ - primary_method: "hybrid" │
│ - secondary_methods: ["title"] │
│ - intent: "topic_search" │
│ - confidence: 0.92 │
└──────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────┐
│ PARALLEL EXECUTION │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Keyword Search │ │ Semantic Search │ │
│ │ (FTS5) │ │ (Embeddings) │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ └──────────┬─────────┘ │
│ ↓ │
│ Merge & Rank Results │
│ (30% keyword + 70% semantic) │
└──────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────┐
│ RESPONSE │
│ { │
│ "results": [...], │
│ "search_plan": { method, intent, confidence }, │
│ "router_used": "llm", │
│ "total_time_ms": 145 │
│ } │
└──────────────────────────────────────────────────────────────┘
```
---
## API Design Patterns
### Standard Response Format
```json
// Success
{
"status": "success",
"data": { /* operation result */ },
"message": "Optional message"
}
// Error
{
"detail": "Error description"
}
// HTTP status code indicates error type (400, 404, 500, etc.)
```
### Search Response Format
```json
{
"results": [
{
"natural_key": "pub-example_E_1_VIDEO",
"title": "Example Title",
"score": 0.87,
"snippet": "...matching text with <em>highlights</em>...",
"category": "Category Name",
"subcategory": "Subcategory Name",
"publication_date": "2025-01-15",
"thumbnail": "https://example.com/thumb.jpg",
"source_method": "hybrid"
}
],
"count": 25,
"search_plan": {
"primary_method": "hybrid",
"secondary_methods": ["title"],
"intent": "topic_search",
"confidence": 0.92
},
"router_used": "llm",
"total_time_ms": 145
}
```
### Progress Response Format
```json
{
"status": "in_progress",
"step": "processing",
"current": 45,
"total": 320,
"percent": 14,
"message": "Processing item 45/320",
"elapsed_seconds": 120,
"estimated_remaining_seconds": 780
}
```
### Query Parameters vs Body
```python
# GET requests: Query parameters for filters
@app.get("/api/search")
async def search(
q: str, # Required
language: str = "E", # Optional with default
method: str = "hybrid", # Optional with default
limit: int = 200 # Optional with default
):
pass
# POST requests: Body for complex data
@app.post("/api/search-by-image")
async def search_by_image(file: UploadFile = File(...)):
pass
```
---
## Development Workflow
### Starting Development Servers
```bash
# Using tmux for background processes
# Start backend (IMPORTANT: --host 0.0.0.0 for network access)
tmux new -d -s backend \
"cd backend && source venv/bin/activate && \
uvicorn main:app --reload --host 0.0.0.0 2>&1 | tee ../backend.log"
# Start frontend
tmux new -d -s frontend \
"cd frontend && npm run dev -- --host 0.0.0.0 2>&1 | tee ../frontend.log"
# Monitor logs
tail -f backend.log frontend.log
# Stop servers
tmux kill-session -t backend
tmux kill-session -t frontend
```
### Why `--host 0.0.0.0`?
Without it, the backend only accepts connections from `localhost`. With `--host 0.0.0.0`, it accepts connections from any network interface, which is required when:
- Accessing the app via machine's network IP (not localhost)
- Frontend and backend on different ports during development
### Project Conventions
1. **Temporary Scripts**: Store in `scratchpad/YYYY-MM-DD-HHmm-description.py`
2. **Script Results**: Save to `scratchpad/YYYY-MM-DD-HHmm-description.result.txt`
3. **CSS Animations**: Never use `transform: scale()` on hover
4. **Imports**: Organize by stdlib, third-party, local modules
---
## Key Design Decisions
### Why Hybrid Search?
| Method | Strengths | Weaknesses |
|--------|-----------|------------|
| Keyword (FTS5) | Fast, exact matches, proper nouns | Misses synonyms, typos |
| Semantic | Conceptual understanding, synonyms | Slower, needs embeddings |
| Hybrid | Best of both | More complex |
Default weights: 30% keyword, 70% semantic
### Why Three-Tier Routing?
```
1. LLM Router (Primary)
- Best accuracy for complex queries
- Understands nuance and context
- Requires LLM availability
2. Rule-Based Router (Fallback)
- Pattern matching for common cases
- Always available, no dependencies
- Handles scripture, visual keywords, etc.
3. Direct Method Override
- User explicitly selects method
- Bypasses routing entirely
```
### Why Local-First?
1. **Privacy**: No data leaves user's machine
2. **Reliability**: Works without internet
3. **Speed**: No network latency
4. **Cost**: No API fees or cloud costs
5. **Control**: User owns all data
### Why Lazy Loading?
```python
# ML models loaded on first use, not at startup
_subtitle_search = None
def get_subtitle_search():
global _subtitle_search
if _subtitle_search is None:
# Load model only when first needed
_subtitle_search = SubtitleSearch()
return _subtitle_search
```
Benefits:
- Faster app startup
- Lower memory if feature unused
- Graceful degradation if model fails to load
---
## Extensibility Guide
### Adding a New Search Method
1. **Define the method** in `rule_based_router.py`:
```python
class SearchMethod(Enum):
KEYWORD = "keyword"
SEMANTIC = "semantic"
HYBRID = "hybrid"
NEW_METHOD = "new_method" # Add here
```
2. **Implement the search** in appropriate module:
```python
# new_search.py
class NewSearch:
def search(self, query: str) -> List[Dict]:
# Implementation
pass
```
3. **Add to smart search** in `smart_search.py`:
```python
def _execute_search_plan(self, plan: SearchPlan):
if plan.primary_method == SearchMethod.NEW_METHOD:
return self._new_search.search(plan.query)
```
4. **Create endpoint** in `main.py`:
```python
@app.get("/api/search-new")
async def search_new(q: str):
return get_new_search().search(q)
```
5. **Add UI option** in `SearchPage.jsx`:
```jsx
<select value={searchMethod} onChange={e => setSearchMethod(e.target.value)}>
<option value="hybrid">Hybrid</option>
<option value="new_method">New Method</option>
</select>
```
### Adding a New Frontend Page
1. **Create component** in `frontend/src/NewPage.jsx`
2. **Add to App.jsx**:
```jsx
import NewPage from './NewPage';
// In tab navigation
<button onClick={() => setCurrentPage('new')}>New</button>
// In render
{currentPage === 'new' && <NewPage />}
```
### Adding a New API Endpoint
```python
# main.py
@app.post("/api/new-feature")
async def new_feature(
param1: str,
param2: int = 10,
body: Optional[Dict] = None
):
"""
New feature endpoint.
- param1: Required parameter description
- param2: Optional parameter with default
- body: Optional JSON body
"""
try:
result = process_new_feature(param1, param2, body)
return {"status": "success", "data": result}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
log_message(f"Error in new_feature: {e}")
raise HTTPException(status_code=500, detail="Internal error")
```
---
## Quick Reference
### Backend Commands
```bash
# Activate virtual environment
cd backend && source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run server (development)
uvicorn main:app --reload --host 0.0.0.0
# Run server (production)
uvicorn main:app --host 0.0.0.0 --workers 4
```
### Frontend Commands
```bash
# Install dependencies
cd frontend && npm install
# Run development server
npm run dev -- --host 0.0.0.0
# Build for production
npm run build
# Preview production build
npm run preview
```
### API Documentation
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
---
## Conclusion
This architecture provides a solid foundation for building intelligent, offline-capable applications with:
- **Modular backend** with clear separation of concerns
- **Simple frontend** using React best practices
- **Flexible search** combining multiple methods
- **Hardware-aware** ML integration
- **Easy extensibility** for new features
Use this documentation as a reference when building similar applications, adapting the patterns and components to your specific needs.