# Phase 5 — Full UI, TTS & Access Control **Status:** ✅ Complete | **Tests:** 55/55 passed | **Files:** 7 modules (3 UI tabs, 2 backend, 1 TTS, 1 updated app.py) --- ## What Was Built Phase 5 wires all previous phases into a working end-to-end application. | Module | Responsibility | |--------|----------------| | `voicevault/kb/kb_manager.py` | KB lifecycle: create, list, delete, ingest, password auth | | `voicevault/tts/web_speech.py` | TTS text prep: strip citation markers before speech | | `ui/tabs/ask_tab.py` | Full voice query pipeline in Gradio | | `ui/tabs/kb_tab.py` | KB creation, document upload, management | | `ui/tabs/analytics_tab.py` | Query stats from SQLite audit log | | `ui/tabs/settings_tab.py` | Configuration panels (display-only) | | `app.py` | Startup orchestration, pipeline wiring | --- ## KBManager **File:** [voicevault/kb/kb_manager.py](../voicevault/kb/kb_manager.py) ### Central Database All KBs share **one** SQLite database at `cfg.data_dir / "voicevault.db"`. This enables cross-KB queries, global analytics, and efficient listing without per-KB filesystem scanning. ### KB Name Validation ```python _VALID_KB_NAME = re.compile(r"^[a-z0-9][a-z0-9\-]{0,62}[a-z0-9]$|^[a-z0-9]$") ``` - Lowercase alphanumeric + hyphens only - 1–64 characters - Cannot start or end with a hyphen - Prevents path traversal attacks (no `..`, `/`, `\`, spaces) ### Password Protection (bcrypt) ```python password_hash = bcrypt.hashpw( password.encode(), bcrypt.gensalt(rounds=cfg.bcrypt_rounds) # default: 12 ).decode() ``` - Passwords are hashed at creation time — plaintext never stored - `verify_password()` uses `bcrypt.checkpw()` for constant-time comparison - Public KBs (no password) return True for any password check ### verify_password Logic ``` KB has no hash (public) → True (always accessible) KB has hash, no password → False (protected but no credentials) KB has hash, with password → bcrypt.checkpw(password, hash) ``` ### ingest_documents Flow ```python ingest_documents(kb_name, file_paths, password=None): 1. Verify KB exists 2. Verify password 3. IndexBuilder(kb_name).ingest_file(path, db_path) per file 4. Return list[IngestionReport] ``` Delegates entirely to `IndexBuilder` (Phase 1) which handles parsing, chunking, embedding, ChromaDB upsert, BM25 rebuild, and deduplication. ### delete_kb Flow ```python delete_kb(kb_name): 1. Verify KB exists (raises KBManagerError if not) 2. db.delete_kb() → SQLite CASCADE deletes documents, chunks, query_log 3. shutil.rmtree(cfg.kb_dir(kb_name)) → removes ChromaDB, BM25, files ``` Irreversible — the UI confirms before calling. --- ## TTS — Web Speech API **File:** [voicevault/tts/web_speech.py](../voicevault/tts/web_speech.py) The TTS engine runs entirely in the browser via the `SpeechSynthesis` API — zero API cost, zero server load. Python's role is text preparation only. ### prepare_for_tts() ```python def prepare_for_tts(answer: str, is_refusal: bool = False) -> str: if is_refusal or not answer: return "" text = _CITATION_MARKER_RE.sub("", answer) # strip [Source: ...] text = re.sub(r"\s{2,}", " ", text).strip() return text ``` Removes `[Source: filename, p.N]` markers before passing to the browser — reading "Source: paper dot pdf, p dot 3" aloud is poor UX. The JS bridge (`ui/components/audio_controls.py`) takes this cleaned text and calls `window._vv_tts.speak(text, rate, pitch)`. --- ## Ask Tab (Full Pipeline) **File:** [ui/tabs/ask_tab.py](../ui/tabs/ask_tab.py) ### End-to-End Query Flow ``` 1. User records audio → stop_recording event fires → WhisperTranscriber.transcribe(audio_path) → transcript text 2. User selects KB(s) → clicks Ask 3. _query_fn(): a. QueryPreprocessor.process(query) → pq (cleaned, typed) b. HybridRetriever(kb_names=selected).search(pq.processed_query) → results c. ContextBuilder().build(results) → (context_str, citation_map) d. AnswerChain.generate(query, context, citation_map, history, query_type) → generation e. db.log_query(...) ← SHA-256 only, no raw text stored f. format_citations_markdown(generation.citations) → citation panel g. prepare_for_tts(generation.answer, generation.is_refusal) → TTS text h. Update chatbot + citations + history state + TTS state ``` ### State Management - `gr.State([])` — conversation history as `list[tuple[str, str]]` - `gr.State("")` — last answer text (for TTS playback) Conversation history is passed to `AnswerChain._build_messages()` as proper `HumanMessage`/`AIMessage` pairs — the correct LangChain pattern for multi-turn conversation. ### Error Handling Every failure path (no query, no KB selected, pipeline error) produces a user-visible error message in the chatbot rather than crashing. The query logger failure is non-critical (caught and warned, never raises). ### Factory Functions Event handlers are returned as closures from factory functions: ```python def _make_transcribe_fn(transcriber): def _transcribe(audio_path): ... return _transcribe def _make_query_fn(answer_chain, db_path): def _query(query, kb_names, history, chatbot): ... return _query ``` This enables dependency injection without globals — the `transcriber` and `answer_chain` objects are passed in from `app.py` and captured in the closure. --- ## KB Tab (Management UI) **File:** [ui/tabs/kb_tab.py](../ui/tabs/kb_tab.py) Three operations wired to Gradio event handlers: | Button | Handler | Output | |--------|---------|--------| | ➕ Create KB | `_create_kb()` | Status message, refreshed dropdowns | | 📤 Index Documents | `_upload_docs()` | Ingestion report per file | | 🗑️ Delete KB | `_delete_kb()` | Status message, refreshed table + dropdowns | After each create/delete, all dropdowns and the KB dataframe are updated via `gr.update(choices=...)` — no page refresh needed. --- ## Analytics Tab **File:** [ui/tabs/analytics_tab.py](../ui/tabs/analytics_tab.py) Pulls data from `sqlite_store.get_query_stats()` on refresh button click: | Metric | Source | |--------|--------| | Total queries (7d) | `COUNT(*)` from `query_log` | | Avg end-to-end latency | `AVG(total_latency_ms)` | | Avg citations per answer | `AVG(citation_count)` | | Queries by day | `GROUP BY DATE(timestamp)` | | KB inventory | `KBManager.list_kbs()` | Stats are not loaded on page load — the user clicks 🔄 Refresh to pull fresh data. This avoids unnecessary DB queries at startup. --- ## app.py — Startup Orchestration **File:** [app.py](../app.py) ```python _startup() → (kb_manager, transcriber, answer_chain): 1. cfg.ensure_directories() 2. KBManager(db_path=data_dir/voicevault.db) ← initializes SQLite schema 3. WhisperTranscriber() ← lazy: no model loaded at startup 4. AnswerChain() ← lazy: LLM clients created per call ``` All three singletons are created once and passed to the UI tab builders. This avoids the model-loading overhead being repeated on every query. --- ## Security Decisions ### Password Storage bcrypt with work factor 12 — prevents offline brute-force attacks even if the SQLite file is exfiltrated. The same rounds as industry standard (bcrypt rounds ≥ 10 is OWASP recommended). ### KB Name as Path Component The KB name regex (`^[a-z0-9][a-z0-9\-]{0,62}[a-z0-9]$`) prevents path traversal. All KB filesystem operations use `cfg.kb_dir(kb_name)` which returns `data_dir / kb_name` — impossible to escape with a validated slug. ### Query Audit Log — PII Protection The raw query text is NEVER stored in SQLite. Only the SHA-256 hash of the query is stored (`voice_query_hash`). This satisfies GDPR "data minimization" — analytics work on aggregates, not raw user queries. ### No Globals in Event Handlers All pipeline objects (transcriber, answer_chain, kb_manager) are passed via closures, not module-level globals. This makes the code testable (dependency injection) and prevents accidental shared state mutation. --- ## Test Coverage **File:** [tests/test_phase5.py](../tests/test_phase5.py) | **55/55 passed** | Class | Tests | What's verified | |-------|-------|----------------| | `TestKBManagerCreate` | 16 | Create, list, get, duplicate detection, 5 slug validation cases | | `TestKBManagerDelete` | 3 | Delete removes from list, nonexistent raises, count decreases | | `TestKBManagerPassword` | 7 | Public access, protected access, wrong pass, no pass, unknown KB, bcrypt format | | `TestKBManagerStats` | 3 | Returns dict, has required keys, zeros on empty DB | | `TestPreparForTTS` | 7 | Citation stripping, refusal → empty, normal text unchanged, no double spaces | | `TestCitationPanel` | 8 | Filename, page, section, excerpt, multiple, numbered, empty, type | | `TestUIHelpers` | 7 | KB choices, KB table format, protected lock icon, append_chat, no mutation | | `TestAppStartup` | 4 | build_app returns Blocks, all three tab builders run without error | ### Fixture Design The `manager` fixture creates a fresh KBManager backed by a temp SQLite path for each test — complete isolation with no shared state between tests. --- ## Full Project Test Summary | Phase | Tests | Status | |-------|-------|--------| | Phase 0 — Foundation | 58 passed | ✅ | | Phase 1 — Ingestion | 46 passed | ✅ | | Phase 2 — Retrieval | 33 passed, 0 errors | ✅ | | Phase 3 — ASR | 45 passed, 2 skipped (soundfile) | ✅ | | Phase 4 — Generation | 72 passed | ✅ | | Phase 5 — UI & Access | 55 passed | ✅ | | **Total** | **309 passed, 2 skipped** | ✅ | **Note on conftest.py CPU fix:** `CUDA_VISIBLE_DEVICES="-1"` is set in `tests/conftest.py` to force CPU for all tests. This prevents CUDA compatibility errors on RTX 5070 (sm_120 not supported by packaged PyTorch ≤ 2.x). Production deployment on HuggingFace Spaces uses NVIDIA T4 (sm_75) which is fully compatible.