Spaces:
Running
Running
| # Phase 5 β Full UI, TTS & Access Control | |
| **Status:** β Complete | **Tests:** 55/55 passed | **Files:** 7 modules (3 UI tabs, 2 backend, 1 TTS, 1 updated app.py) | |
| --- | |
| ## What Was Built | |
| Phase 5 wires all previous phases into a working end-to-end application. | |
| | Module | Responsibility | | |
| |--------|----------------| | |
| | `voicevault/kb/kb_manager.py` | KB lifecycle: create, list, delete, ingest, password auth | | |
| | `voicevault/tts/web_speech.py` | TTS text prep: strip citation markers before speech | | |
| | `ui/tabs/ask_tab.py` | Full voice query pipeline in Gradio | | |
| | `ui/tabs/kb_tab.py` | KB creation, document upload, management | | |
| | `ui/tabs/analytics_tab.py` | Query stats from SQLite audit log | | |
| | `ui/tabs/settings_tab.py` | Configuration panels (display-only) | | |
| | `app.py` | Startup orchestration, pipeline wiring | | |
| --- | |
| ## KBManager | |
| **File:** [voicevault/kb/kb_manager.py](../voicevault/kb/kb_manager.py) | |
| ### Central Database | |
| All KBs share **one** SQLite database at `cfg.data_dir / "voicevault.db"`. This enables cross-KB queries, global analytics, and efficient listing without per-KB filesystem scanning. | |
| ### KB Name Validation | |
| ```python | |
| _VALID_KB_NAME = re.compile(r"^[a-z0-9][a-z0-9\-]{0,62}[a-z0-9]$|^[a-z0-9]$") | |
| ``` | |
| - Lowercase alphanumeric + hyphens only | |
| - 1β64 characters | |
| - Cannot start or end with a hyphen | |
| - Prevents path traversal attacks (no `..`, `/`, `\`, spaces) | |
| ### Password Protection (bcrypt) | |
| ```python | |
| password_hash = bcrypt.hashpw( | |
| password.encode(), bcrypt.gensalt(rounds=cfg.bcrypt_rounds) # default: 12 | |
| ).decode() | |
| ``` | |
| - Passwords are hashed at creation time β plaintext never stored | |
| - `verify_password()` uses `bcrypt.checkpw()` for constant-time comparison | |
| - Public KBs (no password) return True for any password check | |
| ### verify_password Logic | |
| ``` | |
| KB has no hash (public) β True (always accessible) | |
| KB has hash, no password β False (protected but no credentials) | |
| KB has hash, with password β bcrypt.checkpw(password, hash) | |
| ``` | |
| ### ingest_documents Flow | |
| ```python | |
| ingest_documents(kb_name, file_paths, password=None): | |
| 1. Verify KB exists | |
| 2. Verify password | |
| 3. IndexBuilder(kb_name).ingest_file(path, db_path) per file | |
| 4. Return list[IngestionReport] | |
| ``` | |
| Delegates entirely to `IndexBuilder` (Phase 1) which handles parsing, chunking, embedding, ChromaDB upsert, BM25 rebuild, and deduplication. | |
| ### delete_kb Flow | |
| ```python | |
| delete_kb(kb_name): | |
| 1. Verify KB exists (raises KBManagerError if not) | |
| 2. db.delete_kb() β SQLite CASCADE deletes documents, chunks, query_log | |
| 3. shutil.rmtree(cfg.kb_dir(kb_name)) β removes ChromaDB, BM25, files | |
| ``` | |
| Irreversible β the UI confirms before calling. | |
| --- | |
| ## TTS β Web Speech API | |
| **File:** [voicevault/tts/web_speech.py](../voicevault/tts/web_speech.py) | |
| The TTS engine runs entirely in the browser via the `SpeechSynthesis` API β zero API cost, zero server load. Python's role is text preparation only. | |
| ### prepare_for_tts() | |
| ```python | |
| def prepare_for_tts(answer: str, is_refusal: bool = False) -> str: | |
| if is_refusal or not answer: | |
| return "" | |
| text = _CITATION_MARKER_RE.sub("", answer) # strip [Source: ...] | |
| text = re.sub(r"\s{2,}", " ", text).strip() | |
| return text | |
| ``` | |
| Removes `[Source: filename, p.N]` markers before passing to the browser β reading "Source: paper dot pdf, p dot 3" aloud is poor UX. The JS bridge (`ui/components/audio_controls.py`) takes this cleaned text and calls `window._vv_tts.speak(text, rate, pitch)`. | |
| --- | |
| ## Ask Tab (Full Pipeline) | |
| **File:** [ui/tabs/ask_tab.py](../ui/tabs/ask_tab.py) | |
| ### End-to-End Query Flow | |
| ``` | |
| 1. User records audio β stop_recording event fires | |
| β WhisperTranscriber.transcribe(audio_path) β transcript text | |
| 2. User selects KB(s) β clicks Ask | |
| 3. _query_fn(): | |
| a. QueryPreprocessor.process(query) β pq (cleaned, typed) | |
| b. HybridRetriever(kb_names=selected).search(pq.processed_query) β results | |
| c. ContextBuilder().build(results) β (context_str, citation_map) | |
| d. AnswerChain.generate(query, context, citation_map, history, query_type) β generation | |
| e. db.log_query(...) β SHA-256 only, no raw text stored | |
| f. format_citations_markdown(generation.citations) β citation panel | |
| g. prepare_for_tts(generation.answer, generation.is_refusal) β TTS text | |
| h. Update chatbot + citations + history state + TTS state | |
| ``` | |
| ### State Management | |
| - `gr.State([])` β conversation history as `list[tuple[str, str]]` | |
| - `gr.State("")` β last answer text (for TTS playback) | |
| Conversation history is passed to `AnswerChain._build_messages()` as proper `HumanMessage`/`AIMessage` pairs β the correct LangChain pattern for multi-turn conversation. | |
| ### Error Handling | |
| Every failure path (no query, no KB selected, pipeline error) produces a user-visible error message in the chatbot rather than crashing. The query logger failure is non-critical (caught and warned, never raises). | |
| ### Factory Functions | |
| Event handlers are returned as closures from factory functions: | |
| ```python | |
| def _make_transcribe_fn(transcriber): | |
| def _transcribe(audio_path): ... | |
| return _transcribe | |
| def _make_query_fn(answer_chain, db_path): | |
| def _query(query, kb_names, history, chatbot): ... | |
| return _query | |
| ``` | |
| This enables dependency injection without globals β the `transcriber` and `answer_chain` objects are passed in from `app.py` and captured in the closure. | |
| --- | |
| ## KB Tab (Management UI) | |
| **File:** [ui/tabs/kb_tab.py](../ui/tabs/kb_tab.py) | |
| Three operations wired to Gradio event handlers: | |
| | Button | Handler | Output | | |
| |--------|---------|--------| | |
| | β Create KB | `_create_kb()` | Status message, refreshed dropdowns | | |
| | π€ Index Documents | `_upload_docs()` | Ingestion report per file | | |
| | ποΈ Delete KB | `_delete_kb()` | Status message, refreshed table + dropdowns | | |
| After each create/delete, all dropdowns and the KB dataframe are updated via `gr.update(choices=...)` β no page refresh needed. | |
| --- | |
| ## Analytics Tab | |
| **File:** [ui/tabs/analytics_tab.py](../ui/tabs/analytics_tab.py) | |
| Pulls data from `sqlite_store.get_query_stats()` on refresh button click: | |
| | Metric | Source | | |
| |--------|--------| | |
| | Total queries (7d) | `COUNT(*)` from `query_log` | | |
| | Avg end-to-end latency | `AVG(total_latency_ms)` | | |
| | Avg citations per answer | `AVG(citation_count)` | | |
| | Queries by day | `GROUP BY DATE(timestamp)` | | |
| | KB inventory | `KBManager.list_kbs()` | | |
| Stats are not loaded on page load β the user clicks π Refresh to pull fresh data. This avoids unnecessary DB queries at startup. | |
| --- | |
| ## app.py β Startup Orchestration | |
| **File:** [app.py](../app.py) | |
| ```python | |
| _startup() β (kb_manager, transcriber, answer_chain): | |
| 1. cfg.ensure_directories() | |
| 2. KBManager(db_path=data_dir/voicevault.db) β initializes SQLite schema | |
| 3. WhisperTranscriber() β lazy: no model loaded at startup | |
| 4. AnswerChain() β lazy: LLM clients created per call | |
| ``` | |
| All three singletons are created once and passed to the UI tab builders. This avoids the model-loading overhead being repeated on every query. | |
| --- | |
| ## Security Decisions | |
| ### Password Storage | |
| bcrypt with work factor 12 β prevents offline brute-force attacks even if the SQLite file is exfiltrated. The same rounds as industry standard (bcrypt rounds β₯ 10 is OWASP recommended). | |
| ### KB Name as Path Component | |
| The KB name regex (`^[a-z0-9][a-z0-9\-]{0,62}[a-z0-9]$`) prevents path traversal. All KB filesystem operations use `cfg.kb_dir(kb_name)` which returns `data_dir / kb_name` β impossible to escape with a validated slug. | |
| ### Query Audit Log β PII Protection | |
| The raw query text is NEVER stored in SQLite. Only the SHA-256 hash of the query is stored (`voice_query_hash`). This satisfies GDPR "data minimization" β analytics work on aggregates, not raw user queries. | |
| ### No Globals in Event Handlers | |
| All pipeline objects (transcriber, answer_chain, kb_manager) are passed via closures, not module-level globals. This makes the code testable (dependency injection) and prevents accidental shared state mutation. | |
| --- | |
| ## Test Coverage | |
| **File:** [tests/test_phase5.py](../tests/test_phase5.py) | **55/55 passed** | |
| | Class | Tests | What's verified | | |
| |-------|-------|----------------| | |
| | `TestKBManagerCreate` | 16 | Create, list, get, duplicate detection, 5 slug validation cases | | |
| | `TestKBManagerDelete` | 3 | Delete removes from list, nonexistent raises, count decreases | | |
| | `TestKBManagerPassword` | 7 | Public access, protected access, wrong pass, no pass, unknown KB, bcrypt format | | |
| | `TestKBManagerStats` | 3 | Returns dict, has required keys, zeros on empty DB | | |
| | `TestPreparForTTS` | 7 | Citation stripping, refusal β empty, normal text unchanged, no double spaces | | |
| | `TestCitationPanel` | 8 | Filename, page, section, excerpt, multiple, numbered, empty, type | | |
| | `TestUIHelpers` | 7 | KB choices, KB table format, protected lock icon, append_chat, no mutation | | |
| | `TestAppStartup` | 4 | build_app returns Blocks, all three tab builders run without error | | |
| ### Fixture Design | |
| The `manager` fixture creates a fresh KBManager backed by a temp SQLite path for each test β complete isolation with no shared state between tests. | |
| --- | |
| ## Full Project Test Summary | |
| | Phase | Tests | Status | | |
| |-------|-------|--------| | |
| | Phase 0 β Foundation | 58 passed | β | | |
| | Phase 1 β Ingestion | 46 passed | β | | |
| | Phase 2 β Retrieval | 33 passed, 0 errors | β | | |
| | Phase 3 β ASR | 45 passed, 2 skipped (soundfile) | β | | |
| | Phase 4 β Generation | 72 passed | β | | |
| | Phase 5 β UI & Access | 55 passed | β | | |
| | **Total** | **309 passed, 2 skipped** | β | | |
| **Note on conftest.py CPU fix:** `CUDA_VISIBLE_DEVICES="-1"` is set in `tests/conftest.py` to force CPU for all tests. This prevents CUDA compatibility errors on RTX 5070 (sm_120 not supported by packaged PyTorch β€ 2.x). Production deployment on HuggingFace Spaces uses NVIDIA T4 (sm_75) which is fully compatible. | |