Spaces:
Running
Phase 5 β Full UI, TTS & Access Control
Status: β Complete | Tests: 55/55 passed | Files: 7 modules (3 UI tabs, 2 backend, 1 TTS, 1 updated app.py)
What Was Built
Phase 5 wires all previous phases into a working end-to-end application.
| Module | Responsibility |
|---|---|
voicevault/kb/kb_manager.py |
KB lifecycle: create, list, delete, ingest, password auth |
voicevault/tts/web_speech.py |
TTS text prep: strip citation markers before speech |
ui/tabs/ask_tab.py |
Full voice query pipeline in Gradio |
ui/tabs/kb_tab.py |
KB creation, document upload, management |
ui/tabs/analytics_tab.py |
Query stats from SQLite audit log |
ui/tabs/settings_tab.py |
Configuration panels (display-only) |
app.py |
Startup orchestration, pipeline wiring |
KBManager
File: voicevault/kb/kb_manager.py
Central Database
All KBs share one SQLite database at cfg.data_dir / "voicevault.db". This enables cross-KB queries, global analytics, and efficient listing without per-KB filesystem scanning.
KB Name Validation
_VALID_KB_NAME = re.compile(r"^[a-z0-9][a-z0-9\-]{0,62}[a-z0-9]$|^[a-z0-9]$")
- Lowercase alphanumeric + hyphens only
- 1β64 characters
- Cannot start or end with a hyphen
- Prevents path traversal attacks (no
..,/,\, spaces)
Password Protection (bcrypt)
password_hash = bcrypt.hashpw(
password.encode(), bcrypt.gensalt(rounds=cfg.bcrypt_rounds) # default: 12
).decode()
- Passwords are hashed at creation time β plaintext never stored
verify_password()usesbcrypt.checkpw()for constant-time comparison- Public KBs (no password) return True for any password check
verify_password Logic
KB has no hash (public) β True (always accessible)
KB has hash, no password β False (protected but no credentials)
KB has hash, with password β bcrypt.checkpw(password, hash)
ingest_documents Flow
ingest_documents(kb_name, file_paths, password=None):
1. Verify KB exists
2. Verify password
3. IndexBuilder(kb_name).ingest_file(path, db_path) per file
4. Return list[IngestionReport]
Delegates entirely to IndexBuilder (Phase 1) which handles parsing, chunking, embedding, ChromaDB upsert, BM25 rebuild, and deduplication.
delete_kb Flow
delete_kb(kb_name):
1. Verify KB exists (raises KBManagerError if not)
2. db.delete_kb() β SQLite CASCADE deletes documents, chunks, query_log
3. shutil.rmtree(cfg.kb_dir(kb_name)) β removes ChromaDB, BM25, files
Irreversible β the UI confirms before calling.
TTS β Web Speech API
File: voicevault/tts/web_speech.py
The TTS engine runs entirely in the browser via the SpeechSynthesis API β zero API cost, zero server load. Python's role is text preparation only.
prepare_for_tts()
def prepare_for_tts(answer: str, is_refusal: bool = False) -> str:
if is_refusal or not answer:
return ""
text = _CITATION_MARKER_RE.sub("", answer) # strip [Source: ...]
text = re.sub(r"\s{2,}", " ", text).strip()
return text
Removes [Source: filename, p.N] markers before passing to the browser β reading "Source: paper dot pdf, p dot 3" aloud is poor UX. The JS bridge (ui/components/audio_controls.py) takes this cleaned text and calls window._vv_tts.speak(text, rate, pitch).
Ask Tab (Full Pipeline)
File: ui/tabs/ask_tab.py
End-to-End Query Flow
1. User records audio β stop_recording event fires
β WhisperTranscriber.transcribe(audio_path) β transcript text
2. User selects KB(s) β clicks Ask
3. _query_fn():
a. QueryPreprocessor.process(query) β pq (cleaned, typed)
b. HybridRetriever(kb_names=selected).search(pq.processed_query) β results
c. ContextBuilder().build(results) β (context_str, citation_map)
d. AnswerChain.generate(query, context, citation_map, history, query_type) β generation
e. db.log_query(...) β SHA-256 only, no raw text stored
f. format_citations_markdown(generation.citations) β citation panel
g. prepare_for_tts(generation.answer, generation.is_refusal) β TTS text
h. Update chatbot + citations + history state + TTS state
State Management
gr.State([])β conversation history aslist[tuple[str, str]]gr.State("")β last answer text (for TTS playback)
Conversation history is passed to AnswerChain._build_messages() as proper HumanMessage/AIMessage pairs β the correct LangChain pattern for multi-turn conversation.
Error Handling
Every failure path (no query, no KB selected, pipeline error) produces a user-visible error message in the chatbot rather than crashing. The query logger failure is non-critical (caught and warned, never raises).
Factory Functions
Event handlers are returned as closures from factory functions:
def _make_transcribe_fn(transcriber):
def _transcribe(audio_path): ...
return _transcribe
def _make_query_fn(answer_chain, db_path):
def _query(query, kb_names, history, chatbot): ...
return _query
This enables dependency injection without globals β the transcriber and answer_chain objects are passed in from app.py and captured in the closure.
KB Tab (Management UI)
File: ui/tabs/kb_tab.py
Three operations wired to Gradio event handlers:
| Button | Handler | Output |
|---|---|---|
| β Create KB | _create_kb() |
Status message, refreshed dropdowns |
| π€ Index Documents | _upload_docs() |
Ingestion report per file |
| ποΈ Delete KB | _delete_kb() |
Status message, refreshed table + dropdowns |
After each create/delete, all dropdowns and the KB dataframe are updated via gr.update(choices=...) β no page refresh needed.
Analytics Tab
File: ui/tabs/analytics_tab.py
Pulls data from sqlite_store.get_query_stats() on refresh button click:
| Metric | Source |
|---|---|
| Total queries (7d) | COUNT(*) from query_log |
| Avg end-to-end latency | AVG(total_latency_ms) |
| Avg citations per answer | AVG(citation_count) |
| Queries by day | GROUP BY DATE(timestamp) |
| KB inventory | KBManager.list_kbs() |
Stats are not loaded on page load β the user clicks π Refresh to pull fresh data. This avoids unnecessary DB queries at startup.
app.py β Startup Orchestration
File: app.py
_startup() β (kb_manager, transcriber, answer_chain):
1. cfg.ensure_directories()
2. KBManager(db_path=data_dir/voicevault.db) β initializes SQLite schema
3. WhisperTranscriber() β lazy: no model loaded at startup
4. AnswerChain() β lazy: LLM clients created per call
All three singletons are created once and passed to the UI tab builders. This avoids the model-loading overhead being repeated on every query.
Security Decisions
Password Storage
bcrypt with work factor 12 β prevents offline brute-force attacks even if the SQLite file is exfiltrated. The same rounds as industry standard (bcrypt rounds β₯ 10 is OWASP recommended).
KB Name as Path Component
The KB name regex (^[a-z0-9][a-z0-9\-]{0,62}[a-z0-9]$) prevents path traversal. All KB filesystem operations use cfg.kb_dir(kb_name) which returns data_dir / kb_name β impossible to escape with a validated slug.
Query Audit Log β PII Protection
The raw query text is NEVER stored in SQLite. Only the SHA-256 hash of the query is stored (voice_query_hash). This satisfies GDPR "data minimization" β analytics work on aggregates, not raw user queries.
No Globals in Event Handlers
All pipeline objects (transcriber, answer_chain, kb_manager) are passed via closures, not module-level globals. This makes the code testable (dependency injection) and prevents accidental shared state mutation.
Test Coverage
File: tests/test_phase5.py | 55/55 passed
| Class | Tests | What's verified |
|---|---|---|
TestKBManagerCreate |
16 | Create, list, get, duplicate detection, 5 slug validation cases |
TestKBManagerDelete |
3 | Delete removes from list, nonexistent raises, count decreases |
TestKBManagerPassword |
7 | Public access, protected access, wrong pass, no pass, unknown KB, bcrypt format |
TestKBManagerStats |
3 | Returns dict, has required keys, zeros on empty DB |
TestPreparForTTS |
7 | Citation stripping, refusal β empty, normal text unchanged, no double spaces |
TestCitationPanel |
8 | Filename, page, section, excerpt, multiple, numbered, empty, type |
TestUIHelpers |
7 | KB choices, KB table format, protected lock icon, append_chat, no mutation |
TestAppStartup |
4 | build_app returns Blocks, all three tab builders run without error |
Fixture Design
The manager fixture creates a fresh KBManager backed by a temp SQLite path for each test β complete isolation with no shared state between tests.
Full Project Test Summary
| Phase | Tests | Status |
|---|---|---|
| Phase 0 β Foundation | 58 passed | β |
| Phase 1 β Ingestion | 46 passed | β |
| Phase 2 β Retrieval | 33 passed, 0 errors | β |
| Phase 3 β ASR | 45 passed, 2 skipped (soundfile) | β |
| Phase 4 β Generation | 72 passed | β |
| Phase 5 β UI & Access | 55 passed | β |
| Total | 309 passed, 2 skipped | β |
Note on conftest.py CPU fix: CUDA_VISIBLE_DEVICES="-1" is set in tests/conftest.py to force CPU for all tests. This prevents CUDA compatibility errors on RTX 5070 (sm_120 not supported by packaged PyTorch β€ 2.x). Production deployment on HuggingFace Spaces uses NVIDIA T4 (sm_75) which is fully compatible.