Spaces:

Luigi
/

tiny-scribe

Running

Luigi commited on Feb 6

Commit

9d88146

1 Parent(s): cd4ade3

docs: update AGENTS.md guidelines and add comprehensive UI/UX implementation plan

- AGENTS.md: Refine code style guidelines, add missing dependencies (numpy, gradio_huggingfacehub_search), update project structure to include meeting_summarizer/ module, add inference settings by model role and environment variables
- UI_UX_IMPLEMENTATION_PLAN.md: Add detailed 3-phase implementation plan for UI/UX improvements including quick wins (tooltips, toast notifications, debug toggle, word count), medium effort changes (mode simplification, progress indicators, configuration presets, custom model auto-loading), and larger changes (advanced mode redesign, collapsible sections, input validation, mobile-first improvements)

Files changed (2) hide show

AGENTS.md +36 -38
UI_UX_IMPLEMENTATION_PLAN.md +1000 -0

AGENTS.md CHANGED Viewed

@@ -33,14 +33,15 @@ mypy app.py
 **Running tests (root project tests):**
 ```bash
-# Run E2E test
 python test_e2e.py
-# Run advanced mode test
 python test_advanced_mode.py
-# Run LFM2 extraction test
 python test_lfm2_extract.py
 ```
 **llama-cpp-python submodule tests:**
@@ -54,57 +55,46 @@ cd llama-cpp-python && pytest tests/test_llama.py::test_function_name -v
 ## Code Style Guidelines
 **Formatting:**
-- Use 4 spaces for indentation
-- Line length: 100 characters max
-- Use double quotes for docstrings
-- Two blank lines before function definitions
-- One blank line after docstrings
 **Imports (ordered):**
 ```python
 # Standard library
 import os
-import argparse
-import re
 from typing import Tuple, Optional, Generator
 # Third-party packages
 from llama_cpp import Llama
-from huggingface_hub import hf_hub_download
-from opencc import OpenCC
 import gradio as gr
 ```
 **Type Hints:**
-- Use type hints for parameters and return values
-- Use `Optional[]` for nullable types
-- Use `Generator[str, None, None]` for generators
-- Example: `def load_model(repo_id: str, filename: str, cpu_only: bool = False) -> Llama:`
 **Naming Conventions:**
-- `snake_case` for functions and variables
-- `CamelCase` for classes
-- `UPPER_CASE` for constants
 - Descriptive names: `stream_summarize_transcript`, not `summ`
-**Docstrings:**
-- Use triple quotes for all public functions
-- Keep first line as brief summary
-- Include Args/Returns sections for complex functions
 **Error Handling:**
-- Use explicit error messages with f-strings
-- Check file existence before operations
-- Use `try/except` blocks for external API calls (Hugging Face, model loading)
 - Log errors with context for debugging
 ## Dependencies
 **Required:**
-- `llama-cpp-python>=0.3.0` - Core inference engine
 - `gradio>=5.0.0` - Web UI framework
 - `huggingface-hub>=0.23.0` - Model downloading
 - `opencc-python-reimplemented>=0.1.7` - Chinese text conversion
 **Development (optional):**
 - `pytest>=7.4.0` - Testing framework
@@ -122,6 +112,10 @@ tiny-scribe/
 ├── test_e2e.py               # E2E test
 ├── test_advanced_mode.py     # Advanced mode test
 ├── test_lfm2_extract.py      # LFM2 extraction test
 ├── llama-cpp-python/          # Git submodule
 └── README.md                  # Project documentation
 ```
@@ -139,16 +133,20 @@ llm = Llama.from_pretrained(
 )
 ```
-**Streaming Chat Completion:**
-```python
-stream = llm.create_chat_completion(
-    messages=[{"role": "user", "content": prompt}],
-    stream=True,
-    max_tokens=1024,
-    temperature=0.6,
-)
 ```
 ## Notes for AI Agents
 - Always call `llm.reset()` after completion to ensure state isolation

 **Running tests (root project tests):**
 ```bash
+# Run all root tests
 python test_e2e.py
 python test_advanced_mode.py
 python test_lfm2_extract.py
+# Run single test with pytest
+pytest test_e2e.py -v                          # Run all tests in file
+pytest test_e2e.py::test_e2e -v               # Run specific function
+pytest test_advanced_mode.py -k "test_name"    # Run by name pattern
 ```
 **llama-cpp-python submodule tests:**
 ## Code Style Guidelines
 **Formatting:**
+- 4 spaces indentation, 100 char max line length, double quotes for docstrings
+- Two blank lines before functions, one after docstrings
 **Imports (ordered):**
 ```python
 # Standard library
 import os
 from typing import Tuple, Optional, Generator
 # Third-party packages
 from llama_cpp import Llama
 import gradio as gr
+# Local modules
+from meeting_summarizer.trace import Tracer
 ```
 **Type Hints:**
+- Use type hints for params/returns
+- `Optional[]` for nullable types, `Generator[str, None, None]` for generators
+- Example: `def load_model(repo_id: str, filename: str) -> Llama:`
 **Naming Conventions:**
+- `snake_case` for functions/variables, `CamelCase` for classes, `UPPER_CASE` for constants
 - Descriptive names: `stream_summarize_transcript`, not `summ`
 **Error Handling:**
+- Use explicit error messages with f-strings, check file existence before operations
+- Use `try/except` for external API calls (Hugging Face, model loading)
 - Log errors with context for debugging
 ## Dependencies
 **Required:**
+- `llama-cpp-python>=0.3.0` - Core inference engine (installed from llama-cpp-python submodule)
 - `gradio>=5.0.0` - Web UI framework
+- `gradio_huggingfacehub_search>=0.0.12` - HuggingFace model search component
 - `huggingface-hub>=0.23.0` - Model downloading
 - `opencc-python-reimplemented>=0.1.7` - Chinese text conversion
+- `numpy>=1.24.0` - Numerical operations for embeddings
 **Development (optional):**
 - `pytest>=7.4.0` - Testing framework
 ├── test_e2e.py               # E2E test
 ├── test_advanced_mode.py     # Advanced mode test
 ├── test_lfm2_extract.py      # LFM2 extraction test
+├── meeting_summarizer/       # Core summarization module
+│   ├── __init__.py
+│   ├── trace.py             # Tracing/logging utilities
+│   └── extraction.py        # Extraction and deduplication logic
 ├── llama-cpp-python/          # Git submodule
 └── README.md                  # Project documentation
 ```
 )
 ```
+**Inference Settings:**
+- Extraction models: Low temp (0.1-0.3) for deterministic JSON
+- Synthesis models: Higher temp (0.7-0.9) for creative summaries
+- Reasoning types: Non-reasoning (hide checkbox), Hybrid (toggleable), Thinking-only (always on)
+**Environment & GPU:**
+```bash
+DEFAULT_N_THREADS=2          # CPU threads (1-32)
+N_GPU_LAYERS=0              # 0=CPU, -1=all GPU
+HF_HUB_DOWNLOAD_TIMEOUT=300  # Download timeout (seconds)
 ```
+GPU offload detection: `from llama_cpp import llama_supports_gpu_offload`
 ## Notes for AI Agents
 - Always call `llm.reset()` after completion to ensure state isolation

UI_UX_IMPLEMENTATION_PLAN.md ADDED Viewed

	@@ -0,0 +1,1000 @@

+# UI/UX Implementation Plan for Tiny Scribe
+## Status
+- ✅ Docker container built and running (http://localhost:7860)
+- ✅ All dependencies verified (Python 3.10.19, Gradio 5.50.0)
+- ✅ Test transcripts available (micro.txt: 20 words, min.txt: 5 words, short.txt: 52 words)
+---
+## Phase 1: Quick Wins (Low Risk, High Value)
+*Estimated Time: 2-3 hours*
+### 1.1 Add Tooltips to Technical Parameters
+**Location:** `app.py` lines 2620-2640 (inference parameters)
+**Implementation:**
+```python
+# Add info parameter to sliders with clearer explanations
+temperature_slider = gr.Slider(
+    minimum=0.0,
+    maximum=2.0,
+    value=0.6,
+    step=0.1,
+    label="Temperature",
+    info="Lower = more focused/consistent, Higher = more creative/diverse",
+    show_label=True,
+    interactive=True,
+    # Add tooltip via Gradio's elem_id + custom CSS
+    elem_id="temperature-slider"
+)
+```
+**Benefits:**
+- Reduces cognitive load for non-technical users
+- Helps users understand trade-offs
+**Testing:**
+1. Start container with Standard Mode selected
+2. Hover over temperature slider - should show detailed explanation
+3. Verify tooltips work on mobile (tap to show)
+---
+### 1.2 Improve Copy/Download Feedback
+**Location:** `app.py` lines 2986-2998 (copy buttons)
+**Implementation:**
+```python
+# Add toast notification via JavaScript
+copy_summary_btn.click(
+    fn=lambda x: x,
+    inputs=[summary_output],
+    outputs=[],
+    js="""
+    (text) => {
+        navigator.clipboard.writeText(text);
+        // Show toast notification
+        const toast = document.createElement('div');
+        toast.style.cssText = `
+            position: fixed;
+            bottom: 20px;
+            right: 20px;
+            background: #10b981;
+            color: white;
+            padding: 12px 24px;
+            border-radius: 8px;
+            box-shadow: 0 4px 12px rgba(0,0,0,0.15);
+            z-index: 10000;
+            animation: slideIn 0.3s ease-out;
+        `;
+        toast.textContent = '✓ Copied to clipboard!';
+        document.body.appendChild(toast);
+        setTimeout(() => toast.remove(), 2000);
+        return text;
+    }
+    """
+)
+```
+**Add to CSS:**
+```css
+@keyframes slideIn {
+    from { transform: translateY(100%); opacity: 0; }
+    to { transform: translateY(0); opacity: 1; }
+}
+```
+**Benefits:**
+- Provides clear user feedback
+- Professional feel
+- Reduces uncertainty about whether action worked
+**Testing:**
+1. Click "Copy Summary" button
+2. Verify green toast appears: "✓ Copied to clipboard!"
+3. Toast disappears after 2 seconds
+4. Verify clipboard content matches summary
+---
+### 1.3 Hide Debug Panels Behind Toggle
+**Location:** `app.py` line 2714 (system_prompt_debug)
+**Implementation:**
+```python
+# Add developer mode toggle at bottom of left column
+with gr.Group():
+    show_debug = gr.Checkbox(
+        value=False,
+        label="Show Developer Debug Info",
+        info="Enable to see internal prompts (for debugging only)"
+    )
+# Make debug panel conditional
+system_prompt_debug = gr.Textbox(
+    label="System Prompt (Debug)",
+    value="",
+    visible=False,
+    interactive=False,
+    elem_classes=["debug-panel"]
+)
+# Toggle visibility
+show_debug.change(
+    fn=lambda x: gr.update(visible=x),
+    inputs=[show_debug],
+    outputs=[system_prompt_debug]
+)
+```
+**Benefits:**
+- Reduces visual clutter
+- Hides technical implementation details
+- Still available for power users
+**Testing:**
+1. Verify debug panel is hidden by default
+2. Check "Show Developer Debug Info" checkbox
+3. Verify system prompt text appears
+4. Uncheck - should hide again
+---
+### 1.4 Add Character/Word Count to Text Input
+**Location:** `app.py` lines 2506-2512 (text_input)
+**Implementation:**
+```python
+# Add word count display below textbox
+with gr.Group():
+    text_input = gr.Textbox(
+        label="Paste Transcript",
+        placeholder="Paste your transcript content here...",
+        lines=10,
+        max_lines=20
+    )
+    text_word_count = gr.Textbox(
+        label="Character/Word Count",
+        value="0 characters / 0 words",
+        interactive=False,
+        scale=0,
+        elem_classes=["word-count"]
+    )
+# Update count function
+def update_word_count(text):
+    chars = len(text)
+    words = len(text.split()) if text else 0
+    return f"{chars:,} characters / {words:,} words"
+# Wire up event
+text_input.change(
+    fn=update_word_count,
+    inputs=[text_input],
+    outputs=[text_word_count]
+)
+```
+**Benefits:**
+- Users know if transcript fits model context
+- Helps plan which model to use
+- Pre-validation before submission
+**Testing:**
+1. Paste text into input
+2. Verify count updates in real-time
+3. Check character/word calculation accuracy
+---
+## Phase 2: Medium Effort (High Impact)
+*Estimated Time: 4-6 hours*
+### 2.1 Simplify Mode Selection
+**Location:** `app.py` line 2544 (mode_radio)
+**Implementation:**
+```python
+mode_radio = gr.Radio(
+    choices=[
+        ("Quick Summarize (Fast, Single-Pass)", "Standard Mode"),
+        ("Deep Analysis Pipeline (Multi-Stage, Higher Quality)", "Advanced Mode (3-Model Pipeline)")
+    ],
+    value="Standard Mode",
+    label="🎯 Summarization Mode",
+    info="Choose processing approach based on your needs"
+)
+# Add explanation cards
+mode_explanation = gr.HTML("""
+<div class="mode-explanation">
+    <div class="mode-card">
+        <h3>⚡ Quick Summarize</h3>
+        <p><strong>Best for:</strong> Short texts, quick summaries, fast results</p>
+        <ul>
+            <li>Single AI model processes entire text</li>
+            <li>Typical time: 10-30 seconds</li>
+            <li>Good for: Meeting notes, article summaries</li>
+        </ul>
+    </div>
+    <div class="mode-card">
+        <h3>🔬 Deep Analysis Pipeline</h3>
+        <p><strong>Best for:</strong> Long transcripts, comprehensive reports, high-quality output</p>
+        <ul>
+            <li>3 specialized AI models work together</li>
+            <li>Deduplicates similar information</li>
+            <li>Typical time: 30-90 seconds</li>
+            <li>Good for: Conference transcripts, research documents</li>
+        </ul>
+    </div>
+</div>
+""")
+```
+**Add CSS:**
+```css
+.mode-explanation {
+    display: flex;
+    gap: 1rem;
+    margin: 1rem 0;
+}
+.mode-card {
+    flex: 1;
+    padding: 1rem;
+    border: 2px solid var(--border-color);
+    border-radius: var(--radius-md);
+    background: var(--card-bg);
+}
+.mode-card h3 {
+    margin-top: 0;
+    color: var(--primary-color);
+}
+.mode-card ul {
+    margin: 0.5rem 0 0 1rem;
+    font-size: 0.9rem;
+}
+```
+**Benefits:**
+- Clear guidance on which mode to use
+- Reduces decision paralysis
+- Educates users about trade-offs
+**Testing:**
+1. Select each mode - verify explanation cards appear
+2. Check layout on mobile (should stack vertically)
+3. Verify text is readable at different screen sizes
+---
+### 2.2 Add Progress Bar + Stage Indicators
+**Location:** `app.py` lines 2746-2814 (router function)
+**Implementation:**
+```python
+# Add progress components
+progress_bar = gr.Progress()
+stage_indicator = gr.HTML("""
+<div class="stage-indicators">
+    <div class="stage" id="stage-input">
+        <span class="stage-icon">📥</span>
+        <span class="stage-label">Input</span>
+    </div>
+    <div class="stage" id="stage-thinking">
+        <span class="stage-icon">🧠</span>
+        <span class="stage-label">Thinking</span>
+    </div>
+    <div class="stage" id="stage-summary">
+        <span class="stage-icon">📝</span>
+        <span class="stage-label">Summary</span>
+    </div>
+</div>
+""")
+# Update router to show progress
+def route_summarize_with_progress(*args):
+    mode = args[-1]  # mode_radio is last arg
+    if mode == "Standard Mode":
+        # Update stage indicator
+        yield gr.update(value='<div class="stage active">📥 Input</div>')
+        # ... process input ...
+        yield gr.update(value='<div class="stage active">🧠 Thinking</div>')
+        # ... generate thinking ...
+        yield gr.update(value='<div class="stage active">📝 Summary</div>')
+        # ... generate summary ...
+```
+**Add CSS:**
+```css
+.stage-indicators {
+    display: flex;
+    justify-content: space-between;
+    margin: 1rem 0;
+    padding: 0.5rem;
+    background: var(--card-bg);
+    border-radius: var(--radius-md);
+}
+.stage {
+    display: flex;
+    align-items: center;
+    gap: 0.5rem;
+    padding: 0.5rem 1rem;
+    border-radius: var(--radius-sm);
+    opacity: 0.5;
+    transition: all 0.3s;
+}
+.stage.active {
+    opacity: 1;
+    background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%);
+    color: white;
+    transform: scale(1.05);
+}
+.stage-icon {
+    font-size: 1.2rem;
+}
+.stage-label {
+    font-weight: 600;
+}
+```
+**Benefits:**
+- Visual feedback during long operations
+- Users know exactly what's happening
+- Reduces perceived wait time
+**Testing:**
+1. Submit Standard Mode task
+2. Verify stage indicators light up in sequence: Input → Thinking → Summary
+3. Test Advanced Mode: Should show Extraction → Deduplication → Synthesis
+4. Check active stage has highlight effect
+---
+### 2.3 Implement Configuration Presets
+**Location:** `app.py` after line 2630 (inference parameters)
+**Implementation:**
+```python
+# Add preset buttons
+with gr.Row():
+    quick_preset_btn = gr.Button("⚡ Quick (Fast)", size="sm", variant="secondary")
+    quality_preset_btn = gr.Button("⭐ Quality (Balanced)", size="sm", variant="secondary")
+    creative_preset_btn = gr.Button("🎨 Creative (Diverse)", size="sm", variant="secondary")
+# Preset configurations
+PRESETS = {
+    "quick": {"temperature": 0.3, "top_p": 0.8, "top_k": 20},
+    "quality": {"temperature": 0.6, "top_p": 0.9, "top_k": 40},
+    "creative": {"temperature": 1.0, "top_p": 0.95, "top_k": 50}
+}
+# Apply preset function
+def apply_preset(preset_name):
+    config = PRESETS[preset_name]
+    return (
+        gr.update(value=config["temperature"]),
+        gr.update(value=config["top_p"]),
+        gr.update(value=config["top_k"])
+    )
+# Wire up buttons
+quick_preset_btn.click(
+    fn=lambda: apply_preset("quick"),
+    outputs=[temperature_slider, top_p, top_k]
+)
+quality_preset_btn.click(
+    fn=lambda: apply_preset("quality"),
+    outputs=[temperature_slider, top_p, top_k]
+)
+creative_preset_btn.click(
+    fn=lambda: apply_preset("creative"),
+    outputs=[temperature_slider, top_p, top_k]
+)
+```
+**Benefits:**
+- One-click optimization for different use cases
+- Reduces need to understand each parameter
+- Provides good starting points for customization
+**Testing:**
+1. Click "Quick" - verify temp=0.3, top_p=0.8, top_k=20
+2. Click "Quality" - verify temp=0.6, top_p=0.9, top_k=40
+3. Click "Creative" - verify temp=1.0, top_p=0.95, top_k=50
+4. Test that manual adjustments still work after applying preset
+---
+### 2.4 Improve Custom Model Loading UX
+**Location:** `app.py` lines 2590-2619 (custom model section)
+**Implementation:**
+```python
+# Simplify to auto-load workflow
+model_search_input = HuggingfaceHubSearch(
+    label="🔍 Search & Load Model",
+    placeholder="Type model name (e.g., 'qwen', 'phi', 'llama')",
+    search_type="model",
+    info="Selecting a model will automatically load it"
+)
+# Auto-load on selection
+def auto_load_model(repo_id):
+    """Automatically load first available GGUF file."""
+    if not repo_id or "/" not in repo_id:
+        return gr.update(), gr.update(value="")
+    # Show loading state with progress
+    yield (
+        gr.update(value="🔄 Loading model..."),
+        gr.update(value="", visible=True)
+    )
+    # Discover files
+    files, error = list_repo_gguf_files(repo_id)
+    if error:
+        yield (
+            gr.update(value=f"❌ {error}"),
+            gr.update(value="", visible=False)
+        )
+        return None, None
+    if not files:
+        yield (
+            gr.update(value="❌ No GGUF files found"),
+            gr.update(value="", visible=False)
+        )
+        return None, None
+    # Auto-select best quantization (prioritize Q4_K_M, Q4_0, Q8_0)
+    preferred_quants = ["Q4_K_M", "Q4_0", "Q8_0"]
+    selected_file = None
+    for quant in preferred_quants:
+        for f in files:
+            if quant.lower() in f["name"].lower():
+                selected_file = f
+                break
+        if selected_file:
+            break
+    if not selected_file:
+        selected_file = files[0]  # Fallback to first file
+    # Load model
+    try:
+        model, msg = load_custom_model_from_hf(
+            repo_id,
+            selected_file["name"],
+            n_threads=2
+        )
+        yield (
+            gr.update(value=f"✅ {msg}"),
+            gr.update(value="", visible=False)
+        )
+        return model, {
+            "repo_id": repo_id,
+            "filename": selected_file["name"],
+            "size_mb": selected_file.get("size_mb", 0)
+        }
+    except Exception as e:
+        yield (
+            gr.update(value=f"❌ Failed to load: {str(e)}"),
+            gr.update(value="", visible=False)
+        )
+        return None, None
+# Wire up auto-load
+model_search_input.change(
+    fn=auto_load_model,
+    inputs=[model_search_input],
+    outputs=[custom_status, custom_file_dropdown],
+    show_progress="minimal"
+)
+```
+**Benefits:**
+- Reduces from 3 steps to 1 step
+- Auto-selects optimal quantization
+- Better error messaging
+- Visual loading states
+**Testing:**
+1. Search for "Qwen3-0.6B-GGUF"
+2. Verify auto-loads best quantization (Q4_K_M or Q4_0)
+3. Check status messages: "🔄 Loading..." → "✅ Loaded: ..."
+4. Test error case: Search for invalid repo
+5. Verify clear error message appears
+---
+## Phase 3: Larger Changes (High Value)
+*Estimated Time: 8-12 hours*
+### 3.1 Redesign Advanced Mode (Reduce Cognitive Load)
+**Approach:** Collapse 3 stages into accordion/tabs, add "Quick Start" preset
+**Implementation:**
+```python
+# Add Quick Start preset at top
+advanced_quick_start = gr.Dropdown(
+    choices=[
+        ("🔬 Deep Analysis (Best for long transcripts)", "deep"),
+        ("⚡ Fast Extraction (Best for quick insights)", "fast"),
+        ("🎯 Balanced (Good default)", "balanced")
+    ],
+    value="balanced",
+    label="Quick Start Preset",
+    info="Pre-configured settings - customize below if needed"
+)
+# Wrap stages in Accordions
+with gr.Accordion("🔍 Stage 1: Extraction", open=True):
+    extraction_model = gr.Dropdown(...)
+    extraction_n_ctx = gr.Slider(...)
+    enable_extraction_reasoning = gr.Checkbox(...)
+with gr.Accordion("🧬 Stage 2: Deduplication", open=True):
+    embedding_model = gr.Dropdown(...)
+    similarity_threshold = gr.Slider(...)
+with gr.Accordion("✨ Stage 3: Synthesis", open=True):
+    synthesis_model = gr.Dropdown(...)
+    enable_synthesis_reasoning = gr.Checkbox(...)
+# Preset configurations
+ADVANCED_PRESETS = {
+    "deep": {
+        "extraction": "qwen2.5_1.5b",
+        "embedding": "granite-107m",
+        "synthesis": "ernie_21b_thinking_q1",
+        "n_ctx": 8192,
+        "similarity": 0.85
+    },
+    "fast": {
+        "extraction": "qwen2.5_1.5b",
+        "embedding": "granite-107m",
+        "synthesis": "granite_3_1_1b_q8",
+        "n_ctx": 4096,
+        "similarity": 0.80
+    },
+    "balanced": {
+        "extraction": "qwen2.5_1.5b",
+        "embedding": "granite-107m",
+        "synthesis": "qwen3_1.7b_q4",
+        "n_ctx": 4096,
+        "similarity": 0.85
+    }
+}
+def apply_advanced_preset(preset_name):
+    config = ADVANCED_PRESETS[preset_name]
+    return (
+        gr.update(value=config["extraction"]),
+        gr.update(value=config["embedding"]),
+        gr.update(value=config["synthesis"]),
+        gr.update(value=config["n_ctx"]),
+        gr.update(value=config["similarity"])
+    )
+advanced_quick_start.change(
+    fn=apply_advanced_preset,
+    inputs=[advanced_quick_start],
+    outputs=[extraction_model, embedding_model, synthesis_model,
+              extraction_n_ctx, similarity_threshold]
+)
+```
+**Benefits:**
+- New users can start with one click
+- Stages collapsible when configured
+- Reduces initial overwhelm
+- Advanced users can still customize
+**Testing:**
+1. Select each preset - verify all settings update correctly
+2. Collapse/expand accordions - verify smooth animations
+3. Customize settings after preset - verify changes stick
+4. Test with actual generation to confirm preset quality
+---
+### 3.2 Add Collapsible Sections for Settings
+**Implementation:**
+```python
+# Wrap infrequently used settings in Accordions
+with gr.Accordion("⚙️ Advanced Inference Settings", open=False):
+    temperature_slider = gr.Slider(...)
+    top_p = gr.Slider(...)
+    top_k = gr.Slider(...)
+    repeat_penalty = gr.Slider(...)
+with gr.Accordion("🔧 Hardware Settings", open=True):
+    thread_config_dropdown = gr.Dropdown(...)
+    custom_threads_slider = gr.Slider(...)
+```
+**Benefits:**
+- Reduces visual clutter
+- Focus on what users actually need
+- Power users can still access everything
+**Testing:**
+1. Verify accordion starts closed (as configured)
+2. Click to expand - verify animation
+3. Verify all controls are accessible when open
+4. Check that state persists during session
+---
+### 3.3 Input Validation with Pre-Submission Warnings
+**Implementation:**
+```python
+# Add validation message area
+validation_warning = gr.HTML("", visible=False)
+# Validation function
+def validate_before_submit(file_input, text_input, model_key, mode):
+    warnings = []
+    # Get transcript content
+    content = ""
+    if text_input:
+        content = text_input
+    elif file_input:
+        try:
+            with open(file_input, 'r', encoding='utf-8') as f:
+                content = f.read()
+        except:
+            pass
+    if not content:
+        return gr.update(visible=False), None
+    # Check model context limits
+    model = AVAILABLE_MODELS.get(model_key, {})
+    max_context = model.get("max_context", 4096)
+    # Estimate tokens (rough estimate: 1 token ≈ 4 chars for mixed content)
+    estimated_tokens = len(content) // 4
+    if estimated_tokens > max_context:
+        warning = f"""
+        <div class="validation-warning">
+            <h3>⚠️ Transcript Exceeds Model Context</h3>
+            <p><strong>Estimated tokens:</strong> {estimated_tokens:,}</p>
+            <p><strong>Model limit:</strong> {max_context:,} tokens</p>
+            <p><strong>Recommendation:</strong> Select a model with larger context (e.g., Hunyuan 256K, ERNIE 131K, Qwen3 4B 256K)</p>
+            <p>Continuing will truncate input.</p>
+        </div>
+        """
+        warnings.append(warning)
+    # Check empty transcript
+    if not content.strip():
+        warning = """
+        <div class="validation-warning">
+            <h3>⚠️ Empty Transcript</h3>
+            <p>Please provide text content before generating summary.</p>
+        </div>
+        """
+        warnings.append(warning)
+    # Check for very short content
+    if estimated_tokens < 50:
+        warning = """
+        <div class="validation-warning info">
+            <h3>ℹ️ Very Short Transcript</h3>
+            <p>Your transcript is less than 50 tokens. Results may be limited.</p>
+        </div>
+        """
+        warnings.append(warning)
+    if warnings:
+        return gr.update(value="<br>".join(warnings), visible=True), None
+    else:
+        return gr.update(visible=False), content
+# Add CSS for warnings
+VALIDATION_CSS = """
+.validation-warning {
+    background: #fef3c7;
+    border: 1px solid #f59e0b;
+    border-left: 4px solid #f59e0b;
+    padding: 1rem;
+    border-radius: var(--radius-md);
+    margin: 1rem 0;
+}
+.validation-warning.info {
+    background: #dbeafe;
+    border-color: #3b82f6;
+    border-left-color: #3b82f6;
+}
+.validation-warning h3 {
+    margin: 0 0 0.5rem 0;
+    color: #1f2937;
+}
+.validation-warning p {
+    margin: 0.25rem 0;
+    color: #374151;
+}
+"""
+# Wire up validation (run on input change)
+file_input.change(
+    fn=lambda f, t, m: validate_before_submit(f, t, m, None)[0],
+    inputs=[file_input, text_input, model_dropdown],
+    outputs=[validation_warning]
+)
+text_input.change(
+    fn=lambda f, t, m: validate_before_submit(f, t, m, None)[0],
+    inputs=[file_input, text_input, model_dropdown],
+    outputs=[validation_warning]
+)
+model_dropdown.change(
+    fn=lambda f, t, m: validate_before_submit(f, t, m, None)[0],
+    inputs=[file_input, text_input, model_dropdown],
+    outputs=[validation_warning]
+)
+```
+**Benefits:**
+- Catches issues before wasted generation time
+- Provides clear recommendations
+- Helps users understand model limitations
+- Professional error handling
+**Testing:**
+1. Paste very long text (100K+ chars) - should show context limit warning
+2. Submit empty text - should show empty transcript warning
+3. Select small model with long text - warning should recommend larger model
+4. Test that warnings disappear when issue is fixed
+5. Verify submit button still works even with warnings (user choice)
+---
+### 3.4 Mobile-First Responsive Improvements
+**Implementation:**
+```python
+# Add mobile-specific CSS
+RESPONSIVE_CSS = """
+/* Mobile-first adjustments */
+@media (max-width: 768px) {
+    .gradio-container {
+        padding: 0.5rem !important;
+    }
+    .gradio-row {
+        flex-direction: column !important;
+    }
+    .gradio-column {
+        width: 100% !important;
+    }
+    /* Stack configuration panels */
+    .configuration-panel {
+        order: 2;
+    }
+    /* Stack output panels */
+    .output-panel {
+        order: 1;
+    }
+    /* Make mode explanation cards stack */
+    .mode-explanation {
+        flex-direction: column;
+    }
+    /* Make submit button sticky on mobile */
+    .submit-btn {
+        position: fixed;
+        bottom: 0;
+        left: 0;
+        right: 0;
+        border-radius: 0;
+        z-index: 1000;
+        margin: 0;
+    }
+    /* Adjust footer */
+    .footer {
+        padding-bottom: 4rem; /* Space for sticky button */
+    }
+    /* Make section headers smaller on mobile */
+    .section-header {
+        font-size: 0.9rem;
+        padding: 0.5rem;
+    }
+}
+/* Tablet adjustments */
+@media (min-width: 769px) and (max-width: 1024px) {
+    .gradio-column {
+        padding: 1rem;
+    }
+    .submit-btn {
+        font-size: 1rem;
+        padding: 0.8rem 1.5rem;
+    }
+}
+"""
+# Add viewport meta tag for mobile
+gr.HTML("""
+<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0">
+""")
+```
+**Benefits:**
+- Better mobile experience
+- Touch-friendly controls
+- Improved readability on small screens
+- Proper viewport scaling
+**Testing:**
+1. Test on mobile viewport (375px width)
+2. Test on tablet viewport (768px width)
+3. Verify stacking order makes sense (output first, config second)
+4. Test touch interactions (buttons, sliders)
+5. Verify no horizontal scrolling
+6. Check submit button visibility and accessibility on mobile
+---
+## Testing Strategy
+### Test Cases Matrix
+| Feature | Test Scenario | Expected Result |
+|----------|---------------|------------------|
+| Tooltips | Hover over temp slider | Show "Lower = more focused..." |
+| Copy Feedback | Click copy button | Green toast appears |
+| Debug Toggle | Check/uncheck debug | Panel shows/hides |
+| Word Count | Paste text | Count updates in real-time |
+| Mode Selection | Select modes | Explanation cards appear |
+| Progress Bar | Submit task | Stages light up sequentially |
+| Presets | Click preset buttons | Parameters auto-set |
+| Auto-Load | Search model | Auto-loads best quant |
+| Accordion | Collapse/expand | Smooth animation |
+| Validation | Exceed context | Show warning banner |
+| Mobile | 375px viewport | Stacked layout, sticky button |
+### Automated Testing
+```python
+# test_ui_features.py
+import gradio
+import requests
+def test_tooltips():
+    """Verify tooltips are present in DOM"""
+    response = requests.get("http://localhost:7860")
+    assert "tooltip" in response.text.lower()
+def test_copy_toast():
+    """Verify toast CSS is present"""
+    response = requests.get("http://localhost:7860")
+    assert "slideIn" in response.text  # Animation keyframes
+def test_progress_indicators():
+    """Verify stage indicators present"""
+    response = requests.get("http://localhost:7860")
+    assert "stage-indicator" in response.text
+def test_validation_warnings():
+    """Verify validation CSS present"""
+    response = requests.get("http://localhost:7860")
+    assert "validation-warning" in response.text
+if __name__ == "__main__":
+    test_tooltips()
+    test_copy_toast()
+    test_progress_indicators()
+    test_validation_warnings()
+    print("✅ All UI tests passed")
+```
+### Manual Testing Checklist
+**Phase 1 Tests:**
+- [ ] Tooltips visible on hover
+- [ ] Copy toast appears and disappears
+- [ ] Debug panel hidden by default
+- [ ] Word count updates in real-time
+**Phase 2 Tests:**
+- [ ] Mode explanations appear for both modes
+- [ ] Progress bar shows stages correctly
+- [ ] Presets apply correct values
+- [ ] Auto-load workflow smooth
+**Phase 3 Tests:**
+- [ ] Advanced presets configure all 3 stages
+- [ ] Accordions collapse/expand smoothly
+- [ ] Validation warnings show appropriately
+- [ ] Mobile layout stacks correctly
+---
+## Implementation Order
+1. **Week 1:** Phase 1 (Quick Wins)
+   - Day 1-2: Tooltips + Copy feedback
+   - Day 3: Debug toggle + Word count
+2. **Week 2:** Phase 2 (Medium Effort)
+   - Day 1-2: Mode selection + Progress indicators
+   - Day 3-4: Presets + Custom model UX
+3. **Week 3:** Phase 3 (Larger Changes)
+   - Day 1-3: Advanced mode redesign
+   - Day 4-5: Collapsible sections + Validation
+   - Day 6-7: Mobile improvements
+---
+## Rollback Plan
+If issues arise, each change is isolated:
+```bash
+# Tag before each phase
+git tag -a phase1-start -m "Before Phase 1 changes"
+git tag -a phase2-start -m "Before Phase 2 changes"
+git tag -a phase3-start -m "Before Phase 3 changes"
+# Rollback if needed
+git reset --hard phase1-start  # Roll back to Phase 1 start
+git reset --hard phase2-start  # Roll back to Phase 2 start
+```
+---
+## Success Metrics
+- **User Engagement:** Time on page + button clicks tracked
+- **Error Rate:** Failed submissions decreased by 50%
+- **Feature Adoption:** Advanced Mode usage increased by 30%
+- **User Satisfaction:** Survey after 2 weeks of deployment
+- **Mobile Traffic:** Mobile session length + completion rate
+---
+## Conclusion
+This plan provides a structured approach to improving Tiny Scribe's UI/UX with:
+- Clear phases and priorities
+- Specific implementation details
+- Comprehensive testing strategy
+- Rollback procedures
+- Success metrics
+Ready to begin Phase 1 implementation when approved.