Luigi commited on
Commit
9d88146
·
1 Parent(s): cd4ade3

docs: update AGENTS.md guidelines and add comprehensive UI/UX implementation plan

Browse files

- AGENTS.md: Refine code style guidelines, add missing dependencies (numpy, gradio_huggingfacehub_search), update project structure to include meeting_summarizer/ module, add inference settings by model role and environment variables
- UI_UX_IMPLEMENTATION_PLAN.md: Add detailed 3-phase implementation plan for UI/UX improvements including quick wins (tooltips, toast notifications, debug toggle, word count), medium effort changes (mode simplification, progress indicators, configuration presets, custom model auto-loading), and larger changes (advanced mode redesign, collapsible sections, input validation, mobile-first improvements)

Files changed (2) hide show
  1. AGENTS.md +36 -38
  2. UI_UX_IMPLEMENTATION_PLAN.md +1000 -0
AGENTS.md CHANGED
@@ -33,14 +33,15 @@ mypy app.py
33
 
34
  **Running tests (root project tests):**
35
  ```bash
36
- # Run E2E test
37
  python test_e2e.py
38
-
39
- # Run advanced mode test
40
  python test_advanced_mode.py
41
-
42
- # Run LFM2 extraction test
43
  python test_lfm2_extract.py
 
 
 
 
 
44
  ```
45
 
46
  **llama-cpp-python submodule tests:**
@@ -54,57 +55,46 @@ cd llama-cpp-python && pytest tests/test_llama.py::test_function_name -v
54
  ## Code Style Guidelines
55
 
56
  **Formatting:**
57
- - Use 4 spaces for indentation
58
- - Line length: 100 characters max
59
- - Use double quotes for docstrings
60
- - Two blank lines before function definitions
61
- - One blank line after docstrings
62
 
63
  **Imports (ordered):**
64
  ```python
65
  # Standard library
66
  import os
67
- import argparse
68
- import re
69
  from typing import Tuple, Optional, Generator
70
 
71
  # Third-party packages
72
  from llama_cpp import Llama
73
- from huggingface_hub import hf_hub_download
74
- from opencc import OpenCC
75
  import gradio as gr
 
 
 
76
  ```
77
 
78
  **Type Hints:**
79
- - Use type hints for parameters and return values
80
- - Use `Optional[]` for nullable types
81
- - Use `Generator[str, None, None]` for generators
82
- - Example: `def load_model(repo_id: str, filename: str, cpu_only: bool = False) -> Llama:`
83
 
84
  **Naming Conventions:**
85
- - `snake_case` for functions and variables
86
- - `CamelCase` for classes
87
- - `UPPER_CASE` for constants
88
  - Descriptive names: `stream_summarize_transcript`, not `summ`
89
 
90
- **Docstrings:**
91
- - Use triple quotes for all public functions
92
- - Keep first line as brief summary
93
- - Include Args/Returns sections for complex functions
94
-
95
  **Error Handling:**
96
- - Use explicit error messages with f-strings
97
- - Check file existence before operations
98
- - Use `try/except` blocks for external API calls (Hugging Face, model loading)
99
  - Log errors with context for debugging
100
 
101
  ## Dependencies
102
 
103
  **Required:**
104
- - `llama-cpp-python>=0.3.0` - Core inference engine
105
  - `gradio>=5.0.0` - Web UI framework
 
106
  - `huggingface-hub>=0.23.0` - Model downloading
107
  - `opencc-python-reimplemented>=0.1.7` - Chinese text conversion
 
108
 
109
  **Development (optional):**
110
  - `pytest>=7.4.0` - Testing framework
@@ -122,6 +112,10 @@ tiny-scribe/
122
  ├── test_e2e.py # E2E test
123
  ├── test_advanced_mode.py # Advanced mode test
124
  ├── test_lfm2_extract.py # LFM2 extraction test
 
 
 
 
125
  ├── llama-cpp-python/ # Git submodule
126
  └── README.md # Project documentation
127
  ```
@@ -139,16 +133,20 @@ llm = Llama.from_pretrained(
139
  )
140
  ```
141
 
142
- **Streaming Chat Completion:**
143
- ```python
144
- stream = llm.create_chat_completion(
145
- messages=[{"role": "user", "content": prompt}],
146
- stream=True,
147
- max_tokens=1024,
148
- temperature=0.6,
149
- )
 
 
150
  ```
151
 
 
 
152
  ## Notes for AI Agents
153
 
154
  - Always call `llm.reset()` after completion to ensure state isolation
 
33
 
34
  **Running tests (root project tests):**
35
  ```bash
36
+ # Run all root tests
37
  python test_e2e.py
 
 
38
  python test_advanced_mode.py
 
 
39
  python test_lfm2_extract.py
40
+
41
+ # Run single test with pytest
42
+ pytest test_e2e.py -v # Run all tests in file
43
+ pytest test_e2e.py::test_e2e -v # Run specific function
44
+ pytest test_advanced_mode.py -k "test_name" # Run by name pattern
45
  ```
46
 
47
  **llama-cpp-python submodule tests:**
 
55
  ## Code Style Guidelines
56
 
57
  **Formatting:**
58
+ - 4 spaces indentation, 100 char max line length, double quotes for docstrings
59
+ - Two blank lines before functions, one after docstrings
 
 
 
60
 
61
  **Imports (ordered):**
62
  ```python
63
  # Standard library
64
  import os
 
 
65
  from typing import Tuple, Optional, Generator
66
 
67
  # Third-party packages
68
  from llama_cpp import Llama
 
 
69
  import gradio as gr
70
+
71
+ # Local modules
72
+ from meeting_summarizer.trace import Tracer
73
  ```
74
 
75
  **Type Hints:**
76
+ - Use type hints for params/returns
77
+ - `Optional[]` for nullable types, `Generator[str, None, None]` for generators
78
+ - Example: `def load_model(repo_id: str, filename: str) -> Llama:`
 
79
 
80
  **Naming Conventions:**
81
+ - `snake_case` for functions/variables, `CamelCase` for classes, `UPPER_CASE` for constants
 
 
82
  - Descriptive names: `stream_summarize_transcript`, not `summ`
83
 
 
 
 
 
 
84
  **Error Handling:**
85
+ - Use explicit error messages with f-strings, check file existence before operations
86
+ - Use `try/except` for external API calls (Hugging Face, model loading)
 
87
  - Log errors with context for debugging
88
 
89
  ## Dependencies
90
 
91
  **Required:**
92
+ - `llama-cpp-python>=0.3.0` - Core inference engine (installed from llama-cpp-python submodule)
93
  - `gradio>=5.0.0` - Web UI framework
94
+ - `gradio_huggingfacehub_search>=0.0.12` - HuggingFace model search component
95
  - `huggingface-hub>=0.23.0` - Model downloading
96
  - `opencc-python-reimplemented>=0.1.7` - Chinese text conversion
97
+ - `numpy>=1.24.0` - Numerical operations for embeddings
98
 
99
  **Development (optional):**
100
  - `pytest>=7.4.0` - Testing framework
 
112
  ├── test_e2e.py # E2E test
113
  ├── test_advanced_mode.py # Advanced mode test
114
  ├── test_lfm2_extract.py # LFM2 extraction test
115
+ ├── meeting_summarizer/ # Core summarization module
116
+ │ ├── __init__.py
117
+ │ ├── trace.py # Tracing/logging utilities
118
+ │ └── extraction.py # Extraction and deduplication logic
119
  ├── llama-cpp-python/ # Git submodule
120
  └── README.md # Project documentation
121
  ```
 
133
  )
134
  ```
135
 
136
+ **Inference Settings:**
137
+ - Extraction models: Low temp (0.1-0.3) for deterministic JSON
138
+ - Synthesis models: Higher temp (0.7-0.9) for creative summaries
139
+ - Reasoning types: Non-reasoning (hide checkbox), Hybrid (toggleable), Thinking-only (always on)
140
+
141
+ **Environment & GPU:**
142
+ ```bash
143
+ DEFAULT_N_THREADS=2 # CPU threads (1-32)
144
+ N_GPU_LAYERS=0 # 0=CPU, -1=all GPU
145
+ HF_HUB_DOWNLOAD_TIMEOUT=300 # Download timeout (seconds)
146
  ```
147
 
148
+ GPU offload detection: `from llama_cpp import llama_supports_gpu_offload`
149
+
150
  ## Notes for AI Agents
151
 
152
  - Always call `llm.reset()` after completion to ensure state isolation
UI_UX_IMPLEMENTATION_PLAN.md ADDED
@@ -0,0 +1,1000 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # UI/UX Implementation Plan for Tiny Scribe
2
+
3
+ ## Status
4
+ - ✅ Docker container built and running (http://localhost:7860)
5
+ - ✅ All dependencies verified (Python 3.10.19, Gradio 5.50.0)
6
+ - ✅ Test transcripts available (micro.txt: 20 words, min.txt: 5 words, short.txt: 52 words)
7
+
8
+ ---
9
+
10
+ ## Phase 1: Quick Wins (Low Risk, High Value)
11
+ *Estimated Time: 2-3 hours*
12
+
13
+ ### 1.1 Add Tooltips to Technical Parameters
14
+ **Location:** `app.py` lines 2620-2640 (inference parameters)
15
+
16
+ **Implementation:**
17
+ ```python
18
+ # Add info parameter to sliders with clearer explanations
19
+ temperature_slider = gr.Slider(
20
+ minimum=0.0,
21
+ maximum=2.0,
22
+ value=0.6,
23
+ step=0.1,
24
+ label="Temperature",
25
+ info="Lower = more focused/consistent, Higher = more creative/diverse",
26
+ show_label=True,
27
+ interactive=True,
28
+ # Add tooltip via Gradio's elem_id + custom CSS
29
+ elem_id="temperature-slider"
30
+ )
31
+ ```
32
+
33
+ **Benefits:**
34
+ - Reduces cognitive load for non-technical users
35
+ - Helps users understand trade-offs
36
+
37
+ **Testing:**
38
+ 1. Start container with Standard Mode selected
39
+ 2. Hover over temperature slider - should show detailed explanation
40
+ 3. Verify tooltips work on mobile (tap to show)
41
+
42
+ ---
43
+
44
+ ### 1.2 Improve Copy/Download Feedback
45
+ **Location:** `app.py` lines 2986-2998 (copy buttons)
46
+
47
+ **Implementation:**
48
+ ```python
49
+ # Add toast notification via JavaScript
50
+ copy_summary_btn.click(
51
+ fn=lambda x: x,
52
+ inputs=[summary_output],
53
+ outputs=[],
54
+ js="""
55
+ (text) => {
56
+ navigator.clipboard.writeText(text);
57
+ // Show toast notification
58
+ const toast = document.createElement('div');
59
+ toast.style.cssText = `
60
+ position: fixed;
61
+ bottom: 20px;
62
+ right: 20px;
63
+ background: #10b981;
64
+ color: white;
65
+ padding: 12px 24px;
66
+ border-radius: 8px;
67
+ box-shadow: 0 4px 12px rgba(0,0,0,0.15);
68
+ z-index: 10000;
69
+ animation: slideIn 0.3s ease-out;
70
+ `;
71
+ toast.textContent = '✓ Copied to clipboard!';
72
+ document.body.appendChild(toast);
73
+ setTimeout(() => toast.remove(), 2000);
74
+ return text;
75
+ }
76
+ """
77
+ )
78
+ ```
79
+
80
+ **Add to CSS:**
81
+ ```css
82
+ @keyframes slideIn {
83
+ from { transform: translateY(100%); opacity: 0; }
84
+ to { transform: translateY(0); opacity: 1; }
85
+ }
86
+ ```
87
+
88
+ **Benefits:**
89
+ - Provides clear user feedback
90
+ - Professional feel
91
+ - Reduces uncertainty about whether action worked
92
+
93
+ **Testing:**
94
+ 1. Click "Copy Summary" button
95
+ 2. Verify green toast appears: "✓ Copied to clipboard!"
96
+ 3. Toast disappears after 2 seconds
97
+ 4. Verify clipboard content matches summary
98
+
99
+ ---
100
+
101
+ ### 1.3 Hide Debug Panels Behind Toggle
102
+ **Location:** `app.py` line 2714 (system_prompt_debug)
103
+
104
+ **Implementation:**
105
+ ```python
106
+ # Add developer mode toggle at bottom of left column
107
+ with gr.Group():
108
+ show_debug = gr.Checkbox(
109
+ value=False,
110
+ label="Show Developer Debug Info",
111
+ info="Enable to see internal prompts (for debugging only)"
112
+ )
113
+
114
+ # Make debug panel conditional
115
+ system_prompt_debug = gr.Textbox(
116
+ label="System Prompt (Debug)",
117
+ value="",
118
+ visible=False,
119
+ interactive=False,
120
+ elem_classes=["debug-panel"]
121
+ )
122
+
123
+ # Toggle visibility
124
+ show_debug.change(
125
+ fn=lambda x: gr.update(visible=x),
126
+ inputs=[show_debug],
127
+ outputs=[system_prompt_debug]
128
+ )
129
+ ```
130
+
131
+ **Benefits:**
132
+ - Reduces visual clutter
133
+ - Hides technical implementation details
134
+ - Still available for power users
135
+
136
+ **Testing:**
137
+ 1. Verify debug panel is hidden by default
138
+ 2. Check "Show Developer Debug Info" checkbox
139
+ 3. Verify system prompt text appears
140
+ 4. Uncheck - should hide again
141
+
142
+ ---
143
+
144
+ ### 1.4 Add Character/Word Count to Text Input
145
+ **Location:** `app.py` lines 2506-2512 (text_input)
146
+
147
+ **Implementation:**
148
+ ```python
149
+ # Add word count display below textbox
150
+ with gr.Group():
151
+ text_input = gr.Textbox(
152
+ label="Paste Transcript",
153
+ placeholder="Paste your transcript content here...",
154
+ lines=10,
155
+ max_lines=20
156
+ )
157
+ text_word_count = gr.Textbox(
158
+ label="Character/Word Count",
159
+ value="0 characters / 0 words",
160
+ interactive=False,
161
+ scale=0,
162
+ elem_classes=["word-count"]
163
+ )
164
+
165
+ # Update count function
166
+ def update_word_count(text):
167
+ chars = len(text)
168
+ words = len(text.split()) if text else 0
169
+ return f"{chars:,} characters / {words:,} words"
170
+
171
+ # Wire up event
172
+ text_input.change(
173
+ fn=update_word_count,
174
+ inputs=[text_input],
175
+ outputs=[text_word_count]
176
+ )
177
+ ```
178
+
179
+ **Benefits:**
180
+ - Users know if transcript fits model context
181
+ - Helps plan which model to use
182
+ - Pre-validation before submission
183
+
184
+ **Testing:**
185
+ 1. Paste text into input
186
+ 2. Verify count updates in real-time
187
+ 3. Check character/word calculation accuracy
188
+
189
+ ---
190
+
191
+ ## Phase 2: Medium Effort (High Impact)
192
+ *Estimated Time: 4-6 hours*
193
+
194
+ ### 2.1 Simplify Mode Selection
195
+ **Location:** `app.py` line 2544 (mode_radio)
196
+
197
+ **Implementation:**
198
+ ```python
199
+ mode_radio = gr.Radio(
200
+ choices=[
201
+ ("Quick Summarize (Fast, Single-Pass)", "Standard Mode"),
202
+ ("Deep Analysis Pipeline (Multi-Stage, Higher Quality)", "Advanced Mode (3-Model Pipeline)")
203
+ ],
204
+ value="Standard Mode",
205
+ label="🎯 Summarization Mode",
206
+ info="Choose processing approach based on your needs"
207
+ )
208
+
209
+ # Add explanation cards
210
+ mode_explanation = gr.HTML("""
211
+ <div class="mode-explanation">
212
+ <div class="mode-card">
213
+ <h3>⚡ Quick Summarize</h3>
214
+ <p><strong>Best for:</strong> Short texts, quick summaries, fast results</p>
215
+ <ul>
216
+ <li>Single AI model processes entire text</li>
217
+ <li>Typical time: 10-30 seconds</li>
218
+ <li>Good for: Meeting notes, article summaries</li>
219
+ </ul>
220
+ </div>
221
+ <div class="mode-card">
222
+ <h3>🔬 Deep Analysis Pipeline</h3>
223
+ <p><strong>Best for:</strong> Long transcripts, comprehensive reports, high-quality output</p>
224
+ <ul>
225
+ <li>3 specialized AI models work together</li>
226
+ <li>Deduplicates similar information</li>
227
+ <li>Typical time: 30-90 seconds</li>
228
+ <li>Good for: Conference transcripts, research documents</li>
229
+ </ul>
230
+ </div>
231
+ </div>
232
+ """)
233
+ ```
234
+
235
+ **Add CSS:**
236
+ ```css
237
+ .mode-explanation {
238
+ display: flex;
239
+ gap: 1rem;
240
+ margin: 1rem 0;
241
+ }
242
+
243
+ .mode-card {
244
+ flex: 1;
245
+ padding: 1rem;
246
+ border: 2px solid var(--border-color);
247
+ border-radius: var(--radius-md);
248
+ background: var(--card-bg);
249
+ }
250
+
251
+ .mode-card h3 {
252
+ margin-top: 0;
253
+ color: var(--primary-color);
254
+ }
255
+
256
+ .mode-card ul {
257
+ margin: 0.5rem 0 0 1rem;
258
+ font-size: 0.9rem;
259
+ }
260
+ ```
261
+
262
+ **Benefits:**
263
+ - Clear guidance on which mode to use
264
+ - Reduces decision paralysis
265
+ - Educates users about trade-offs
266
+
267
+ **Testing:**
268
+ 1. Select each mode - verify explanation cards appear
269
+ 2. Check layout on mobile (should stack vertically)
270
+ 3. Verify text is readable at different screen sizes
271
+
272
+ ---
273
+
274
+ ### 2.2 Add Progress Bar + Stage Indicators
275
+ **Location:** `app.py` lines 2746-2814 (router function)
276
+
277
+ **Implementation:**
278
+ ```python
279
+ # Add progress components
280
+ progress_bar = gr.Progress()
281
+ stage_indicator = gr.HTML("""
282
+ <div class="stage-indicators">
283
+ <div class="stage" id="stage-input">
284
+ <span class="stage-icon">📥</span>
285
+ <span class="stage-label">Input</span>
286
+ </div>
287
+ <div class="stage" id="stage-thinking">
288
+ <span class="stage-icon">🧠</span>
289
+ <span class="stage-label">Thinking</span>
290
+ </div>
291
+ <div class="stage" id="stage-summary">
292
+ <span class="stage-icon">📝</span>
293
+ <span class="stage-label">Summary</span>
294
+ </div>
295
+ </div>
296
+ """)
297
+
298
+ # Update router to show progress
299
+ def route_summarize_with_progress(*args):
300
+ mode = args[-1] # mode_radio is last arg
301
+
302
+ if mode == "Standard Mode":
303
+ # Update stage indicator
304
+ yield gr.update(value='<div class="stage active">📥 Input</div>')
305
+ # ... process input ...
306
+
307
+ yield gr.update(value='<div class="stage active">🧠 Thinking</div>')
308
+ # ... generate thinking ...
309
+
310
+ yield gr.update(value='<div class="stage active">📝 Summary</div>')
311
+ # ... generate summary ...
312
+ ```
313
+
314
+ **Add CSS:**
315
+ ```css
316
+ .stage-indicators {
317
+ display: flex;
318
+ justify-content: space-between;
319
+ margin: 1rem 0;
320
+ padding: 0.5rem;
321
+ background: var(--card-bg);
322
+ border-radius: var(--radius-md);
323
+ }
324
+
325
+ .stage {
326
+ display: flex;
327
+ align-items: center;
328
+ gap: 0.5rem;
329
+ padding: 0.5rem 1rem;
330
+ border-radius: var(--radius-sm);
331
+ opacity: 0.5;
332
+ transition: all 0.3s;
333
+ }
334
+
335
+ .stage.active {
336
+ opacity: 1;
337
+ background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%);
338
+ color: white;
339
+ transform: scale(1.05);
340
+ }
341
+
342
+ .stage-icon {
343
+ font-size: 1.2rem;
344
+ }
345
+
346
+ .stage-label {
347
+ font-weight: 600;
348
+ }
349
+ ```
350
+
351
+ **Benefits:**
352
+ - Visual feedback during long operations
353
+ - Users know exactly what's happening
354
+ - Reduces perceived wait time
355
+
356
+ **Testing:**
357
+ 1. Submit Standard Mode task
358
+ 2. Verify stage indicators light up in sequence: Input → Thinking → Summary
359
+ 3. Test Advanced Mode: Should show Extraction → Deduplication → Synthesis
360
+ 4. Check active stage has highlight effect
361
+
362
+ ---
363
+
364
+ ### 2.3 Implement Configuration Presets
365
+ **Location:** `app.py` after line 2630 (inference parameters)
366
+
367
+ **Implementation:**
368
+ ```python
369
+ # Add preset buttons
370
+ with gr.Row():
371
+ quick_preset_btn = gr.Button("⚡ Quick (Fast)", size="sm", variant="secondary")
372
+ quality_preset_btn = gr.Button("⭐ Quality (Balanced)", size="sm", variant="secondary")
373
+ creative_preset_btn = gr.Button("🎨 Creative (Diverse)", size="sm", variant="secondary")
374
+
375
+ # Preset configurations
376
+ PRESETS = {
377
+ "quick": {"temperature": 0.3, "top_p": 0.8, "top_k": 20},
378
+ "quality": {"temperature": 0.6, "top_p": 0.9, "top_k": 40},
379
+ "creative": {"temperature": 1.0, "top_p": 0.95, "top_k": 50}
380
+ }
381
+
382
+ # Apply preset function
383
+ def apply_preset(preset_name):
384
+ config = PRESETS[preset_name]
385
+ return (
386
+ gr.update(value=config["temperature"]),
387
+ gr.update(value=config["top_p"]),
388
+ gr.update(value=config["top_k"])
389
+ )
390
+
391
+ # Wire up buttons
392
+ quick_preset_btn.click(
393
+ fn=lambda: apply_preset("quick"),
394
+ outputs=[temperature_slider, top_p, top_k]
395
+ )
396
+
397
+ quality_preset_btn.click(
398
+ fn=lambda: apply_preset("quality"),
399
+ outputs=[temperature_slider, top_p, top_k]
400
+ )
401
+
402
+ creative_preset_btn.click(
403
+ fn=lambda: apply_preset("creative"),
404
+ outputs=[temperature_slider, top_p, top_k]
405
+ )
406
+ ```
407
+
408
+ **Benefits:**
409
+ - One-click optimization for different use cases
410
+ - Reduces need to understand each parameter
411
+ - Provides good starting points for customization
412
+
413
+ **Testing:**
414
+ 1. Click "Quick" - verify temp=0.3, top_p=0.8, top_k=20
415
+ 2. Click "Quality" - verify temp=0.6, top_p=0.9, top_k=40
416
+ 3. Click "Creative" - verify temp=1.0, top_p=0.95, top_k=50
417
+ 4. Test that manual adjustments still work after applying preset
418
+
419
+ ---
420
+
421
+ ### 2.4 Improve Custom Model Loading UX
422
+ **Location:** `app.py` lines 2590-2619 (custom model section)
423
+
424
+ **Implementation:**
425
+ ```python
426
+ # Simplify to auto-load workflow
427
+ model_search_input = HuggingfaceHubSearch(
428
+ label="🔍 Search & Load Model",
429
+ placeholder="Type model name (e.g., 'qwen', 'phi', 'llama')",
430
+ search_type="model",
431
+ info="Selecting a model will automatically load it"
432
+ )
433
+
434
+ # Auto-load on selection
435
+ def auto_load_model(repo_id):
436
+ """Automatically load first available GGUF file."""
437
+ if not repo_id or "/" not in repo_id:
438
+ return gr.update(), gr.update(value="")
439
+
440
+ # Show loading state with progress
441
+ yield (
442
+ gr.update(value="🔄 Loading model..."),
443
+ gr.update(value="", visible=True)
444
+ )
445
+
446
+ # Discover files
447
+ files, error = list_repo_gguf_files(repo_id)
448
+
449
+ if error:
450
+ yield (
451
+ gr.update(value=f"❌ {error}"),
452
+ gr.update(value="", visible=False)
453
+ )
454
+ return None, None
455
+
456
+ if not files:
457
+ yield (
458
+ gr.update(value="❌ No GGUF files found"),
459
+ gr.update(value="", visible=False)
460
+ )
461
+ return None, None
462
+
463
+ # Auto-select best quantization (prioritize Q4_K_M, Q4_0, Q8_0)
464
+ preferred_quants = ["Q4_K_M", "Q4_0", "Q8_0"]
465
+ selected_file = None
466
+
467
+ for quant in preferred_quants:
468
+ for f in files:
469
+ if quant.lower() in f["name"].lower():
470
+ selected_file = f
471
+ break
472
+ if selected_file:
473
+ break
474
+
475
+ if not selected_file:
476
+ selected_file = files[0] # Fallback to first file
477
+
478
+ # Load model
479
+ try:
480
+ model, msg = load_custom_model_from_hf(
481
+ repo_id,
482
+ selected_file["name"],
483
+ n_threads=2
484
+ )
485
+ yield (
486
+ gr.update(value=f"✅ {msg}"),
487
+ gr.update(value="", visible=False)
488
+ )
489
+ return model, {
490
+ "repo_id": repo_id,
491
+ "filename": selected_file["name"],
492
+ "size_mb": selected_file.get("size_mb", 0)
493
+ }
494
+
495
+ except Exception as e:
496
+ yield (
497
+ gr.update(value=f"❌ Failed to load: {str(e)}"),
498
+ gr.update(value="", visible=False)
499
+ )
500
+ return None, None
501
+
502
+ # Wire up auto-load
503
+ model_search_input.change(
504
+ fn=auto_load_model,
505
+ inputs=[model_search_input],
506
+ outputs=[custom_status, custom_file_dropdown],
507
+ show_progress="minimal"
508
+ )
509
+ ```
510
+
511
+ **Benefits:**
512
+ - Reduces from 3 steps to 1 step
513
+ - Auto-selects optimal quantization
514
+ - Better error messaging
515
+ - Visual loading states
516
+
517
+ **Testing:**
518
+ 1. Search for "Qwen3-0.6B-GGUF"
519
+ 2. Verify auto-loads best quantization (Q4_K_M or Q4_0)
520
+ 3. Check status messages: "🔄 Loading..." → "✅ Loaded: ..."
521
+ 4. Test error case: Search for invalid repo
522
+ 5. Verify clear error message appears
523
+
524
+ ---
525
+
526
+ ## Phase 3: Larger Changes (High Value)
527
+ *Estimated Time: 8-12 hours*
528
+
529
+ ### 3.1 Redesign Advanced Mode (Reduce Cognitive Load)
530
+
531
+ **Approach:** Collapse 3 stages into accordion/tabs, add "Quick Start" preset
532
+
533
+ **Implementation:**
534
+ ```python
535
+ # Add Quick Start preset at top
536
+ advanced_quick_start = gr.Dropdown(
537
+ choices=[
538
+ ("🔬 Deep Analysis (Best for long transcripts)", "deep"),
539
+ ("⚡ Fast Extraction (Best for quick insights)", "fast"),
540
+ ("🎯 Balanced (Good default)", "balanced")
541
+ ],
542
+ value="balanced",
543
+ label="Quick Start Preset",
544
+ info="Pre-configured settings - customize below if needed"
545
+ )
546
+
547
+ # Wrap stages in Accordions
548
+ with gr.Accordion("🔍 Stage 1: Extraction", open=True):
549
+ extraction_model = gr.Dropdown(...)
550
+ extraction_n_ctx = gr.Slider(...)
551
+ enable_extraction_reasoning = gr.Checkbox(...)
552
+
553
+ with gr.Accordion("🧬 Stage 2: Deduplication", open=True):
554
+ embedding_model = gr.Dropdown(...)
555
+ similarity_threshold = gr.Slider(...)
556
+
557
+ with gr.Accordion("✨ Stage 3: Synthesis", open=True):
558
+ synthesis_model = gr.Dropdown(...)
559
+ enable_synthesis_reasoning = gr.Checkbox(...)
560
+
561
+ # Preset configurations
562
+ ADVANCED_PRESETS = {
563
+ "deep": {
564
+ "extraction": "qwen2.5_1.5b",
565
+ "embedding": "granite-107m",
566
+ "synthesis": "ernie_21b_thinking_q1",
567
+ "n_ctx": 8192,
568
+ "similarity": 0.85
569
+ },
570
+ "fast": {
571
+ "extraction": "qwen2.5_1.5b",
572
+ "embedding": "granite-107m",
573
+ "synthesis": "granite_3_1_1b_q8",
574
+ "n_ctx": 4096,
575
+ "similarity": 0.80
576
+ },
577
+ "balanced": {
578
+ "extraction": "qwen2.5_1.5b",
579
+ "embedding": "granite-107m",
580
+ "synthesis": "qwen3_1.7b_q4",
581
+ "n_ctx": 4096,
582
+ "similarity": 0.85
583
+ }
584
+ }
585
+
586
+ def apply_advanced_preset(preset_name):
587
+ config = ADVANCED_PRESETS[preset_name]
588
+ return (
589
+ gr.update(value=config["extraction"]),
590
+ gr.update(value=config["embedding"]),
591
+ gr.update(value=config["synthesis"]),
592
+ gr.update(value=config["n_ctx"]),
593
+ gr.update(value=config["similarity"])
594
+ )
595
+
596
+ advanced_quick_start.change(
597
+ fn=apply_advanced_preset,
598
+ inputs=[advanced_quick_start],
599
+ outputs=[extraction_model, embedding_model, synthesis_model,
600
+ extraction_n_ctx, similarity_threshold]
601
+ )
602
+ ```
603
+
604
+ **Benefits:**
605
+ - New users can start with one click
606
+ - Stages collapsible when configured
607
+ - Reduces initial overwhelm
608
+ - Advanced users can still customize
609
+
610
+ **Testing:**
611
+ 1. Select each preset - verify all settings update correctly
612
+ 2. Collapse/expand accordions - verify smooth animations
613
+ 3. Customize settings after preset - verify changes stick
614
+ 4. Test with actual generation to confirm preset quality
615
+
616
+ ---
617
+
618
+ ### 3.2 Add Collapsible Sections for Settings
619
+
620
+ **Implementation:**
621
+ ```python
622
+ # Wrap infrequently used settings in Accordions
623
+ with gr.Accordion("⚙️ Advanced Inference Settings", open=False):
624
+ temperature_slider = gr.Slider(...)
625
+ top_p = gr.Slider(...)
626
+ top_k = gr.Slider(...)
627
+ repeat_penalty = gr.Slider(...)
628
+
629
+ with gr.Accordion("🔧 Hardware Settings", open=True):
630
+ thread_config_dropdown = gr.Dropdown(...)
631
+ custom_threads_slider = gr.Slider(...)
632
+ ```
633
+
634
+ **Benefits:**
635
+ - Reduces visual clutter
636
+ - Focus on what users actually need
637
+ - Power users can still access everything
638
+
639
+ **Testing:**
640
+ 1. Verify accordion starts closed (as configured)
641
+ 2. Click to expand - verify animation
642
+ 3. Verify all controls are accessible when open
643
+ 4. Check that state persists during session
644
+
645
+ ---
646
+
647
+ ### 3.3 Input Validation with Pre-Submission Warnings
648
+
649
+ **Implementation:**
650
+ ```python
651
+ # Add validation message area
652
+ validation_warning = gr.HTML("", visible=False)
653
+
654
+ # Validation function
655
+ def validate_before_submit(file_input, text_input, model_key, mode):
656
+ warnings = []
657
+
658
+ # Get transcript content
659
+ content = ""
660
+ if text_input:
661
+ content = text_input
662
+ elif file_input:
663
+ try:
664
+ with open(file_input, 'r', encoding='utf-8') as f:
665
+ content = f.read()
666
+ except:
667
+ pass
668
+
669
+ if not content:
670
+ return gr.update(visible=False), None
671
+
672
+ # Check model context limits
673
+ model = AVAILABLE_MODELS.get(model_key, {})
674
+ max_context = model.get("max_context", 4096)
675
+
676
+ # Estimate tokens (rough estimate: 1 token ≈ 4 chars for mixed content)
677
+ estimated_tokens = len(content) // 4
678
+
679
+ if estimated_tokens > max_context:
680
+ warning = f"""
681
+ <div class="validation-warning">
682
+ <h3>⚠️ Transcript Exceeds Model Context</h3>
683
+ <p><strong>Estimated tokens:</strong> {estimated_tokens:,}</p>
684
+ <p><strong>Model limit:</strong> {max_context:,} tokens</p>
685
+ <p><strong>Recommendation:</strong> Select a model with larger context (e.g., Hunyuan 256K, ERNIE 131K, Qwen3 4B 256K)</p>
686
+ <p>Continuing will truncate input.</p>
687
+ </div>
688
+ """
689
+ warnings.append(warning)
690
+
691
+ # Check empty transcript
692
+ if not content.strip():
693
+ warning = """
694
+ <div class="validation-warning">
695
+ <h3>⚠️ Empty Transcript</h3>
696
+ <p>Please provide text content before generating summary.</p>
697
+ </div>
698
+ """
699
+ warnings.append(warning)
700
+
701
+ # Check for very short content
702
+ if estimated_tokens < 50:
703
+ warning = """
704
+ <div class="validation-warning info">
705
+ <h3>ℹ️ Very Short Transcript</h3>
706
+ <p>Your transcript is less than 50 tokens. Results may be limited.</p>
707
+ </div>
708
+ """
709
+ warnings.append(warning)
710
+
711
+ if warnings:
712
+ return gr.update(value="<br>".join(warnings), visible=True), None
713
+ else:
714
+ return gr.update(visible=False), content
715
+
716
+ # Add CSS for warnings
717
+ VALIDATION_CSS = """
718
+ .validation-warning {
719
+ background: #fef3c7;
720
+ border: 1px solid #f59e0b;
721
+ border-left: 4px solid #f59e0b;
722
+ padding: 1rem;
723
+ border-radius: var(--radius-md);
724
+ margin: 1rem 0;
725
+ }
726
+
727
+ .validation-warning.info {
728
+ background: #dbeafe;
729
+ border-color: #3b82f6;
730
+ border-left-color: #3b82f6;
731
+ }
732
+
733
+ .validation-warning h3 {
734
+ margin: 0 0 0.5rem 0;
735
+ color: #1f2937;
736
+ }
737
+
738
+ .validation-warning p {
739
+ margin: 0.25rem 0;
740
+ color: #374151;
741
+ }
742
+ """
743
+
744
+ # Wire up validation (run on input change)
745
+ file_input.change(
746
+ fn=lambda f, t, m: validate_before_submit(f, t, m, None)[0],
747
+ inputs=[file_input, text_input, model_dropdown],
748
+ outputs=[validation_warning]
749
+ )
750
+
751
+ text_input.change(
752
+ fn=lambda f, t, m: validate_before_submit(f, t, m, None)[0],
753
+ inputs=[file_input, text_input, model_dropdown],
754
+ outputs=[validation_warning]
755
+ )
756
+
757
+ model_dropdown.change(
758
+ fn=lambda f, t, m: validate_before_submit(f, t, m, None)[0],
759
+ inputs=[file_input, text_input, model_dropdown],
760
+ outputs=[validation_warning]
761
+ )
762
+ ```
763
+
764
+ **Benefits:**
765
+ - Catches issues before wasted generation time
766
+ - Provides clear recommendations
767
+ - Helps users understand model limitations
768
+ - Professional error handling
769
+
770
+ **Testing:**
771
+ 1. Paste very long text (100K+ chars) - should show context limit warning
772
+ 2. Submit empty text - should show empty transcript warning
773
+ 3. Select small model with long text - warning should recommend larger model
774
+ 4. Test that warnings disappear when issue is fixed
775
+ 5. Verify submit button still works even with warnings (user choice)
776
+
777
+ ---
778
+
779
+ ### 3.4 Mobile-First Responsive Improvements
780
+
781
+ **Implementation:**
782
+ ```python
783
+ # Add mobile-specific CSS
784
+ RESPONSIVE_CSS = """
785
+ /* Mobile-first adjustments */
786
+ @media (max-width: 768px) {
787
+ .gradio-container {
788
+ padding: 0.5rem !important;
789
+ }
790
+
791
+ .gradio-row {
792
+ flex-direction: column !important;
793
+ }
794
+
795
+ .gradio-column {
796
+ width: 100% !important;
797
+ }
798
+
799
+ /* Stack configuration panels */
800
+ .configuration-panel {
801
+ order: 2;
802
+ }
803
+
804
+ /* Stack output panels */
805
+ .output-panel {
806
+ order: 1;
807
+ }
808
+
809
+ /* Make mode explanation cards stack */
810
+ .mode-explanation {
811
+ flex-direction: column;
812
+ }
813
+
814
+ /* Make submit button sticky on mobile */
815
+ .submit-btn {
816
+ position: fixed;
817
+ bottom: 0;
818
+ left: 0;
819
+ right: 0;
820
+ border-radius: 0;
821
+ z-index: 1000;
822
+ margin: 0;
823
+ }
824
+
825
+ /* Adjust footer */
826
+ .footer {
827
+ padding-bottom: 4rem; /* Space for sticky button */
828
+ }
829
+
830
+ /* Make section headers smaller on mobile */
831
+ .section-header {
832
+ font-size: 0.9rem;
833
+ padding: 0.5rem;
834
+ }
835
+ }
836
+
837
+ /* Tablet adjustments */
838
+ @media (min-width: 769px) and (max-width: 1024px) {
839
+ .gradio-column {
840
+ padding: 1rem;
841
+ }
842
+
843
+ .submit-btn {
844
+ font-size: 1rem;
845
+ padding: 0.8rem 1.5rem;
846
+ }
847
+ }
848
+ """
849
+
850
+ # Add viewport meta tag for mobile
851
+ gr.HTML("""
852
+ <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0">
853
+ """)
854
+ ```
855
+
856
+ **Benefits:**
857
+ - Better mobile experience
858
+ - Touch-friendly controls
859
+ - Improved readability on small screens
860
+ - Proper viewport scaling
861
+
862
+ **Testing:**
863
+ 1. Test on mobile viewport (375px width)
864
+ 2. Test on tablet viewport (768px width)
865
+ 3. Verify stacking order makes sense (output first, config second)
866
+ 4. Test touch interactions (buttons, sliders)
867
+ 5. Verify no horizontal scrolling
868
+ 6. Check submit button visibility and accessibility on mobile
869
+
870
+ ---
871
+
872
+ ## Testing Strategy
873
+
874
+ ### Test Cases Matrix
875
+
876
+ | Feature | Test Scenario | Expected Result |
877
+ |----------|---------------|------------------|
878
+ | Tooltips | Hover over temp slider | Show "Lower = more focused..." |
879
+ | Copy Feedback | Click copy button | Green toast appears |
880
+ | Debug Toggle | Check/uncheck debug | Panel shows/hides |
881
+ | Word Count | Paste text | Count updates in real-time |
882
+ | Mode Selection | Select modes | Explanation cards appear |
883
+ | Progress Bar | Submit task | Stages light up sequentially |
884
+ | Presets | Click preset buttons | Parameters auto-set |
885
+ | Auto-Load | Search model | Auto-loads best quant |
886
+ | Accordion | Collapse/expand | Smooth animation |
887
+ | Validation | Exceed context | Show warning banner |
888
+ | Mobile | 375px viewport | Stacked layout, sticky button |
889
+
890
+ ### Automated Testing
891
+
892
+ ```python
893
+ # test_ui_features.py
894
+ import gradio
895
+ import requests
896
+
897
+ def test_tooltips():
898
+ """Verify tooltips are present in DOM"""
899
+ response = requests.get("http://localhost:7860")
900
+ assert "tooltip" in response.text.lower()
901
+
902
+ def test_copy_toast():
903
+ """Verify toast CSS is present"""
904
+ response = requests.get("http://localhost:7860")
905
+ assert "slideIn" in response.text # Animation keyframes
906
+
907
+ def test_progress_indicators():
908
+ """Verify stage indicators present"""
909
+ response = requests.get("http://localhost:7860")
910
+ assert "stage-indicator" in response.text
911
+
912
+ def test_validation_warnings():
913
+ """Verify validation CSS present"""
914
+ response = requests.get("http://localhost:7860")
915
+ assert "validation-warning" in response.text
916
+
917
+ if __name__ == "__main__":
918
+ test_tooltips()
919
+ test_copy_toast()
920
+ test_progress_indicators()
921
+ test_validation_warnings()
922
+ print("✅ All UI tests passed")
923
+ ```
924
+
925
+ ### Manual Testing Checklist
926
+
927
+ **Phase 1 Tests:**
928
+ - [ ] Tooltips visible on hover
929
+ - [ ] Copy toast appears and disappears
930
+ - [ ] Debug panel hidden by default
931
+ - [ ] Word count updates in real-time
932
+
933
+ **Phase 2 Tests:**
934
+ - [ ] Mode explanations appear for both modes
935
+ - [ ] Progress bar shows stages correctly
936
+ - [ ] Presets apply correct values
937
+ - [ ] Auto-load workflow smooth
938
+
939
+ **Phase 3 Tests:**
940
+ - [ ] Advanced presets configure all 3 stages
941
+ - [ ] Accordions collapse/expand smoothly
942
+ - [ ] Validation warnings show appropriately
943
+ - [ ] Mobile layout stacks correctly
944
+
945
+ ---
946
+
947
+ ## Implementation Order
948
+
949
+ 1. **Week 1:** Phase 1 (Quick Wins)
950
+ - Day 1-2: Tooltips + Copy feedback
951
+ - Day 3: Debug toggle + Word count
952
+
953
+ 2. **Week 2:** Phase 2 (Medium Effort)
954
+ - Day 1-2: Mode selection + Progress indicators
955
+ - Day 3-4: Presets + Custom model UX
956
+
957
+ 3. **Week 3:** Phase 3 (Larger Changes)
958
+ - Day 1-3: Advanced mode redesign
959
+ - Day 4-5: Collapsible sections + Validation
960
+ - Day 6-7: Mobile improvements
961
+
962
+ ---
963
+
964
+ ## Rollback Plan
965
+
966
+ If issues arise, each change is isolated:
967
+
968
+ ```bash
969
+ # Tag before each phase
970
+ git tag -a phase1-start -m "Before Phase 1 changes"
971
+ git tag -a phase2-start -m "Before Phase 2 changes"
972
+ git tag -a phase3-start -m "Before Phase 3 changes"
973
+
974
+ # Rollback if needed
975
+ git reset --hard phase1-start # Roll back to Phase 1 start
976
+ git reset --hard phase2-start # Roll back to Phase 2 start
977
+ ```
978
+
979
+ ---
980
+
981
+ ## Success Metrics
982
+
983
+ - **User Engagement:** Time on page + button clicks tracked
984
+ - **Error Rate:** Failed submissions decreased by 50%
985
+ - **Feature Adoption:** Advanced Mode usage increased by 30%
986
+ - **User Satisfaction:** Survey after 2 weeks of deployment
987
+ - **Mobile Traffic:** Mobile session length + completion rate
988
+
989
+ ---
990
+
991
+ ## Conclusion
992
+
993
+ This plan provides a structured approach to improving Tiny Scribe's UI/UX with:
994
+ - Clear phases and priorities
995
+ - Specific implementation details
996
+ - Comprehensive testing strategy
997
+ - Rollback procedures
998
+ - Success metrics
999
+
1000
+ Ready to begin Phase 1 implementation when approved.