Spaces:

Luigi
/

tiny-scribe

Running

App Files Files Community

Luigi commited on Feb 2

Commit

f21446b

1 Parent(s): 0733923

Improve UI/UX: Modern glassmorphism design, added Paste Text tab, and optimized visual hierarchy

Browse files

Files changed (5) hide show

.opencode/plans/debug_and_custom_model.md +439 -0
.opencode/plans/fix_custom_model_info.md +286 -0
.opencode/plans/redesign_custom_gguf_loader.md +309 -0
GEMINI.md +74 -0
app.py +171 -300

.opencode/plans/debug_and_custom_model.md ADDED Viewed

	@@ -0,0 +1,439 @@

+# Implementation Plan: Debug System Prompt & Custom GGUF Loader
+## Feature 1: Debug System Prompt Display
+### Purpose
+Show users the exact system prompt that will be sent to the LLM for transparency and debugging.
+### Current State
+The system prompt is built inline in `summarize_streaming()` (lines ~903-916) but never exposed to the UI.
+### Implementation Plan
+#### Step 1: Extract Prompt Builder Function
+**Location**: Add new function in `app.py` around line 880
+```python
+def build_system_prompt(length: str, format_type: str, language: str, enable_reasoning: bool, supports_think_tags: bool) -> str:
+    """Build the system prompt that will be sent to the LLM.
+    Args:
+        length: "tiny", "short", "medium", "long"
+        format_type: "bullets", "paragraph", "structured"
+        language: "en", "zh-TW"
+        enable_reasoning: Whether reasoning mode is enabled
+        supports_think_tags: Whether the model supports <think> tags
+    Returns:
+        The complete system prompt string
+    """
+    # Length configurations (existing)
+    length_prompts = {
+        "tiny": f"""Provide a {format_type} summary in 2-3 sentences covering:
+- Main topic and key points
+- Most important finding or conclusion
+- Practical takeaway""",
+        "short": f"""Provide a {format_type} summary in 3-5 sentences covering:
+- Main topic and purpose
+- 2-3 key points or findings
+- Conclusion or recommendation""",
+        "medium": f"""Provide a {format_type} summary in 1-2 paragraphs covering:
+- Main topic and context
+- Key points with brief explanations
+- Supporting details
+- Conclusions and recommendations""",
+        "long": f"""Provide a comprehensive {format_type} summary in 3-4 paragraphs covering:
+- Background and context
+- All major points with detailed explanations
+- Supporting evidence and examples
+- Different perspectives if present
+- Conclusions, implications, and actionable recommendations""",
+    }
+    base_prompt = length_prompts.get(length, length_prompts["medium"])
+    if language == "zh-TW":
+        if enable_reasoning and supports_think_tags:
+            system_content = f"You are a helpful assistant that summarizes transcripts. First think through the content in <thinking> tags, then provide the summary.\n\n{base_prompt}\n\nPlease respond in Traditional Chinese (Taiwan)."
+        else:
+            system_content = f"You are a helpful assistant that summarizes transcripts.\n\n{base_prompt}\n\nPlease respond in Traditional Chinese (Taiwan)."
+    else:
+        if enable_reasoning and supports_think_tags:
+            system_content = f"You are a helpful assistant that summarizes transcripts. First think through the content in <thinking> tags, then provide the summary.\n\n{base_prompt}"
+        else:
+            system_content = f"You are a helpful assistant that summarizes transcripts.\n\n{base_prompt}"
+    return system_content
+```
+#### Step 2: Refactor summarize_streaming()
+**Location**: Lines ~903-916 in `app.py`
+Replace inline prompt building with call to `build_system_prompt()`:
+```python
+# OLD CODE (to replace):
+length_prompts = {...}  # Remove this dict
+# ... if language == "zh-TW": logic ...
+# NEW CODE:
+system_content = build_system_prompt(
+    length=length,
+    format_type=format_type,
+    language=language,
+    enable_reasoning=enable_reasoning,
+    supports_think_tags=supports_think_tags
+)
+```
+#### Step 3: Add UI Component
+**Location**: In the right column interface, after the summary output (around line 1370)
+Add a collapsible accordion:
+```python
+with gr.Accordion("Debug: System Prompt", open=False):
+    system_prompt_debug = gr.Textbox(
+        label="System Prompt (Read-Only)",
+        lines=10,
+        max_lines=20,
+        interactive=False,
+        show_copy_button=True,
+        value="Click 'Generate Summary' to see the system prompt that will be used."
+    )
+```
+#### Step 4: Update Event Handlers
+**Location**: In `generate_summary()` function
+Pass the built system prompt to the output:
+```python
+def generate_summary(model_key, thread_config, custom_threads, transcript_text,
+                    summary_length, output_format, language, enable_reasoning,
+                    enable_streaming, progress=gr.Progress()):
+    # ... existing code ...
+    # Build system prompt for display
+    selected_model = AVAILABLE_MODELS[model_key]
+    supports_think_tags = selected_model.get("supports_toggle", False) or selected_model.get("supports_reasoning", False)
+    system_prompt_preview = build_system_prompt(
+        length=summary_length,
+        format_type=output_format,
+        language=language,
+        enable_reasoning=enable_reasoning,
+        supports_think_tags=supports_think_tags
+    )
+    # ... rest of summarization logic ...
+    # Return the system prompt along with other outputs
+    yield final_summary, thinking_text, json_output, system_prompt_preview, status_msg
+```
+#### Step 5: Update Gradio Outputs
+**Location**: Line ~1435
+Add `system_prompt_debug` to outputs list:
+```python
+outputs=[summary_output, thinking_output, json_output, system_prompt_debug, status_message]
+```
+---
+## Feature 2: Custom GGUF Loader from HuggingFace
+### Purpose
+Allow users to load any GGUF model from HuggingFace, not just the predefined list.
+### Implementation Plan
+#### Step 1: Add Custom Model Option
+**Location**: In AVAILABLE_MODELS dict (around line 120)
+Add as the last entry:
+```python
+AVAILABLE_MODELS = {
+    # ... existing models ...
+    "custom_hf": {
+        "display": "Custom HF GGUF...",
+        "repo_id": None,  # Will be provided by user
+        "filename": None,  # Will be provided by user
+        "quantization": None,
+        "description": "Load any GGUF model from HuggingFace",
+        "size_mb": 0,  # Unknown
+        "n_gpu_layers": 0,
+        "n_ctx": 8192,
+        "max_tokens": 4096,
+        "supports_reasoning": False,
+        "supports_toggle": False,
+    },
+}
+```
+#### Step 2: Add Custom Model UI Components
+**Location**: In the left column, after model dropdown (around line 1270)
+```python
+# Custom model inputs (hidden by default)
+with gr.Group(visible=False) as custom_model_group:
+    gr.Markdown("### Custom HuggingFace Model")
+    custom_repo_id = gr.Textbox(
+        label="HuggingFace Repo ID",
+        placeholder="e.g., unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF",
+        info="The HuggingFace repository containing the GGUF file",
+    )
+    custom_filename = gr.Textbox(
+        label="GGUF Filename Pattern",
+        placeholder="e.g., *Q4_K_M.gguf or exact filename",
+        info="Use * as wildcard or provide exact filename",
+    )
+    custom_load_btn = gr.Button("Load Custom Model", variant="primary")
+    custom_error_message = gr.Textbox(
+        label="Status",
+        interactive=False,
+        visible=False,
+    )
+    custom_retry_btn = gr.Button("Retry", variant="secondary", visible=False)
+```
+#### Step 3: Add Visibility Toggle Handler
+**Location**: Add new event handler around line 1490
+```python
+def update_custom_model_visibility(model_key):
+    """Show/hide custom model inputs based on selection."""
+    is_custom = model_key == "custom_hf"
+    return gr.update(visible=is_custom)
+# Add event handler
+model_dropdown.change(
+    update_custom_model_visibility,
+    inputs=[model_dropdown],
+    outputs=[custom_model_group],
+)
+```
+#### Step 4: Create Custom Model Loader Function
+**Location**: Add new function around line 710
+```python
+def load_custom_model(repo_id: str, filename: str, cpu_only: bool = False) -> Tuple[Optional[Llama], str]:
+    """Load a custom GGUF model from HuggingFace.
+    Args:
+        repo_id: HuggingFace repository ID
+        filename: Filename pattern or exact name
+        cpu_only: Whether to use CPU only
+    Returns:
+        Tuple of (model_instance, error_message)
+        If successful, error_message is empty string
+        If failed, model_instance is None
+    """
+    if not repo_id or not filename:
+        return None, "❌ Error: Please provide both Repo ID and Filename"
+    # Validate repo_id format
+    if "/" not in repo_id:
+        return None, "❌ Error: Repo ID must be in format 'username/repo-name'"
+    try:
+        n_gpu_layers = 0 if cpu_only else -1
+        n_ctx = 8192  # Conservative default for custom models
+        n_batch = 512
+        llm = Llama.from_pretrained(
+            repo_id=repo_id,
+            filename=filename,
+            n_gpu_layers=n_gpu_layers,
+            n_ctx=n_ctx,
+            n_batch=n_batch,
+            verbose=False,
+        )
+        return llm, ""
+    except Exception as e:
+        error_msg = str(e)
+        if "not found" in error_msg.lower():
+            return None, f"❌ Error: Model or file not found. Check repo_id and filename.\nDetails: {error_msg}"
+        elif "permission" in error_msg.lower() or "access" in error_msg.lower():
+            return None, f"❌ Error: Cannot access model. It may be private or gated.\nDetails: {error_msg}"
+        else:
+            return None, f"❌ Error loading model: {error_msg}"
+```
+#### Step 5: Add Custom Model Loading Handler
+**Location**: Add around line 1510
+```python
+def handle_custom_model_load(repo_id, filename, cpu_only):
+    """Handle custom model loading with error display and retry option."""
+    llm, error = load_custom_model(repo_id, filename, cpu_only)
+    if llm is None:
+        # Show error and retry button
+        return (
+            gr.update(visible=True, value=error),  # error_message
+            gr.update(visible=True),  # retry_btn
+            None,  # model_instance (store somewhere accessible)
+        )
+    else:
+        # Success - hide error, show success message
+        return (
+            gr.update(visible=True, value="✅ Model loaded successfully!"),
+            gr.update(visible=False),  # retry_btn
+            llm,  # Store model instance
+        )
+custom_load_btn.click(
+    handle_custom_model_load,
+    inputs=[custom_repo_id, custom_filename, cpu_only_checkbox],
+    outputs=[custom_error_message, custom_retry_btn, model_state],  # model_state is gr.State()
+)
+custom_retry_btn.click(
+    handle_custom_model_load,
+    inputs=[custom_repo_id, custom_filename, cpu_only_checkbox],
+    outputs=[custom_error_message, custom_retry_btn, model_state],
+)
+```
+#### Step 6: Update Generate Summary for Custom Models
+**Location**: In `generate_summary()` function
+Modify to handle custom models:
+```python
+def generate_summary(model_key, thread_config, custom_threads, transcript_text,
+                    summary_length, output_format, language, enable_reasoning,
+                    enable_streaming, custom_repo_id=None, custom_filename=None,
+                    progress=gr.Progress()):
+    if model_key == "custom_hf":
+        # Load custom model
+        llm, error = load_custom_model(custom_repo_id, custom_filename, cpu_only)
+        if llm is None:
+            yield "", "", "", "", error
+            return
+    else:
+        # Use predefined model
+        model_info = AVAILABLE_MODELS[model_key]
+        llm = load_model_from_config(model_info)
+    # ... rest of the function ...
+```
+#### Step 7: Update UI to Pass Custom Model Values
+**Location**: Line ~1429
+Add custom inputs to the generate summary call:
+```python
+generate_btn.click(
+    fn=generate_summary,
+    inputs=[
+        model_dropdown,
+        thread_config,
+        custom_n_threads,
+        transcript_input,
+        summary_length,
+        output_format,
+        language,
+        reasoning_checkbox,
+        streaming_toggle,
+        custom_repo_id,      # NEW
+        custom_filename,     # NEW
+    ],
+    outputs=[...]
+)
+```
+#### Step 8: Update generate_summary signature
+**Location**: Function definition around line 870
+Update function signature to accept custom model parameters:
+```python
+def generate_summary(
+    model_key: str,
+    thread_config: str,
+    custom_threads: int,
+    transcript_text: str,
+    summary_length: str,
+    output_format: str,
+    language: str,
+    enable_reasoning: bool,
+    enable_streaming: bool,
+    custom_repo_id: Optional[str] = None,      # NEW
+    custom_filename: Optional[str] = None,     # NEW
+    progress: gr.Progress = gr.Progress(),
+) -> Generator:
+```
+#### Step 9: Update Model State Management
+**Location**: Add near other state declarations (around line 1250)
+```python
+# Store loaded model to avoid reloading on each generation
+model_state = gr.State(None)
+```
+---
+## Implementation Order
+1. **Feature 1 First** - Debug System Prompt (simpler, self-contained)
+   - Step 1: Create `build_system_prompt()` function
+   - Step 2: Refactor `summarize_streaming()` to use it
+   - Step 3: Add UI accordion component
+   - Step 4: Update event handlers and outputs
+2. **Feature 2 Second** - Custom GGUF Loader (more complex)
+   - Step 1: Add "custom_hf" to AVAILABLE_MODELS
+   - Step 2: Add UI components for custom model inputs
+   - Step 3: Add visibility toggle handler
+   - Step 4: Create `load_custom_model()` function
+   - Step 5: Add load/retry handlers
+   - Step 6: Update generate_summary for custom models
+   - Step 7: Update UI inputs
+   - Step 8: Update function signature
+   - Step 9: Add model state management
+---
+## Testing Plan
+### Feature 1 Tests
+1. Select different models, verify system prompt updates correctly
+2. Toggle reasoning mode, verify /think or /no_think appears
+3. Change language, verify Traditional Chinese prompt appears
+4. Change length/format, verify prompt content changes
+5. Verify prompt is read-only and copyable
+### Feature 2 Tests
+1. Select "Custom HF GGUF...", verify inputs appear
+2. Enter invalid repo_id, verify error message with retry button
+3. Enter valid but non-existent model, verify error
+4. Enter valid model with wrong filename, verify error
+5. Enter valid model with correct filename, verify success
+6. Click retry after error, verify it retries
+7. Test fallback to predefined models still works
+---
+## Risk Mitigation
+1. **Custom model loading failures**: Already handled with try/except and user-friendly error messages
+2. **Memory issues with large custom models**: Use conservative defaults (n_ctx=8192, CPU-only for HF Spaces)
+3. **UI clutter**: Custom model inputs hidden by default, only show when selected
+4. **Breaking existing functionality**: Feature 1 is additive only, Feature 2 extends existing paths without changing them
+---
+## Files to Modify
+- `/home/luigi/tiny-scribe/app.py` - Main implementation file
+## Estimated Lines Changed
+- Feature 1: ~50 lines added, ~20 lines modified
+- Feature 2: ~150 lines added, ~30 lines modified
+Total: ~250 lines of code changes

.opencode/plans/fix_custom_model_info.md ADDED Viewed

	@@ -0,0 +1,286 @@

+# Implementation Plan: Fix Model Information for Custom GGUF Models
+## Overview
+Fix the bug where "Model Information" remains empty when a custom GGUF model is loaded.
+**Selected Approach**: Option A - Store metadata in Gradio State variables
+**UI Style**: Cards/panels with dense layout
+**Priority**: Bug fix first, then UI improvements
+---
+## Bug Analysis
+### Problem
+`get_model_info()` reads from static `AVAILABLE_MODELS["custom_hf"]` which has:
+- `repo_id = None`
+- `filename = None`
+The actual values entered by the user are never stored or passed to the info display function.
+### Solution
+Store actual custom model metadata in dedicated Gradio State variables and pass them to `get_model_info()` when `model_key == "custom_hf"`.
+---
+## Implementation Steps
+### Step 1: Add Custom Model Metadata State
+**Location**: UI section (~line 1730), alongside other states
+```python
+# Custom model metadata state (stores actual repo_id and filename when loaded)
+custom_model_metadata = gr.State({
+    "repo_id": None,
+    "filename": None,
+    "size_mb": 0,
+})
+```
+### Step 2: Modify `get_model_info()` Function
+**Location**: ~line 904
+**Current signature**:
+```python
+def get_model_info(model_key: str, n_threads: int = 2):
+```
+**New signature**:
+```python
+def get_model_info(model_key: str, n_threads: int = 2, custom_metadata: dict = None):
+```
+**Logic change** (inside function):
+```python
+if model_key == "custom_hf" and custom_metadata:
+    # Use actual metadata from loaded custom model
+    repo_id = custom_metadata.get("repo_id", "Not loaded")
+    filename = custom_metadata.get("filename", "Not selected")
+    size_mb = custom_metadata.get("size_mb", 0)
+    # Parse quantization from filename
+    quant = parse_quantization(filename) if filename else "Unknown"
+    info_text = (
+        f"## 🤖 Custom HF GGUF Model\n\n"
+        f"### 📊 Model Metadata\n"
+        f"| Property | Value |\n"
+        f"|----------|-------|\n"
+        f"| **Repository** | `{repo_id}` |\n"
+        f"| **GGUF File** | `{filename}` |\n"
+        f"| **Quantization** | `{quant}` |\n"
+        f"| **File Size** | {size_mb:.1f} MB |\n"
+        f"| **Context** | 8,192 tokens |\n"
+        f"| **Threads** | {n_threads} |\n\n"
+        f"⚠️ Note: Custom models use conservative defaults (CPU-only, smaller context)."
+    )
+else:
+    # Use existing logic for predefined models
+    m = AVAILABLE_MODELS[model_key]
+    # ... existing code ...
+```
+### Step 3: Update `load_custom_model_selected()` to Store Metadata
+**Location**: Event handler section (~line 1927)
+**Current function** (simplified):
+```python
+def load_custom_model_selected(repo_id, selected_file_display, files_data):
+    filename = selected_file_display.split(" | ")[0].replace("📄 ", "").strip()
+    llm, load_msg = load_custom_model_from_hf(repo_id, filename, n_threads)
+    if llm is None:
+        return gr.update(visible=True, value=error), gr.update(visible=True), None
+    else:
+        return gr.update(visible=True, value=success), gr.update(visible=False), llm
+```
+**New function**:
+```python
+def load_custom_model_selected(repo_id, selected_file_display, files_data):
+    filename = selected_file_display.split(" | ")[0].replace("📄 ", "").strip()
+    # Extract size from files_data
+    size_mb = 0
+    for f in files_data:
+        if f["name"] == filename:
+            size_mb = f.get("size_mb", 0)
+            break
+    llm, load_msg = load_custom_model_from_hf(repo_id, filename, n_threads)
+    if llm is None:
+        return (
+            gr.update(visible=True, value=f"❌ {load_msg}"),
+            gr.update(visible=True),
+            None,
+            {"repo_id": None, "filename": None, "size_mb": 0},  # Clear metadata
+        )
+    else:
+        # Create metadata dict
+        metadata = {
+            "repo_id": repo_id,
+            "filename": filename,
+            "size_mb": size_mb,
+        }
+        return (
+            gr.update(visible=True, value=f"✅ {load_msg}"),
+            gr.update(visible=False),
+            llm,
+            metadata,  # Return metadata to store in state
+        )
+```
+**Update the click event handler**:
+```python
+load_btn.click(
+    fn=load_custom_model_selected,
+    inputs=[model_search_input, custom_file_dropdown, custom_repo_files],
+    outputs=[custom_status, retry_btn, custom_model_state, custom_model_metadata],  # Added metadata
+)
+retry_btn.click(
+    fn=load_custom_model_selected,
+    inputs=[model_search_input, custom_file_dropdown, custom_repo_files],
+    outputs=[custom_status, retry_btn, custom_model_state, custom_model_metadata],  # Added metadata
+)
+```
+### Step 4: Update Model Info Display Event Handler
+**Location**: ~line 1711, `update_settings_on_model_change()` function
+**Current**:
+```python
+def update_settings_on_model_change(model_key, n_threads):
+    info, _, _, _ = get_model_info(model_key, n_threads=n_threads)
+    # ... return info ...
+```
+**New**:
+```python
+def update_settings_on_model_change(model_key, n_threads, custom_metadata):
+    info, _, _, _ = get_model_info(model_key, n_threads=n_threads, custom_metadata=custom_metadata)
+    # ... return info ...
+```
+**Update the event handler**:
+```python
+model_dropdown.change(
+    fn=update_settings_on_model_change,
+    inputs=[model_dropdown, n_threads_display, custom_model_metadata],  # Added metadata
+    outputs=[info_output, max_tokens, reasoning_checkbox, n_ctx_display,
+             thinking_accordion, thinking_output, enable_reasoning],
+)
+```
+### Step 5: Update Submit Button Handler
+**Location**: ~line 2010
+**Update inputs to include custom_model_metadata**:
+```python
+submit_btn.click(
+    fn=summarize_streaming,
+    inputs=[file_input, model_dropdown, enable_reasoning, max_tokens, temperature_slider,
+            top_p, top_k, language_selector, thread_config_dropdown, custom_threads_slider,
+            custom_model_state, custom_model_metadata],  # Added metadata
+    outputs=[thinking_output, summary_output, info_output, metrics_state, system_prompt_debug],
+    show_progress="full"
+)
+```
+### Step 6: Update `summarize_streaming()` Function
+**Location**: ~line 1080
+**Update function signature**:
+```python
+def summarize_streaming(
+    file_obj,
+    model_key: str,
+    enable_reasoning: bool = True,
+    max_tokens: int = 2048,
+    temperature: float = 0.6,
+    top_p: float = None,
+    top_k: int = None,
+    output_language: str = "en",
+    thread_config: str = "free",
+    custom_threads: int = 4,
+    custom_model_state: Any = None,
+    custom_model_metadata: dict = None,  # NEW parameter
+) -> Generator[Tuple[str, str, str, dict, str], None, None]:
+```
+**Update model info generation**:
+```python
+# Get base model info with current thread configuration
+info_text, _, _, _ = get_model_info(model_key, n_threads=n_threads, custom_metadata=custom_model_metadata)
+```
+---
+## Files to Modify
+1. **app.py** - Main changes:
+   - Line ~1730: Add `custom_model_metadata` state
+   - Line ~904: Modify `get_model_info()` signature and logic
+   - Line ~1927: Update `load_custom_model_selected()` to return metadata
+   - Line ~1711: Update `update_settings_on_model_change()` to accept metadata
+   - Line ~1080: Update `summarize_streaming()` signature
+   - Line ~2010: Update submit button event handler inputs
+---
+## Testing Plan
+1. Select "🔧 Custom HF GGUF..." from model dropdown
+2. Type "llama" in search box and select a model
+3. Verify file dropdown auto-populates
+4. Select a GGUF file
+5. Click "Load Selected Model"
+6. **Verify Model Information now shows**:
+   - Repository: actual repo ID
+   - GGUF File: actual filename
+   - Quantization: parsed from filename
+   - File Size: actual size
+   - Context: 8192 tokens
+   - Note about conservative defaults
+7. Generate a summary
+8. Verify model info remains correct during and after generation
+9. Switch to a different predefined model
+10. Verify model info updates correctly
+11. Switch back to custom model
+12. Verify it shows "Not loaded" state until new custom model is loaded
+---
+## Lines to Modify
+| Function/Component | Line Range | Changes |
+|-------------------|------------|---------|
+| State declarations | ~1730 | Add `custom_model_metadata` |
+| `get_model_info()` | ~904-947 | Add `custom_metadata` param, handle custom_hf |
+| `load_custom_model_selected()` | ~1927-1960 | Return metadata dict |
+| Load button click | ~1970 | Add `custom_model_metadata` to outputs |
+| `update_settings_on_model_change()` | ~1711-1720 | Accept metadata param |
+| Model dropdown change | ~1721 | Add `custom_model_metadata` to inputs |
+| `summarize_streaming()` signature | ~1080 | Add `custom_model_metadata` param |
+| Submit button click | ~2010 | Add `custom_model_metadata` to inputs |
+---
+## Expected Result
+After the fix, when a custom GGUF model is loaded:
+- ✅ Model Information displays actual repo_id and filename
+- ✅ Quantization level is parsed and shown
+- ✅ File size is displayed
+- ✅ Context window shows correct value
+- ✅ Information updates correctly when switching models
+---
+Ready to implement the bug fix? Say **"implement bug fix"** and I'll proceed!

.opencode/plans/redesign_custom_gguf_loader.md ADDED Viewed

	@@ -0,0 +1,309 @@

+# Implementation Plan: Redesign Custom HF GGUF Loader
+## Overview
+Redesign the custom GGUF model query to use the native `gradio_huggingfacehub_search` component, matching the UX of gguf-my-repo space.
+**Selected Approach**: Option A1 + Flow 1
+- Use native HF search component
+- Search ALL models on HF Hub
+- Auto-discover GGUF files after selection
+## Current Issues
+### Problems with Current Implementation
+1. **Complexity**: Manual textbox + search results dropdown + manual file discovery
+2. **UX friction**: Too many steps, confusing flow
+3. **Maintenance burden**: Custom search logic, event handlers, caching
+4. **Performance**: Multiple API calls without optimization
+### What Users Want
+1. **Simple search**: Type model name, see suggestions
+2. **Auto-discovery**: Select model → automatically see available GGUF files
+3. **Quick precision selection**: Choose from discovered files
+4. **Load and go**: One click to load selected GGUF
+---
+## New Design
+### User Flow (Flow 1)
+```
+1. Select "🔧 Custom HF GGUF..." from model dropdown
+   ↓
+2. Type model name in HuggingfaceHubSearch component
+   ↓
+3. See real-time search suggestions from ALL HF models
+   ↓
+4. Select a model from suggestions
+   ↓
+5. Auto-trigger: Discover all GGUF files in that repo
+   ↓
+6. See GGUF files dropdown populated (alphabetically sorted)
+   ↓
+7. Select desired precision/quantization
+   ↓
+8. Click "⬇️ Load Selected Model"
+   ↓
+9. Model loads, ready to use!
+```
+### UI Components
+```
+[Model Dropdown: "🔧 Custom HF GGUF..." selected]
+  ↓
+┌─────────────────────────────────────┐
+│ 🔍 Search HuggingFace Models        │
+│ [HuggingfaceHubSearch Component]    │
+│ Type to search all HF models...     │
+└─────────────────────────────────────┘
+  ↓ (after selection)
+┌─────────────────────────────────────┐
+│ 📦 Available GGUF Files             │
+│ [Dropdown with quant options]       │
+│ e.g., model-Q4_K_M.gguf (4.2GB)    │
+└─────────────────────────────────────┘
+  ↓
+[⬇️ Load Selected Model] [Status: Ready]
+```
+---
+## Technical Implementation
+### 1. Dependencies
+**requirements.txt additions:**
+```
+gradio-huggingfacehub-search>=0.1.0
+```
+**Import:**
+```python
+from gradio_huggingfacehub_search import HuggingfaceHubSearch
+```
+### 2. Remove Old Components
+**Remove:**
+1. `custom_repo_id` textbox
+2. `model_search_results` dropdown
+3. `discover_btn` button (or repurpose)
+4. All custom search functions:
+   - `get_popular_gguf_models()`
+   - `search_gguf_models()`
+   - `search_models_dynamic()`
+   - `on_model_selected_from_search()`
+**Keep:**
+- `custom_file_dropdown` - for selecting GGUF precision
+- `custom_repo_files` state - for storing file metadata
+- `custom_model_state` state - for loaded model
+- `load_btn` and `retry_btn` - for loading model
+- `custom_status` - for status messages
+### 3. Add New Component
+**Location:** In custom_model_group (replacing old textbox)
+```python
+# NEW: Native HF Hub Search Component
+model_search_input = HuggingfaceHubSearch(
+    label="🔍 Search HuggingFace Models",
+    placeholder="Type model name to search (e.g., 'llama', 'qwen')",
+    search_type="model",
+    # Optional: Add filters
+    # filter="gguf"  # if component supports filtering
+)
+# Keep file dropdown (but update label)
+custom_file_dropdown = gr.Dropdown(
+    label="📦 Select GGUF File (Precision)",
+    choices=[],
+    value=None,
+    info="Available GGUF files will appear after selecting a model",
+    interactive=True,
+)
+```
+### 4. New Event Handler
+**Flow:** model_search_input.change → auto-discover files
+```python
+def on_model_selected(repo_id):
+    """Handle model selection from HuggingfaceHubSearch.
+    Automatically discovers GGUF files in the selected repo.
+    """
+    if not repo_id:
+        return (
+            gr.update(choices=[], value=None),
+            [],
+            gr.update(visible=False),
+        )
+    # Show searching status
+    yield (
+        gr.update(choices=["Searching for GGUF files..."], value=None, interactive=False),
+        [],
+        gr.update(visible=True, value="🔍 Discovering GGUF files..."),
+    )
+    # Discover files
+    files, error = list_repo_gguf_files(repo_id)
+    if error:
+        yield (
+            gr.update(choices=[], value=None, interactive=True),
+            [],
+            gr.update(visible=True, value=f"❌ {error}"),
+        )
+    elif not files:
+        yield (
+            gr.update(choices=[], value=None, interactive=True),
+            [],
+            gr.update(visible=True, value="❌ No GGUF files in this repository"),
+        )
+    else:
+        # Format and show files
+        choices = [format_file_choice(f) for f in files]
+        yield (
+            gr.update(choices=choices, value=choices[0] if choices else None, interactive=True),
+            files,
+            gr.update(visible=True, value=f"✅ Found {len(files)} GGUF files! Select one and click 'Load Model'"),
+        )
+# Connect event handler
+model_search_input.change(
+    fn=on_model_selected,
+    inputs=[model_search_input],
+    outputs=[custom_file_dropdown, custom_repo_files, custom_status],
+)
+```
+### 5. Update Load Function
+**Current:** `load_custom_model_selected()` extracts filename from display string
+**Keep as-is** - already works correctly:
+```python
+def load_custom_model_selected(repo_id, selected_file_display, files_data):
+    """Load the selected custom model."""
+    # Extract filename from display string
+    filename = selected_file_display.split(" | ")[0].replace("📄 ", "").strip()
+    # ... rest of loading logic
+```
+### 6. Simplified UI Layout
+```python
+with gr.Group(visible=False) as custom_model_group:
+    gr.HTML('<div class="section-header" style="margin-top: 20px;"><span class="section-icon">🔧</span> Load Custom GGUF Model</div>')
+    # Step 1: Search models
+    model_search_input = HuggingfaceHubSearch(...)
+    # Step 2: Select GGUF file (auto-populated)
+    custom_file_dropdown = gr.Dropdown(...)
+    # Step 3: Load button
+    with gr.Row():
+        load_btn = gr.Button("⬇️ Load Selected Model", variant="primary")
+        retry_btn = gr.Button("🔄 Retry", variant="secondary", visible=False)
+    # Status
+    custom_status = gr.Textbox(label="Status", interactive=False, visible=False)
+    # Hidden states
+    custom_repo_files = gr.State([])
+    custom_model_state = gr.State(None)
+```
+---
+## Files to Modify
+### 1. requirements.txt
+Add dependency:
+```
+gradio-huggingfacehub-search>=0.1.0
+```
+### 2. app.py
+**Import section (~line 1-15):**
+```python
+from gradio_huggingfacehub_search import HuggingfaceHubSearch
+```
+**Remove functions (~lines 30-200):**
+- Remove `get_popular_gguf_models()`
+- Remove `search_gguf_models()`
+- Remove `POPULAR_GGUF_MODELS` cache
+- Remove `search_models_dynamic()`
+- Remove `on_model_selected_from_search()`
+- Keep `list_repo_gguf_files()` - still needed
+- Keep `parse_quantization()` - still needed
+- Keep `format_file_choice()` - still needed
+- Keep `load_custom_model_from_hf()` - still needed
+**UI section (~lines 1590-1610):**
+Replace old components with new design
+**Event handlers (~lines 1950-2050):**
+Replace search event handlers with simplified version
+---
+## Migration Checklist
+- [ ] Add `gradio-huggingfacehub-search` to requirements.txt
+- [ ] Add import statement
+- [ ] Remove unused search functions (3 functions)
+- [ ] Remove unused cache variables
+- [ ] Replace `custom_repo_id` + `model_search_results` with `HuggingfaceHubSearch`
+- [ ] Update custom model UI group
+- [ ] Simplify event handlers
+- [ ] Test search and file discovery flow
+- [ ] Verify model loading still works
+- [ ] Update documentation/comments
+---
+## Benefits of This Redesign
+1. **Better UX**: Native HF search component, professional look
+2. **Less code**: Remove ~150 lines of custom search logic
+3. **Better performance**: Component handles debouncing and caching
+4. **Easier maintenance**: Community-maintained search component
+5. **More reliable**: Uses official HF component
+6. **Simpler flow**: One search box, auto-discovery, select and load
+---
+## Testing Plan
+1. Select "🔧 Custom HF GGUF..." from model dropdown
+2. Type "llama" in search box
+3. Verify suggestions appear (any HF models with "llama" in name)
+4. Select a model (e.g., "meta-llama/Llama-2-7b-hf")
+5. Verify GGUF files auto-discover (if any exist)
+6. Select a GGUF file
+7. Click "Load Selected Model"
+8. Verify model loads successfully
+9. Test with models that have no GGUF files (should show error)
+10. Test error handling for invalid repo IDs
+---
+## Questions Before Implementation
+1. **Requirements check**: Should I add `gradio-huggingfacehub-search` to requirements.txt now, or will you handle dependencies?
+2. **Component customization**: The HuggingfaceHubSearch component may allow custom filters. Should we try to filter for models that might have GGUF files, or search all models?
+3. **Manual discovery button**: Keep the "Discover Files" button as a backup option, or remove it since search is now automatic?
+4. **Ready to implement?** Say "implement the redesign" and I'll proceed with the refactoring.

GEMINI.md ADDED Viewed

	@@ -0,0 +1,74 @@

+# Tiny Scribe - Project Context
+## Project Overview
+**Tiny Scribe** is a lightweight, local LLM-powered transcript summarization tool. It is designed to run efficiently on standard hardware (including free CPU tiers on HuggingFace Spaces) using GGUF quantized models.
+The project features a web interface (Gradio) and a CLI tool, supporting over 24 models ranging from 100M to 30B parameters. It includes specialized features like live streaming, reasoning mode (thinking) for supported models, and dual-language output (English/Traditional Chinese).
+## Tech Stack
+*   **Language:** Python 3.10+
+*   **UI Framework:** Gradio (Web), `argparse` (CLI)
+*   **Inference Engine:** `llama-cpp-python` (Python bindings for `llama.cpp`)
+*   **Model Format:** GGUF (Quantized)
+*   **Containerization:** Docker (optimized for HuggingFace Spaces)
+*   **Utilities:** `opencc` (Chinese conversion), `huggingface_hub`
+## Key Files & Directories
+*   `app.py`: The main entry point for the Gradio web application. Contains the UI layout, model loading logic, and generation pipeline.
+*   `summarize_transcript.py`: Command-line interface for batch processing or local summarization without the web UI.
+*   `Dockerfile`: Defines the build environment. **Crucial:** It installs a specific pre-compiled wheel for `llama-cpp-python` to ensure compatibility and performance on HF Spaces (Free CPU tier).
+*   `deploy.sh`: Helper script to stage, commit, and push changes to the HuggingFace Space. Enforces non-generic commit messages.
+*   `requirements.txt`: Python dependencies (excluding `llama-cpp-python` which is handled specially in Docker).
+*   `transcripts/`: Directory for storing input transcript files.
+*   `AGENTS.md` / `CLAUDE.md`: Existing context files for other AI assistants.
+## Build & Run Instructions
+### 1. Installation
+The project relies on `llama-cpp-python`. For local development, you must install it separately, as it's not in `requirements.txt` to avoid build errors on systems without compilers.
+```bash
+# Install general dependencies
+pip install -r requirements.txt
+# Install llama-cpp-python (with CUDA support if available, otherwise CPU)
+# See: https://github.com/abetlen/llama-cpp-python#installation
+pip install llama-cpp-python
+```
+### 2. Running the Web UI
+```bash
+python app.py
+# Access at http://localhost:7860
+```
+### 3. Running the CLI
+```bash
+# Basic English summary
+python summarize_transcript.py -i transcripts/your_file.txt
+# Traditional Chinese output
+python summarize_transcript.py -i transcripts/your_file.txt -l zh-TW
+# Use a specific model
+python summarize_transcript.py -i transcripts/your_file.txt -m "unsloth/Qwen3-1.7B-GGUF"
+```
+### 4. Deployment (HuggingFace Spaces)
+Always use the provided script to ensure clean commits and deployment:
+```bash
+./deploy.sh "Your descriptive commit message"
+```
+## Model Architecture & Categories
+The project categorizes models to help users balance speed vs. quality:
+*   **Tiny (0.1-0.6B):** Extremely fast, good for simple formatting (e.g., Qwen3-0.6B).
+*   **Compact (1.5-2.6B):** Good balance for free tier (e.g., Granite-3.1-1B, Qwen3-1.7B).
+*   **Standard (3-7B):** Higher quality, slower on CPU (e.g., Llama-3-8B variants).
+*   **Medium (21-30B):** High performance, requires significant RAM (e.g., Command R, Qwen-30B).
+## Development Conventions
+*   **Dependency Management:** `llama-cpp-python` is pinned in the `Dockerfile` via a custom wheel URL. Do not add it to `requirements.txt` unless you are changing the build strategy.
+*   **Code Style:** The project uses `ruff` for linting.
+*   **Git:** Use `deploy.sh` to push. Avoid generic commit messages like "update" or "fix".
+*   **Environment:** The app is optimized for Linux/Docker environments. Local Windows development may require extra setup for `llama-cpp-python` compilation.

app.py CHANGED Viewed

@@ -1030,7 +1030,8 @@ def parse_thinking_blocks(content: str, streaming: bool = False) -> Tuple[str, s
 def summarize_streaming(
     file_obj,
-    model_key: str,
     enable_reasoning: bool = True,
     max_tokens: int = 2048,
     temperature: float = 0.6,
@@ -1042,10 +1043,11 @@ def summarize_streaming(
     custom_model_state: Any = None,
 ) -> Generator[Tuple[str, str, str, dict, str], None, None]:
     """
-    Stream summary generation from uploaded file.
     Args:
         file_obj: Gradio file object
         model_key: Model identifier from AVAILABLE_MODELS
         enable_reasoning: Whether to use reasoning mode (/think) for Qwen3 models
         max_tokens: Maximum tokens to generate
@@ -1102,31 +1104,35 @@ def summarize_streaming(
     if max_tokens > usable_max - 512:
         max_tokens = usable_max - 512
-    # Read uploaded file
     try:
-        if file_obj is None:
             system_prompt_preview = build_system_prompt(output_language, False, enable_reasoning)
-            yield ("", "Error: Please upload a transcript file first", "", metrics, system_prompt_preview)
             return
-        path = file_obj.name if hasattr(file_obj, 'name') else file_obj
-        # Get file metadata
-        import os
-        file_size = os.path.getsize(path)
-        file_name = os.path.basename(path)
-        with open(path, 'r', encoding='utf-8') as f:
-            transcript = f.read()
-        # Store file info
         metrics["file_info"] = {
-            "filename": file_name,
-            "size_bytes": file_size,
             "original_char_count": len(transcript),
         }
     except Exception as e:
         system_prompt_preview = build_system_prompt(output_language, False, enable_reasoning)
-        yield ("", f"Error reading file: {e}", "", metrics, system_prompt_preview)
         return
     if not transcript.strip():
@@ -1348,387 +1354,247 @@ def summarize_streaming(
 # Custom CSS for better UI
 custom_css = """
 :root {
-    --primary-color: #667eea;
-    --primary-dark: #5a67d8;
-    --primary-light: #a3bffa;
-    --accent-color: #764ba2;
     --bg-color: #f8fafc;
-    --card-bg: #ffffff;
     --text-color: #1e293b;
     --text-muted: #64748b;
     --border-color: #e2e8f0;
     --border-light: #f1f5f9;
-    --success-bg: #f0fdf4;
-    --success-border: #86efac;
-    --warning-bg: #fef3c7;
-    --warning-border: #fbbf24;
     --shadow-sm: 0 1px 2px rgba(0, 0, 0, 0.05);
     --shadow-md: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
     --shadow-lg: 0 10px 15px -3px rgba(0, 0, 0, 0.1), 0 4px 6px -2px rgba(0, 0, 0, 0.05);
-    --radius-sm: 6px;
-    --radius-md: 8px;
-    --radius-lg: 12px;
 }
 /* ===== LAYOUT & BASE ===== */
 .gradio-container {
     max-width: 1400px !important;
 }
 /* ===== HEADER ===== */
 .app-header {
     text-align: center;
-    padding: 2rem 1.5rem;
     background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%);
     border-radius: var(--radius-lg);
-    margin-bottom: 1.5rem;
     color: white;
     box-shadow: var(--shadow-lg);
 }
 .app-header h1 {
     margin: 0 0 0.5rem 0;
-    font-size: 2.25rem;
-    font-weight: 700;
-    letter-spacing: -0.025em;
 }
 .app-header p {
     margin: 0;
     opacity: 0.9;
-    font-size: 1.1rem;
 }
 .model-badge {
     display: inline-flex;
     align-items: center;
     gap: 0.5rem;
-    background: rgba(255, 255, 255, 0.2);
-    padding: 0.5rem 1rem;
-    border-radius: 20px;
-    font-size: 0.85rem;
-    margin-top: 1rem;
-    backdrop-filter: blur(4px);
 }
 /* ===== INSTRUCTIONS ===== */
 .instructions {
-    background: linear-gradient(135deg, #f8fafc 0%, #f1f5f9 100%);
-    border-left: 4px solid var(--primary-color);
-    padding: 1rem 1.25rem;
-    border-radius: 0 var(--radius-md) var(--radius-md) 0;
-    margin-bottom: 1.5rem;
     box-shadow: var(--shadow-sm);
-}
-.instructions ul {
-    margin: 0.5rem 0 0 0;
-    padding-left: 1.25rem;
-}
-.instructions li {
-    margin-bottom: 0.35rem;
-    color: var(--text-color);
 }
 /* ===== SECTION HEADERS ===== */
 .section-header {
-    font-size: 1rem;
-    font-weight: 600;
     color: var(--text-color);
-    margin-bottom: 0.75rem;
     display: flex;
     align-items: center;
-    gap: 0.5rem;
-    padding-bottom: 0.5rem;
-    border-bottom: 1px solid var(--border-light);
 }
 .section-icon {
-    font-size: 1.1rem;
 }
 /* ===== TABS STYLING ===== */
 .gradio-tabs {
     border: 1px solid var(--border-color) !important;
-    border-radius: var(--radius-lg) !important;
     overflow: hidden;
     box-shadow: var(--shadow-sm);
-    margin-bottom: 1rem;
-}
-.gradio-tabitem {
-    padding: 1rem !important;
     background: var(--card-bg) !important;
 }
 .tab-nav {
-    background: linear-gradient(180deg, #f8fafc 0%, #f1f5f9 100%) !important;
-    border-bottom: 1px solid var(--border-color) !important;
-    padding: 0 !important;
-    gap: 0 !important;
 }
 .tab-nav button {
-    padding: 0.875rem 1.25rem !important;
-    font-weight: 500 !important;
-    color: var(--text-muted) !important;
-    border: none !important;
-    border-bottom: 3px solid transparent !important;
-    background: transparent !important;
-    transition: all 0.2s ease !important;
-    margin: 0 !important;
-    border-radius: 0 !important;
-}
-.tab-nav button:hover {
-    color: var(--primary-color) !important;
-    background: rgba(102, 126, 234, 0.05) !important;
-}
-.tab-nav button.selected {
-    color: var(--primary-color) !important;
-    border-bottom-color: var(--primary-color) !important;
-    background: var(--card-bg) !important;
-    font-weight: 600 !important;
 }
 /* ===== GROUPS & CARDS ===== */
 .gradio-group {
     border: 1px solid var(--border-color) !important;
     border-radius: var(--radius-md) !important;
-    padding: 1rem !important;
     background: var(--card-bg) !important;
     box-shadow: var(--shadow-sm) !important;
-    margin-bottom: 1rem !important;
 }
 /* ===== ACCORDION STYLING ===== */
 .gradio-accordion {
     border: 1px solid var(--border-color) !important;
     border-radius: var(--radius-md) !important;
-    overflow: hidden;
-    box-shadow: var(--shadow-sm);
-    margin-bottom: 1rem;
-}
-.gradio-accordion > .label-wrap {
-    background: linear-gradient(180deg, #f8fafc 0%, #f1f5f9 100%) !important;
-    padding: 0.875rem 1rem !important;
-    border-bottom: 1px solid var(--border-color);
-}
-.gradio-accordion > .label-wrap:hover {
-    background: linear-gradient(180deg, #f1f5f9 0%, #e2e8f0 100%) !important;
-}
-.gradio-accordion > .label-wrap span {
-    font-weight: 600 !important;
-    color: var(--text-color) !important;
-}
-.gradio-accordion > div:last-child {
-    padding: 1rem !important;
     background: var(--card-bg) !important;
 }
 /* ===== BUTTONS ===== */
-/* Primary submit button */
 .submit-btn {
     background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%) !important;
     border: none !important;
     color: white !important;
-    font-weight: 600 !important;
-    padding: 0.875rem 2rem !important;
     border-radius: var(--radius-md) !important;
     cursor: pointer;
-    transition: all 0.2s ease !important;
-    box-shadow: var(--shadow-md) !important;
     width: 100% !important;
-    font-size: 1rem !important;
 }
 .submit-btn:hover {
-    transform: translateY(-2px);
-    box-shadow: 0 6px 20px rgba(102, 126, 234, 0.4) !important;
-}
-.submit-btn:active {
-    transform: translateY(0);
-}
-/* Secondary buttons (Copy, Download, Load) */
-button.secondary,
-button[size="sm"] {
-    background: var(--card-bg) !important;
-    border: 1px solid var(--border-color) !important;
-    color: var(--text-color) !important;
-    font-weight: 500 !important;
-    padding: 0.5rem 1rem !important;
-    border-radius: var(--radius-sm) !important;
-    transition: all 0.2s ease !important;
-    box-shadow: var(--shadow-sm) !important;
-}
-button.secondary:hover,
-button[size="sm"]:hover {
-    background: var(--bg-color) !important;
-    border-color: var(--primary-color) !important;
-    color: var(--primary-color) !important;
-    box-shadow: var(--shadow-md) !important;
-}
-/* Small primary buttons (Load Model) */
-button.primary[size="sm"] {
-    background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%) !important;
-    border: none !important;
-    color: white !important;
-    font-weight: 600 !important;
-}
-button.primary[size="sm"]:hover {
-    box-shadow: 0 4px 12px rgba(102, 126, 234, 0.4) !important;
-    color: white !important;
-}
-/* ===== INPUT COMPONENTS ===== */
-/* File upload area */
-.file-upload-area {
-    border: 2px dashed var(--border-color) !important;
-    border-radius: var(--radius-lg) !important;
-    padding: 1.5rem !important;
-    text-align: center;
-    transition: all 0.3s ease !important;
-    background: var(--bg-color) !important;
-}
-.file-upload-area:hover {
-    border-color: var(--primary-color) !important;
-    background: rgba(102, 126, 234, 0.05) !important;
-}
-/* Dropdowns */
-.gradio-dropdown {
-    border-radius: var(--radius-sm) !important;
-}
-.gradio-dropdown > div > input {
-    border-radius: var(--radius-sm) !important;
-}
-/* Sliders */
-input[type="range"] {
-    accent-color: var(--primary-color);
 }
 /* ===== OUTPUT BOXES ===== */
 .thinking-box {
-    background: var(--warning-bg) !important;
-    border: 1px solid var(--warning-border) !important;
     border-radius: var(--radius-md) !important;
-    font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace !important;
-    font-size: 0.875rem !important;
 }
-.thinking-box textarea {
-    background: transparent !important;
-    border: none !important;
 }
 .summary-box {
-    background: var(--success-bg) !important;
-    border: 1px solid var(--success-border) !important;
     border-radius: var(--radius-md) !important;
-    padding: 1rem !important;
-}
-/* ===== MODEL INFO GRID ===== */
-.stats-grid {
-    font-size: 0.9rem;
-}
-.stats-grid table {
-    width: 100%;
-    border-collapse: collapse;
-}
-.stats-grid th {
-    text-align: left;
-    padding: 0.5rem;
-    background: var(--bg-color);
-    font-weight: 600;
-    color: var(--text-muted);
-    font-size: 0.8rem;
-    text-transform: uppercase;
-    letter-spacing: 0.025em;
-}
-.stats-grid td {
-    padding: 0.5rem;
-    border-bottom: 1px solid var(--border-light);
-}
-/* ===== FOOTER ===== */
-.footer {
-    text-align: center;
-    margin-top: 2rem;
-    padding: 1.25rem;
-    color: var(--text-muted);
-    font-size: 0.85rem;
-    border-top: 1px solid var(--border-color);
-    background: linear-gradient(180deg, var(--bg-color) 0%, #f1f5f9 100%);
-    border-radius: 0 0 var(--radius-lg) var(--radius-lg);
 }
 /* ===== RESPONSIVE ADJUSTMENTS ===== */
-@media (max-width: 768px) {
-    .app-header h1 {
-        font-size: 1.75rem;
     }
-    .app-header p {
-        font-size: 1rem;
-    }
     .submit-btn {
-        padding: 0.75rem 1.5rem !important;
     }
 }
-/* ===== FOCUS STATES (Accessibility) ===== */
-button:focus-visible,
-input:focus-visible,
-select:focus-visible,
-textarea:focus-visible {
-    outline: 2px solid var(--primary-color) !important;
-    outline-offset: 2px !important;
-}
-/* ===== SCROLLBAR STYLING ===== */
-.thinking-box textarea::-webkit-scrollbar,
-.summary-box::-webkit-scrollbar {
-    width: 8px;
-}
-.thinking-box textarea::-webkit-scrollbar-track,
-.summary-box::-webkit-scrollbar-track {
-    background: var(--border-light);
-    border-radius: 4px;
-}
-.thinking-box textarea::-webkit-scrollbar-thumb,
-.summary-box::-webkit-scrollbar-thumb {
-    background: var(--border-color);
-    border-radius: 4px;
-}
-.thinking-box textarea::-webkit-scrollbar-thumb:hover,
-.summary-box::-webkit-scrollbar-thumb:hover {
-    background: var(--text-muted);
 }
 """
 # Create Gradio interface
 def create_interface():
     """Create and configure the Gradio interface."""
@@ -1769,24 +1635,36 @@ def create_interface():
             with gr.Column(scale=1):
                 # ==========================================
-                # Section 1: Input Configuration (Language + File)
                 # ==========================================
                 with gr.Group():
-                    gr.HTML('<div class="section-header"><span class="section-icon">📤</span> Input</div>')
                     language_selector = gr.Dropdown(
                         choices=[("English", "en"), ("Traditional Chinese (zh-TW)", "zh-TW")],
                         value="en",
-                        label="🌐 Output Language",
-                        info="Choose the target language for your summary"
                     )
-                    file_input = gr.File(
-                        label="📄 Upload Transcript (.txt)",
-                        file_types=[".txt"],
-                        type="filepath",
-                        elem_classes=["file-upload-area"]
-                    )
                 # ==========================================
                 # Section 2: Model Selection (Tabs)
@@ -2260,17 +2138,10 @@ def create_interface():
             outputs=[custom_info_output],
         )
-        # Also update submit button to use custom model state
-        # Note: We'll modify the summarize_streaming function to accept custom_model_state
-        # ==========================================
-        # END: Custom Model Loader Event Handlers
-        # ==========================================
         # Update submit button to include custom_model_state in inputs and system_prompt_debug in outputs
         submit_btn.click(
             fn=summarize_streaming,
-            inputs=[file_input, model_dropdown, enable_reasoning, max_tokens, temperature_slider, top_p, top_k, language_selector, thread_config_dropdown, custom_threads_slider, custom_model_state],
             outputs=[thinking_output, summary_output, info_output, metrics_state, system_prompt_debug],
             show_progress="full"
         )

 def summarize_streaming(
     file_obj,
+    text_input: str = "",
+    model_key: str = "qwen3_600m_q4",
     enable_reasoning: bool = True,
     max_tokens: int = 2048,
     temperature: float = 0.6,
     custom_model_state: Any = None,
 ) -> Generator[Tuple[str, str, str, dict, str], None, None]:
     """
+    Stream summary generation from uploaded file or text input.
     Args:
         file_obj: Gradio file object
+        text_input: Direct text input from user
         model_key: Model identifier from AVAILABLE_MODELS
         enable_reasoning: Whether to use reasoning mode (/think) for Qwen3 models
         max_tokens: Maximum tokens to generate
     if max_tokens > usable_max - 512:
         max_tokens = usable_max - 512
+    # Read input source (prioritize text_input)
     try:
+        transcript = ""
+        source_name = "Direct Input"
+        source_size = 0
+        if text_input and text_input.strip():
+            transcript = text_input
+            source_size = len(transcript.encode('utf-8'))
+        elif file_obj is not None:
+            path = file_obj.name if hasattr(file_obj, 'name') else file_obj
+            source_name = os.path.basename(path)
+            source_size = os.path.getsize(path)
+            with open(path, 'r', encoding='utf-8') as f:
+                transcript = f.read()
+        else:
             system_prompt_preview = build_system_prompt(output_language, False, enable_reasoning)
+            yield ("", "Error: Please upload a file or paste text first", "", metrics, system_prompt_preview)
             return
+        # Store input info
         metrics["file_info"] = {
+            "source": source_name,
+            "size_bytes": source_size,
             "original_char_count": len(transcript),
         }
     except Exception as e:
         system_prompt_preview = build_system_prompt(output_language, False, enable_reasoning)
+        yield ("", f"Error reading input: {e}", "", metrics, system_prompt_preview)
         return
     if not transcript.strip():
 # Custom CSS for better UI
 custom_css = """
 :root {
+    --primary-color: #6366f1;
+    --primary-dark: #4f46e5;
+    --primary-light: #c7d2fe;
+    --accent-color: #8b5cf6;
     --bg-color: #f8fafc;
+    --card-bg: rgba(255, 255, 255, 0.85);
     --text-color: #1e293b;
     --text-muted: #64748b;
     --border-color: #e2e8f0;
     --border-light: #f1f5f9;
+    /* Semantic Colors */
+    --thinking-bg: #f5f3ff;
+    --thinking-border: #ddd6fe;
+    --thinking-accent: #8b5cf6;
+    --summary-bg: #f0fdf4;
+    --summary-border: #dcfce7;
+    --summary-accent: #22c55e;
     --shadow-sm: 0 1px 2px rgba(0, 0, 0, 0.05);
     --shadow-md: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
     --shadow-lg: 0 10px 15px -3px rgba(0, 0, 0, 0.1), 0 4px 6px -2px rgba(0, 0, 0, 0.05);
+    --radius-sm: 8px;
+    --radius-md: 12px;
+    --radius-lg: 20px;
 }
 /* ===== LAYOUT & BASE ===== */
 .gradio-container {
     max-width: 1400px !important;
+    background: radial-gradient(circle at top right, #eef2ff 0%, #f8fafc 40%) !important;
 }
 /* ===== HEADER ===== */
 .app-header {
     text-align: center;
+    padding: 2.5rem 1.5rem;
     background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%);
     border-radius: var(--radius-lg);
+    margin-bottom: 2rem;
     color: white;
     box-shadow: var(--shadow-lg);
+    position: relative;
+    overflow: hidden;
+}
+.app-header::before {
+    content: "";
+    position: absolute;
+    top: -50%;
+    left: -50%;
+    width: 200%;
+    height: 200%;
+    background: radial-gradient(circle, rgba(255,255,255,0.1) 0%, transparent 60%);
+    animation: rotate 20s linear infinite;
+}
+@keyframes rotate {
+    from { transform: rotate(0deg); }
+    to { transform: rotate(360deg); }
 }
 .app-header h1 {
     margin: 0 0 0.5rem 0;
+    font-size: 2.5rem;
+    font-weight: 800;
+    letter-spacing: -0.04em;
+    position: relative;
+    z-index: 1;
 }
 .app-header p {
     margin: 0;
     opacity: 0.9;
+    font-size: 1.15rem;
+    font-weight: 400;
+    position: relative;
+    z-index: 1;
 }
 .model-badge {
     display: inline-flex;
     align-items: center;
     gap: 0.5rem;
+    background: rgba(255, 255, 255, 0.15);
+    padding: 0.6rem 1.25rem;
+    border-radius: 30px;
+    font-size: 0.9rem;
+    margin-top: 1.25rem;
+    backdrop-filter: blur(8px);
+    border: 1px solid rgba(255, 255, 255, 0.2);
+    position: relative;
+    z-index: 1;
+    font-weight: 500;
 }
 /* ===== INSTRUCTIONS ===== */
 .instructions {
+    background: var(--card-bg);
+    border-left: 5px solid var(--primary-color);
+    padding: 1.25rem 1.5rem;
+    border-radius: var(--radius-sm) var(--radius-md) var(--radius-md) var(--radius-sm);
+    margin-bottom: 2rem;
     box-shadow: var(--shadow-sm);
+    backdrop-filter: blur(10px);
+    border: 1px solid var(--border-color);
 }
 /* ===== SECTION HEADERS ===== */
 .section-header {
+    font-size: 0.95rem;
+    font-weight: 700;
     color: var(--text-color);
+    margin-bottom: 1rem;
     display: flex;
     align-items: center;
+    gap: 0.6rem;
+    padding-bottom: 0.6rem;
+    border-bottom: 2px solid var(--border-light);
+    text-transform: uppercase;
+    letter-spacing: 0.05em;
 }
 .section-icon {
+    font-size: 1.2rem;
 }
 /* ===== TABS STYLING ===== */
 .gradio-tabs {
     border: 1px solid var(--border-color) !important;
+    border-radius: var(--radius-md) !important;
     overflow: hidden;
     box-shadow: var(--shadow-sm);
     background: var(--card-bg) !important;
+    backdrop-filter: blur(10px);
 }
 .tab-nav {
+    background: #f1f5f9 !important;
+    padding: 0.25rem 0.25rem 0 0.25rem !important;
+    gap: 4px !important;
 }
 .tab-nav button {
+    border-radius: 8px 8px 0 0 !important;
+    padding: 0.75rem 1rem !important;
 }
 /* ===== GROUPS & CARDS ===== */
 .gradio-group {
     border: 1px solid var(--border-color) !important;
     border-radius: var(--radius-md) !important;
+    padding: 1.25rem !important;
     background: var(--card-bg) !important;
     box-shadow: var(--shadow-sm) !important;
+    margin-bottom: 1.5rem !important;
+    backdrop-filter: blur(10px);
+    transition: transform 0.2s ease, box-shadow 0.2s ease !important;
+}
+.gradio-group:hover {
+    box-shadow: var(--shadow-md) !important;
 }
 /* ===== ACCORDION STYLING ===== */
 .gradio-accordion {
     border: 1px solid var(--border-color) !important;
     border-radius: var(--radius-md) !important;
     background: var(--card-bg) !important;
 }
 /* ===== BUTTONS ===== */
 .submit-btn {
     background: linear-gradient(135deg, var(--primary-color) 0%, var(--accent-color) 100%) !important;
     border: none !important;
     color: white !important;
+    font-weight: 700 !important;
+    padding: 1rem 2rem !important;
     border-radius: var(--radius-md) !important;
     cursor: pointer;
+    transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1) !important;
+    box-shadow: 0 4px 15px rgba(99, 102, 241, 0.4) !important;
     width: 100% !important;
+    font-size: 1.1rem !important;
+    letter-spacing: 0.02em;
 }
 .submit-btn:hover {
+    transform: translateY(-3px) scale(1.02);
+    box-shadow: 0 8px 25px rgba(99, 102, 241, 0.5) !important;
 }
 /* ===== OUTPUT BOXES ===== */
 .thinking-box {
+    background: var(--thinking-bg) !important;
+    border: 1px solid var(--thinking-border) !important;
+    border-left: 4px solid var(--thinking-accent) !important;
     border-radius: var(--radius-md) !important;
+    font-family: 'JetBrains Mono', 'Fira Code', monospace !important;
+    transition: all 0.3s ease !important;
 }
+.thinking-box:focus-within {
+    box-shadow: 0 0 0 3px rgba(139, 92, 246, 0.1) !important;
 }
 .summary-box {
+    background: var(--summary-bg) !important;
+    border: 1px solid var(--summary-border) !important;
     border-radius: var(--radius-md) !important;
+    padding: 1.5rem !important;
+    font-size: 1.1rem !important;
+    line-height: 1.7 !important;
+    color: #0f172a !important;
+    box-shadow: var(--shadow-sm);
 }
 /* ===== RESPONSIVE ADJUSTMENTS ===== */
+@media (max-width: 1024px) {
+    .gradio-container {
+        padding: 1rem !important;
     }
     .submit-btn {
+        position: sticky;
+        bottom: 1rem;
+        z-index: 100;
     }
 }
+@media (max-width: 768px) {
+    .app-header {
+        padding: 1.5rem 1rem;
+    }
+    .app-header h1 {
+        font-size: 1.8rem;
+    }
 }
 """
 # Create Gradio interface
 def create_interface():
     """Create and configure the Gradio interface."""
             with gr.Column(scale=1):
                 # ==========================================
+                # Section 1: Input Configuration (Language + Source)
                 # ==========================================
                 with gr.Group():
+                    gr.HTML('<div class="section-header"><span class="section-icon">🌐</span> Global Settings</div>')
                     language_selector = gr.Dropdown(
                         choices=[("English", "en"), ("Traditional Chinese (zh-TW)", "zh-TW")],
                         value="en",
+                        label="Output Language",
+                        info="Target language for the summary"
                     )
+                with gr.Group():
+                    gr.HTML('<div class="section-header"><span class="section-icon">📥</span> Input Content</div>')
+                    with gr.Tabs() as input_tabs:
+                        with gr.TabItem("📄 Upload File", id=0):
+                            file_input = gr.File(
+                                label="Transcript (.txt)",
+                                file_types=[".txt"],
+                                type="filepath",
+                                elem_classes=["file-upload-area"]
+                            )
+                        with gr.TabItem("✍️ Paste Text", id=1):
+                            text_input = gr.Textbox(
+                                label="Paste Transcript",
+                                placeholder="Paste your transcript content here...",
+                                lines=10,
+                                max_lines=20
+                            )
                 # ==========================================
                 # Section 2: Model Selection (Tabs)
             outputs=[custom_info_output],
         )
         # Update submit button to include custom_model_state in inputs and system_prompt_debug in outputs
         submit_btn.click(
             fn=summarize_streaming,
+            inputs=[file_input, text_input, model_dropdown, enable_reasoning, max_tokens, temperature_slider, top_p, top_k, language_selector, thread_config_dropdown, custom_threads_slider, custom_model_state],
             outputs=[thinking_output, summary_output, info_output, metrics_state, system_prompt_debug],
             show_progress="full"
         )