Spaces:

Luigi
/

tiny-scribe

Runtime error

Luigi commited on Feb 5

Commit

7b25e5e

1 Parent(s): 37c9868

docs: update README with UI improvements and Advanced Mode features

- Add Dual Summarization Modes section (Standard vs Advanced)
- Document new UI improvements (two-column layout, model source selection)
- Add detailed Advanced Mode pipeline description (3 stages)
- Update usage instructions with step-by-step guide
- Document real-time outputs (thinking, summary, metrics)
- Reflect recent UI refactoring (radio buttons, unified model info)

Files changed (1) hide show

README.md +49 -13

README.md CHANGED Viewed

@@ -12,32 +12,68 @@ license: mit
 # Tiny Scribe
-A lightweight transcript summarization tool powered by local LLMs. Features 24+ preset models ranging from 100M to 30B parameters, plus the ability to load any GGUF model from HuggingFace Hub. Includes live streaming output, reasoning modes, and flexible deployment options.
 ## Features
 - **24+ Preset Models**: From tiny 100M models to powerful 30B models
-- **Custom GGUF Loading**: Load any GGUF model from HuggingFace Hub
-- **Tabbed Interface**: Clean separation between Preset Models and Custom GGUF
 - **Live Streaming**: Real-time summary generation with token-by-token output
 - **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
 - **Thinking Buffer**: Automatic 50% context window extension when reasoning enabled
 - **Hardware Presets**: Free Tier (2 vCPUs), Upgrade (8 vCPUs), or Custom thread count
-- **File Upload**: Upload .txt files to summarize
 - **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
 - **Auto Settings**: Temperature, top_p, and top_k auto-populate per model
 ## Usage
-1. **Upload File**: Upload a .txt file containing your transcript
-2. **Select Output Language**: Choose English or Traditional Chinese (zh-TW)
-3. **Choose Model**:
-   - **Preset Models tab**: Select from 24+ curated models
-   - **Custom GGUF tab**: Search and load any GGUF from HuggingFace
-4. **Configure Settings** (optional in Advanced Settings):
-   - Hardware tier (CPU threads)
-   - Temperature, Top-p, Top-k inference parameters
-5. **Click Generate Summary**: Watch the thinking process and summary appear in real-time!
 ## Custom GGUF Models

 # Tiny Scribe
+A lightweight transcript summarization tool powered by local LLMs. Features 24+ preset models ranging from 100M to 30B parameters, plus the ability to load any GGUF model from HuggingFace Hub. Includes two summarization modes (Standard and Advanced 3-model pipeline), live streaming output, reasoning modes, and flexible deployment options.
 ## Features
+### Core Capabilities
 - **24+ Preset Models**: From tiny 100M models to powerful 30B models
+- **Custom GGUF Loading**: Load any GGUF model from HuggingFace Hub with live search
+- **Dual Summarization Modes**:
+  - **Standard Mode**: Single-model direct summarization
+  - **Advanced Mode**: 3-stage pipeline (Extraction → Deduplication → Synthesis)
 - **Live Streaming**: Real-time summary generation with token-by-token output
 - **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
 - **Thinking Buffer**: Automatic 50% context window extension when reasoning enabled
+### User Interface
+- **Clean Two-Column Layout**: Configuration (left) and output (right)
+- **Model Source Selection**: Radio button toggle between Preset and Custom models
+- **Real-Time Outputs**:
+  - **Model Thinking Process**: See the AI's reasoning in real-time
+  - **Final Summary**: Polished, formatted summary
+  - **Generation Metrics**: Separate section for performance stats
+- **Unified Model Information**: Displays specs for Standard (1 model) or Advanced (3 models)
 - **Hardware Presets**: Free Tier (2 vCPUs), Upgrade (8 vCPUs), or Custom thread count
 - **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
 - **Auto Settings**: Temperature, top_p, and top_k auto-populate per model
 ## Usage
+### Quick Start (Standard Mode)
+1. **Configure Global Settings**:
+   - **Output Language**: Choose English or Traditional Chinese (zh-TW)
+   - **Input Content**: Upload a .txt file or paste your transcript
+   - **Hardware Configuration**: Select CPU thread preset (Free Tier, Upgrade, or Custom)
+2. **Select Summarization Mode**:
+   - **Standard Mode**: Single-model direct summarization (faster, simpler)
+   - **Advanced Mode**: 3-model pipeline with extraction, deduplication, synthesis (higher quality)
+3. **Choose Model** (Standard Mode):
+   - **Preset Models**: Select from 24+ curated models
+   - **Custom GGUF**: Search and load any GGUF from HuggingFace Hub
+4. **Configure Inference Parameters** (optional):
+   - Temperature, Top-p, Top-k (auto-populated with model defaults)
+   - Max Output Tokens
+   - Enable/disable reasoning mode (for supported models)
+5. **Generate Summary**: Click "✨ Generate Summary" and watch:
+   - **Model Thinking Process** (left): AI's reasoning in real-time
+   - **Final Summary** (right): Polished result
+   - **Generation Metrics**: Performance stats (tokens/sec, generation time)
+### Advanced Mode (3-Model Pipeline)
+For higher quality summarization with large transcripts:
+1. **Stage 1 - Extraction**: Small model (≤1.7B) extracts key points from windows
+2. **Stage 2 - Deduplication**: Embedding model removes duplicate items
+3. **Stage 3 - Synthesis**: Large model (1B-30B) generates executive summary
+Configure each stage independently with dedicated model, context window, and inference settings.
 ## Custom GGUF Models