Spaces:
Running
Running
docs: update README with UI improvements and Advanced Mode features
Browse files- Add Dual Summarization Modes section (Standard vs Advanced)
- Document new UI improvements (two-column layout, model source selection)
- Add detailed Advanced Mode pipeline description (3 stages)
- Update usage instructions with step-by-step guide
- Document real-time outputs (thinking, summary, metrics)
- Reflect recent UI refactoring (radio buttons, unified model info)
README.md
CHANGED
|
@@ -12,32 +12,68 @@ license: mit
|
|
| 12 |
|
| 13 |
# Tiny Scribe
|
| 14 |
|
| 15 |
-
A lightweight transcript summarization tool powered by local LLMs. Features 24+ preset models ranging from 100M to 30B parameters, plus the ability to load any GGUF model from HuggingFace Hub. Includes live streaming output, reasoning modes, and flexible deployment options.
|
| 16 |
|
| 17 |
## Features
|
| 18 |
|
|
|
|
| 19 |
- **24+ Preset Models**: From tiny 100M models to powerful 30B models
|
| 20 |
-
- **Custom GGUF Loading**: Load any GGUF model from HuggingFace Hub
|
| 21 |
-
- **
|
|
|
|
|
|
|
| 22 |
- **Live Streaming**: Real-time summary generation with token-by-token output
|
| 23 |
- **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
|
| 24 |
- **Thinking Buffer**: Automatic 50% context window extension when reasoning enabled
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
- **Hardware Presets**: Free Tier (2 vCPUs), Upgrade (8 vCPUs), or Custom thread count
|
| 26 |
-
- **File Upload**: Upload .txt files to summarize
|
| 27 |
- **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
|
| 28 |
- **Auto Settings**: Temperature, top_p, and top_k auto-populate per model
|
| 29 |
|
| 30 |
## Usage
|
| 31 |
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
- **
|
| 36 |
-
- **
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
## Custom GGUF Models
|
| 43 |
|
|
|
|
| 12 |
|
| 13 |
# Tiny Scribe
|
| 14 |
|
| 15 |
+
A lightweight transcript summarization tool powered by local LLMs. Features 24+ preset models ranging from 100M to 30B parameters, plus the ability to load any GGUF model from HuggingFace Hub. Includes two summarization modes (Standard and Advanced 3-model pipeline), live streaming output, reasoning modes, and flexible deployment options.
|
| 16 |
|
| 17 |
## Features
|
| 18 |
|
| 19 |
+
### Core Capabilities
|
| 20 |
- **24+ Preset Models**: From tiny 100M models to powerful 30B models
|
| 21 |
+
- **Custom GGUF Loading**: Load any GGUF model from HuggingFace Hub with live search
|
| 22 |
+
- **Dual Summarization Modes**:
|
| 23 |
+
- **Standard Mode**: Single-model direct summarization
|
| 24 |
+
- **Advanced Mode**: 3-stage pipeline (Extraction → Deduplication → Synthesis)
|
| 25 |
- **Live Streaming**: Real-time summary generation with token-by-token output
|
| 26 |
- **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
|
| 27 |
- **Thinking Buffer**: Automatic 50% context window extension when reasoning enabled
|
| 28 |
+
|
| 29 |
+
### User Interface
|
| 30 |
+
- **Clean Two-Column Layout**: Configuration (left) and output (right)
|
| 31 |
+
- **Model Source Selection**: Radio button toggle between Preset and Custom models
|
| 32 |
+
- **Real-Time Outputs**:
|
| 33 |
+
- **Model Thinking Process**: See the AI's reasoning in real-time
|
| 34 |
+
- **Final Summary**: Polished, formatted summary
|
| 35 |
+
- **Generation Metrics**: Separate section for performance stats
|
| 36 |
+
- **Unified Model Information**: Displays specs for Standard (1 model) or Advanced (3 models)
|
| 37 |
- **Hardware Presets**: Free Tier (2 vCPUs), Upgrade (8 vCPUs), or Custom thread count
|
|
|
|
| 38 |
- **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
|
| 39 |
- **Auto Settings**: Temperature, top_p, and top_k auto-populate per model
|
| 40 |
|
| 41 |
## Usage
|
| 42 |
|
| 43 |
+
### Quick Start (Standard Mode)
|
| 44 |
+
|
| 45 |
+
1. **Configure Global Settings**:
|
| 46 |
+
- **Output Language**: Choose English or Traditional Chinese (zh-TW)
|
| 47 |
+
- **Input Content**: Upload a .txt file or paste your transcript
|
| 48 |
+
- **Hardware Configuration**: Select CPU thread preset (Free Tier, Upgrade, or Custom)
|
| 49 |
+
|
| 50 |
+
2. **Select Summarization Mode**:
|
| 51 |
+
- **Standard Mode**: Single-model direct summarization (faster, simpler)
|
| 52 |
+
- **Advanced Mode**: 3-model pipeline with extraction, deduplication, synthesis (higher quality)
|
| 53 |
+
|
| 54 |
+
3. **Choose Model** (Standard Mode):
|
| 55 |
+
- **Preset Models**: Select from 24+ curated models
|
| 56 |
+
- **Custom GGUF**: Search and load any GGUF from HuggingFace Hub
|
| 57 |
+
|
| 58 |
+
4. **Configure Inference Parameters** (optional):
|
| 59 |
+
- Temperature, Top-p, Top-k (auto-populated with model defaults)
|
| 60 |
+
- Max Output Tokens
|
| 61 |
+
- Enable/disable reasoning mode (for supported models)
|
| 62 |
+
|
| 63 |
+
5. **Generate Summary**: Click "✨ Generate Summary" and watch:
|
| 64 |
+
- **Model Thinking Process** (left): AI's reasoning in real-time
|
| 65 |
+
- **Final Summary** (right): Polished result
|
| 66 |
+
- **Generation Metrics**: Performance stats (tokens/sec, generation time)
|
| 67 |
+
|
| 68 |
+
### Advanced Mode (3-Model Pipeline)
|
| 69 |
+
|
| 70 |
+
For higher quality summarization with large transcripts:
|
| 71 |
+
|
| 72 |
+
1. **Stage 1 - Extraction**: Small model (≤1.7B) extracts key points from windows
|
| 73 |
+
2. **Stage 2 - Deduplication**: Embedding model removes duplicate items
|
| 74 |
+
3. **Stage 3 - Synthesis**: Large model (1B-30B) generates executive summary
|
| 75 |
+
|
| 76 |
+
Configure each stage independently with dedicated model, context window, and inference settings.
|
| 77 |
|
| 78 |
## Custom GGUF Models
|
| 79 |
|