Luigi commited on
Commit
7b25e5e
·
1 Parent(s): 37c9868

docs: update README with UI improvements and Advanced Mode features

Browse files

- Add Dual Summarization Modes section (Standard vs Advanced)
- Document new UI improvements (two-column layout, model source selection)
- Add detailed Advanced Mode pipeline description (3 stages)
- Update usage instructions with step-by-step guide
- Document real-time outputs (thinking, summary, metrics)
- Reflect recent UI refactoring (radio buttons, unified model info)

Files changed (1) hide show
  1. README.md +49 -13
README.md CHANGED
@@ -12,32 +12,68 @@ license: mit
12
 
13
  # Tiny Scribe
14
 
15
- A lightweight transcript summarization tool powered by local LLMs. Features 24+ preset models ranging from 100M to 30B parameters, plus the ability to load any GGUF model from HuggingFace Hub. Includes live streaming output, reasoning modes, and flexible deployment options.
16
 
17
  ## Features
18
 
 
19
  - **24+ Preset Models**: From tiny 100M models to powerful 30B models
20
- - **Custom GGUF Loading**: Load any GGUF model from HuggingFace Hub
21
- - **Tabbed Interface**: Clean separation between Preset Models and Custom GGUF
 
 
22
  - **Live Streaming**: Real-time summary generation with token-by-token output
23
  - **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
24
  - **Thinking Buffer**: Automatic 50% context window extension when reasoning enabled
 
 
 
 
 
 
 
 
 
25
  - **Hardware Presets**: Free Tier (2 vCPUs), Upgrade (8 vCPUs), or Custom thread count
26
- - **File Upload**: Upload .txt files to summarize
27
  - **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
28
  - **Auto Settings**: Temperature, top_p, and top_k auto-populate per model
29
 
30
  ## Usage
31
 
32
- 1. **Upload File**: Upload a .txt file containing your transcript
33
- 2. **Select Output Language**: Choose English or Traditional Chinese (zh-TW)
34
- 3. **Choose Model**:
35
- - **Preset Models tab**: Select from 24+ curated models
36
- - **Custom GGUF tab**: Search and load any GGUF from HuggingFace
37
- 4. **Configure Settings** (optional in Advanced Settings):
38
- - Hardware tier (CPU threads)
39
- - Temperature, Top-p, Top-k inference parameters
40
- 5. **Click Generate Summary**: Watch the thinking process and summary appear in real-time!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  ## Custom GGUF Models
43
 
 
12
 
13
  # Tiny Scribe
14
 
15
+ A lightweight transcript summarization tool powered by local LLMs. Features 24+ preset models ranging from 100M to 30B parameters, plus the ability to load any GGUF model from HuggingFace Hub. Includes two summarization modes (Standard and Advanced 3-model pipeline), live streaming output, reasoning modes, and flexible deployment options.
16
 
17
  ## Features
18
 
19
+ ### Core Capabilities
20
  - **24+ Preset Models**: From tiny 100M models to powerful 30B models
21
+ - **Custom GGUF Loading**: Load any GGUF model from HuggingFace Hub with live search
22
+ - **Dual Summarization Modes**:
23
+ - **Standard Mode**: Single-model direct summarization
24
+ - **Advanced Mode**: 3-stage pipeline (Extraction → Deduplication → Synthesis)
25
  - **Live Streaming**: Real-time summary generation with token-by-token output
26
  - **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
27
  - **Thinking Buffer**: Automatic 50% context window extension when reasoning enabled
28
+
29
+ ### User Interface
30
+ - **Clean Two-Column Layout**: Configuration (left) and output (right)
31
+ - **Model Source Selection**: Radio button toggle between Preset and Custom models
32
+ - **Real-Time Outputs**:
33
+ - **Model Thinking Process**: See the AI's reasoning in real-time
34
+ - **Final Summary**: Polished, formatted summary
35
+ - **Generation Metrics**: Separate section for performance stats
36
+ - **Unified Model Information**: Displays specs for Standard (1 model) or Advanced (3 models)
37
  - **Hardware Presets**: Free Tier (2 vCPUs), Upgrade (8 vCPUs), or Custom thread count
 
38
  - **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
39
  - **Auto Settings**: Temperature, top_p, and top_k auto-populate per model
40
 
41
  ## Usage
42
 
43
+ ### Quick Start (Standard Mode)
44
+
45
+ 1. **Configure Global Settings**:
46
+ - **Output Language**: Choose English or Traditional Chinese (zh-TW)
47
+ - **Input Content**: Upload a .txt file or paste your transcript
48
+ - **Hardware Configuration**: Select CPU thread preset (Free Tier, Upgrade, or Custom)
49
+
50
+ 2. **Select Summarization Mode**:
51
+ - **Standard Mode**: Single-model direct summarization (faster, simpler)
52
+ - **Advanced Mode**: 3-model pipeline with extraction, deduplication, synthesis (higher quality)
53
+
54
+ 3. **Choose Model** (Standard Mode):
55
+ - **Preset Models**: Select from 24+ curated models
56
+ - **Custom GGUF**: Search and load any GGUF from HuggingFace Hub
57
+
58
+ 4. **Configure Inference Parameters** (optional):
59
+ - Temperature, Top-p, Top-k (auto-populated with model defaults)
60
+ - Max Output Tokens
61
+ - Enable/disable reasoning mode (for supported models)
62
+
63
+ 5. **Generate Summary**: Click "✨ Generate Summary" and watch:
64
+ - **Model Thinking Process** (left): AI's reasoning in real-time
65
+ - **Final Summary** (right): Polished result
66
+ - **Generation Metrics**: Performance stats (tokens/sec, generation time)
67
+
68
+ ### Advanced Mode (3-Model Pipeline)
69
+
70
+ For higher quality summarization with large transcripts:
71
+
72
+ 1. **Stage 1 - Extraction**: Small model (≤1.7B) extracts key points from windows
73
+ 2. **Stage 2 - Deduplication**: Embedding model removes duplicate items
74
+ 3. **Stage 3 - Synthesis**: Large model (1B-30B) generates executive summary
75
+
76
+ Configure each stage independently with dedicated model, context window, and inference settings.
77
 
78
  ## Custom GGUF Models
79