Spaces:
Running
Running
feat: Add Breeze 3B Q4 model (mradermacher/breeze-3b-GGUF)
Browse files- Added Breeze 3B with Q4_K_M quantization (2.0GB)
- 32K context window based on Qwen2.5-Coder architecture
- Updated README: 22→23 models
README.md
CHANGED
|
@@ -12,11 +12,11 @@ license: mit
|
|
| 12 |
|
| 13 |
# Tiny Scribe
|
| 14 |
|
| 15 |
-
A lightweight transcript summarization tool powered by local LLMs. Features
|
| 16 |
|
| 17 |
## Features
|
| 18 |
|
| 19 |
-
- **
|
| 20 |
- **Live Streaming**: Real-time summary generation with token-by-token output
|
| 21 |
- **Model Selection**: Dropdown to choose from 22 available models
|
| 22 |
- **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
|
|
@@ -26,7 +26,7 @@ A lightweight transcript summarization tool powered by local LLMs. Features 22 m
|
|
| 26 |
- **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
|
| 27 |
- **Auto Settings**: Temperature, top_p, and top_k sliders auto-populate per model
|
| 28 |
|
| 29 |
-
## Model Registry (
|
| 30 |
|
| 31 |
### Tiny Models (0.1-0.6B)
|
| 32 |
- **Falcon-H1-100M** - 100M parameters, 4K context
|
|
@@ -48,6 +48,7 @@ A lightweight transcript summarization tool powered by local LLMs. Features 22 m
|
|
| 48 |
|
| 49 |
### Standard Models (3-7B)
|
| 50 |
- **Granite-3.1-3B-A800M** - 3B parameters, 4K context
|
|
|
|
| 51 |
- **Qwen3-4B-Thinking** - 4B parameters, 8K context (reasoning)
|
| 52 |
- **Granite-4.0-Tiny-7B** - 7B parameters, 8K context
|
| 53 |
|
|
@@ -61,7 +62,7 @@ A lightweight transcript summarization tool powered by local LLMs. Features 22 m
|
|
| 61 |
## Usage
|
| 62 |
|
| 63 |
1. **Select Output Language**: Choose English or Traditional Chinese (zh-TW)
|
| 64 |
-
2. **Select Model**: Choose from the dropdown of
|
| 65 |
3. **Configure Settings** (optional):
|
| 66 |
- Enable "Use Reasoning Mode" for thinking models
|
| 67 |
- Adjust Temperature, Top-p, and Top-k (auto-populated per model)
|
|
|
|
| 12 |
|
| 13 |
# Tiny Scribe
|
| 14 |
|
| 15 |
+
A lightweight transcript summarization tool powered by local LLMs. Features 23 models ranging from 100M to 30B parameters with live streaming output, reasoning modes, and flexible deployment options.
|
| 16 |
|
| 17 |
## Features
|
| 18 |
|
| 19 |
+
- **23 Local Models**: From tiny 100M models to powerful 30B models
|
| 20 |
- **Live Streaming**: Real-time summary generation with token-by-token output
|
| 21 |
- **Model Selection**: Dropdown to choose from 22 available models
|
| 22 |
- **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
|
|
|
|
| 26 |
- **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
|
| 27 |
- **Auto Settings**: Temperature, top_p, and top_k sliders auto-populate per model
|
| 28 |
|
| 29 |
+
## Model Registry (23 Models)
|
| 30 |
|
| 31 |
### Tiny Models (0.1-0.6B)
|
| 32 |
- **Falcon-H1-100M** - 100M parameters, 4K context
|
|
|
|
| 48 |
|
| 49 |
### Standard Models (3-7B)
|
| 50 |
- **Granite-3.1-3B-A800M** - 3B parameters, 4K context
|
| 51 |
+
- **Breeze-3B-Q4** - 3B parameters, 32K context
|
| 52 |
- **Qwen3-4B-Thinking** - 4B parameters, 8K context (reasoning)
|
| 53 |
- **Granite-4.0-Tiny-7B** - 7B parameters, 8K context
|
| 54 |
|
|
|
|
| 62 |
## Usage
|
| 63 |
|
| 64 |
1. **Select Output Language**: Choose English or Traditional Chinese (zh-TW)
|
| 65 |
+
2. **Select Model**: Choose from the dropdown of 23 available models
|
| 66 |
3. **Configure Settings** (optional):
|
| 67 |
- Enable "Use Reasoning Mode" for thinking models
|
| 68 |
- Adjust Temperature, Top-p, and Top-k (auto-populated per model)
|
app.py
CHANGED
|
@@ -218,6 +218,20 @@ AVAILABLE_MODELS = {
|
|
| 218 |
"repeat_penalty": 1.1,
|
| 219 |
},
|
| 220 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 221 |
"granite_3_1_3b_q4": {
|
| 222 |
"name": "Granite 3.1 3B-A800M Instruct (128K Context)",
|
| 223 |
"repo_id": "bartowski/granite-3.1-3b-a800m-instruct-GGUF",
|
|
|
|
| 218 |
"repeat_penalty": 1.1,
|
| 219 |
},
|
| 220 |
},
|
| 221 |
+
"breeze_3b_q4": {
|
| 222 |
+
"name": "Breeze 3B Q4 (32K Context)",
|
| 223 |
+
"repo_id": "mradermacher/breeze-3b-GGUF",
|
| 224 |
+
"filename": "*Q4_K_M.gguf",
|
| 225 |
+
"max_context": 32768,
|
| 226 |
+
"default_temperature": 0.6,
|
| 227 |
+
"supports_toggle": False,
|
| 228 |
+
"inference_settings": {
|
| 229 |
+
"temperature": 0.6,
|
| 230 |
+
"top_p": 0.95,
|
| 231 |
+
"top_k": 20,
|
| 232 |
+
"repeat_penalty": 1.0,
|
| 233 |
+
},
|
| 234 |
+
},
|
| 235 |
"granite_3_1_3b_q4": {
|
| 236 |
"name": "Granite 3.1 3B-A800M Instruct (128K Context)",
|
| 237 |
"repo_id": "bartowski/granite-3.1-3b-a800m-instruct-GGUF",
|