Luigi commited on
Commit
e187d3a
·
1 Parent(s): fd459ca

feat: Add Breeze 3B Q4 model (mradermacher/breeze-3b-GGUF)

Browse files

- Added Breeze 3B with Q4_K_M quantization (2.0GB)
- 32K context window based on Qwen2.5-Coder architecture
- Updated README: 22→23 models

Files changed (2) hide show
  1. README.md +5 -4
  2. app.py +14 -0
README.md CHANGED
@@ -12,11 +12,11 @@ license: mit
12
 
13
  # Tiny Scribe
14
 
15
- A lightweight transcript summarization tool powered by local LLMs. Features 22 models ranging from 100M to 30B parameters with live streaming output, reasoning modes, and flexible deployment options.
16
 
17
  ## Features
18
 
19
- - **22 Local Models**: From tiny 100M models to powerful 30B models
20
  - **Live Streaming**: Real-time summary generation with token-by-token output
21
  - **Model Selection**: Dropdown to choose from 22 available models
22
  - **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
@@ -26,7 +26,7 @@ A lightweight transcript summarization tool powered by local LLMs. Features 22 m
26
  - **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
27
  - **Auto Settings**: Temperature, top_p, and top_k sliders auto-populate per model
28
 
29
- ## Model Registry (22 Models)
30
 
31
  ### Tiny Models (0.1-0.6B)
32
  - **Falcon-H1-100M** - 100M parameters, 4K context
@@ -48,6 +48,7 @@ A lightweight transcript summarization tool powered by local LLMs. Features 22 m
48
 
49
  ### Standard Models (3-7B)
50
  - **Granite-3.1-3B-A800M** - 3B parameters, 4K context
 
51
  - **Qwen3-4B-Thinking** - 4B parameters, 8K context (reasoning)
52
  - **Granite-4.0-Tiny-7B** - 7B parameters, 8K context
53
 
@@ -61,7 +62,7 @@ A lightweight transcript summarization tool powered by local LLMs. Features 22 m
61
  ## Usage
62
 
63
  1. **Select Output Language**: Choose English or Traditional Chinese (zh-TW)
64
- 2. **Select Model**: Choose from the dropdown of 22 available models
65
  3. **Configure Settings** (optional):
66
  - Enable "Use Reasoning Mode" for thinking models
67
  - Adjust Temperature, Top-p, and Top-k (auto-populated per model)
 
12
 
13
  # Tiny Scribe
14
 
15
+ A lightweight transcript summarization tool powered by local LLMs. Features 23 models ranging from 100M to 30B parameters with live streaming output, reasoning modes, and flexible deployment options.
16
 
17
  ## Features
18
 
19
+ - **23 Local Models**: From tiny 100M models to powerful 30B models
20
  - **Live Streaming**: Real-time summary generation with token-by-token output
21
  - **Model Selection**: Dropdown to choose from 22 available models
22
  - **Reasoning Modes**: Toggle thinking/reasoning for supported models (Qwen3, ERNIE, LFM2)
 
26
  - **Language Support**: English or Traditional Chinese (zh-TW) output via OpenCC
27
  - **Auto Settings**: Temperature, top_p, and top_k sliders auto-populate per model
28
 
29
+ ## Model Registry (23 Models)
30
 
31
  ### Tiny Models (0.1-0.6B)
32
  - **Falcon-H1-100M** - 100M parameters, 4K context
 
48
 
49
  ### Standard Models (3-7B)
50
  - **Granite-3.1-3B-A800M** - 3B parameters, 4K context
51
+ - **Breeze-3B-Q4** - 3B parameters, 32K context
52
  - **Qwen3-4B-Thinking** - 4B parameters, 8K context (reasoning)
53
  - **Granite-4.0-Tiny-7B** - 7B parameters, 8K context
54
 
 
62
  ## Usage
63
 
64
  1. **Select Output Language**: Choose English or Traditional Chinese (zh-TW)
65
+ 2. **Select Model**: Choose from the dropdown of 23 available models
66
  3. **Configure Settings** (optional):
67
  - Enable "Use Reasoning Mode" for thinking models
68
  - Adjust Temperature, Top-p, and Top-k (auto-populated per model)
app.py CHANGED
@@ -218,6 +218,20 @@ AVAILABLE_MODELS = {
218
  "repeat_penalty": 1.1,
219
  },
220
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
221
  "granite_3_1_3b_q4": {
222
  "name": "Granite 3.1 3B-A800M Instruct (128K Context)",
223
  "repo_id": "bartowski/granite-3.1-3b-a800m-instruct-GGUF",
 
218
  "repeat_penalty": 1.1,
219
  },
220
  },
221
+ "breeze_3b_q4": {
222
+ "name": "Breeze 3B Q4 (32K Context)",
223
+ "repo_id": "mradermacher/breeze-3b-GGUF",
224
+ "filename": "*Q4_K_M.gguf",
225
+ "max_context": 32768,
226
+ "default_temperature": 0.6,
227
+ "supports_toggle": False,
228
+ "inference_settings": {
229
+ "temperature": 0.6,
230
+ "top_p": 0.95,
231
+ "top_k": 20,
232
+ "repeat_penalty": 1.0,
233
+ },
234
+ },
235
  "granite_3_1_3b_q4": {
236
  "name": "Granite 3.1 3B-A800M Instruct (128K Context)",
237
  "repo_id": "bartowski/granite-3.1-3b-a800m-instruct-GGUF",