NeoRoth commited on
Commit
14de29c
·
verified ·
1 Parent(s): 33a7ce1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: mistralai/Voxtral-Mini-4B-Realtime-2602
4
+ tags:
5
+ - mlx
6
+ - safetensors
7
+ - voxtral
8
+ - realtime
9
+ - mxfp4
10
+ - mixed-precision
11
+ - apple-silicon
12
+ - speech-to-text
13
+ language:
14
+ - fr
15
+ - en
16
+ - de
17
+ - es
18
+ - it
19
+ - pt
20
+ - nl
21
+ - pl
22
+ - ru
23
+ - uk
24
+ - ja
25
+ - ko
26
+ - zh
27
+ - ar
28
+ - hi
29
+ library_name: mlx
30
+ pipeline_tag: automatic-speech-recognition
31
+ ---
32
+
33
+ # Voxtral Mini 4B Realtime 2602 — MLX
34
+
35
+ MLX quantized weights for [Voxtral-Mini-4B-Realtime-2602](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602) on Apple Silicon.
36
+
37
+ ## Architecture
38
+
39
+ Voxtral Realtime differs from the standard Voxtral Mini 3B. It uses **additive fusion** (audio embeddings + token embeddings) instead of sequence concatenation, and a **causal transformer encoder** with sliding window attention (window=750) instead of a bidirectional Whisper-style encoder. This makes it suitable for streaming / low-latency inference.
40
+
41
+ ## Variants
42
+
43
+ | Folder | Quantization | Description |
44
+ |--------|-------------|-------------|
45
+ | `mlx-mxfp4-mixed/` | **MXFP4 mixed precision** | Text decoder: MXFP4 4-bit (group_size=32). Audio encoder/projector: 8-bit affine (group_size=64). Embeddings: full precision. |
46
+
47
+ ## License
48
+
49
+ This model is distributed under the **Apache 2.0** license, following the upstream [Voxtral-Mini-4B-Realtime-2602](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602) license.
50
+
51
+ ## Requirements
52
+
53
+ - Apple Silicon (M1+, M5+ recommended for native MXFP4)
54
+ - MLX >= 0.30.0
55
+ - Python 3.11+
56
+
57
+ ## Source
58
+
59
+ Converted from [mistralai/Voxtral-Mini-4B-Realtime-2602](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602) using [oriloq-mlx](https://github.com/oriloq-s/oriloq-mlx).