Update README.md
Browse filesupdate description
README.md
CHANGED
|
@@ -17,6 +17,8 @@ pipeline_tag: automatic-speech-recognition
|
|
| 17 |
|
| 18 |
This is an 8-bit quantized [MLX](https://github.com/ml-explore/mlx) version of [Voxtral Mini 4B Realtime](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602) by Mistral AI, converted using [voxmlx](https://github.com/awni/voxmlx).
|
| 19 |
|
|
|
|
|
|
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
- **Base model:** [mistralai/Voxtral-Mini-4B-Realtime-2602](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602)
|
|
@@ -29,42 +31,6 @@ This is an 8-bit quantized [MLX](https://github.com/ml-explore/mlx) version of [
|
|
| 29 |
|
| 30 |
Voxtral Mini is a speech-to-text model that supports 13+ languages with sub-500ms latency. This version has been quantized to 8-bit precision for efficient inference on Apple Silicon using the MLX framework.
|
| 31 |
|
| 32 |
-
## Usage
|
| 33 |
-
|
| 34 |
-
Install voxmlx:
|
| 35 |
-
|
| 36 |
-
```bash
|
| 37 |
-
pip install voxmlx
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
Transcribe from a file:
|
| 41 |
-
|
| 42 |
-
```python
|
| 43 |
-
import voxmlx
|
| 44 |
-
|
| 45 |
-
model = voxmlx.load("ellamind/Voxtral-Mini-4B-Realtime-8bit-mlx")
|
| 46 |
-
text = voxmlx.transcribe(model, "audio.wav")
|
| 47 |
-
print(text)
|
| 48 |
-
```
|
| 49 |
-
|
| 50 |
-
Transcribe from microphone:
|
| 51 |
-
|
| 52 |
-
```bash
|
| 53 |
-
voxmlx-transcribe --model ellamind/Voxtral-Mini-4B-Realtime-8bit-mlx
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
-
## Conversion
|
| 57 |
-
|
| 58 |
-
This model was converted using [voxmlx](https://github.com/awni/voxmlx):
|
| 59 |
-
|
| 60 |
-
```bash
|
| 61 |
-
pip install voxmlx
|
| 62 |
-
voxmlx-convert \
|
| 63 |
-
--hf-path mistralai/Voxtral-Mini-4B-Realtime-2602 \
|
| 64 |
-
--mlx-path voxtral-mlx-8bit \
|
| 65 |
-
--quantize \
|
| 66 |
-
--bits 8
|
| 67 |
-
```
|
| 68 |
|
| 69 |
## Credits
|
| 70 |
|
|
|
|
| 17 |
|
| 18 |
This is an 8-bit quantized [MLX](https://github.com/ml-explore/mlx) version of [Voxtral Mini 4B Realtime](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602) by Mistral AI, converted using [voxmlx](https://github.com/awni/voxmlx).
|
| 19 |
|
| 20 |
+
This version was created for use with [Supervoxtral](https://github.com/jphme/supervoxtral), enabling blazingly-fast realtime transcription on MacOS.
|
| 21 |
+
|
| 22 |
## Model Details
|
| 23 |
|
| 24 |
- **Base model:** [mistralai/Voxtral-Mini-4B-Realtime-2602](https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602)
|
|
|
|
| 31 |
|
| 32 |
Voxtral Mini is a speech-to-text model that supports 13+ languages with sub-500ms latency. This version has been quantized to 8-bit precision for efficient inference on Apple Silicon using the MLX framework.
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
## Credits
|
| 36 |
|