Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -10,22 +10,10 @@ tags:
|
|
| 10 |
- stt
|
| 11 |
pipeline_tag: automatic-speech-recognition
|
| 12 |
language:
|
| 13 |
-
-
|
| 14 |
-
- zh
|
| 15 |
-
- ja
|
| 16 |
-
- ko
|
| 17 |
-
- es
|
| 18 |
-
- fr
|
| 19 |
-
- de
|
| 20 |
-
- it
|
| 21 |
-
- pt
|
| 22 |
-
- ru
|
| 23 |
-
- ar
|
| 24 |
-
- th
|
| 25 |
-
- vi
|
| 26 |
---
|
| 27 |
|
| 28 |
-
# mlx-community/
|
| 29 |
|
| 30 |
This model was converted to MLX format from [FunAudioLLM/Fun-ASR-MLT-Nano-2512](https://huggingface.co/FunAudioLLM/Fun-ASR-MLT-Nano-2512) using [mlx-audio-plus](https://github.com/DePasqualeOrg/mlx-audio-plus) version **0.1.4**.
|
| 31 |
|
|
@@ -33,7 +21,9 @@ This model was converted to MLX format from [FunAudioLLM/Fun-ASR-MLT-Nano-2512](
|
|
| 33 |
|
| 34 |
| Feature | Description |
|
| 35 |
|---------|-------------|
|
| 36 |
-
| **Multilingual** | Supports
|
|
|
|
|
|
|
| 37 |
| **Translation** | Translate speech directly to English text |
|
| 38 |
| **Custom prompting** | Guide recognition with domain-specific context |
|
| 39 |
| **Streaming** | Real-time token-by-token output |
|
|
@@ -52,7 +42,7 @@ pip install -U mlx-audio-plus
|
|
| 52 |
from mlx_audio.stt.models.funasr import Model
|
| 53 |
|
| 54 |
# Load the model
|
| 55 |
-
model = Model.from_pretrained("mlx-community/
|
| 56 |
|
| 57 |
# Transcribe audio
|
| 58 |
result = model.generate("audio.wav")
|
|
@@ -107,24 +97,6 @@ for chunk in model.generate("audio.wav", stream=True):
|
|
| 107 |
print(chunk, end="", flush=True)
|
| 108 |
```
|
| 109 |
|
| 110 |
-
### Batch Processing
|
| 111 |
-
|
| 112 |
-
```python
|
| 113 |
-
audio_files = ["meeting1.wav", "meeting2.wav", "meeting3.wav"]
|
| 114 |
-
|
| 115 |
-
for audio_path in audio_files:
|
| 116 |
-
result = model.generate(audio_path)
|
| 117 |
-
print(f"{audio_path}: {result.text}")
|
| 118 |
-
```
|
| 119 |
-
|
| 120 |
## Supported Languages
|
| 121 |
|
| 122 |
-
|
| 123 |
-
|------|----------|------|----------|
|
| 124 |
-
| `en` | English | `ru` | Russian |
|
| 125 |
-
| `zh` | Chinese | `ar` | Arabic |
|
| 126 |
-
| `ja` | Japanese | `th` | Thai |
|
| 127 |
-
| `ko` | Korean | `vi` | Vietnamese |
|
| 128 |
-
| `es` | Spanish | `de` | German |
|
| 129 |
-
| `fr` | French | `it` | Italian |
|
| 130 |
-
| `pt` | Portuguese | `auto` | Auto-detect |
|
|
|
|
| 10 |
- stt
|
| 11 |
pipeline_tag: automatic-speech-recognition
|
| 12 |
language:
|
| 13 |
+
- multilingual
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# mlx-community/Fun-ASR-MLT-Nano-2512-8bit
|
| 17 |
|
| 18 |
This model was converted to MLX format from [FunAudioLLM/Fun-ASR-MLT-Nano-2512](https://huggingface.co/FunAudioLLM/Fun-ASR-MLT-Nano-2512) using [mlx-audio-plus](https://github.com/DePasqualeOrg/mlx-audio-plus) version **0.1.4**.
|
| 19 |
|
|
|
|
| 21 |
|
| 22 |
| Feature | Description |
|
| 23 |
|---------|-------------|
|
| 24 |
+
| **Multilingual** | Supports 31 languages with focus on East and Southeast Asian languages |
|
| 25 |
+
| **Chinese dialects** | Supports 7 major Chinese dialects |
|
| 26 |
+
| **Code-switching** | Handles mixed-language speech within sentences |
|
| 27 |
| **Translation** | Translate speech directly to English text |
|
| 28 |
| **Custom prompting** | Guide recognition with domain-specific context |
|
| 29 |
| **Streaming** | Real-time token-by-token output |
|
|
|
|
| 42 |
from mlx_audio.stt.models.funasr import Model
|
| 43 |
|
| 44 |
# Load the model
|
| 45 |
+
model = Model.from_pretrained("mlx-community/Fun-ASR-MLT-Nano-2512-8bit")
|
| 46 |
|
| 47 |
# Transcribe audio
|
| 48 |
result = model.generate("audio.wav")
|
|
|
|
| 97 |
print(chunk, end="", flush=True)
|
| 98 |
```
|
| 99 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 100 |
## Supported Languages
|
| 101 |
|
| 102 |
+
See [original model](https://huggingface.co/FunAudioLLM/Fun-ASR-MLT-Nano-2512) for the full list of supported languages.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|