Improve OpenMOSS audio tokenizer model card
Browse files
README.md
CHANGED
|
@@ -4,7 +4,9 @@ language:
|
|
| 4 |
- en
|
| 5 |
license: apache-2.0
|
| 6 |
library_name: mlx
|
| 7 |
-
pipeline_tag:
|
|
|
|
|
|
|
| 8 |
tags:
|
| 9 |
- mlx
|
| 10 |
- audio
|
|
@@ -16,11 +18,11 @@ tags:
|
|
| 16 |
- 8bit
|
| 17 |
---
|
| 18 |
|
| 19 |
-
# OpenMOSS Audio Tokenizer — MLX
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
## Variants
|
| 26 |
|
|
@@ -28,9 +30,17 @@ This is a supporting model — it encodes and decodes audio tokens for the MOSS
|
|
| 28 |
| --- | --- |
|
| 29 |
| `mlx-int8/` | int8 quantized weights |
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
## How to Get Started
|
| 32 |
|
| 33 |
-
Load
|
| 34 |
|
| 35 |
```python
|
| 36 |
from mlx_speech.models.moss_audio_tokenizer import MossAudioTokenizerModel
|
|
@@ -38,7 +48,7 @@ from mlx_speech.models.moss_audio_tokenizer import MossAudioTokenizerModel
|
|
| 38 |
model = MossAudioTokenizerModel.from_path("mlx-int8")
|
| 39 |
```
|
| 40 |
|
| 41 |
-
The tokenizer is loaded automatically when you run
|
| 42 |
|
| 43 |
```bash
|
| 44 |
python scripts/generate_moss_local.py \
|
|
@@ -46,12 +56,17 @@ python scripts/generate_moss_local.py \
|
|
| 46 |
--output outputs/out.wav
|
| 47 |
```
|
| 48 |
|
| 49 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
-
|
| 52 |
|
| 53 |
-
|
|
|
|
| 54 |
|
| 55 |
## License
|
| 56 |
|
| 57 |
-
Apache 2.0 — following the upstream [OpenMOSS](https://
|
|
|
|
| 4 |
- en
|
| 5 |
license: apache-2.0
|
| 6 |
library_name: mlx
|
| 7 |
+
pipeline_tag: feature-extraction
|
| 8 |
+
base_model: OpenMOSS-Team/MOSS-Audio-Tokenizer
|
| 9 |
+
base_model_relation: quantized
|
| 10 |
tags:
|
| 11 |
- mlx
|
| 12 |
- audio
|
|
|
|
| 18 |
- 8bit
|
| 19 |
---
|
| 20 |
|
| 21 |
+
# OpenMOSS Audio Tokenizer — MLX 8-bit
|
| 22 |
|
| 23 |
+
This repository contains an MLX-native int8 conversion of the OpenMOSS audio tokenizer for Apple Silicon.
|
| 24 |
|
| 25 |
+
It is a supporting model that encodes and decodes audio tokens for the OpenMOSS TTS family. It is not a standalone speech generation model.
|
| 26 |
|
| 27 |
## Variants
|
| 28 |
|
|
|
|
| 30 |
| --- | --- |
|
| 31 |
| `mlx-int8/` | int8 quantized weights |
|
| 32 |
|
| 33 |
+
## Model Details
|
| 34 |
+
|
| 35 |
+
- Developed by: AppAutomaton
|
| 36 |
+
- Shared by: AppAutomaton on Hugging Face
|
| 37 |
+
- Upstream model: [`OpenMOSS-Team/MOSS-Audio-Tokenizer`](https://huggingface.co/OpenMOSS-Team/MOSS-Audio-Tokenizer)
|
| 38 |
+
- Task: audio tokenization and codec decoding
|
| 39 |
+
- Runtime: MLX on Apple Silicon
|
| 40 |
+
|
| 41 |
## How to Get Started
|
| 42 |
|
| 43 |
+
Load it directly with [`mlx-speech`](https://github.com/appautomaton/mlx-speech):
|
| 44 |
|
| 45 |
```python
|
| 46 |
from mlx_speech.models.moss_audio_tokenizer import MossAudioTokenizerModel
|
|
|
|
| 48 |
model = MossAudioTokenizerModel.from_path("mlx-int8")
|
| 49 |
```
|
| 50 |
|
| 51 |
+
The tokenizer is loaded automatically when you run OpenMOSS generation scripts. You usually do not need to instantiate it directly.
|
| 52 |
|
| 53 |
```bash
|
| 54 |
python scripts/generate_moss_local.py \
|
|
|
|
| 56 |
--output outputs/out.wav
|
| 57 |
```
|
| 58 |
|
| 59 |
+
## Notes
|
| 60 |
+
|
| 61 |
+
- This repo contains the quantized MLX runtime artifact only.
|
| 62 |
+
- The conversion remaps the original OpenMOSS audio tokenizer weights explicitly for MLX inference.
|
| 63 |
+
- The artifact is shared by the OpenMOSS local TTS, TTSD, and SoundEffect runtime paths in this repo.
|
| 64 |
|
| 65 |
+
## Links
|
| 66 |
|
| 67 |
+
- Source code: [mlx-speech](https://github.com/appautomaton/mlx-speech)
|
| 68 |
+
- More examples: [AppAutomaton](https://github.com/appautomaton)
|
| 69 |
|
| 70 |
## License
|
| 71 |
|
| 72 |
+
Apache 2.0 — following the upstream license published with [`OpenMOSS-Team/MOSS-Audio-Tokenizer`](https://huggingface.co/OpenMOSS-Team/MOSS-Audio-Tokenizer).
|