tamarher commited on
Commit
8b690ef
·
verified ·
1 Parent(s): 51a2eb1

Improve OpenMOSS audio tokenizer model card

Browse files
Files changed (1) hide show
  1. README.md +25 -10
README.md CHANGED
@@ -4,7 +4,9 @@ language:
4
  - en
5
  license: apache-2.0
6
  library_name: mlx
7
- pipeline_tag: audio-to-audio
 
 
8
  tags:
9
  - mlx
10
  - audio
@@ -16,11 +18,11 @@ tags:
16
  - 8bit
17
  ---
18
 
19
- # OpenMOSS Audio Tokenizer — MLX
20
 
21
- The CAT codec component from the [OpenMOSS](https://github.com/open-moss) project, converted and quantized for native MLX inference on Apple Silicon.
22
 
23
- This is a supporting model it encodes and decodes audio tokens for the MOSS TTS model family. It is not a standalone TTS model.
24
 
25
  ## Variants
26
 
@@ -28,9 +30,17 @@ This is a supporting model — it encodes and decodes audio tokens for the MOSS
28
  | --- | --- |
29
  | `mlx-int8/` | int8 quantized weights |
30
 
 
 
 
 
 
 
 
 
31
  ## How to Get Started
32
 
33
- Load via [mlx-speech](https://github.com/appautomaton/mlx-speech):
34
 
35
  ```python
36
  from mlx_speech.models.moss_audio_tokenizer import MossAudioTokenizerModel
@@ -38,7 +48,7 @@ from mlx_speech.models.moss_audio_tokenizer import MossAudioTokenizerModel
38
  model = MossAudioTokenizerModel.from_path("mlx-int8")
39
  ```
40
 
41
- The tokenizer is loaded automatically when you run any MOSS TTS generation script. You typically do not need to load it directly.
42
 
43
  ```bash
44
  python scripts/generate_moss_local.py \
@@ -46,12 +56,17 @@ python scripts/generate_moss_local.py \
46
  --output outputs/out.wav
47
  ```
48
 
49
- ## Model Details
 
 
 
 
50
 
51
- Converted from the original OpenMOSS checkpoint using explicit MLX weight remapping — no PyTorch at inference time. Quantized to int8 with `W8Abf16` mixed precision.
52
 
53
- See [mlx-speech](https://github.com/appautomaton/mlx-speech) for the full conversion pipeline and runtime code.
 
54
 
55
  ## License
56
 
57
- Apache 2.0 — following the upstream [OpenMOSS](https://github.com/open-moss) license terms.
 
4
  - en
5
  license: apache-2.0
6
  library_name: mlx
7
+ pipeline_tag: feature-extraction
8
+ base_model: OpenMOSS-Team/MOSS-Audio-Tokenizer
9
+ base_model_relation: quantized
10
  tags:
11
  - mlx
12
  - audio
 
18
  - 8bit
19
  ---
20
 
21
+ # OpenMOSS Audio Tokenizer — MLX 8-bit
22
 
23
+ This repository contains an MLX-native int8 conversion of the OpenMOSS audio tokenizer for Apple Silicon.
24
 
25
+ It is a supporting model that encodes and decodes audio tokens for the OpenMOSS TTS family. It is not a standalone speech generation model.
26
 
27
  ## Variants
28
 
 
30
  | --- | --- |
31
  | `mlx-int8/` | int8 quantized weights |
32
 
33
+ ## Model Details
34
+
35
+ - Developed by: AppAutomaton
36
+ - Shared by: AppAutomaton on Hugging Face
37
+ - Upstream model: [`OpenMOSS-Team/MOSS-Audio-Tokenizer`](https://huggingface.co/OpenMOSS-Team/MOSS-Audio-Tokenizer)
38
+ - Task: audio tokenization and codec decoding
39
+ - Runtime: MLX on Apple Silicon
40
+
41
  ## How to Get Started
42
 
43
+ Load it directly with [`mlx-speech`](https://github.com/appautomaton/mlx-speech):
44
 
45
  ```python
46
  from mlx_speech.models.moss_audio_tokenizer import MossAudioTokenizerModel
 
48
  model = MossAudioTokenizerModel.from_path("mlx-int8")
49
  ```
50
 
51
+ The tokenizer is loaded automatically when you run OpenMOSS generation scripts. You usually do not need to instantiate it directly.
52
 
53
  ```bash
54
  python scripts/generate_moss_local.py \
 
56
  --output outputs/out.wav
57
  ```
58
 
59
+ ## Notes
60
+
61
+ - This repo contains the quantized MLX runtime artifact only.
62
+ - The conversion remaps the original OpenMOSS audio tokenizer weights explicitly for MLX inference.
63
+ - The artifact is shared by the OpenMOSS local TTS, TTSD, and SoundEffect runtime paths in this repo.
64
 
65
+ ## Links
66
 
67
+ - Source code: [mlx-speech](https://github.com/appautomaton/mlx-speech)
68
+ - More examples: [AppAutomaton](https://github.com/appautomaton)
69
 
70
  ## License
71
 
72
+ Apache 2.0 — following the upstream license published with [`OpenMOSS-Team/MOSS-Audio-Tokenizer`](https://huggingface.co/OpenMOSS-Team/MOSS-Audio-Tokenizer).