OpenAudio-S1 Mini Int8 (model + code)

Browse files

Files changed (10) hide show

openaudio-s1-mini-int8/.gitattributes +35 -0
openaudio-s1-mini-int8/README.md +82 -0
openaudio-s1-mini-int8/code/fish-speech-int8.zip +3 -0
openaudio-s1-mini-int8/codec_int8.pth +3 -0
openaudio-s1-mini-int8/config.json +32 -0
openaudio-s1-mini-int8/languages.txt +13 -0
openaudio-s1-mini-int8/model.pth +3 -0
openaudio-s1-mini-int8/source.txt +1 -0
openaudio-s1-mini-int8/special_tokens.json +0 -0
openaudio-s1-mini-int8/tokenizer.tiktoken +0 -0

openaudio-s1-mini-int8/.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

openaudio-s1-mini-int8/README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+tags:
+- text-to-speech
+license: cc-by-nc-sa-4.0
+language:
+- zh
+- en
+- de
+- ja
+- fr
+- es
+- ko
+- ar
+- nl
+- ru
+- it
+- pl
+- pt
+pipeline_tag: text-to-speech
+inference: false
+base_model: fishaudio/openaudio-s1-mini
+---
+# OpenAudio S1-mini INT8 Quantized
+**INT8 weight-only quantized version** of [fishaudio/openaudio-s1-mini](https://huggingface.co/fishaudio/openaudio-s1-mini) for efficient GPU inference.
+## Model Size Comparison
+| Model | Original | INT8 | Reduction |
+|-------|----------|------|-----------|
+| LLaMA (model.pth) | 1.64 GB | 1.02 GB | -38% |
+| Codec (codec_int8.pth) | 1.74 GB | 0.91 GB | -48% |
+| **Total** | **3.38 GB** | **1.93 GB** | **-43%** |
+## Performance
+- RTF (Real-Time Factor): ~1.9x with reference caching
+- Tested on RTX 3090
+- Quality comparable to original FP16/BF16 model
+## Usage
+```python
+from voice_clone_tts import VoiceCloneTTS
+tts = VoiceCloneTTS(
+    llama_checkpoint_path="ORI-Muchim/openaudio-s1-mini-int8",
+    decoder_checkpoint_path="ORI-Muchim/openaudio-s1-mini-int8",
+)
+audio, sr = tts.synthesize(
+    text="Hello, this is a test.",
+    reference_audio="reference.wav",  # Optional: for voice cloning
+)
+```
+## Files
+- `model.pth` - INT8 quantized LLaMA model (1.02 GB)
+- `codec_int8.pth` - INT8 quantized DAC codec (0.91 GB)
+- `config.json` - Model configuration
+- `tokenizer.tiktoken` - Tokenizer
+- `special_tokens.json` - Special tokens
+## Quantization Method
+Weight-only INT8 quantization with per-channel scales:
+- Weights stored as INT8
+- Scales stored as BF16
+- Activations remain in FP16/BF16
+## Credits
+- Original model: [Fish Audio](https://fish.audio) / [fishaudio/openaudio-s1-mini](https://huggingface.co/fishaudio/openaudio-s1-mini)
+- Quantization: ORI-Muchim
+## License
+CC-BY-NC-SA-4.0 (Non-commercial use only)
+See the original model for full license terms.

openaudio-s1-mini-int8/code/fish-speech-int8.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bcf677b22b20214bbae56f256fee48907578105d53fe7bc6a18a523c4b3170f7
+size 121812

openaudio-s1-mini-int8/codec_int8.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:60bbf6126044d4192e5373dc9fb2834c1e571ae9ec89bb10f514d42addb513ee
+size 953503175

openaudio-s1-mini-int8/config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+    "attention_o_bias": false,
+    "attention_qk_norm": true,
+    "attention_qkv_bias": false,
+    "codebook_size": 4096,
+    "dim": 1024,
+    "dropout": 0.0,
+    "fast_attention_o_bias": false,
+    "fast_attention_qk_norm": false,
+    "fast_attention_qkv_bias": false,
+    "fast_dim": 1024,
+    "fast_head_dim": 64,
+    "fast_intermediate_size": 3072,
+    "fast_n_head": 16,
+    "fast_n_local_heads": 8,
+    "head_dim": 128,
+    "initializer_range": 0.03125,
+    "intermediate_size": 3072,
+    "max_seq_len": 8192,
+    "model_type": "dual_ar",
+    "n_fast_layer": 4,
+    "n_head": 16,
+    "n_layer": 28,
+    "n_local_heads": 8,
+    "norm_eps": 1e-06,
+    "num_codebooks": 10,
+    "rope_base": 1000000,
+    "scale_codebook_embeddings": true,
+    "tie_word_embeddings": false,
+    "use_gradient_checkpointing": true,
+    "vocab_size": 155776
+}

openaudio-s1-mini-int8/languages.txt ADDED Viewed

	@@ -0,0 +1,13 @@

+Chinese
+English
+German
+Japanese
+French
+Spanish
+Korean
+Arabic
+Dutch
+Russian
+Italian
+Polish
+Portuguese

openaudio-s1-mini-int8/model.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b7d191474dc196df3ab99ba57aa6f320c37bb6f57c225f24b94f74f52626fb58
+size 1067112303

openaudio-s1-mini-int8/source.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://huggingface.co/ORI-Muchim/openaudio-s1-mini-int8

openaudio-s1-mini-int8/special_tokens.json ADDED Viewed

The diff for this file is too large to render. See raw diff

openaudio-s1-mini-int8/tokenizer.tiktoken ADDED Viewed

The diff for this file is too large to render. See raw diff