niobures commited on
Commit
ab66182
·
verified ·
1 Parent(s): d46ddbc

OpenAudio-S1 Mini Int8 (model + code)

Browse files
openaudio-s1-mini-int8/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
openaudio-s1-mini-int8/README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-to-speech
4
+ license: cc-by-nc-sa-4.0
5
+ language:
6
+ - zh
7
+ - en
8
+ - de
9
+ - ja
10
+ - fr
11
+ - es
12
+ - ko
13
+ - ar
14
+ - nl
15
+ - ru
16
+ - it
17
+ - pl
18
+ - pt
19
+ pipeline_tag: text-to-speech
20
+ inference: false
21
+ base_model: fishaudio/openaudio-s1-mini
22
+ ---
23
+
24
+ # OpenAudio S1-mini INT8 Quantized
25
+
26
+ **INT8 weight-only quantized version** of [fishaudio/openaudio-s1-mini](https://huggingface.co/fishaudio/openaudio-s1-mini) for efficient GPU inference.
27
+
28
+ ## Model Size Comparison
29
+
30
+ | Model | Original | INT8 | Reduction |
31
+ |-------|----------|------|-----------|
32
+ | LLaMA (model.pth) | 1.64 GB | 1.02 GB | -38% |
33
+ | Codec (codec_int8.pth) | 1.74 GB | 0.91 GB | -48% |
34
+ | **Total** | **3.38 GB** | **1.93 GB** | **-43%** |
35
+
36
+ ## Performance
37
+
38
+ - RTF (Real-Time Factor): ~1.9x with reference caching
39
+ - Tested on RTX 3090
40
+ - Quality comparable to original FP16/BF16 model
41
+
42
+ ## Usage
43
+
44
+ ```python
45
+ from voice_clone_tts import VoiceCloneTTS
46
+
47
+ tts = VoiceCloneTTS(
48
+ llama_checkpoint_path="ORI-Muchim/openaudio-s1-mini-int8",
49
+ decoder_checkpoint_path="ORI-Muchim/openaudio-s1-mini-int8",
50
+ )
51
+
52
+ audio, sr = tts.synthesize(
53
+ text="Hello, this is a test.",
54
+ reference_audio="reference.wav", # Optional: for voice cloning
55
+ )
56
+ ```
57
+
58
+ ## Files
59
+
60
+ - `model.pth` - INT8 quantized LLaMA model (1.02 GB)
61
+ - `codec_int8.pth` - INT8 quantized DAC codec (0.91 GB)
62
+ - `config.json` - Model configuration
63
+ - `tokenizer.tiktoken` - Tokenizer
64
+ - `special_tokens.json` - Special tokens
65
+
66
+ ## Quantization Method
67
+
68
+ Weight-only INT8 quantization with per-channel scales:
69
+ - Weights stored as INT8
70
+ - Scales stored as BF16
71
+ - Activations remain in FP16/BF16
72
+
73
+ ## Credits
74
+
75
+ - Original model: [Fish Audio](https://fish.audio) / [fishaudio/openaudio-s1-mini](https://huggingface.co/fishaudio/openaudio-s1-mini)
76
+ - Quantization: ORI-Muchim
77
+
78
+ ## License
79
+
80
+ CC-BY-NC-SA-4.0 (Non-commercial use only)
81
+
82
+ See the original model for full license terms.
openaudio-s1-mini-int8/code/fish-speech-int8.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bcf677b22b20214bbae56f256fee48907578105d53fe7bc6a18a523c4b3170f7
3
+ size 121812
openaudio-s1-mini-int8/codec_int8.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60bbf6126044d4192e5373dc9fb2834c1e571ae9ec89bb10f514d42addb513ee
3
+ size 953503175
openaudio-s1-mini-int8/config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "attention_o_bias": false,
3
+ "attention_qk_norm": true,
4
+ "attention_qkv_bias": false,
5
+ "codebook_size": 4096,
6
+ "dim": 1024,
7
+ "dropout": 0.0,
8
+ "fast_attention_o_bias": false,
9
+ "fast_attention_qk_norm": false,
10
+ "fast_attention_qkv_bias": false,
11
+ "fast_dim": 1024,
12
+ "fast_head_dim": 64,
13
+ "fast_intermediate_size": 3072,
14
+ "fast_n_head": 16,
15
+ "fast_n_local_heads": 8,
16
+ "head_dim": 128,
17
+ "initializer_range": 0.03125,
18
+ "intermediate_size": 3072,
19
+ "max_seq_len": 8192,
20
+ "model_type": "dual_ar",
21
+ "n_fast_layer": 4,
22
+ "n_head": 16,
23
+ "n_layer": 28,
24
+ "n_local_heads": 8,
25
+ "norm_eps": 1e-06,
26
+ "num_codebooks": 10,
27
+ "rope_base": 1000000,
28
+ "scale_codebook_embeddings": true,
29
+ "tie_word_embeddings": false,
30
+ "use_gradient_checkpointing": true,
31
+ "vocab_size": 155776
32
+ }
openaudio-s1-mini-int8/languages.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Chinese
2
+ English
3
+ German
4
+ Japanese
5
+ French
6
+ Spanish
7
+ Korean
8
+ Arabic
9
+ Dutch
10
+ Russian
11
+ Italian
12
+ Polish
13
+ Portuguese
openaudio-s1-mini-int8/model.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7d191474dc196df3ab99ba57aa6f320c37bb6f57c225f24b94f74f52626fb58
3
+ size 1067112303
openaudio-s1-mini-int8/source.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ https://huggingface.co/ORI-Muchim/openaudio-s1-mini-int8
openaudio-s1-mini-int8/special_tokens.json ADDED
The diff for this file is too large to render. See raw diff
 
openaudio-s1-mini-int8/tokenizer.tiktoken ADDED
The diff for this file is too large to render. See raw diff