Improve model card: add paper link and fix citation
Browse filesHi! I'm Niels from the community science team at Hugging Face. I'm opening this PR to improve your model card by adding a link to the [Fish Audio S2 Technical Report](https://huggingface.co/papers/2603.08823) and fixing a small syntax error in the BibTeX citation. This will help users better discover and cite your work.
README.md
CHANGED
|
@@ -1,9 +1,4 @@
|
|
| 1 |
---
|
| 2 |
-
tags:
|
| 3 |
-
- text-to-speech
|
| 4 |
-
license: other
|
| 5 |
-
license_name: fish-audio-research-license
|
| 6 |
-
license_link: LICENSE.md
|
| 7 |
language:
|
| 8 |
- zh
|
| 9 |
- en
|
|
@@ -18,7 +13,7 @@ language:
|
|
| 18 |
- sv
|
| 19 |
- it
|
| 20 |
- tr
|
| 21 |
-
-
|
| 22 |
- nl
|
| 23 |
- cy
|
| 24 |
- eu
|
|
@@ -88,23 +83,26 @@ language:
|
|
| 88 |
- as
|
| 89 |
- gu
|
| 90 |
- fo
|
|
|
|
|
|
|
|
|
|
| 91 |
pipeline_tag: text-to-speech
|
|
|
|
|
|
|
| 92 |
inference: false
|
| 93 |
-
extra_gated_prompt:
|
| 94 |
-
|
| 95 |
-
laws.
|
| 96 |
extra_gated_fields:
|
| 97 |
Country: country
|
| 98 |
Specific date: date_picker
|
| 99 |
I agree to use this model for non-commercial use ONLY: checkbox
|
| 100 |
---
|
| 101 |
|
| 102 |
-
|
| 103 |
# Fish Audio S2 Pro
|
| 104 |
|
| 105 |
<img src="overview.png" alt="Fish Audio S2 Pro overview — fine-grained control, multi-speaker multi-turn generation, low-latency streaming, and long-context inference." width="100%">
|
| 106 |
|
| 107 |
-
**Fish Audio S2 Pro** is a leading text-to-speech (TTS) model with fine-grained inline control of prosody and emotion. Trained on over 10M+ hours of audio data across 80+ languages, the system combines reinforcement learning alignment with a dual-autoregressive architecture. The release includes model weights, fine-tuning code, and an SGLang-based streaming inference engine.
|
| 108 |
|
| 109 |
## Architecture
|
| 110 |
|
|
@@ -160,8 +158,9 @@ If you find our work useful, please consider citing our report:
|
|
| 160 |
archivePrefix={arXiv},
|
| 161 |
primaryClass={cs.SD},
|
| 162 |
url={https://arxiv.org/abs/2603.08823},
|
|
|
|
| 163 |
```
|
| 164 |
|
| 165 |
## License
|
| 166 |
|
| 167 |
-
This model is licensed under the [Fish Audio Research License](LICENSE.md). Research and non-commercial use is permitted free of charge. Commercial use requires a separate license from Fish Audio — contact business@fish.audio.
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
language:
|
| 3 |
- zh
|
| 4 |
- en
|
|
|
|
| 13 |
- sv
|
| 14 |
- it
|
| 15 |
- tr
|
| 16 |
+
- 'no'
|
| 17 |
- nl
|
| 18 |
- cy
|
| 19 |
- eu
|
|
|
|
| 83 |
- as
|
| 84 |
- gu
|
| 85 |
- fo
|
| 86 |
+
license: other
|
| 87 |
+
license_name: fish-audio-research-license
|
| 88 |
+
license_link: LICENSE.md
|
| 89 |
pipeline_tag: text-to-speech
|
| 90 |
+
tags:
|
| 91 |
+
- text-to-speech
|
| 92 |
inference: false
|
| 93 |
+
extra_gated_prompt: You agree to not use the model to generate contents that violate
|
| 94 |
+
DMCA or local laws.
|
|
|
|
| 95 |
extra_gated_fields:
|
| 96 |
Country: country
|
| 97 |
Specific date: date_picker
|
| 98 |
I agree to use this model for non-commercial use ONLY: checkbox
|
| 99 |
---
|
| 100 |
|
|
|
|
| 101 |
# Fish Audio S2 Pro
|
| 102 |
|
| 103 |
<img src="overview.png" alt="Fish Audio S2 Pro overview — fine-grained control, multi-speaker multi-turn generation, low-latency streaming, and long-context inference." width="100%">
|
| 104 |
|
| 105 |
+
**Fish Audio S2 Pro** is a leading text-to-speech (TTS) model with fine-grained inline control of prosody and emotion, introduced in the [Fish Audio S2 Technical Report](https://huggingface.co/papers/2603.08823). Trained on over 10M+ hours of audio data across 80+ languages, the system combines reinforcement learning alignment with a dual-autoregressive architecture. The release includes model weights, fine-tuning code, and an SGLang-based streaming inference engine.
|
| 106 |
|
| 107 |
## Architecture
|
| 108 |
|
|
|
|
| 158 |
archivePrefix={arXiv},
|
| 159 |
primaryClass={cs.SD},
|
| 160 |
url={https://arxiv.org/abs/2603.08823},
|
| 161 |
+
}
|
| 162 |
```
|
| 163 |
|
| 164 |
## License
|
| 165 |
|
| 166 |
+
This model is licensed under the [Fish Audio Research License](LICENSE.md). Research and non-commercial use is permitted free of charge. Commercial use requires a separate license from Fish Audio — contact business@fish.audio.
|