File size: 1,785 Bytes
7e7d4ae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
license: other
license_name: fish-audio-research
license_link: LICENSE
tags:
  - text-to-speech
  - tts
  - voice-cloning
  - speech-synthesis
language:
  - en
  - zh
---

# Fish Speech S2 Pro — Mirror

Mirror of the Fish Speech S2 Pro model by [Fish Audio](https://fish.audio).

**Original model:** [fishaudio/fish-speech-1.5](https://huggingface.co/fishaudio/fish-speech-1.5)

## Available Files

| File | Size | Description |
|---|---|---|
| `model.safetensors` | 9.12 GB | Main language model weights |
| `codec.pth` | 1.87 GB | Audio codec (encoder/decoder) |
| `config.json` | 1.86 KB | Model configuration |
| `tokenizer.json` | 12.2 MB | Tokenizer data |
| `tokenizer_config.json` | 861 KB | Tokenizer configuration |
| `special_tokens_map.json` | 102 KB | Special tokens mapping |
| `chat_template.jinja` | 4.12 KB | Chat template |

## Model Details

Fish Speech is a leading open-source text-to-speech (TTS) model that supports high-quality voice cloning and multilingual speech synthesis. The S2 Pro variant offers improved quality and zero-shot voice cloning capabilities.

- **Architecture:** Qwen3-based language model + audio codec
- **Task:** Text-to-speech, voice cloning
- **Languages:** English, Chinese, Japanese, and more
- **Code:** [github.com/fishaudio/fish-speech](https://github.com/fishaudio/fish-speech)

## Usage with ComfyUI-FFMPEGA

This model is automatically downloaded and used by the [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA) extension for TTS and voice cloning features.

## License

**Fish Audio Research License** — see [LICENSE](LICENSE) file.

- ✅ Free for research and non-commercial use
- ❌ Commercial use requires a separate license from [Fish Audio](https://fish.audio) (contact: business@fish.audio)