|
|
--- |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- sesame/csm-1b |
|
|
- senstella/csm-expressiva-1b |
|
|
- meta-llama/Llama-3.2-1B |
|
|
- Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct |
|
|
- fixie-ai/ultravox-v0_5-llama-3_2-1b |
|
|
pipeline_tag: text-to-speech |
|
|
--- |
|
|
|
|
|
**The model supports multilingual transcription, but voice output is only in English or English-like languages.** |
|
|
|
|
|
|
|
|
Models: |
|
|
CSM: [sesame/csm-1b](https://huggingface.co/sesame/csm-1b) |
|
|
CSM-EXPRESSIVA(WHISPERING & NO VC): [senstella/csm-expressiva-1b](https://huggingface.co/senstella/csm-expressiva-1b) |
|
|
LLAMA: [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) |
|
|
LLAMA-VIKHR: [Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct](https://huggingface.co/Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct) |
|
|
LLAMA-ULTRAVOX: [fixie-ai/ultravox-v0_5-llama-3_2-1b](https://huggingface.co/fixie-ai/ultravox-v0_5-llama-3_2-1b) |
|
|
|
|
|
|
|
|
### CSM: |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm/examples/conversational_a.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm/examples/conversational_b.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm/examples/read_speech_a.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm/examples/read_speech_b.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm/examples/read_speech_c.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm/examples/read_speech_d.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
|
|
|
|
|
|
### CSM-EXPRESSIVA(WHISPERING & NO VC): |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/csm-expressiva/examples/demo.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
|
|
|
|
|
|
### LLAMA: |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama/real-examples/audio.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama/real-examples/audio_1(VC).wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama/real-examples/audio_2(VC).wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
|
|
|
|
|
|
### LLAMA-VIKHR: |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama-vikhr/real-examples/audio.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama-vikhr/real-examples/audio_1(VC).wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama-vikhr/real-examples/audio_2(VC).wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
|
|
|
|
|
|
### LLAMA-ULTRAVOX: |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama-ultravox/real-examples/audio.wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama-ultravox/real-examples/audio_1(VC).wav?download=true" type="audio/mpeg"> |
|
|
</audio> |
|
|
<audio controls> |
|
|
<source src="https://huggingface.co/Derur/csm-models/resolve/main/llama-ultravox/real-examples/audio_2(VC).wav?download=true" type="audio/mpeg"> |
|
|
</audio> |