YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Salt-Cosy: CosyVoice2 Fine-tuned on Russian Audiobooks

Fine-tuned CosyVoice2-0.5B LLM component on ToneBooks Russian audiobook dataset.

Usage

This is only the fine-tuned llm.pt component. To use it:

  1. Download the base CosyVoice2-0.5B model from ModelScope:
modelscope download --model FunAudioLLM/CosyVoice2-0.5B --local_dir pretrained_models/CosyVoice2-0.5B
  1. Replace the llm.pt with this fine-tuned version:
# Backup original
mv pretrained_models/CosyVoice2-0.5B/llm.pt pretrained_models/CosyVoice2-0.5B/llm.pt.original

# Download fine-tuned weights
wget https://huggingface.co/AlexWortega/Salt-cosy/resolve/main/llm.pt -O pretrained_models/CosyVoice2-0.5B/llm.pt
  1. Use CosyVoice2 as normal:
from cosyvoice.cli.cosyvoice import CosyVoice2

cosyvoice = CosyVoice2('pretrained_models/CosyVoice2-0.5B', fp16=True)

# Zero-shot TTS
for result in cosyvoice.inference_zero_shot(
    tts_text="Привет! Это тестовая генерация речи.",
    prompt_text="Текст референсного аудио",
    prompt_wav="reference.wav",
    stream=False
):
    # result['tts_speech'] contains the generated audio
    pass

Training Details

  • Base Model: CosyVoice2-0.5B
  • Dataset: ToneBooks (Russian audiobooks)
  • Epochs: 18
  • Component: LLM only

License

Apache 2.0 (following CosyVoice license)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support