YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

StyleTTS2 Japanese - Aiyoshida Voice

Japanese TTS model fine-tuned on Aiyoshida voice data using StyleTTS2.

Model Info

  • Base Model: StyleTTS2 Japanese (MOE 70k pretrained)
  • Training: 50 epochs fine-tuning
  • Language: Japanese
  • Sample Rate: 24kHz

Files

  • model.pth - Main model checkpoint (2.1GB)
  • voicepack.pt - Voice embedding for inference
  • config.yml - Model configuration

Usage

# Clone StyleTTS2 Japanese repo
# git clone https://github.com/livetoon-dev/styletts2-second

# Inference
python fast_infer_speed_f0.py \
  -t "島根の吉田です" \
  -o output.wav \
  --voice voicepack.pt \
  --config config.yml \
  --ckpt model.pth

Sample Outputs

Generated with this model:

  • たーかーのーつーめー
  • 夜中すぎたら子供たちに甘いものをあげないって約束したじゃない
  • 島根の吉田です

License

For research and personal use only.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support