YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

StyleTTS2 Japanese - Aiyoshida Voice

Japanese TTS model fine-tuned on Aiyoshida voice data using StyleTTS2.

Model Info

Base Model: StyleTTS2 Japanese (MOE 70k pretrained)
Training: 50 epochs fine-tuning
Language: Japanese
Sample Rate: 24kHz

Files

model.pth - Main model checkpoint (2.1GB)
voicepack.pt - Voice embedding for inference
config.yml - Model configuration

Usage

# Clone StyleTTS2 Japanese repo
# git clone https://github.com/livetoon-dev/styletts2-second

# Inference
python fast_infer_speed_f0.py \
  -t "島根の吉田です" \
  -o output.wav \
  --voice voicepack.pt \
  --config config.yml \
  --ckpt model.pth

Sample Outputs

Generated with this model:

たーかーのーつーめー
夜中すぎたら子供たちに甘いものをあげないって約束したじゃない
島根の吉田です

License

For research and personal use only.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support