Real Nepali v0.4

Model artifact repository for the real_nepali_v0.4 Piper/VITS checkpoint.

This is an experimental punctuation-aware continuation of ampixa/real-nepali-v0.2-kala. It keeps the same six-speaker map and clear Nepali frontend direction, but adds explicit punctuation tokens to the model input:

<period> <question> <exclaim> <comma>

Expected files:

checkpoint.ckpt
config.json
speaker_id_map.json

Training run:

real_nepali_v0.3_punct_4h_replay/run_epoch999_punct4h_lr2em6_prosodyfreeze_bs96sps16_1000
epoch=999-step=67200.ckpt

The remote path contains v0.3 because that is the internal punctuation-token frontend profile name. The release/evaluation label for this checkpoint is v0.4.

Checkpoint SHA-256:

46ba2bf465eecceff345b7f118353d7b498dfcb5eb283e3556ba9ea260dcdff9

Training mix:

Source Rows Duration
v0.2 six-speaker replay set 4,338 8.61 h
Algenib punctuation/intent synthetic batch 1,470 2.99 h
Total 5,808 11.60 h

Speaker IDs:

{
  "algenib": 0,
  "barsha": 1,
  "kala": 2,
  "slr143_F": 3,
  "slr43_0546": 4,
  "slr43_2099": 5
}

Important caveat: this checkpoint should be judged by listening. It was trained to test whether punctuation tokens improve prosody without repeating the quality loss seen in the earlier tiny synthetic-only v0.3 repair run. Use the punctuation review samples before treating it as the production/default model.

Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support