Real Nepali v0.4

Model artifact repository for the real_nepali_v0.4 Piper/VITS checkpoint.

This is an experimental punctuation-aware continuation of ampixa/real-nepali-v0.2-kala. It keeps the same six-speaker map and clear Nepali frontend direction, but adds explicit punctuation tokens to the model input:

<period> <question> <exclaim> <comma>

Expected files:

checkpoint.ckpt
config.json
speaker_id_map.json

Training run:

real_nepali_v0.3_punct_4h_replay/run_epoch999_punct4h_lr2em6_prosodyfreeze_bs96sps16_1000
epoch=999-step=67200.ckpt

The remote path contains v0.3 because that is the internal punctuation-token frontend profile name. The release/evaluation label for this checkpoint is v0.4.

Checkpoint SHA-256:

46ba2bf465eecceff345b7f118353d7b498dfcb5eb283e3556ba9ea260dcdff9

Training mix:

Source	Rows	Duration
v0.2 six-speaker replay set	4,338	8.61 h
Algenib punctuation/intent synthetic batch	1,470	2.99 h
Total	5,808	11.60 h

Speaker IDs:

{
  "algenib": 0,
  "barsha": 1,
  "kala": 2,
  "slr143_F": 3,
  "slr43_0546": 4,
  "slr43_2099": 5
}

Important caveat: this checkpoint should be judged by listening. It was trained to test whether punctuation tokens improve prosody without repeating the quality loss seen in the earlier tiny synthetic-only v0.3 repair run. Use the punctuation review samples before treating it as the production/default model.

Downloads last month: 25