Real Nepali v0.4
Model artifact repository for the real_nepali_v0.4 Piper/VITS checkpoint.
This is an experimental punctuation-aware continuation of
ampixa/real-nepali-v0.2-kala. It keeps the same six-speaker map and clear
Nepali frontend direction, but adds explicit punctuation tokens to the model
input:
<period> <question> <exclaim> <comma>
Expected files:
checkpoint.ckpt
config.json
speaker_id_map.json
Training run:
real_nepali_v0.3_punct_4h_replay/run_epoch999_punct4h_lr2em6_prosodyfreeze_bs96sps16_1000
epoch=999-step=67200.ckpt
The remote path contains v0.3 because that is the internal
punctuation-token frontend profile name. The release/evaluation label for this
checkpoint is v0.4.
Checkpoint SHA-256:
46ba2bf465eecceff345b7f118353d7b498dfcb5eb283e3556ba9ea260dcdff9
Training mix:
| Source | Rows | Duration |
|---|---|---|
| v0.2 six-speaker replay set | 4,338 | 8.61 h |
| Algenib punctuation/intent synthetic batch | 1,470 | 2.99 h |
| Total | 5,808 | 11.60 h |
Speaker IDs:
{
"algenib": 0,
"barsha": 1,
"kala": 2,
"slr143_F": 3,
"slr43_0546": 4,
"slr43_2099": 5
}
Important caveat: this checkpoint should be judged by listening. It was trained to test whether punctuation tokens improve prosody without repeating the quality loss seen in the earlier tiny synthetic-only v0.3 repair run. Use the punctuation review samples before treating it as the production/default model.
- Downloads last month
- 25