VoxCPM2 Bedtime Story LoRA

LoRA adapter fine-tuned on VoxCPM2 for bedtime story narration in English.

Training

  • Base: openbmb/VoxCPM2
  • Dataset: LJSpeech (6,550 clips, ~12h β€” half-data experiment)
  • Method: LoRA (r=32, alpha=64, targeting DiT attention layers)
  • Best: exp6 (r=32, lr=5e-5, half data) β€” val loss 0.872
  • GPU: A100-80GB via Modal

Evaluation (6 experiments)

Exp Config Train Loss Val Loss
exp1 r32, lr=1e-4 0.823 0.918
exp2 r16, lr=5e-5 0.765 0.899
exp3 r64, lr=1e-4 0.838 0.908
exp4 r32, lr=5e-5, 200 steps 0.803 0.884
exp5 r32, lr=2e-4 0.832 0.896
exp6 r32, lr=5e-5, half data 0.935 0.872

Demo Audio

English bedtime stories (VoxCPM2)

LoRA vs Stock comparison

Type Audio
Stock VoxCPM2 comparison_stock_0.wav
LoRA exp6 comparison_lora_exp6_0.wav

Reference voice

Files

  • lora_weights.safetensors β€” LoRA adapter weights (12.8 MB)
  • lora_config.json β€” LoRA configuration
  • training_state.json β€” Training state

Part of DreamVoice

DreamVoice β€” bedtime stories in a parent's cloned voice.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sush0401/VoxCPM2-bedtime-lora

Base model

openbmb/VoxCPM2
Adapter
(5)
this model