sush0401
/

VoxCPM2-bedtime-lora

bedtime-stories

Model card Files Files and versions

VoxCPM2 Bedtime Story LoRA

LoRA adapter fine-tuned on VoxCPM2 for bedtime story narration in English.

Training

Base: openbmb/VoxCPM2
Dataset: LJSpeech (6,550 clips, ~12h — half-data experiment)
Method: LoRA (r=32, alpha=64, targeting DiT attention layers)
Best: exp6 (r=32, lr=5e-5, half data) — val loss 0.872
GPU: A100-80GB via Modal

Evaluation (6 experiments)

Exp	Config	Train Loss	Val Loss
exp1	r32, lr=1e-4	0.823	0.918
exp2	r16, lr=5e-5	0.765	0.899
exp3	r64, lr=1e-4	0.838	0.908
exp4	r32, lr=5e-5, 200 steps	0.803	0.884
exp5	r32, lr=2e-4	0.832	0.896
exp6	r32, lr=5e-5, half data	0.935	0.872

Demo Audio

English bedtime stories (VoxCPM2)

Mood	Audio
Magical	english_magical_mid.wav
Funny	english_funny_high.wav
Calming	english_calming_low.wav
Dreamy	english_dreamy_low.wav

LoRA vs Stock comparison

Type	Audio
Stock VoxCPM2	comparison_stock_0.wav
LoRA exp6	comparison_lora_exp6_0.wav

Reference voice

reference_voice.wav — voice used for cloning

Files

lora_weights.safetensors — LoRA adapter weights (12.8 MB)
lora_config.json — LoRA configuration
training_state.json — Training state

Part of DreamVoice

DreamVoice — bedtime stories in a parent's cloned voice.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for sush0401/VoxCPM2-bedtime-lora

Base model

openbmb/VoxCPM2

Adapter

(5)

this model