pleas help

by MRvood - opened Aug 3, 2025

Aug 3, 2025

I cannot manage to continue training or using chatterbox-finetuning, please think of me as someone who needs a bit of help. I'm not super smart and really want to try to train a little bit more on this model with my segment_001.wav and segment_001.txt. I have spent a couple of days on trying to get this to work and I tried ChatGPT to help me, but it ends up hallucinating stuff up and the hours with hitting the head in the wall or table goes by. A how-to set up and prepare your data for training and start to retrain would be nice. Sorry for asking this but after days of getting nowhere I'm desperate.

akhbar

Owner Aug 3, 2025

•

edited Aug 3, 2025

Hi MRvood.
You can use this repo to finetune the model: https://github.com/vaaale/chatterbox-streaming.git
This is my clone of: https://github.com/davidbrowne17/chatterbox-streaming.git

You must create a data set. For example:
Create a directory $HOME/dataset
Create a subdirectory 'wav' ($HOME/dataset/wav) and put your audio files that you want to train on in there.
Create a csv file with two columns like this (Don't remember if they are tab-, comma-, or ;-separated...):
sample1.wav "I really want to finetune Chatterbox"
sample2.wav "Getting this to work would be so awesome"
.....
(I don't remember if the metadata file was actually necessary here..... Check the documentation on github)

You can train both loras or finetune the whole model with GRPO. GRPO gave me the best results.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment