Choosing the same voice for all audio generated files
#58
by
abesimi
- opened
Hi,
I am using lately csm-1b to produce audio in python and it works fine, but it seems to me that I cannot assign the same speaker for different executions.
Each time I run the script, a different voice is produced.
Any feedback to keep consistent speaker is welcomed.
Something like Google Gemini has, a set of enumerated speakers...
Here's my code.
def getAudioFromText(text: str, tempID: str) -> bool:
conversation = [
{"role": "0", "content": [{"type": "text", "text": text}]},
]
inputs = aprocessor.apply_chat_template(
conversation,
tokenize=True,
return_dict=True,
).to(device)
# infer the model
try:
audio = model.generate(**inputs, output_audio=True,)
audio_url=os.path.join(tempID, f"output_audio.wav")
aprocessor.save_audio(audio, audio_url, sampling_rate=24000, format="wav")
#why the audio file is missing the last word or second?
#fix by adding silence at the end of the audio for 200 ms
return True
except Exception as e:
return False