It's really stubborn?
It looks like the model is stubborn. I can't make it just start off reading a chunk of text, until I ask it something about the text. Any suggestions here? My test script - https://gist.github.com/sssemil/41b0e57616c55c907240682306f23c44
Honestly, no idea on what to do, good luck with the issue tho
Honestly, no idea on what to do, good luck with the issue tho
π©
The model is reactive and waits for user audio input to talk.
I also want it to greet the user, i achieve it by feeding a few audio tokens (empty) into the model - that triggers the greeting in my case
The model is reactive and waits for user audio input to talk.
I also want it to greet the user, i achieve it by feeding a few audio tokens (empty) into the model - that triggers the greeting in my case
I'm trying to make it read a text, and reply to whatever users says in-between, and then go back to reading the text.
this is cool
The model only has three modes 1) qa assistant 2) customer support 3) general chit-chat. So unfortunately it does not handle custom modes like "be a TTS model and read back this text". Will focus on making models in the future more general like that. The only way reading out text may be possible is by injecting text into the agent text channel, since the agent audio is delayed by one step from the agent text. You would have to modify lm_gen.step() with your own custom function that inputs your own text tokens instead of inputting text tokens from the last step output.