It's really stubborn?

#20

by emilss - opened Jan 27

Jan 27

It looks like the model is stubborn. I can't make it just start off reading a chunk of text, until I ask it something about the text. Any suggestions here? My test script - https://gist.github.com/sssemil/41b0e57616c55c907240682306f23c44

HEISOV

Jan 28

Honestly, no idea on what to do, good luck with the issue tho

emilss

Jan 28

Honestly, no idea on what to do, good luck with the issue tho

😩

pekopeter

Jan 30

The model is reactive and waits for user audio input to talk.
I also want it to greet the user, i achieve it by feeding a few audio tokens (empty) into the model - that triggers the greeting in my case

emilss

Jan 30

The model is reactive and waits for user audio input to talk.
I also want it to greet the user, i achieve it by feeding a few audio tokens (empty) into the model - that triggers the greeting in my case

I'm trying to make it read a text, and reply to whatever users says in-between, and then go back to reading the text.

caspiandonavon

Feb 3

this is cool

royrajarshi

NVIDIA org Feb 19

The model only has three modes 1) qa assistant 2) customer support 3) general chit-chat. So unfortunately it does not handle custom modes like "be a TTS model and read back this text". Will focus on making models in the future more general like that. The only way reading out text may be possible is by injecting text into the agent text channel, since the agent audio is delayed by one step from the agent text. You would have to modify lm_gen.step() with your own custom function that inputs your own text tokens instead of inputting text tokens from the last step output.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment