Is thinking turned off by default?
#3
by
Fernanda24
- opened
when i run the offical fp8 repo thinking is on by default but after loading this one up with exact same command basically in sglang thinking seems to be off?
oh and further more it doesnt stop generating response it just goes on and on hmm maybe i did something wrong?
I have tried this version as well.
I deployed it using vllm 0.10.2 and 4 H100 GPUs and the response never ends, it looks like he in a conversation with itself so the response is the a question to himself and he answer it in a never ending loop.
Setting the temperature to 1.0 doesn't help.