reasoning on / off ?

#5
by TahirC - opened

can we control the thinking to n tokens or turn off if needed ?

Not yet. We plan to release a 'Flash' version that matches the latency of the instruct model while retaining current capabilities.

Sign up or log in to comment