Controlling generation length

by gsarti - opened Jun 30, 2022

BigScience Workshop org Jun 30, 2022

I'm trying to use Bloom for generation with the Inference API, but if the input is not minimal I only get 1-2 extra tokens. I tried to check the detailed parameters in the docs, but there doesn't seem to be a way to control length for text generation models (not seq2seq). Moreover, even if I try to pass some parameters to the API request I get a message saying "Parameters are not accepted for this specific model".

Is there a way to enable more lengthy generation using Bloom?

LincolnStein

Jul 11, 2022

I’m frustrated with this as well. Text generation is useless.

TimeRobber

BigScience Workshop org Nov 14, 2022

Hi sorry for the very long delay. Currently we hard limit the number of tokens one can request in order to prevent users from requesting too many tokens and causing the entire system to crash. Though I'm quite surprised it generates 1-2 extra tokens ... it should be a lot more.

I'm closing this discussion due to this being a few months old and quite a few things have changed since then. Please feel free to re-open if you still see this issue.

TimeRobber changed discussion status to closed Nov 14, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment