BLOOM Inference API issues: return_full_text and punctuation

#153

by ilLancio - opened Dec 12, 2022

Discussion

ilLancio

Dec 12, 2022

•

edited Dec 14, 2022

The following issues have been observed when using the inferenceApi class:

Issue 1: Incorrect `return_full_text` parameter

When the return_full_text parameter is set to False, the full text is still returned in the output.

Issue 2: Incorrect handling of punctuation in generated text

When a punctuation character (only !,.?) is preceded by a space in the prompt, the space is not preserved in the generated_text output, even if the prompt was generated by the AI itself. This can be seen in the following example:

>>> inference = InferenceApi("bigscience/bloom", token = HfFolder.get_token())
>>> generated_text = inference("Hello") # Output: "Hello , world"
>>> inference(generated_text) 
'Hello, world and some other text' # Space before "," is not preserved

It is possible that these two issues are related. We hope that a fix will be implemented soon.

ilLancio changed discussion title from BLOOM Inference API bugs: return_full_text and punctuation to BLOOM Inference API issues: return_full_text and punctuation Dec 14, 2022

TimeRobber

BigScience Workshop org Dec 15, 2022

Hi!

When the return_full_text parameter is set to False, the full text is still returned in the output.

Yeah this is somewhat expected, BLOOM being so large has to undergo a special deployment mechanism where we essentially recoded some generate functions. Right now that deployment has very limited features.

Issue 2: Incorrect handling of punctuation in generated text

Given the code snipped you shared, can you try running inference("Hello, world") I',m suspecting that you're generated_text doesn't correspond to the comment you shared.

ilLancio

Dec 18, 2022

•

edited Dec 18, 2022

Given the code snipped you shared, can you try running inference("Hello, world") I',m suspecting that you're generated_text doesn't correspond to the comment you shared.

I don't think I understand what you are trying to point out about the second issue.
The inputs and outputs I wrote in the example are just to let understand what I'm trying to convey more clearly and in a shorter form, and are not actual inputs and outputs.
Here you can see an actual input and output:

>>> generated_text = inference("  ..  ?  !  ,")[0]["generated_text"]
>>> generated_text
' .. ? ! ,! ! ! ! ! ! ! ! ! !  '
>>> inference(generated_text)[0]["generated_text"]
'..?!,!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!'

TimeRobber

BigScience Workshop org Dec 19, 2022

•

edited Dec 19, 2022

I couldn't reproduce the first example. Ran the following:

from huggingface_hub import InferenceApi

inference = InferenceApi("bigscience/bloom", token =True)
generated_text = inference("Hello")[0]["generated_text"]

print(generated_text) # prints "Hello, I am a young woman of 28 years old who has just arrived in New Braunfels for"

new_generated_text = inference(generated_text)[0]["generated_text"]

print(new_generated_text) # prints "Hello, I am a young woman of 28 years old who has just arrived in New Braunfels for the first time. I like everything but above all a perfect hygiene. I take my time because quality"

But the second one seems to do that:

from huggingface_hub import InferenceApi

inference = InferenceApi("bigscience/bloom", token =True)
generated_text = inference("  ..  ?  !  ,")[0]["generated_text"]

print(generated_text) # prints " .. ? ! ,! ! ! ! ! ! ! ! ! !  "

new_generated_text = inference(generated_text)[0]["generated_text"]

print(new_generated_text) # prints "..?!,!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

Thanks for reporting. I'm guessing this has something to do with the inference API somehow. cc @Narsil

TimeRobber

BigScience Workshop org Dec 20, 2022

Okay just to let you know we've found the issue. It's linked to our tokenization mechanism.

>>> tok = AutoTokenizer.from_pretrained("bigscience/bloom", use_fast = True)
>>> tok.decode(tok.encode("Hello , there"))
'Hello, there'
>>> tok.decode(tok.encode("Hello , there"), clean_up_tokenization_spaces=False)
'Hello , there'

We'll work towards fixing this ASAP!

gorkanauta

Jan 12, 2023

How is this issue moving forward? I have the same problem, returning full text although specified not to :(

TimeRobber

BigScience Workshop org Jan 14, 2023

Sorry we're coming back from holidays + our HF offsite. https://github.com/huggingface/text-generation-inference/pull/13 should patch BLOOM. It just requires to be deployed which should happen ASAP
We still want to improve the fix, typically handling return_full_text option. If you feel like contributing, please feel free to do so

TimeRobber

BigScience Workshop org Jan 19, 2023

cc @olivierdehaene

TimeRobber

BigScience Workshop org Jan 26, 2023

Closing as @olivierdehaene has deployed newer version. It fixes the bug from https://huggingface.co/bigscience/bloom/discussions/153#63a07cd82fabbbb8998f60de

Feel free to re-open if you still see an issue.

TimeRobber changed discussion status to closed Jan 26, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

BLOOM Inference API issues: return_full_text and punctuation

Issue 1: Incorrect return_full_text parameter

Issue 2: Incorrect handling of punctuation in generated text

Issue 1: Incorrect `return_full_text` parameter