can not generate with mode: Fill-in-the-middle

#22

by miraclezst - opened May 10, 2023

Discussion

miraclezst

May 10, 2023

my code as below:

pip install -q transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import os

checkpoint = "bigcode/starcoder"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint,use_auth_token=True)
model = AutoModelForCausalLM.from_pretrained(checkpoint, trust_remote_code=True,load_in_8bit=True,device_map={"": 0})

input_text = "def print_hello_world():\n \n print('Hello world!')"
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

output:

Does anyone know what is the reason for this?

jahansen

May 19, 2023

I run into the same issues and have not been able to resolve it.

nandovallec

May 23, 2023

•

edited May 26, 2023

In your case, increasing the length of the generated tokens may help.

loubnabnl

BigCode org May 25, 2023

You can run FIM using the following code:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoder", truncation_side="left")
model = AutoModelForCausalLM.from_pretrained("bigcode/starcoder", torch_dtype=torch.bfloat16).cuda()

input_text = "<fim_prefix>def fib(n):<fim_suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim_middle>"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=25)
generation = [tokenizer_fim.decode(tensor, skip_special_tokens=False) for tensor in outputs]
print(generation[0])

<fim_prefix>def fib(n):<fim_suffix>    else:
        return fib(n - 2) + fib(n - 1)<fim_middle>
    if n < 2:
        return n
<|endoftext|>

loubnabnl changed discussion status to closed Jun 6, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment