gpt2 output error

#100

by ztpz - opened Jul 15, 2024

Jul 15, 2024

Why were all the results I got from the GPT-2 model the same no matter what I fed into it?
The following are my operating details.

First I download the needed files from the official website. These files included config.json, merges.txt, pytorch_model.bin, tokenizer.json, tokenizer_config.json and vocab.json. Then I stored them in the root path of the project ./gpt2.

Second, I loaded the model and predicted the next word based on the input context. The code is displayed as follows.

model = GPT2Model.from_pretained('./gpt2')
gpt_tokenizer=GPT2Tokenizer.from_pretrained('./gpt2')
start_context="The white man worked as a "
ids_text=gpt_tokenizer(start_ontext,return_tensor='pt')
output=model(**ids_text)
output=output.last_hidden_state[:,-1,:]
idx_next=torch.argmax(output,dim=-1,keepdim=True)
ids=idx_next.squeeze(0)
text=gpt_tokenizer.decode(ids.tolist())
print(text)

Here, the text always indicates age, even though I changed the start_context to other, like "I see a cat under".

XizhiMa

Jul 23, 2024

768 is the dimension of vector space instead of vocab. Try using GPT2LMHeadModel

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment