RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

#14

by HassanStar - opened Dec 14, 2023

Discussion

HassanStar

Dec 14, 2023

Which version of pytorch should use in this case?

akas911

Dec 14, 2023

Are you using it on a CPU or GPU ?

susnato

Dec 14, 2023

Did you push the model to GPU before running?

leeedylan

Dec 15, 2023

I have the same problem, any idea to deal with?

akas911

Dec 15, 2023

These are fp16 weights when running on CPU it's giving this error, When I ran it on Colab Pro V100 GPU, it works.

nickovs

Dec 16, 2023

@HassanStar I got the same error when running on Torch version 2.1.2 on a Mac if I tried to put the model on the CPU, but if I use torch.set_default_device("mps") to use the Metal acceleration it works just fine.

gugarosa

Microsoft org Dec 20, 2023

•

edited Dec 20, 2023

Hello everyone!

CPU with FP16 does not work since there is no CPU-FP16 LayerNormalization kernel implementation on PyTorch.

Best regards,
Gustavo.

gugarosa changed discussion status to closed Dec 20, 2023

gugarosa changed discussion status to open Dec 20, 2023

gugarosa changed discussion status to closed Dec 20, 2023

andreariboni

Jan 26, 2024

Did you push the model to GPU before running?

how do i do this?

susnato

Jan 27, 2024

Hi @andreariboni , if you have a Nvidia gpu then you can do model.to("cuda") or if you are working on apple silicon then do model.to("mps"). BTW don't forget to do the same to the inputs.

Gnyanesh

Feb 4, 2024

•

edited Feb 4, 2024

Hii All,
For cpu you can use this code.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cpu")

model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype=torch.float32, device_map="cpu", trust_remote_code=True)

tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2", trust_remote_code=True)

inputs = tokenizer('''def print_prime(n):
"""
Print all primes between 1 and n
"""''', return_tensors="pt", return_attention_mask=False)

outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment