torch and llama.cpp integration

by TobDeBer - opened Sep 19, 2024

Sep 19, 2024

•

edited Sep 19, 2024

I just tried with latest torch/transformers and llama.cpp and inference failed.
Are there upstream branches I can use?

ValueError: The checkpoint you are trying to load has model type granitemoe but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

mayank-mishra

IBM Research org Sep 20, 2024

@TobDeBer you need this specific PR: https://github.com/huggingface/transformers/pull/33207/
the code for the MoE model is not merged into HF transformers main branch yet

TobDeBer

Sep 21, 2024

Thanks!
It has been merged last night.

TobDeBer changed discussion status to closed Sep 21, 2024

mayank-mishra

IBM Research org Sep 21, 2024

awesome

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment