error loading model: missing tensor 'token_embd.weight'
#1
by
poita66
- opened
I'm not really surprised that it fails to run, given that it's only 8MB.
I tried running it with the provided commands with llama.cpp and got the error:
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = true)
llama_model_load: error loading model: missing tensor 'token_embd.weight'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/root/.cache/llama.cpp/CronoBJS_fix-json-GGUF_fix-json-Q8_0.gguf'
srv load_model: failed to load model, '/root/.cache/llama.cpp/CronoBJS_fix-json-GGUF_fix-json-Q8_0.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
Am I doing something wrong?
Oh, this is also the same with the diff-apply GGUF too
OK, so this is because this GGUF (and the repo it quantizes) are just LORAs.
For anyone reading this, you'll need to use the model this adapter was developed for (https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct) and then apply this with the --lora flag in llama.cpp
poita66
changed discussion status to
closed