Panic when try to load the model bin files

by chenhunghan - opened May 22, 2023

Discussion

chenhunghan

May 22, 2023

•

edited May 22, 2023

There seems to be rust exceptions when loading the model.

(Using llm-rs==0.1.1)

I tried to load the model like model = Llama("./mpt-7b-q4_0-ggjt.bin"), but got

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidFormatVersion { container_type: Ggjt, version: 2 }', src/models.rs:5:1

also tried

from llm_rs import Llama

#load the model
model = Llama("cache/mpt-7b-q4_0.bin")

#generate
print(model.generate("The meaning of life is"))

but got

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })', src/models.rs:5:1

any additional instructions to load the models?

LLukas22

rustformers org May 22, 2023

There were breaking changes in the ggml format, you need to use llm-rs==0.2.0 or greater (see here)

Also MPT isn't a LLama model, sou you need to load it via the Mpt model, you can also see this in the model-card of this repo.

from llm_rs import Mpt

model = Mpt("cache/mpt-7b-q4_0.bin")

chenhunghan

May 22, 2023

Thank you, works well!

LLukas22 changed discussion status to closed May 23, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment