Error when trying to load the model

#2
by Diskuntrol - opened

First of all congrats, this model works beautifully in Spanish even not being one of its main targets, probably.

However, when trying to load it for further finetuning with a custom dataset, I get this error:
"saturated-labs/T-Rex-mini does not appear to have a file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack."

Which is weird, given that the files are there in the repo. Other Llama 3 based models don't display this behavior. I'm pretty new to this, so maybe you know what's going on.

This is how I'm trying to load it (showing only the relevant parts of the code):

model_name = "saturated-labs/T-Rex-mini"
[...]
bnb_config = BitsAndBytesConfig(
load_in_8bit=True,
)
[...]
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
torch_dtype = torch.bfloat16
)

Thank you and keep up the good work!

Saturated Labs ( LoreMate AI ) org

Glad you liked the model also even if the files are in the repo, they might not be in the root directory or named exactly as expected, try giving the full path .

model_path = "/some_path_on_your_linux/to/T-Rex-mini"

model = AutoModelForCausalLM.from_pretrained(
model_path, # Use local path instead of model_name
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
use_safetensors=True
)

it generally solves the problem for me, ensure that you have config.json

Saturated Labs ( LoreMate AI ) org

I ma marking it resolved if you are facing more problems feel free to reach out

aravshakya changed discussion status to closed

Sign up or log in to comment