LM studio

by niszogenn - opened Mar 10, 2025

Mar 10, 2025

i have tried to run this model in lm studio but i got an error

🥲 Failed to load the model

Failed to load model

error loading model: error loading model architecture: unknown model architecture: 'instella'

is there a way to fix it or do we have to wait when they will fix it

Melechtna

Owner Mar 10, 2025

I'm actually trying to work on it, you need to add Instella to the llama.cpp binary. Adding it to the conversion script took a while, but AMD did actually provide us every thing we need to get all the weights correct. I'm not sure where to post my changes however.

Melechtna

Owner Mar 10, 2025

If things go well, I will likely package my adjustments to the llama.cpp source code here in a tar package, if anyone wishes to push this to their github, they'll be welcome to, however I'm unsure if I'm allowed to do that. But given the conversion script is a perfect 1:1 translation to gguf using AMD's own weights, the loading has to equally be as 1:1, which is what I'm working on. As far as I can tell, I just need to figure out where to plug the values into the C code, which is what I'm currently hunting for. Once compiled, it should "just work".

Melechtna

Owner Mar 10, 2025

Okay, I've uploaded what I changed for the conversion, if someone, perhaps yourself, can figure out how to implement the changes into the actual loader itself, you could simply run ollama serve, and just access it in LM Studio, open-webui, etc. I'll keep trying, but I'm having difficulties.

Elvisaro

Mar 12, 2025

Thanks for Trying to convert the instella from safetensors to GGUF, because I Failed badly.

OdinVex

Aug 14, 2025

Okay, I've uploaded what I changed for the conversion, if someone, perhaps yourself, can figure out how to implement the changes into the actual loader itself, you could simply run ollama serve, and just access it in LM Studio, open-webui, etc. I'll keep trying, but I'm having difficulties.

You might could file a Pull Request (or Issue if you're less familiar with Pull Requests) for llama.cpp to get the code in. :) Eager for this, curious to try it out.

Melechtna

Owner Aug 14, 2025

Okay, I've uploaded what I changed for the conversion, if someone, perhaps yourself, can figure out how to implement the changes into the actual loader itself, you could simply run ollama serve, and just access it in LM Studio, open-webui, etc. I'll keep trying, but I'm having difficulties.

You might could file a Pull Request (or Issue if you're less familiar with Pull Requests) for llama.cpp to get the code in. :) Eager for this, curious to try it out.

This was forever ago, IIRC, the problem was that a lot of different weights, constraints, etc, were needed for this to be added to ollama, I think I sent them a note about this, but I kind of moved on to other things, so, I have absolutely no idea anymore.

OdinVex

Aug 14, 2025

•

edited Aug 14, 2025

Okay, I've uploaded what I changed for the conversion, if someone, perhaps yourself, can figure out how to implement the changes into the actual loader itself, you could simply run ollama serve, and just access it in LM Studio, open-webui, etc. I'll keep trying, but I'm having difficulties.

You might could file a Pull Request (or Issue if you're less familiar with Pull Requests) for llama.cpp to get the code in. :) Eager for this, curious to try it out.

This was forever ago, IIRC, the problem was that a lot of different weights, constraints, etc, were needed for this to be added to ollama, I think I sent them a note about this, but I kind of moved on to other things, so, I have absolutely no idea anymore.

Thanks for the follow-up reply about it. I haven't seen any Issue(s) filed for the architecture to be added to Ollama. I might query them about it.

Edit: I found a close release you based on, b4856, I'll look at making a PR after I figure out how to use llama.cpp to begin with. :]

Edit: Does the following look alright? I took the most modern llama.cpp and added your changes (what I could find based on differential comparison). https://github.com/OdinVex/llama.cpp (3 files, convert_hf_to_gguf.py, gguf-py/gguf/constants.py, and gguf-py/gguf/tensor_mapping.pywere updated. It may need updating/tweaking, made some slight modernization changes (Model->ModelBase, etc).

OdinVex

Aug 14, 2025

I've got it somewhat loading in llama.cpp, but the tokenizer doesn't seem to be merged. Was that forgotten or?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment