Model architecture?

#2
by damdam1414 - opened

Hello, in the model description it is mentioned: "Initialized from: Qwen2.5 3B"
But config.json refers to LlamaForCausalLM, not Qwen2ForCausalLM?

SpeakLeash | Spichlerz org

yes and this is correct, because we modified architecture adding bias where qwen doesn't support it

Sign up or log in to comment