Model architecture?
#2
by damdam1414 - opened
Hello, in the model description it is mentioned: "Initialized from: Qwen2.5 3B"
But config.json refers to LlamaForCausalLM, not Qwen2ForCausalLM?
yes and this is correct, because we modified architecture adding bias where qwen doesn't support it