Is this model initialized from Qwen3-8B?
#3
by
JosephusCheung
- opened
Hello,
I noticed that the model architecture and parameters appear to be structurally identical to Qwen3-8B.
Could you please clarify if this model was initialized from Qwen3 weights? Currently, the model card only mentions Qwen3 in the context of benchmarks, but lists "WeDLM-8B" as the base. If this is indeed built on top of Qwen3, it would be helpful to see that explicitly mentioned.
Thanks!
Yes, it is based on Qwen3-8B. WeDLM-8B-Base(https://huggingface.co/tencent/WeDLM-8B-Base) is from Qwen3-8B-Base. And WeDLM-8B-Instruct is based on WeDLM-8B-Base with further post-training.
Thanks for your suggestion. I have updated the readme.