pulsar2: 4.2 转换 Qwen-3VL 报错问题咨询

#1
by wangsh0111 - opened

你好~非常感谢你们提供 Qwen-3VL 的 ax 模型,然而,在一些实际场景中可能还需要对 Qwen3-VL 进行微调。我参考 https://github.com/AXERA-TECH/Qwen2.5-VL-3B-Instruct.axera 成功完成了 Qwen2.5-VL 到 ax 模型的转换,并尝试参考 https://github.com/AXERA-TECH/Qwen3-VL.AXERA 以进行Qwen3-VL 到 ax 模型的转换,却遇到了如下报错:

root@ubuntu:/data/Qwen3-VL.AXERA-main/model_convert# pulsar2 llm_build --input_path ../Qwen/Qwen3-VL-2B-Instruct/ --output_path ../Qwen/Qwen3-VL-2B-Instruct-AX650/ --kv_cache_len 2047 --hidden_state_type bf16 --prefill_len 320 --last_kv_cache_len 320 --last_kv_cache_len 512 --last_kv_cache_len 768 --last_kv_cache_len 1024 --chip AX650 -c 1 --parallel 8
:177: DeprecationWarning: label() is deprecated. Use is_required() or is_repeated() instead.
:104: DeprecationWarning: label() is deprecated. Use is_required() or is_repeated() instead.
Config(
model_name='Qwen3-VL-2B-Instruct',
model_type='qwen3_vl_text',
num_hidden_layers=28,
num_attention_heads=16,
num_key_value_heads=8,
hidden_size=2048,
head_dim=128,
intermediate_size=6144,
vocab_size=151936,
rope_theta=5000000,
max_position_embeddings=262144,
rope_partial_factor=1.0,
rms_norm_eps=1e-06,
norm_type='rms_norm',
hidden_act='silu',
hidden_act_param=0.03,
scale_depth=1.4,
scale_emb=1,
dim_model_base=256,
origin_model_type='qwen3_vl',
quant=False,
quant_sym=False,
quant_bits=4,
quant_group_size=128,
rs_factor=32,
rs_high_freq_factor=4.0,
rs_low_freq_factor=1.0,
rs_original_max_position_embeddings=8192,
rs_rope_type='',
rs_mrope_section=[16, 24, 24]
)
Traceback (most recent call last):
File "", line 59, in guard_context
File "", line 207, in llm_build
AssertionError: model_type error qwen3_vl_text

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 4, in
File "", line 276, in
File "", line 272, in pulsar2
File "", line 111, in llm_build
File "/usr/local/lib/python3.12/contextlib.py", line 158, in exit
self.gen.throw(value)
File "", line 61, in guard_context
File "", line 73, in error_func
yamain.common.error.CodeException: (<LLMErrorCode.LLM_PREPARE: 0>, AssertionError('model_type error qwen3_vl_text'))

我的实验表明该仓库 (AXERA-TECH/Qwen3-VL-4B-Instruct-GPTQ-Int4) 可以成功在板侧运行,并注意到该仓库使用 pulsar2: 5.0 进行模型转换,由于 https://huggingface.co/AXERA-TECH/Pulsar2 最新仅提供了 pulsar2: 4.2 的镜像文件,所以我的上述转换过程工具链为 pulsar2: 4.2,因此想请问上述错误是否是由于 pulsar2 版本问题所导致?如果是,请问是否可以提供 pulsar2: 5.0 的镜像文件呢;如果不是,可以告诉我上述错误有什么解决方案呢?

如果您能帮助我解决上述问题,我将不胜感激~

Sign up or log in to comment