How do I quantify the llama part of the model?
Please provide more detailed log information to facilitate troubleshooting.
If you see 'build llm model done!' in the log, it indicates that the model compilation was successful.
Subsequent error messages can be ignored, as they mainly arise from comparing accuracy differences before and after quantization, and certain libraries may not support this process, leading to errors.
I got it. 👌👌
You attempted to compile a model that is not supported by our ToolChain (pulsar2).
Please strictly use only the models provided by us for compilation. Compiling private or custom models is not recommended unless they are explicitly supported by the AXERA Pulsar2 toolchain.
However, the AXERA-TECH/Janus-Pro-1B model card says that it was quantized in Pulsar2 version: 3.4. At the same time, the Janus-Pro-1B.axera only mentions the quantization method of vit, but does not give the quantization method of the .axmodel in the janus_pro_1b_axmodel folder in the model file.
We can try to break down the problem:
- First, compile using the model we provide and go through the entire workflow.
- Then test with your private model.
As a general rule, we only support models that are currently publicly available (AXERA org).
Thank you for your advice.
I downloaded the Janus model from your original repository (https://huggingface.co/deepseek-ai/Janus-Pro-1B)
and I used Pulsar2 v3.4 to attempt to quantize the Janus-Pro-1B using the following command: pulsar2 llm_build --input_path Janus-Pro-1B/ --output_path Janus-Pro-1B-axmodel/ --prefill_len 2048 --kv_cache_len 2049 --post_topk 50 --hidden_state_type bf16 --chip AX650 -c 1 --parallel 8. However, I still encounter the following issue.
Also, I used the vit model quantization method you provided (https://github.com/AXERA-TECH/Janus-Pro-1B.axera) and was able to correctly quantize the vit model, and inference runs normally on the AX650 board.
However, I also need to quantize the llama model in Janus-Pro-1B. Specifically, my main problem is how to quantize the Janus-Pro-1B to obtain the axmodel in the janus_pro_1b_axmodel file.
I apologize for the inconvenience. I've noticed that you are using toolchain version Pulsar2 v3.4. Please note that Janus Pro requires Pulsar2 v4.0 or later to compile correctly. Therefore, the errors in the log will likely be resolved after you update the toolchain. The latest available version is currently Pulsar2-v4.2.
Thanks for your suggestion.
I upgraded my pulsar2 to v4.2, but the above error
still occurs.
My sincere apologies😭😭. Upon reviewing the code commits, it appears that the support for this model was not integrated into the AXERA toolchain. I will
resolve this issue as soon as possible. Thank you for your contribution and bringing this to our attention.
Thank you so much for looking into this and getting back to me promptly! To help us better plan our follow-up work, could you share a rough timeline—even a general estimate would be incredibly helpful—for when the update might be completed?
The fixed code has now been merged into the main branch and will be included in the next official release, which is expected to be available within the next month.
Alternatively, for a quicker update, you may reach out to your AXERA support contact to request an interim build that includes the latest fixes.
Best regards.
Thank you for your help. I obtained a temporary version of the quantization toolchain from AXERA support and successfully quantized janus-pro-1b.
However, the performance of the quantized model is poor (a bit unintelligent). Could you share the parameter settings used for quantization in your open-source janus-pro-1b? I'd like to use it as a reference.
Excuse me, I wanted to follow up on a previous inquiry. While attempting to quantize my fine-tuned model, I’ve encountered the following error:
Could you please provide guidance on how to resolve this issue?
Additionally, I have a follow-up question: Which large, quantized multimodal models are compatible with your company’s chips?



