How do I quantify the llama part of the model?

by 13deda - opened Sep 12, 2025

Discussion

13deda

Sep 12, 2025

If I use the pulsar2 llm_build command directly, I will get the error model_type error multi_modality.

blyon1995

AXERA org Sep 12, 2025

Please provide more detailed log information to facilitate troubleshooting.

If you see 'build llm model done!' in the log, it indicates that the model compilation was successful.

Subsequent error messages can be ignored, as they mainly arise from comparing accuracy differences before and after quantization, and certain libraries may not support this process, leading to errors.

13deda

Sep 12, 2025

•

edited Sep 12, 2025

This is the complete error message. It seems that it has not yet reached the quantification stage.

blyon1995

AXERA org Sep 12, 2025

I got it. 👌👌
You attempted to compile a model that is not supported by our ToolChain (pulsar2).
Please strictly use only the models provided by us for compilation. Compiling private or custom models is not recommended unless they are explicitly supported by the AXERA Pulsar2 toolchain.

13deda

Sep 12, 2025

However, the AXERA-TECH/Janus-Pro-1B model card says that it was quantized in Pulsar2 version: 3.4. At the same time, the Janus-Pro-1B.axera only mentions the quantization method of vit, but does not give the quantization method of the .axmodel in the janus_pro_1b_axmodel folder in the model file.

blyon1995

AXERA org Sep 12, 2025

•

edited Sep 12, 2025

We can try to break down the problem:

First, compile using the model we provide and go through the entire workflow.
Then test with your private model.

As a general rule, we only support models that are currently publicly available (AXERA org).

13deda

Sep 12, 2025

Thank you for your advice.
I downloaded the Janus model from your original repository (https://huggingface.co/deepseek-ai/Janus-Pro-1B)
and I used Pulsar2 v3.4 to attempt to quantize the Janus-Pro-1B using the following command: pulsar2 llm_build --input_path Janus-Pro-1B/ --output_path Janus-Pro-1B-axmodel/ --prefill_len 2048 --kv_cache_len 2049 --post_topk 50 --hidden_state_type bf16 --chip AX650 -c 1 --parallel 8. However, I still encounter the following issue.

Also, I used the vit model quantization method you provided (https://github.com/AXERA-TECH/Janus-Pro-1B.axera) and was able to correctly quantize the vit model, and inference runs normally on the AX650 board.
However, I also need to quantize the llama model in Janus-Pro-1B. Specifically, my main problem is how to quantize the Janus-Pro-1B to obtain the axmodel in the janus_pro_1b_axmodel file.

blyon1995

AXERA org Sep 12, 2025

I apologize for the inconvenience. I've noticed that you are using toolchain version Pulsar2 v3.4. Please note that Janus Pro requires Pulsar2 v4.0 or later to compile correctly. Therefore, the errors in the log will likely be resolved after you update the toolchain. The latest available version is currently Pulsar2-v4.2.

13deda

Sep 12, 2025

Thanks for your suggestion.
I upgraded my pulsar2 to v4.2, but the above error

still occurs.

blyon1995

AXERA org Sep 15, 2025

Thanks for your suggestion.
I upgraded my pulsar2 to v4.2, but the above error

still occurs.

My sincere apologies😭😭. Upon reviewing the code commits, it appears that the support for this model was not integrated into the AXERA toolchain. I will
resolve this issue as soon as possible. Thank you for your contribution and bringing this to our attention.

13deda

Sep 16, 2025

Thank you so much for looking into this and getting back to me promptly! To help us better plan our follow-up work, could you share a rough timeline—even a general estimate would be incredibly helpful—for when the update might be completed?

blyon1995

AXERA org Sep 16, 2025

The fixed code has now been merged into the main branch and will be included in the next official release, which is expected to be available within the next month.

Alternatively, for a quicker update, you may reach out to your AXERA support contact to request an interim build that includes the latest fixes.

Best regards.

13deda

Sep 24, 2025

Thank you for your help. I obtained a temporary version of the quantization toolchain from AXERA support and successfully quantized janus-pro-1b.

However, the performance of the quantized model is poor (a bit unintelligent). Could you share the parameter settings used for quantization in your open-source janus-pro-1b? I'd like to use it as a reference.

13deda

Sep 24, 2025

Excuse me, I wanted to follow up on a previous inquiry. While attempting to quantize my fine-tuned model, I’ve encountered the following error:

Could you please provide guidance on how to resolve this issue?
Additionally, I have a follow-up question: Which large, quantized multimodal models are compatible with your company’s chips?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment