Comment about config issues

Files changed (1) hide show

README.md CHANGED Viewed

@@ -73,4 +73,13 @@ torch==2.7.0
 vllm==0.9.2
 ```
 *Quantized with ❤️ using LLM Compressor for the open-source community*

 vllm==0.9.2
 ```
+## vLLM Workaround for FP8
+See: https://github.com/vllm-project/vllm/issues/19876
+Currently, the Skywork Chat config (https://github.com/vllm-project/vllm/blob/e8cc53af5e17205470c04f442e67f276e08623a1/vllm/transformers_utils/configs/skyworkr1v.py#L14)
+uses a custom config (not a standard AutoConfig from transformers ), which doesn't take advantage of all the default values that the AutoConfig uses.
+When loading the raw model via transformers then quantizing and saving, transformers doesn't save out default values to the config, causing the saved
+config to be missing critical values (like tie_word_embeddings). This was patched in vLLM for InternVL models (https://github.com/vllm-project/vllm/pull/19992) but
+remains for Skywork still, and will hopefully be resolved soon.
 *Quantized with ❤️ using LLM Compressor for the open-source community*