Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -84,4 +84,15 @@ When loading the raw model via transformers then quantizing and saving, transfor
 config to be missing critical values (like tie_word_embeddings). This was patched in vLLM for InternVL models (https://github.com/vllm-project/vllm/pull/19992) but
 remains for Skywork still, and will hopefully be resolved soon.
 *Quantized with ❤️ using LLM Compressor for the open-source community*

 config to be missing critical values (like tie_word_embeddings). This was patched in vLLM for InternVL models (https://github.com/vllm-project/vllm/pull/19992) but
 remains for Skywork still, and will hopefully be resolved soon.
+## vLLM Reasoning Parsing issues
+See: https://github.com/vllm-project/vllm/pull/21041
+See: https://github.com/SkyworkAI/Skywork-R1V/issues/42
+Due to Skywork models not using a single `<think></think>` token in the tokenizer, vLLM struggles to parse out the reasoning. Additionally,
+the chat chonfig for Skywork is `'<|im_start|>assistant\n<think>\n'` and includes the first `<think>` token so your generation output may
+not even include the first `<think>` token and only output `</think>`. There is ongoing work to add a string-based reasoning parser to vLLM
+that will allow for parsing out the `<think></think>` outputs as strings (multi-tokens) as a workaround to this issue.
+The Skywork team has mentioned that they will be utilizing single-token `<think>` in the next model version so this wont be an issue moving forward.
 *Quantized with ❤️ using LLM Compressor for the open-source community*