brandonbeiler commited on
Commit
7a5515c
·
verified ·
1 Parent(s): 834eb1b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -84,4 +84,15 @@ When loading the raw model via transformers then quantizing and saving, transfor
84
  config to be missing critical values (like tie_word_embeddings). This was patched in vLLM for InternVL models (https://github.com/vllm-project/vllm/pull/19992) but
85
  remains for Skywork still, and will hopefully be resolved soon.
86
 
 
 
 
 
 
 
 
 
 
 
 
87
  *Quantized with ❤️ using LLM Compressor for the open-source community*
 
84
  config to be missing critical values (like tie_word_embeddings). This was patched in vLLM for InternVL models (https://github.com/vllm-project/vllm/pull/19992) but
85
  remains for Skywork still, and will hopefully be resolved soon.
86
 
87
+ ## vLLM Reasoning Parsing issues
88
+ See: https://github.com/vllm-project/vllm/pull/21041
89
+ See: https://github.com/SkyworkAI/Skywork-R1V/issues/42
90
+
91
+ Due to Skywork models not using a single `<think></think>` token in the tokenizer, vLLM struggles to parse out the reasoning. Additionally,
92
+ the chat chonfig for Skywork is `'<|im_start|>assistant\n<think>\n'` and includes the first `<think>` token so your generation output may
93
+ not even include the first `<think>` token and only output `</think>`. There is ongoing work to add a string-based reasoning parser to vLLM
94
+ that will allow for parsing out the `<think></think>` outputs as strings (multi-tokens) as a workaround to this issue.
95
+
96
+ The Skywork team has mentioned that they will be utilizing single-token `<think>` in the next model version so this wont be an issue moving forward.
97
+
98
  *Quantized with ❤️ using LLM Compressor for the open-source community*