Mentioned that repetition_penalty was for M2.1 and may not apply to M2.5

Files changed (1) hide show

README.md CHANGED Viewed

@@ -150,8 +150,10 @@ You have 2 reasoning parsers;
 The reason why `minimax_m2_append_think` was introduced was Interleaved Thinking and having the model build upon it's previous thinking (usually frontends discard the thinking trace)
 > [!TIP]
-> 💡With the recommended parameters the model tends to get stuck in repetition loops.\
-> It seems like repetition_penalty: 1.10, frequency_penalty: 0.40 avoids that
 ```bash
 # Model configuration (Mandatory)

 The reason why `minimax_m2_append_think` was introduced was Interleaved Thinking and having the model build upon it's previous thinking (usually frontends discard the thinking trace)
 > [!TIP]
+> 💡In MiniMax-M2.1 with the recommended parameters the model tended to get stuck in repetition loops in vLLM\
+> It seemed like repetition_penalty: 1.10, frequency_penalty: 0.40 avoided that.
+>
+> You may want to try recommended settings without repetition_penalty first (and it slows down token generation)
 ```bash
 # Model configuration (Mandatory)