Mentioned that repetition_penalty was for M2.1 and may not apply to M2.5
Browse files
README.md
CHANGED
|
@@ -150,8 +150,10 @@ You have 2 reasoning parsers;
|
|
| 150 |
The reason why `minimax_m2_append_think` was introduced was Interleaved Thinking and having the model build upon it's previous thinking (usually frontends discard the thinking trace)
|
| 151 |
|
| 152 |
> [!TIP]
|
| 153 |
-
> 💡
|
| 154 |
-
> It
|
|
|
|
|
|
|
| 155 |
|
| 156 |
```bash
|
| 157 |
# Model configuration (Mandatory)
|
|
|
|
| 150 |
The reason why `minimax_m2_append_think` was introduced was Interleaved Thinking and having the model build upon it's previous thinking (usually frontends discard the thinking trace)
|
| 151 |
|
| 152 |
> [!TIP]
|
| 153 |
+
> 💡In MiniMax-M2.1 with the recommended parameters the model tended to get stuck in repetition loops in vLLM\
|
| 154 |
+
> It seemed like repetition_penalty: 1.10, frequency_penalty: 0.40 avoided that.
|
| 155 |
+
>
|
| 156 |
+
> You may want to try recommended settings without repetition_penalty first (and it slows down token generation)
|
| 157 |
|
| 158 |
```bash
|
| 159 |
# Model configuration (Mandatory)
|