| | --- |
| | pipeline_tag: text-generation |
| | license: other |
| | license_name: modified-mit |
| | license_link: https://github.com/MiniMax-AI/MiniMax-M2.1/blob/main/LICENSE |
| | base_model: |
| | - MiniMaxAI/MiniMax-M2.1 |
| | tags: |
| | - smoothie-qwen |
| | --- |
| | |
| | # Smoothie-MiniMax-M2.1 |
| |
|
| | ## Overview |
| |
|
| | This is a modified version of [MiniMax-M2.1](https://huggingface.co/MiniMaxAI/MiniMax-M2.1), using [Smoothie-Qwen](https://github.com/dnotitia/smoothie-qwen). |
| |
|
| | ## What is it? |
| |
|
| | Reduced probability of Kanji, Hanja, Chinese character(radical) tokens to reduce sudden language mixing. |
| |
|
| | ## For who? |
| |
|
| | If you see Chinese characters during non-Chinese conversation, this model will **help** in this case. |
| |
|
| | It does not *"solve"* the main problem, just improve its occurrence. |
| |
|
| | **For Chinese and Japanese users: Use original model!** This model will behave worse in these languages. |
| |
|
| | ## Result |
| |
|
| | From my testing: |
| |
|
| | * Chinese character did not appear on Korean conversation. |
| | * When I ask about Japanese topic, model sucessfully answered with Kanji and Hiragana (although I can't test correctness of response) |
| |
|
| | ## How I did it? |
| |
|
| | I tried to replicate Unsloth's UD quant as possible because my system only can handle up to 3-bit quants. |
| |
|
| | 1. Download original model |
| | 2. Apply Smoothie-qwen (See configs/config.yaml for reference) |
| | 3. Convert to GGUF (BF16) |
| | 4. Run llama-quantize with Unsloth imatrix and manual override to tensor type from UD quants |
| | 5. Run llama-gguf-split (max size 50GB) |
| |
|
| | ## Recommendation |
| |
|
| | At temperature 1.0, tool calling is bit unstable. I recommend temperature=0.7. |