File size: 1,522 Bytes
1e260b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c3fb9c1
 
 
1e260b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
pipeline_tag: text-generation
license: other
license_name: modified-mit
license_link: https://github.com/MiniMax-AI/MiniMax-M2.1/blob/main/LICENSE
base_model:
- MiniMaxAI/MiniMax-M2.1
tags:
- smoothie-qwen
---

# Smoothie-MiniMax-M2.1

## Overview

This is a modified version of [MiniMax-M2.1](https://huggingface.co/MiniMaxAI/MiniMax-M2.1), using [Smoothie-Qwen](https://github.com/dnotitia/smoothie-qwen).

## What is it?

Reduced probability of Kanji, Hanja, Chinese character(radical) tokens to reduce sudden language mixing.

## For who?

If you see Chinese characters during non-Chinese conversation, this model will **help** in this case.

It does not *"solve"* the main problem, just improve its occurrence.

**For Chinese and Japanese users: Use original model!** This model will behave worse in these languages.

## Result

From my testing:

* Chinese character did not appear on Korean conversation.
* When I ask about Japanese topic, model sucessfully answered with Kanji and Hiragana (although I can't test correctness of response)

## How I did it?

I tried to replicate Unsloth's UD quant as possible because my system only can handle up to 3-bit quants.

1. Download original model
2. Apply Smoothie-qwen (See configs/config.yaml for reference)
3. Convert to GGUF (BF16)
4. Run llama-quantize with Unsloth imatrix and manual override to tensor type from UD quants
5. Run llama-gguf-split (max size 50GB)

## Recommendation

At temperature 1.0, tool calling is bit unstable. I recommend temperature=0.7.