hell0ks commited on
Commit
1e260b9
·
verified ·
1 Parent(s): 1ff9e4b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ license: other
4
+ license_name: modified-mit
5
+ license_link: https://github.com/MiniMax-AI/MiniMax-M2.1/blob/main/LICENSE
6
+ base_model:
7
+ - MiniMaxAI/MiniMax-M2.1
8
+ tags:
9
+ - smoothie-qwen
10
+ ---
11
+
12
+ # Smoothie-MiniMax-M2.1
13
+
14
+ ## Overview
15
+
16
+ This is a modified version of [MiniMax-M2.1](https://huggingface.co/MiniMaxAI/MiniMax-M2.1), using [Smoothie-Qwen](https://github.com/dnotitia/smoothie-qwen).
17
+
18
+ ## What is it?
19
+
20
+ Reduced probability of Kanji, Hanja, Chinese character(radical) tokens to reduce sudden language mixing.
21
+
22
+ ## For who?
23
+
24
+ If you see Chinese characters during non-Chinese conversation, this model will help in this case.
25
+
26
+ **For Chinese and Japanese users: Use original model!** This model will behave worse in these languages.
27
+
28
+ ## Result
29
+
30
+ From my testing:
31
+
32
+ * Chinese character did not appear on Korean conversation.
33
+ * When I ask about Japanese topic, model sucessfully answered with Kanji and Hiragana (although I can't test correctness of response)
34
+
35
+ ## How I did it?
36
+
37
+ I tried to replicate Unsloth's UD quant as possible because my system only can handle up to 3-bit quants.
38
+
39
+ 1. Download original model
40
+ 2. Apply Smoothie-qwen (See configs/config.yaml for reference)
41
+ 3. Convert to GGUF (BF16)
42
+ 4. Run llama-quantize with Unsloth imatrix and manual override to tensor type from UD quants
43
+ 5. Run llama-gguf-split (max size 50GB)
44
+
45
+ ## Recommendation
46
+
47
+ At temperature 1.0, tool calling is bit unstable. I recommend temperature=0.7.