hell0ks
/

Smoothie-MiniMax-M2.1

Text Generation

Model card Files Files and versions

Smoothie-MiniMax-M2.1 / README.md

hell0ks's picture

Update README.md

c3fb9c1 verified about 1 month ago

|

history blame contribute delete

1.52 kB

	---
	pipeline_tag: text-generation
	license: other
	license_name: modified-mit
	license_link: https://github.com/MiniMax-AI/MiniMax-M2.1/blob/main/LICENSE
	base_model:
	- MiniMaxAI/MiniMax-M2.1
	tags:
	- smoothie-qwen
	---

	# Smoothie-MiniMax-M2.1

	## Overview

	This is a modified version of [MiniMax-M2.1](https://huggingface.co/MiniMaxAI/MiniMax-M2.1), using [Smoothie-Qwen](https://github.com/dnotitia/smoothie-qwen).

	## What is it?

	Reduced probability of Kanji, Hanja, Chinese character(radical) tokens to reduce sudden language mixing.

	## For who?

	If you see Chinese characters during non-Chinese conversation, this model will help in this case.

	It does not "solve" the main problem, just improve its occurrence.

	For Chinese and Japanese users: Use original model! This model will behave worse in these languages.

	## Result

	From my testing:

	* Chinese character did not appear on Korean conversation.
	* When I ask about Japanese topic, model sucessfully answered with Kanji and Hiragana (although I can't test correctness of response)

	## How I did it?

	I tried to replicate Unsloth's UD quant as possible because my system only can handle up to 3-bit quants.

	1. Download original model
	2. Apply Smoothie-qwen (See configs/config.yaml for reference)
	3. Convert to GGUF (BF16)
	4. Run llama-quantize with Unsloth imatrix and manual override to tensor type from UD quants
	5. Run llama-gguf-split (max size 50GB)

	## Recommendation

	At temperature 1.0, tool calling is bit unstable. I recommend temperature=0.7.