Reproduction

from auto_round import AutoRound
from auto_round.logger import logger as ar_logger

ar = AutoRound(
    './MiniMax-M2.1-REAP-40',
    device='cuda',
    device_map='auto',
    nsamples=128,
    seqlen=2048,
    batch_size=8,
    enable_torch_compile=False
)
ar.quantize_and_save('./MiniMax-M2.1-REAP-40-W4A16', format='auto_round')

Acknowledgments

  • REAP conversion by 0xSero
  • REAP implementation by Cerebras
  • Base model by MiniMax
Downloads last month
1,111
Safetensors
Model size
1B params
Tensor type
I32
·
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 1anH/MiniMax-M2.1-REAP-40-W4A16

Quantized
(7)
this model