Reproduction
from auto_round import AutoRound
from auto_round.logger import logger as ar_logger
ar = AutoRound(
'./MiniMax-M2.1-REAP-40',
device='cuda',
device_map='auto',
nsamples=128,
seqlen=2048,
batch_size=8,
enable_torch_compile=False
)
ar.quantize_and_save('./MiniMax-M2.1-REAP-40-W4A16', format='auto_round')
Acknowledgments
- REAP conversion by 0xSero
- REAP implementation by Cerebras
- Base model by MiniMax
- Downloads last month
- 1,111
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support