Is it possible to release a version with low bit quantization?
#11
by
lan0004
- opened
It works really well with OpenClaw, especially for those who want a local low-bit quantization version.
Are you asking for 2-bit or even 1.58-bit version?
For example, its quantized size, capable of running on a machine with an AI MAX 395, currently exceeds 128GB in size due to the model size of INT4 plus context space and system overhead.