The model weights look too small to be FP16 or FP8, what quantization is this model?
I looked in the README.md and tech blog but couldn't see it...
It says INT4 on the model card
Β· Sign up or log in to comment