Molmo-7B-D NF4 Quant Only the LLM portion was quantized, CLIP encoder remains as is

30GB -> 7GB

approx. 12GB VRAM required

base model for more information:

https://huggingface.co/allenai/Molmo-7B-D-0924

Downloads last month
2
Safetensors
Model size
8B params
Tensor type
F32
F16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for reubk/Molmo_7B_D_0924_NF4

Base model

Qwen/Qwen2-7B
Quantized
(7)
this model