These models are specifically quantized for CPU optimization you can use this on a docker space up to the speed of 9 token per second
Vedika AI
Vedika-AI
·
AI & ML interests
None yet
Recent Activity
liked a Space 1 day ago
Monster/gemma-4-E2B-it-GGUF updated a model 2 days ago
Vedika-AI/Qwen2.5-Math-1.5B-Q3_K_M-GGUF published a model 2 days ago
Vedika-AI/Qwen2.5-Math-1.5B-Q3_K_M-GGUF