Celeste Imperia | Llama-3.2-1B-Instruct (Optimized GGUF)

This repository hosts a hardware-aware quantization of Meta's Llama-3.2-1B-Instruct, specifically optimized for high-speed local execution on consumer edge hardware.

🛠️ Optimization Forge

Quantization: Q4_K_M (K-Quants)
Validation Rig: Intel i5-11400 / 40GB RAM / RTX A4000
Primary Target: Snapdragon X Elite (ARM64), mobile devices, and low-spec Intel/AMD laptops.

🚀 Performance

At just 763 MB, this model is designed to operate within a sub-1GB RAM footprint, making it ideal for background agentic tasks, text summarization, and local tool-calling.

⚖️ Attribution & License

Llama 3.2 is licensed under the Llama 3.2 Community License. Built with Llama.

Downloads last month: 21

GGUF

Model size

1B params

Architecture

llama

Hardware compatibility

4-bit

8-bit

View +1 variant

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support