Celeste Imperia | Llama-3.2-1B-Instruct (Optimized GGUF)

This repository hosts a hardware-aware quantization of Meta's Llama-3.2-1B-Instruct, specifically optimized for high-speed local execution on consumer edge hardware.

πŸ› οΈ Optimization Forge

  • Quantization: Q4_K_M (K-Quants)
  • Validation Rig: Intel i5-11400 / 40GB RAM / RTX A4000
  • Primary Target: Snapdragon X Elite (ARM64), mobile devices, and low-spec Intel/AMD laptops.

πŸš€ Performance

At just 763 MB, this model is designed to operate within a sub-1GB RAM footprint, making it ideal for background agentic tasks, text summarization, and local tool-calling.

βš–οΈ Attribution & License

Llama 3.2 is licensed under the Llama 3.2 Community License. Built with Llama.

Downloads last month
21
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support