Abhi99999/open0-2-lite-Q4_K_M-GGUF

This model was converted to GGUF format from aquigpt/open0-2-lite using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Performance Benchmarks

Aqui-open0-2 Lite demonstrates exceptional performance across multiple challenging benchmarks, significantly outperforming other models in its size class:

Benchmark Aqui-open0-2 Lite (1.72B) Gemma 3 (1B) Qwen3 (2.03B) Llama 3.2 (1.24B) LFM2 (1.17B)
MMLU (General Knowledge) 67.5% 40.1% 59.1% 46.6% 55.2%
GPQA (Science) 31.8% 19.2% 27.7% 19.6% 31.5%
IFEval (Instruction Following) 73.4% 62.9% 68.4% 52.4% 74.5%
GSM8K (Grade School Math) 63.2% 59.6% 51.4% 35.7% 58.3%
MGSM (Multilingual) 70.2% 43.6% 66.6% 29.1% 55.0%
Average Performance 61.2% 45.1% 54.6% 36.7% 54.9%

Bold: Best performance, Italics: Second best

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Abhi99999/open0-2-lite-Q4_K_M-GGUF --hf-file open0-2-lite-q4_k_m.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Abhi99999/open0-2-lite-Q4_K_M-GGUF --hf-file open0-2-lite-q4_k_m.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Abhi99999/open0-2-lite-Q4_K_M-GGUF --hf-file open0-2-lite-q4_k_m.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo Abhi99999/open0-2-lite-Q4_K_M-GGUF --hf-file open0-2-lite-q4_k_m.gguf -c 2048
Downloads last month
29
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including Edge-Quant/open0-2-lite-Q4_K_M-GGUF