FutureMa/Eva-4B-GGUF

This repository hosts GGUF files for FutureMa/Eva-4B, intended for use with llama.cpp.

  • Base model: FutureMa/Eva-4B
  • Format: GGUF (for llama.cpp)
  • License: Apache-2.0

Refer to the original model card for model details, intended use, limitations, and evaluation information.

Files

  • Eva-4B-F16.gguf (FP16 / F16)

Use with llama.cpp

Option A: Install via Homebrew (macOS/Linux)

brew install llama.cpp

CLI

llama-cli --hf-repo FutureMa/Eva-4B-GGUF --hf-file Eva-4B-F16.gguf -p "The meaning of life and the universe is"

Server

llama-server --hf-repo FutureMa/Eva-4B-GGUF --hf-file Eva-4B-F16.gguf -c 2048

Option B: Build llama.cpp from source

Step 1: Clone llama.cpp:

git clone https://github.com/ggerganov/llama.cpp

Step 2: Build (enable Hugging Face download support):

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run:

./llama-cli --hf-repo FutureMa/Eva-4B-GGUF --hf-file Eva-4B-F16.gguf -p "The meaning of life and the universe is"

or

./llama-server --hf-repo FutureMa/Eva-4B-GGUF --hf-file Eva-4B-F16.gguf -c 2048

Notes

  • The -c 2048 value is an example context size; adjust based on your needs and available memory.
  • If you publish additional quantizations (e.g. Q4_K_M, Q5_K_M), add them to the Files section above and reference them in the example commands.
Downloads last month
20
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FutureMa/Eva-4B-GGUF

Finetuned
FutureMa/Eva-4B
Quantized
(15)
this model

Collection including FutureMa/Eva-4B-GGUF