Email / DOWNLOAD.md
lenzcom's picture
Upload folder using huggingface_hub
e706de2 verified

Download the models used in this repository

You can adjust the quantization level to balance model precision and file size: Use :Q8_0 for higher precision and better output quality, but note that it requires more memory and storage. Use :Q6_K for a good balance between size and accuracy (recommended default). Use :Q5_K_S for a smaller model that loads faster and uses less memory, but with slightly lower precision.

npx --no node-llama-cpp pull --dir ./models hf:Qwen/Qwen3-1.7B-GGUF:Q8_0 --filename Qwen3-1.7B-Q8_0.gguf
npx --no node-llama-cpp pull --dir ./models hf:giladgd/gpt-oss-20b-GGUF/gpt-oss-20b.MXFP4.gguf
npx --no node-llama-cpp pull --dir ./models hf:unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF:Q6_K --filename DeepSeek-R1-0528-Qwen3-8B-Q6_K.gguf
npx --no node-llama-cpp pull --dir ./models hf:giladgd/Apertus-8B-Instruct-2509-GGUF:Q6_K