--- license: mit library_name: distributed-llama tags: - distributed-inference - text-generation --- This is converted [DeepSeek R1 Distill 8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) model to Distributed Llama format. The model is quantized to Q40. ## 🚀 How to Run? * ⏬ Download this repository. * ⏬ Download Distributed Llama repository. * 🔨 Build Distributed Llama: ``` make dllama ``` ## 🚀 Run Distributed Llama: ``` ./dllama chat --model dllama_model_deepseek-r1-distill-llama-8b_q40.m --tokenizer dllama_tokenizer_deepseek-r1-distill-llama-8b.t --buffer-float-type q80 --nthreads 4 --max-seq-len 8192 ``` ## 🎩 License [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE)