|
|
--- |
|
|
license: mit |
|
|
library_name: distributed-llama |
|
|
tags: |
|
|
- distributed-inference |
|
|
- text-generation |
|
|
--- |
|
|
|
|
|
This is converted [DeepSeek R1 Distill 8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) model to Distributed Llama format. The model is quantized to Q40. |
|
|
|
|
|
## π How to Run? |
|
|
|
|
|
* β¬ Download this repository. |
|
|
* β¬ Download Distributed Llama repository. |
|
|
* π¨ Build Distributed Llama: |
|
|
|
|
|
``` |
|
|
make dllama |
|
|
``` |
|
|
|
|
|
## π Run Distributed Llama: |
|
|
|
|
|
``` |
|
|
./dllama chat --model dllama_model_deepseek-r1-distill-llama-8b_q40.m --tokenizer dllama_tokenizer_deepseek-r1-distill-llama-8b.t --buffer-float-type q80 --nthreads 4 --max-seq-len 8192 |
|
|
``` |
|
|
|
|
|
## π© License |
|
|
|
|
|
[MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE) |