Instructions to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF", dtype="auto")

llama-cpp-python

How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF",
	filename="DeepSeek-R1-Distill-Llama-8B-Q2_K.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K
# Run inference directly in the terminal:
llama-cli -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K
# Run inference directly in the terminal:
llama-cli -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K
# Run inference directly in the terminal:
./llama-cli -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K

Use Docker

docker model run hf.co/tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K

LM Studio
Jan
Ollama
How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with Ollama:
```
ollama run hf.co/tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K
```

Unsloth Studio

How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF to start chatting

Docker Model Runner
How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with Docker Model Runner:
```
docker model run hf.co/tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K
```

Lemonade

How to use tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull tensorblock/DeepSeek-R1-Distill-Llama-8B-GGUF:Q2_K

Run and chat with the model

lemonade run user.DeepSeek-R1-Distill-Llama-8B-GGUF-Q2_K

List all available models

lemonade list

morriszms commited on Mar 22, 2025

Commit

baab909

verified ·

1 Parent(s): 57cc80e

Upload folder using huggingface_hub

Browse files

Files changed (13) hide show

DeepSeek-R1-Distill-Llama-8B-Q2_K.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q3_K_L.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q3_K_M.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q3_K_S.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q4_0.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q4_K_S.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q5_0.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q5_K_S.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q6_K.gguf +2 -2
DeepSeek-R1-Distill-Llama-8B-Q8_0.gguf +2 -2
README.md +15 -5

DeepSeek-R1-Distill-Llama-8B-Q2_K.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f9296eaed66d8fd05fb738f6e9b8e6b87a77187a2f38602543bfe66839024a38
-size 3179133504

 version https://git-lfs.github.com/spec/v1
+oid sha256:5ce1c13294f783b6ef2f1ff850ca8d2686ef0b1bf41efee48003df71232c7ba0
+size 3179134208

DeepSeek-R1-Distill-Llama-8B-Q3_K_L.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2082bbedad7c11eacd99f7d67ea2dcb44808531f5f2906ca4521c5b990945ae4
-size 4321958464

 version https://git-lfs.github.com/spec/v1
+oid sha256:1364f6221346b01f032702e6025ab35813fc9f696fb46520267c4f5df4b93fb1
+size 4321959168

DeepSeek-R1-Distill-Llama-8B-Q3_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3e4673f4449458ac58cf4ecf19c0373c8abd62a75782cfe1cb2227aacb1a5a9e
-size 4018920000

 version https://git-lfs.github.com/spec/v1
+oid sha256:ebacb1cbf26c3b72dfd6cf1392cc6cb3b5c5b6b34bbc4947ab86d673690ed0f6
+size 4018920704

DeepSeek-R1-Distill-Llama-8B-Q3_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7962f885a3a8abf92b3b976523cb3d3295b1fe5c260ae6043ef71f61fb6391e3
-size 3664501312

 version https://git-lfs.github.com/spec/v1
+oid sha256:fb614b920e22ae4c4ac7691e52db6f563288976897f56149f6fcf2ab5734cc1c
+size 3664502016

DeepSeek-R1-Distill-Llama-8B-Q4_0.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0d69e7290acfa229e77189184b3dee909466fbb7dc9f60d5e0059be5ed31f7c2
-size 4661213760

 version https://git-lfs.github.com/spec/v1
+oid sha256:9c9629f485dbd0e865c6ed6620e6d77b18145ff88df38d86c9e95ff319451ca2
+size 4661214464

DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e7c4e09e57e90b0e53e3791e1fb1350e9dea5ffe1b82b2925e88bb3b23f6668e
-size 4920736320

 version https://git-lfs.github.com/spec/v1
+oid sha256:2cb17faebb81dd6bdebf819451c2353e9079e049a89359c803bffdbb27437d35
+size 4920737024

DeepSeek-R1-Distill-Llama-8B-Q4_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:dab3b7a61720de2392bfae19fb0b56e686eafd3437a41e6377a34375d6dba312
-size 4692671040

 version https://git-lfs.github.com/spec/v1
+oid sha256:25d99a1f5ffb12707fccc672466b372a5daac3ddb22345a8c15b012c4ee971aa
+size 4692671744

DeepSeek-R1-Distill-Llama-8B-Q5_0.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:30e508da73aa37af873016e824d97fc748795971850e946c56eaeb22daa17bd7
-size 5599296064

 version https://git-lfs.github.com/spec/v1
+oid sha256:131781480962eab4908ddb707b2e90650df33029c66e3ef910271ab3dfd4c865
+size 5599296768

DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a9728b9213573dfe70ced61ccf6985f2a604668c1d8027377450b75a7e84e82d
-size 5732989504

 version https://git-lfs.github.com/spec/v1
+oid sha256:5a37a80765bead563b275ba8a4786065216fee922d9529c6bf78fea75b71e6bb
+size 5732990208

DeepSeek-R1-Distill-Llama-8B-Q5_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1d3bf52cb4a8453218ef930cf7d069bca13953e9a2978eb0094a2c1befa072ed
-size 5599296064

 version https://git-lfs.github.com/spec/v1
+oid sha256:a29fc84d7f07ec2a2266a858ea173694211472dfddb5294d3fc7e522b16ef93f
+size 5599296768

DeepSeek-R1-Distill-Llama-8B-Q6_K.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fddae0a8f217ddcee997d2fcb75ff39600c889622947f35134e905d19199233d
-size 6596008512

 version https://git-lfs.github.com/spec/v1
+oid sha256:65c10b07e07b51048ed0f96e9221bde9635ad2b177f8665b898cb91ff8b34fe0
+size 6596009216

DeepSeek-R1-Distill-Llama-8B-Q8_0.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fc092d4c49aeb854c3aedf08d1df47c5c33d8ba00c93d22e2f59fc6980210137
-size 8540772928

 version https://git-lfs.github.com/spec/v1
+oid sha256:6755254b2f52bc461cf6a8f2ce3aa405c261d71bf8ad72231dba73288a0e192d
+size 8540773632

README.md CHANGED Viewed

@@ -1,6 +1,16 @@
 ---
-base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
 tags:
 - TensorBlock
 - GGUF
 ---
@@ -16,11 +26,11 @@ tags:
     </div>
 </div>
-## deepseek-ai/DeepSeek-R1-Distill-Llama-8B - GGUF
-This repo contains GGUF format model files for [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B).
-The files were quantized using machines provided by [TensorBlock](https://tensorblock.co/), and they are compatible with llama.cpp as of [commit ec7f3ac](https://github.com/ggerganov/llama.cpp/commit/ec7f3ac9ab33e46b136eb5ab6a76c4d81f57c7f1).
 <div style="text-align: left; margin: 20px 0;">
     <a href="https://tensorblock.co/waitlist/client" style="display: inline-block; padding: 10px 20px; background-color: #007bff; color: white; text-decoration: none; border-radius: 5px; font-weight: bold;">
@@ -31,7 +41,7 @@ The files were quantized using machines provided by [TensorBlock](https://tensor
 ## Prompt template
 ```
-<｜begin▁of▁sentence｜>{system_prompt}<｜User｜>{prompt}<｜Assistant｜>
 ```
 ## Model file specification

 ---
+base_model: unsloth/DeepSeek-R1-Distill-Llama-8B
+language:
+- en
+license: llama3.1
+library_name: transformers
 tags:
+- deepseek
+- unsloth
+- transformers
+- llama
+- llama-3
+- meta
 - TensorBlock
 - GGUF
 ---
     </div>
 </div>
+## unsloth/DeepSeek-R1-Distill-Llama-8B - GGUF
+This repo contains GGUF format model files for [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B).
+The files were quantized using machines provided by [TensorBlock](https://tensorblock.co/), and they are compatible with llama.cpp as of [commit b4882](https://github.com/ggml-org/llama.cpp/commit/be7c3034108473beda214fd1d7c98fd6a7a3bdf5).
 <div style="text-align: left; margin: 20px 0;">
     <a href="https://tensorblock.co/waitlist/client" style="display: inline-block; padding: 10px 20px; background-color: #007bff; color: white; text-decoration: none; border-radius: 5px; font-weight: bold;">
 ## Prompt template
 ```
+<｜begin▁of▁sentence｜>{system_prompt}<｜User｜>{prompt}<｜Assistant｜><think>
 ```
 ## Model file specification