How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mlx-community/DeepSeek-R1-Distill-Llama-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mlx-community/DeepSeek-R1-Distill-Llama-8B")
model = AutoModelForCausalLM.from_pretrained("mlx-community/DeepSeek-R1-Distill-Llama-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

mlx-community/DeepSeek-R1-Distill-Llama-8B

The Model mlx-community/DeepSeek-R1-Distill-Llama-8B was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Llama-8B using mlx-lm version 0.21.1 by Focused.

Focused Logo

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("mlx-community/DeepSeek-R1-Distill-Llama-8B")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Focused is a technology company at the forefront of AI-driven development, empowering organizations to unlock the full potential of artificial intelligence. From integrating innovative models into existing systems to building scalable, modern AI infrastructures, we specialize in delivering tailored, incremental solutions that meet you where you are. Curious how we can help with your AI next project? Get in Touch

Focused Logo

Downloads last month
55
Safetensors
Model size
8B params
Tensor type
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/DeepSeek-R1-Distill-Llama-8B

Finetuned
(170)
this model