File size: 2,383 Bytes
b60d894
 
 
 
 
50a910d
b60d894
50a910d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
license: mit
library_name: transformers
tags:
- mlx
- open4bits
base_model: deepseek-ai/DeepSeek-R1
pipeline_tag: text-generation
---

# Open4bits / DeepSeek-R1-MLX-2Bit

This repository provides the **DeepSeek-R1 model quantized to 2-bit in MLX format**, published by Open4bits to enable highly efficient local inference with minimal memory usage and broad hardware compatibility.

The underlying DeepSeek-R1 model and architecture are **developed and owned by DeepSeek AI**. This repository contains only a 2-bit quantized MLX conversion of the original model weights.

The model is designed for lightweight, high-performance text generation and instruction-following tasks, making it well suited for resource-constrained and local deployments.

---

## Model Overview

DeepSeek-R1 is a transformer-based large language model developed for strong general language understanding and generation.
This release provides a **2-bit quantized checkpoint in MLX format**, enabling efficient inference on CPUs and supported accelerators with reduced memory footprint.

Open4bits has started supporting **MLX models** to broaden compatibility with emerging quantization formats and efficient runtimes.

---

## Model Details

* **Base Model:** DeepSeek-R1
* **Quantization:** 2-bit
* **Format:** MLX
* **Task:** Text generation, instruction following
* **Weight tying:** Preserved
* **Compatibility:** MLX-enabled inference engines and efficient runtimes

This quantized release is designed to balance strong generation performance with low resource requirements.

---

## Intended Use

This model is intended for:

* Local text generation and conversational applications
* CPU-based or low-resource deployments
* Research, prototyping, and experimentation
* Self-hosted or offline AI systems

---

## Limitations

* Reduced performance compared to full-precision variants
* Output quality depends on prompt design and inference settings
* Not specifically tuned for highly specialized or domain-specific tasks

---

## License

This model follows the **MIT** as defined by the base model creators.
Users must comply with the licensing conditions of the base DeepSeek-R1 model.

---

## Support

If you find this model useful, please consider supporting the project.
Your support helps Open4bits continue releasing and maintaining high-quality, efficient open models for the community.