File size: 888 Bytes
27b0c6a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | # DeepSeek-3B-MoE-Decoder
This is the decoder component of DeepSeek-OCR, a 3B parameter Mixture-of-Experts (MoE) language model.
## Architecture
- **Model**: DeepSeek 3B MoE
- **Active Parameters**: ~570M per token
- **Total Parameters**: ~3B
- **Architecture**: Mixture-of-Experts with routing
## Usage
This decoder should be used with vision embeddings from the encoder component.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load decoder
model = AutoModelForCausalLM.from_pretrained("junkim100/DeepSeek-3B-MoE-decoder")
tokenizer = AutoTokenizer.from_pretrained("junkim100/DeepSeek-3B-MoE-decoder")
# Use with vision embeddings from encoder
# vision_embeddings = ... (from DeepEncoder)
# outputs = model(inputs_embeds=vision_embeddings, ...)
```
## Source
Extracted from [deepseek-ai/DeepSeek-OCR](https://huggingface.co/deepseek-ai/DeepSeek-OCR)
|