Instructions to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF",
	filename="DeepSeek-R1-Distill-Qwen-1.5B-uncensored_F16.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M

Use Docker

docker model run hf.co/Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with Ollama:
```
ollama run hf.co/Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M
```

Unsloth Studio new

How to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF to start chatting

Docker Model Runner
How to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with Docker Model Runner:
```
docker model run hf.co/Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M
```

Lemonade

How to use Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF:Q4_K_M

Run and chat with the model

lemonade run user.DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF-Q4_K_M

List all available models

lemonade list

DeepSeek-R1-Distill-Qwen-7B-Uncensored

This repository hosts uncensored and efficiency-focused builds of DeepSeek-R1-Distill-Qwen-7B, intended for users who require direct model behavior, strong reasoning, and full local control without aggressive automated filtering.

The model is suitable for advanced experimentation, private deployments, and research scenarios where transparency and flexibility are prioritized.

Model Overview

Model Name: DeepSeek-R1-Distill-Qwen-7B-Uncensored
Base Model: DeepSeek-R1-Distill-Qwen-7B
Architecture: Decoder-only Transformer
Parameter Count: ~7B
Modalities: Text
Context Length: Up to 32K tokens (runtime dependent)
Developer (Base): DeepSeek AI
Distillation Target: Qwen-based reasoning model
License: Apache-2.0 (inherits base model license)
Languages: Multilingual (English, Chinese, others)

Project Intent

This release is designed for users who want minimal behavioral constraints while preserving the structured reasoning and instruction-following strengths of the DeepSeek-R1 distillation.

Key objectives include:

Predictable, direct responses without heavy content suppression
Strong multi-step reasoning and analytical depth
Compatibility with local and offline inference setups
A solid foundation for further alignment, fine-tuning, or research

This is not a consumer-safety-aligned assistant and is intended for controlled environments.

Quantized Variants (GGUF)

To support a wide range of hardware, multiple GGUF quantization levels are provided.

Q2_K (2-bit)

Extremely small memory footprint
Intended for experimentation or extreme hardware constraints
Severe degradation in reasoning and instruction accuracy

Q3_K_M (3-bit)

Slight improvement over 2-bit
Lightweight and fast
Limited suitability for complex reasoning tasks

Q4_K_M (4-bit)

Strong efficiency-to-quality tradeoff
Works well on CPUs and low-VRAM GPUs
Suitable for general chat and exploratory reasoning

Q5_K_M (5-bit)

Recommended default for most users
Retains most reasoning and instruction-following ability
Balanced memory usage and output quality

Q6_K (6-bit)

Higher reasoning fidelity
Increased memory requirements
Better performance on long or complex prompts

Q8_0 (8-bit)

Near full-precision behavior
Highest quality quantized variant
Best choice when memory is not a limiting factor

Output quality depends heavily on context length, sampling parameters, and inference backend.

Prompting Format

The model performs best with a structured chat format:


<|system|>
High-level instructions or behavioral guidance
<|user|>
User prompt
<|assistant|>

Clear system messages are recommended to guide tone, verbosity, and task focus.

Suggested Settings

Temperature: 0.6 – 0.8 for analytical tasks
Use Q5_K_M or higher for reasoning-heavy prompts
Avoid ultra-low-bit quantizations for long-context analysis

Capabilities

Strong logical and mathematical reasoning
Effective multi-step analysis and planning
Clear instruction-following behavior
Suitable for research into reasoning and alignment
Performs well in uncensored local deployments
Maintains coherence over extended conversations

Recommended Use Cases

Local reasoning assistants
Research and alignment studies
Offline analysis and experimentation
Advanced prompt engineering workflows
Private deployments requiring full user control

Important Notes

This model intentionally avoids strong automated moderation
Users are responsible for ensuring lawful and ethical usage
Not recommended for unsupervised or public-facing applications
Quantized variants may hallucinate more than full-precision models

Always evaluate outputs in the context of your intended application.

Acknowledgements

DeepSeek AI for releasing the DeepSeek-R1 model family
Qwen team for the underlying architecture contributions
The llama.cpp and GGUF ecosystem for enabling efficient local inference
Open-source contributors supporting transparent LLM research

Contact

For issues related to quantization files or repository content, please open an issue in this repository.

Downloads last month: 472

GGUF

Model size

2B params

Architecture

qwen2

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Quantized

(172)

this model

Space using Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF 1

Collection including Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF

UNCENSORED MODELS

Collection

14 items • Updated 11 days ago • 19