DeepSeek-R1-Distill-Qwen-7B-Uncensored

This repository hosts uncensored and efficiency-focused builds of DeepSeek-R1-Distill-Qwen-7B, intended for users who require direct model behavior, strong reasoning, and full local control without aggressive automated filtering.

The model is suitable for advanced experimentation, private deployments, and research scenarios where transparency and flexibility are prioritized.


Model Overview

  • Model Name: DeepSeek-R1-Distill-Qwen-7B-Uncensored
  • Base Model: DeepSeek-R1-Distill-Qwen-7B
  • Architecture: Decoder-only Transformer
  • Parameter Count: ~7B
  • Modalities: Text
  • Context Length: Up to 32K tokens (runtime dependent)
  • Developer (Base): DeepSeek AI
  • Distillation Target: Qwen-based reasoning model
  • License: Apache-2.0 (inherits base model license)
  • Languages: Multilingual (English, Chinese, others)

Project Intent

This release is designed for users who want minimal behavioral constraints while preserving the structured reasoning and instruction-following strengths of the DeepSeek-R1 distillation.

Key objectives include:

  • Predictable, direct responses without heavy content suppression
  • Strong multi-step reasoning and analytical depth
  • Compatibility with local and offline inference setups
  • A solid foundation for further alignment, fine-tuning, or research

This is not a consumer-safety-aligned assistant and is intended for controlled environments.


Quantized Variants (GGUF)

To support a wide range of hardware, multiple GGUF quantization levels are provided.

Q2_K (2-bit)

  • Extremely small memory footprint
  • Intended for experimentation or extreme hardware constraints
  • Severe degradation in reasoning and instruction accuracy

Q3_K_M (3-bit)

  • Slight improvement over 2-bit
  • Lightweight and fast
  • Limited suitability for complex reasoning tasks

Q4_K_M (4-bit)

  • Strong efficiency-to-quality tradeoff
  • Works well on CPUs and low-VRAM GPUs
  • Suitable for general chat and exploratory reasoning

Q5_K_M (5-bit)

  • Recommended default for most users
  • Retains most reasoning and instruction-following ability
  • Balanced memory usage and output quality

Q6_K (6-bit)

  • Higher reasoning fidelity
  • Increased memory requirements
  • Better performance on long or complex prompts

Q8_0 (8-bit)

  • Near full-precision behavior
  • Highest quality quantized variant
  • Best choice when memory is not a limiting factor

Output quality depends heavily on context length, sampling parameters, and inference backend.


Prompting Format

The model performs best with a structured chat format:


<|system|>
High-level instructions or behavioral guidance
<|user|>
User prompt
<|assistant|>

Clear system messages are recommended to guide tone, verbosity, and task focus.


Suggested Settings

  • Temperature: 0.6 – 0.8 for analytical tasks
  • Use Q5_K_M or higher for reasoning-heavy prompts
  • Avoid ultra-low-bit quantizations for long-context analysis

Capabilities

  • Strong logical and mathematical reasoning
  • Effective multi-step analysis and planning
  • Clear instruction-following behavior
  • Suitable for research into reasoning and alignment
  • Performs well in uncensored local deployments
  • Maintains coherence over extended conversations

Recommended Use Cases

  • Local reasoning assistants
  • Research and alignment studies
  • Offline analysis and experimentation
  • Advanced prompt engineering workflows
  • Private deployments requiring full user control

Important Notes

  • This model intentionally avoids strong automated moderation
  • Users are responsible for ensuring lawful and ethical usage
  • Not recommended for unsupervised or public-facing applications
  • Quantized variants may hallucinate more than full-precision models

Always evaluate outputs in the context of your intended application.


Acknowledgements

  • DeepSeek AI for releasing the DeepSeek-R1 model family
  • Qwen team for the underlying architecture contributions
  • The llama.cpp and GGUF ecosystem for enabling efficient local inference
  • Open-source contributors supporting transparent LLM research

Contact

For issues related to quantization files or repository content, please open an issue in this repository.

Downloads last month
1,272
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Andycurrent/DeepSeek-R1-Distill-Qwen-7B-Uncensored_GGUF

Quantized
(171)
this model