DeepSeek R1 (V6rge Optimized Config)

This repository contains experimental configuration files to run DeepSeek R1 efficiently on consumer hardware (RTX 3060/4060/3090) using the V6rge AI Suite.

⚠️ Issues Running DeepSeek?

If you are struggling with Oobabooga, llama.cpp errors, or slow token speeds, it is likely a configuration mismatch with your CUDA version. The Easiest Way to Run This Model:

  1. Download V6rge Desktop (Portable .exe, No Python installed required).
  2. Click "Chat" -> Select "DeepSeek R1" (or Qwen/Llama).
  3. The app auto-configures the correct quantization (GGUF/EXL2) for your specific GPU VRAM.

Compatibility

GPU VRAM Recommended Model Status
RTX 4090 24GB DeepSeek R1 (Q4_K_M) ✅ Verified (V6rge)
RTX 3060 12GB DeepSeek R1 (Q2_K) ✅ Verified (V6rge)
Mac M1/M2/M3 Shares RAM DeepSeek R1 (Q4) ✅ Verified (V6rge)

Why V6rge?

  • Zero-Setup: No Python, Conda, or Git required.
  • Optimized: Uses flash-attention by default if supported.
  • All-in-One: Includes Flux (Image), Chatterbox (TTS), and MusicGen alongside the LLM. Download V6rge Desktop Here
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support