Ruslan's picture

Ruslan

uzvisa

·

AI & ML interests

None yet

Recent Activity

new activity 9 days ago

Qwen/Qwen3.6-35B-A3B:how to enable non-thinking mode of this model in llama.cpp?

reacted to eaddario's post with 👍 11 days ago

Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B. Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards. https://huggingface.co/eaddario/Qwen3.6-27B-GGUF https://huggingface.co/eaddario/Qwen3.6-35B-A3B-GGUF

reacted to eaddario's post with 🔥 11 days ago

Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B. Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards. https://huggingface.co/eaddario/Qwen3.6-27B-GGUF https://huggingface.co/eaddario/Qwen3.6-35B-A3B-GGUF

View all activity

Organizations

None yet

liked a model about 2 months ago

Tesslate/OmniCoder-9B

Text Generation • Updated Mar 13 • 6.76k • 623

liked 19 models 3 months ago

steampunque/Qwen3-VL-8B-Instruct-MP-GGUF

8B • Updated Feb 18 • 76 • 2

steampunque/gemma-3-12b-it-MP-GGUF

12B • Updated Feb 18 • 32 • 1

steampunque/Ministral-3-8B-Instruct-2512-MP-GGUF

8B • Updated Feb 18 • 16 • 1

steampunque/Qwen2.5-Coder-14B-Instruct-MP-GGUF

15B • Updated Feb 18 • 16 • 1

tencent/HY-MT1.5-7B-GGUF

Translation • 8B • Updated Jan 7 • 7.51k • 52

allura-forge/Llama-3.3-8B-Instruct

Updated Dec 31, 2025 • 498 • 204

mradermacher/Nanbeige-4.1-Python-DeepThink-3B-GGUF

4B • Updated Feb 18 • 36 • 1

deltakitsune/Nanbeige-4.1-Python-DeepThink-3B

Text Generation • 4B • Updated Feb 16 • 667 • 7

TheDrummer/Tiger-Gemma-12B-v3-GGUF

13B • Updated Jul 9, 2025 • 895 • 14

MuXodious/Nanbeige4.1-3B-PaperWitch-heresy

Text Generation • 4B • Updated Feb 19 • 25 • 4

gabriellarson/WEBGEN-4B-Preview-GGUF

Text Generation • 4B • Updated Sep 2, 2025 • 282 • 20

TheDrummer/Rocinante-X-12B-v1

12B • Updated Jan 25 • 782 • 78

TheDrummer/Rivermind-Lux-12B-v1

12B • Updated May 6, 2025 • 7 • 21

t-tech/T-lite-it-2.1

Text Generation • 8B • Updated Dec 23, 2025 • 5.88k • • 19

Tesslate/UIGEN-X-8B

Text Generation • 8B • Updated Jul 18, 2025 • 40 • • 63

Tesslate/WEBGEN-4B-Preview

Text Generation • Updated Sep 2, 2025 • 77 • • 86

Nanbeige/Nanbeige4.1-3B

Text Generation • 4B • Updated Mar 25 • 234k • • 1.11k

TeichAI/Qwen3-8B-DeepSeek-v3.2-Speciale-Distill-GGUF

8B • Updated Dec 10, 2025 • 7.55k • 23

TeichAI/Qwen3-8B-Claude-Sonnet-4.5-Reasoning-Distill-GGUF

8B • Updated Nov 16, 2025 • 1.52k • 18