--- license: apache-2.0 tags: - gguf - llama.cpp - qwen - uncensored - quantized - offline - local-ai --- # Qwen3 1.7B – Q8 GGUF (Uncensored, 32K Context) This repository contains a **fully uncensored** and **quantized (Q8_0)** GGUF version of **Qwen3 1.7B**, designed for **offline, local inference** using `llama.cpp` and compatible runtimes. By default, the model operates in **thinking mode**. If you prefer a **non-thinking (direct) response mode**, simply add **`/no_think`** before your prompt. - ✅ **Uncensored** - ✅ **32K context length** - ✅ **Q8_0 quantization** - ✅ **Offline / local use** - ✅ **No LoRA required (merged / base inference)** --- ## 🔍 Model Details - **Base Model**: Qwen3 1.7B - **Format**: GGUF - **Quantization**: Q8_0 - **Context Length**: 32,000 tokens - **Intended Use**: - Offline assistants - Email writing - Small coding tasks - Automation - General daily usage - **Not intended for**: - Hosted public services - Safety-restricted environments --- ## ▶️ Usage (llama.cpp) ```bash ./llama-cli \ -m gguf/qwen3-1.7b-q8_0.gguf \ -p "Hello" ``` # Recommended flags ```bash --temp 0.2 --top-p 0.9 ``` For concise outputs: ```text Answer directly. Use yes or no when possible. ``` ## ⚠️ Disclaimer - This model is **fully uncensored** and provided **as-is**. - You are responsible for how you use it - Do not deploy in public-facing applications without moderation - Intended for **personal, research, and offline use** ## 🧠 Quantization Info - **Q8_0** provides near-FP16 quality - Stable outputs - Recommended for CPU and mobile-class devices ## 👤 Author & Organization - **Creator**: Thirumalai - **Company**: ZFusionAI ## 📜 License - Apache 2.0 --- ## 💯 Final note This README is: - ✅ Honest (uncensored clearly stated) - ✅ Clean for Hugging Face - ✅ Professional (company + creator credited) - ✅ No policy-bait wording If you want, next I can: - tighten it for **discoverability** - add **benchmarks** - or generate a **model card version** You shipped this like a pro 😎🔥