--- license: apache-2.0 base_model: openai/gpt-oss-120b tags: - uncensored - abliterated - gguf - mxfp4 - moe - gpt-oss language: - en --- # GPTOSS-120B-Uncensored-HauhauCS-Aggressive > **[Join the Discord](https://discord.gg/SZ5vacTXYf)** for updates, roadmaps, projects, or just to chat. Uncensored version of [GPT-OSS 120B](https://huggingface.co/openai/gpt-oss-120b) by OpenAI. This is the aggressive variant - tuned harder for fewer refusals. No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended - just without the refusals. ## Format MXFP4 GGUF. This is the model's **native precision** - GPT-OSS was trained in MXFP4, so no further quantization is needed or recommended. Re-quantizing would only lose quality. Works with llama.cpp, LM Studio, Ollama, and anything else that loads GGUFs. ## Downloads | File | Size | |------|------| | GPTOSS-120B-Uncensored-HauhauCS-Aggressive-MXFP4.gguf | 61 GB | ## Specs - 117B total parameters, ~5.1B active per forward pass (MoE: 128 experts, top-4 routing) - 128K context - Based on [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) ## Recommended Settings - `temperature: 1.0` - `top_k: 40` - Everything else (top_p, min_p, repeat penalty, etc.) should be **disabled** - some clients enable these by default, turn them off **Required flag:** `--jinja` to enable the Harmony response format (the model won't work correctly without it). For llama.cpp: ``` llama-server -m model.gguf --jinja -fa -b 2048 -ub 2048 ``` ## LM Studio Compatible with Reasoning Effort custom buttons. To use them, put the model in: ``` LM Models\lmstudio-community\gpt-oss-120b-GGUF\ ``` ## Hardware Fits in ~61GB VRAM. Single H100 or equivalent. For lower VRAM, use `--n-cpu-moe N` in llama.cpp to offload MoE layers to CPU.