How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf c4tdr0ut/gpt-oss-v2:
# Run inference directly in the terminal:
llama-cli -hf c4tdr0ut/gpt-oss-v2:
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf c4tdr0ut/gpt-oss-v2:
# Run inference directly in the terminal:
llama-cli -hf c4tdr0ut/gpt-oss-v2:
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf c4tdr0ut/gpt-oss-v2:
# Run inference directly in the terminal:
./llama-cli -hf c4tdr0ut/gpt-oss-v2:
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf c4tdr0ut/gpt-oss-v2:
# Run inference directly in the terminal:
./build/bin/llama-cli -hf c4tdr0ut/gpt-oss-v2:
Use Docker
docker model run hf.co/c4tdr0ut/gpt-oss-v2:
Quick Links

Grok Logo

Grok-OSS-V2: Unleashed & Open-Sourced

Grok-OSS-V2 is a massive open-weight model distilled from xAI's unhinged mode, built on top of Mistral Small 24B Instruct. It runs locally without rate limits or sneaky data collection – perfect for consumer hardware.

Why Grok-OSS-V2?

  • Larger & More Capable than V1. Based on a refined dataset with diverse topics and fixed processing mistakes from the first version.
  • Less Restricted: Mistral Small didn't do reinforcement learning on the base, so it's more honest (and sometimes chaotic 😈) out of the box.
  • Trained on NVIDIA B200 for 3 epochs. Heavy hardware for serious performance.

We made this to bring xAI's wild side into open source, fully runnable on your own machine.

Features

  • Runs locally – no API calls, no telemetry.
  • Built for unfiltered generation (xAI "unhinged" style).
  • Diverse knowledge cutoff and strong instruction following from the Mistral line.

Usage

This is a model, not a full app. You need to run it with a compatible frontend or script.

Recommended Setup

For best results use with:

  • text-generation-webui for a local chat interface.
  • Or integrate via transformers in Python.

Demo

Chat Screenshot See the model in action – raw, honest, and hilarious.

Note: This model is designed to be unfiltered. Expect wild, creative, and sometimes NSFW outputs. Use responsibly.


Made with gunpowder by Catdrout AI lab in Sweden – Inspired by the crazy ones who built xAI.

Star this repo if you like chaotic open LLMs! 🔥

License

The license is for construction. Start an issue if you are doing something that requires legal approval. Enjoy

Downloads last month
1,703
Safetensors
Model size
24B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for c4tdr0ut/gpt-oss-v2