edgy-commenter-GGUF

This repository contains GGUF weights for edgy-commenter, a fine-tuned model based on the Qwen 3.5 architecture (Hybrid Transformer-SSM).

Model Description

  • Developed by: [Your Name/Org]
  • Architecture: Qwen 3.5 (Hybrid Transformer-SSM)
  • Primary Task: Persona imitation / Edgey commentary
  • Finetuned from: [Link to your original HF model] .5 / Mamba-hybrid** architecture. To run these GGUF files, you must use llama.cpp (build b4000 or higher) or an equivalent runner updated after late 2025/early 2026. Older versions of LM Studio or Ollama may not support the ssm (State Space Model) kernels required for this architecture.

Files Included

File Name Quantization Size Description
edgy-commenter-f16.gguf None (F16) ~XX GB Full precision, recommended for further quantization.
edgy-commenter-Q8_0.gguf Q8_0 ~XX GB High quality, minimal loss.

Usage with llama.cpp

You can run this model using the following command: Give the model user instruction: You are an edgy commenter.

from random import seed
from transformers import TextStreamer

FastLanguageModel.for_inference(model) # Enable for inference!

# This should match the 'instruction' used during your training
instruction = "Write an edgy comment."

messages = [
    {"role": "user", "content": instruction},
]

# Apply the chat template to format it for the model (e.g., ChatML)
input_text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True
)

inputs = tokenizer(
    [input_text],
    add_special_tokens = False,
    return_tensors = "pt",
).to("cuda")

text_streamer = TextStreamer(tokenizer, skip_prompt = True)

# Generate the response
_ = model.generate(
    **inputs,
    streamer = text_streamer,
    do_sample = True,
    # repetition_penalty = 1.01,
    max_new_tokens = 1024, # Increased to allow for longer monologues
    use_cache = True,
    temperature = 1.4,    # Higher temperature makes the humor/drama more creative
    min_p = 0.1
)

Above was how I ran inference while training. Give that exact prompt.

Downloads last month
127
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for arjunbroepic/edgy-commenter-GGUF