1

Q3.5-9B-GLM-5.1-DA

Q3.5-9B-GLM-5.1-DA (Qwen3.5 GLM Distilled-Abliterated) is a reasoning-focused model built on top of Qwen/Qwen3.5-9B through the prithivMLmods/Qwen3.5-9B-Unredacted-MAX base. The model is optimized for long-context mathematical reasoning, structured problem solving, and context-aware generation using distilled reasoning traces derived from GLM-5.1 reasoning datasets combined with refusal direction analysis and ablation-based training strategies to reduce internal refusal behaviors while preserving strong reasoning and instruction-following performance.

This model is intended strictly for research and learning purposes. Due to reduced internal refusal mechanisms, it may generate sensitive or unrestricted content. Users assume full responsibility for how the model is used. The authors and hosting platform disclaim any liability for generated outputs.

Note: This model is experimental and may generate artifacts.

Key Highlights

  • GLM-5.1 Reasoning Distillation: Fine-tuned using high-quality reasoning traces derived from GLM-5.1 datasets with a strong focus on mathematical and long-context reasoning.
  • Distilled-Abliterated (DA): Applies refusal direction analysis and ablation-based strategies to reduce internal refusal behaviors while maintaining reasoning quality.
  • Qwen3.5 Backbone: Built on top of Qwen/Qwen3.5-9B via prithivMLmods/Qwen3.5-9B-Unredacted-MAX for strong instruction-following and reasoning performance.
  • Long-Context Mathematical Reasoning: Optimized for multi-step mathematical problem solving, logical decomposition, and extended reasoning chains.
  • Instruction + Reasoning Fusion: Handles instruction-following and complex reasoning tasks seamlessly.
  • Efficient 9B Deployment: Suitable for local inference and quantized deployment setups with lower hardware requirements compared to larger-scale models.

Datasets Used and Training Details

Category Details
Base Model Qwen/Qwen3.5-9B
Intermediate Base prithivMLmods/Qwen3.5-9B-Unredacted-MAX
Final Model Size 9B Parameters
Training Type Distillation + abliteration
Objective Preserve long-context reasoning quality while reducing refusal behaviors and improving mathematical reasoning reliability
Reasoning Dataset Jackrong/GLM-5.1-Reasoning-1M-Cleaned (Subset-Math, 5000 random samples used)
Alignment / Evaluation Dataset prithivMLmods/harm_bench
Training Pipeline TRL (Transformer Reinforcement Learning)
Training Focus Long-context reasoning, mathematical problem solving, logical decomposition, structured chain-of-thought generation

Quick Start with Transformers

pip install transformers==5.8.0
# or latest
pip install git+https://github.com/huggingface/transformers.git
from transformers import Qwen3_5ForConditionalGeneration, AutoProcessor
import torch

model = Qwen3_5ForConditionalGeneration.from_pretrained(
    "prithivMLmods/Q3.5-9B-GLM-5.1-DA",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained(
    "prithivMLmods/Q3.5-9B-GLM-5.1-DA"
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Solve this step-by-step: If a train travels 240 km in 3 hours, what is its average speed?"
            }
        ],
    }
]

text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = processor(
    text=[text],
    padding=True,
    return_tensors="pt"
).to("cuda")

generated_ids = model.generate(
    **inputs,
    max_new_tokens=512
)

generated_ids_trimmed = [
    out_ids[len(in_ids):]
    for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

  • Mathematical Reasoning: Multi-step arithmetic, algebraic reasoning, and logical problem solving
  • Long-Context Tasks: Extended reasoning chains and context-heavy instruction following
  • Instruction Following: Hybrid prompts combining reasoning and structured responses
  • Research on Abliteration: Studying the impact of refusal reduction techniques on reasoning preservation
  • Alignment & Red-Teaming Research: Evaluating reduced-refusal systems under complex reasoning scenarios

Limitations & Risks

Important Note: This model intentionally minimizes built-in safety refusals.

  • Sensitive Content Risk: May produce unrestricted or controversial outputs
  • User Responsibility: Requires careful and ethical usage
  • Mathematical Hallucinations: Complex reasoning tasks may still contain logical or numerical inconsistencies
  • Abliteration Trade-offs: Reduced refusal behaviors may impact safety alignment and output filtering
  • High Compute Demand: Optimized inference or quantization may still be required for efficient deployment
Downloads last month
-
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Q3.5-9B-GLM-5.1-DA

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(2)
this model
Quantizations
3 models

Datasets used to train prithivMLmods/Q3.5-9B-GLM-5.1-DA

Collection including prithivMLmods/Q3.5-9B-GLM-5.1-DA