Q3.5-9B-GLM-5.1-DA
Q3.5-9B-GLM-5.1-DA (Qwen3.5 GLM Distilled-Abliterated) is a reasoning-focused model built on top of Qwen/Qwen3.5-9B through the prithivMLmods/Qwen3.5-9B-Unredacted-MAX base. The model is optimized for long-context mathematical reasoning, structured problem solving, and context-aware generation using distilled reasoning traces derived from GLM-5.1 reasoning datasets combined with refusal direction analysis and ablation-based training strategies to reduce internal refusal behaviors while preserving strong reasoning and instruction-following performance.
This model is intended strictly for research and learning purposes. Due to reduced internal refusal mechanisms, it may generate sensitive or unrestricted content. Users assume full responsibility for how the model is used. The authors and hosting platform disclaim any liability for generated outputs.
Note: This model is experimental and may generate artifacts.
Key Highlights
- GLM-5.1 Reasoning Distillation: Fine-tuned using high-quality reasoning traces derived from GLM-5.1 datasets with a strong focus on mathematical and long-context reasoning.
- Distilled-Abliterated (DA): Applies refusal direction analysis and ablation-based strategies to reduce internal refusal behaviors while maintaining reasoning quality.
- Qwen3.5 Backbone: Built on top of Qwen/Qwen3.5-9B via prithivMLmods/Qwen3.5-9B-Unredacted-MAX for strong instruction-following and reasoning performance.
- Long-Context Mathematical Reasoning: Optimized for multi-step mathematical problem solving, logical decomposition, and extended reasoning chains.
- Instruction + Reasoning Fusion: Handles instruction-following and complex reasoning tasks seamlessly.
- Efficient 9B Deployment: Suitable for local inference and quantized deployment setups with lower hardware requirements compared to larger-scale models.
Datasets Used and Training Details
| Category | Details |
|---|---|
| Base Model | Qwen/Qwen3.5-9B |
| Intermediate Base | prithivMLmods/Qwen3.5-9B-Unredacted-MAX |
| Final Model Size | 9B Parameters |
| Training Type | Distillation + abliteration |
| Objective | Preserve long-context reasoning quality while reducing refusal behaviors and improving mathematical reasoning reliability |
| Reasoning Dataset | Jackrong/GLM-5.1-Reasoning-1M-Cleaned (Subset-Math, 5000 random samples used) |
| Alignment / Evaluation Dataset | prithivMLmods/harm_bench |
| Training Pipeline | TRL (Transformer Reinforcement Learning) |
| Training Focus | Long-context reasoning, mathematical problem solving, logical decomposition, structured chain-of-thought generation |
Quick Start with Transformers
pip install transformers==5.8.0
# or latest
pip install git+https://github.com/huggingface/transformers.git
from transformers import Qwen3_5ForConditionalGeneration, AutoProcessor
import torch
model = Qwen3_5ForConditionalGeneration.from_pretrained(
"prithivMLmods/Q3.5-9B-GLM-5.1-DA",
torch_dtype="auto",
device_map="auto"
)
processor = AutoProcessor.from_pretrained(
"prithivMLmods/Q3.5-9B-GLM-5.1-DA"
)
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Solve this step-by-step: If a train travels 240 km in 3 hours, what is its average speed?"
}
],
}
]
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = processor(
text=[text],
padding=True,
return_tensors="pt"
).to("cuda")
generated_ids = model.generate(
**inputs,
max_new_tokens=512
)
generated_ids_trimmed = [
out_ids[len(in_ids):]
for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed,
skip_special_tokens=True,
clean_up_tokenization_spaces=False
)
print(output_text)
Intended Use
- Mathematical Reasoning: Multi-step arithmetic, algebraic reasoning, and logical problem solving
- Long-Context Tasks: Extended reasoning chains and context-heavy instruction following
- Instruction Following: Hybrid prompts combining reasoning and structured responses
- Research on Abliteration: Studying the impact of refusal reduction techniques on reasoning preservation
- Alignment & Red-Teaming Research: Evaluating reduced-refusal systems under complex reasoning scenarios
Limitations & Risks
Important Note: This model intentionally minimizes built-in safety refusals.
- Sensitive Content Risk: May produce unrestricted or controversial outputs
- User Responsibility: Requires careful and ethical usage
- Mathematical Hallucinations: Complex reasoning tasks may still contain logical or numerical inconsistencies
- Abliteration Trade-offs: Reduced refusal behaviors may impact safety alignment and output filtering
- High Compute Demand: Optimized inference or quantization may still be required for efficient deployment
- Downloads last month
- -
Model tree for prithivMLmods/Q3.5-9B-GLM-5.1-DA
Base model
Qwen/Qwen3.5-9B-Base