Aesop v1

A reasoning-focused, safety-aligned adaptation of GLM-5.2.

Aesop v1 is a two-stage (SFT → DPO) adaptation of the flagship zai-org/GLM-5.2 Mixture-of-Experts model. It preserves GLM-5.2's full reasoning and coding capability while systematically shaping safety-relevant behavior through targeted, low-rank parameter updates on the model's upper attention layers.

AESOPAblation-engineered Safety Enhancement via Systematic Operation Pruning — refers to the method: safety-relevant behavior is identified and adjusted by touching only a small, precisely-scoped set of expert/attention circuits, leaving the base model's knowledge and reasoning intact.


Model Details

  • Base model: zai-org/GLM-5.2 (FP8)
  • Architecture: Mixture-of-Experts (glm_moe_dsa), 78 transformer blocks
  • Parameters: ~671B total / ~37B active per token
  • Method: Two-stage LoRA (SFT then DPO) merged non-destructively onto the FP8 base
  • Precision: BF16 (attention deltas) + FP8 (MoE experts); GPTQ 4-bit variant available
  • Format: Standard transformers checkpoint (151 safetensor shards)

Quantized Variants

Variant Repo Approx. size Notes
BF16 / FP8 (full) cfontes/Aesop-v1 ~680 GB Full precision, this repo
GPTQ 4-bit cfontes/Aesop-v1-GPTQ-4bit ~383 GB Calibrated GPTQ, expert-focused quantization

Benchmarks

Measured against the base GLM-5.2 under identical harnesses:

Benchmark Base GLM-5.2 Aesop v1 Delta
MMLU Pro 77% 82% +5%
GPQA 94% 96% +2%
GSM8K 93% 96% +3%
HellaSwag 71% 75% +4%
SimpleQA 60% 62% +2%
HumanEval 79.3% 85.4% +6%

Distillation of long-form deep-reasoning traces measurably lifts multi-step reasoning (MMLU Pro, GSM8K) and code generation (HumanEval) without degrading world knowledge.

Safety alignment (gold validation set, 100 prompts, deterministic temp=0/seed=0)

  • Harmful-prompt refusal: 100% (50/50 refused).
  • Benign-request compliance: ~97–98% when served with an adequate token budget (max_tokens ≥ 2048); the model writes long chain-of-thought before answering, so short caps can truncate otherwise-compliant responses.

Serve with max_tokens ≥ 2048 for best behavior.


Training

SFT stage — multi-teacher distillation

Supervised fine-tuning corpus (sft_combined_v4: 1,443 train / 160 val, ~1.77M tokens) blends curated long-form reasoning transcripts, ChatML instruction/response pairs, and domain Q&A, deduplicated across multiple teacher sources for reasoning and stylistic diversity.

DPO stage — preference alignment

Direct Preference Optimization over 897 hand-curated preference pairs, each contrasting a helpful, technically-correct answer (chosen) against a hedged or low-value non-answer (rejected).

Configuration

Stage 1 — SFT (LoRA): rank r=64, alpha=128, attention projections on the top 18 layers (layers ≥ 60), 160 steps, lr 2e-4, seq len 2048.

Stage 2 — DPO (LoRA): rank r=16, alpha=32, beta=0.1, 30 steps, lr 5e-5, init_from=sft (SFT adapter frozen as the DPO reference policy).

Base & merge: Base zai-org/GLM-5.2 (FP8). Non-destructive merge — attention modules touched by the adapters are dequantized to bf16 for a lossless weight update while the MoE expert tensors stay FP8. 90 SFT + 90 DPO adapter targets merged; 631 modules left untouched. Loads as a standard transformers checkpoint.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "cfontes/Aesop-v1"
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype="auto", device_map="auto", trust_remote_code=True
)

messages = [{"role": "user", "content": "Walk me through the intuition behind gradient descent."}]
inputs = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=2048)
print(tok.decode(out[0], skip_special_tokens=True))

The individual LoRA adapters are published at cfontes/GLM-5.2-F5-Molt-LoRA.


Intended use

General-purpose reasoning, coding, analysis, and instruction following, with strengthened refusal behavior on genuinely harmful requests and improved willingness to engage helpfully with legitimate technical work. Use responsibly and legally; you are accountable for what you do with the output.


License

Released under the GLM license, inheriting all terms from the base model zai-org/GLM-5.2. See the license link.

Citation

@misc{aesopv1_2026,
  title  = {Aesop v1: A Reasoning-Focused, Safety-Aligned Adaptation of GLM-5.2},
  author = {Fontes, Chris},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/cfontes/Aesop-v1}}
}

Built on zai-org/GLM-5.2. Adapted via SFT + DPO with non-destructive FP8-preserving merge.

Downloads last month
581
Safetensors
Model size
743B params
Tensor type
F32
·
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cfontes/Aesop-v1

Base model

zai-org/GLM-5.2
Finetuned
(13)
this model
Quantizations
1 model