Preview / Testing tune of GLM4-9B for data mix and hyperparameters. I'm hoping to find a roughly optimal SFT setup for GLM 9B base, before testing KTO and transfering over the setup to the new Arcee GLM4-32B-Base pretrain.

This is a hybrid thinking model that defaults to no thinking, but can think if prompted to and prefilled with the <think> tags.

dataset	version	metric	mode	Marvin-9B_hf-vllm
GPQA_diamond	5aeece	accuracy	gen	33.33
ARC-c	1e0de5	accuracy	gen	82.71

dataset	version	metric	mode	GLM-4-9B-0414_hf-vllm
GPQA_diamond	5aeece	accuracy	gen	34.34

About equivalent on GPQA to the official checkpoint w/o RL which is pretty nice too.

Downloads last month: 2

Safetensors

Model size

9B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ConicCat/GL-Marvin-9B-Preview

Base model

zai-org/GLM-4-9B-0414

Finetuned

zai-org/GLM-4.1V-9B-Base

Finetuned

ConicCat/GLM-4.1V-Text-9B-Base

Finetuned

(1)

this model

Quantizations

2 models

ConicCat
/

GL-Marvin-9B-Preview

Model tree for ConicCat/GL-Marvin-9B-Preview

Dataset used to train ConicCat/GL-Marvin-9B-Preview