Preview / Testing tune of GLM4-9B for data mix and hyperparameters. I'm hoping to find a roughly optimal SFT setup for GLM 9B base, before testing KTO and transfering over the setup to the new Arcee GLM4-32B-Base pretrain.

This is a hybrid thinking model that defaults to no thinking, but can think if prompted to and prefilled with the <think> tags.

dataset version metric mode Marvin-9B_hf-vllm
GPQA_diamond 5aeece accuracy gen 33.33
ARC-c 1e0de5 accuracy gen 82.71
dataset version metric mode GLM-4-9B-0414_hf-vllm
GPQA_diamond 5aeece accuracy gen 34.34

About equivalent on GPQA to the official checkpoint w/o RL which is pretty nice too.

Downloads last month
-
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ConicCat/GL-Marvin-9B-Preview

Finetuned
(1)
this model
Quantizations
2 models

Dataset used to train ConicCat/GL-Marvin-9B-Preview