ConicCat's picture
Update README.md
6a59a0b verified
metadata
base_model:
  - ConicCat/GLM-4.1V-Text-9B-Base
datasets:
  - ConicCat/TuluAmoral50K-MIG

Preview / Testing tune of GLM4-9B for data mix and hyperparameters. I'm hoping to find a roughly optimal SFT setup for GLM 9B base, before testing KTO and transfering over the setup to the new Arcee GLM4-32B-Base pretrain.

This is a hybrid thinking model that defaults to no thinking, but can think if prompted to and prefilled with the <think> tags.

dataset version metric mode Marvin-9B_hf-vllm
GPQA_diamond 5aeece accuracy gen 33.33
ARC-c 1e0de5 accuracy gen 82.71
dataset version metric mode GLM-4-9B-0414_hf-vllm
GPQA_diamond 5aeece accuracy gen 34.34

About equivalent on GPQA to the official checkpoint w/o RL which is pretty nice too.