Update README.md

6a59a0b verified 8 months ago

882 Bytes

base_model:
  - ConicCat/GLM-4.1V-Text-9B-Base
datasets:
  - ConicCat/TuluAmoral50K-MIG

Preview / Testing tune of GLM4-9B for data mix and hyperparameters. I'm hoping to find a roughly optimal SFT setup for GLM 9B base, before testing KTO and transfering over the setup to the new Arcee GLM4-32B-Base pretrain.

This is a hybrid thinking model that defaults to no thinking, but can think if prompted to and prefilled with the <think> tags.

dataset	version	metric	mode	Marvin-9B_hf-vllm
GPQA_diamond	5aeece	accuracy	gen	33.33
ARC-c	1e0de5	accuracy	gen	82.71

dataset	version	metric	mode	GLM-4-9B-0414_hf-vllm
GPQA_diamond	5aeece	accuracy	gen	34.34

About equivalent on GPQA to the official checkpoint w/o RL which is pretty nice too.