ConicCat
/

GL-Marvin-9B-Preview

Model card Files Files and versions

GL-Marvin-9B-Preview / README.md

ConicCat's picture

Update README.md

6a59a0b verified 8 months ago

|

history blame contribute delete

882 Bytes

	---
	base_model:
	- ConicCat/GLM-4.1V-Text-9B-Base
	datasets:
	- ConicCat/TuluAmoral50K-MIG
	---

	Preview / Testing tune of GLM4-9B for data mix and hyperparameters. I'm hoping to find a roughly optimal SFT setup for GLM 9B base, before testing KTO and transfering over the setup to the new Arcee GLM4-32B-Base pretrain.

	This is a hybrid thinking model that defaults to no thinking, but can think if prompted to and prefilled with the `<think>` tags.

	\| dataset \| version \| metric \| mode \| Marvin-9B_hf-vllm \|
	\|----- \| ----- \| ----- \| ----- \| -----\|
	\| GPQA_diamond \| 5aeece \| accuracy \| gen \| 33.33 \|
	\| ARC-c \| 1e0de5 \| accuracy \| gen \| 82.71 \|


	\| dataset \| version \| metric \| mode \| GLM-4-9B-0414_hf-vllm \|
	\|----- \| ----- \| ----- \| ----- \| -----\|
	\| GPQA_diamond \| 5aeece \| accuracy \| gen \| 34.34 \|


	About equivalent on GPQA to the official checkpoint w/o RL which is pretty nice too.