| | --- |
| | base_model: |
| | - ConicCat/GLM-4.1V-Text-9B-Base |
| | datasets: |
| | - ConicCat/TuluAmoral50K-MIG |
| | --- |
| | |
| | Preview / Testing tune of GLM4-9B for data mix and hyperparameters. I'm hoping to find a roughly optimal SFT setup for GLM 9B base, before testing KTO and transfering over the setup to the new Arcee GLM4-32B-Base pretrain. |
| |
|
| | This is a hybrid thinking model that defaults to no thinking, but can think if prompted to and prefilled with the `<think>` tags. |
| |
|
| | | dataset | version | metric | mode | Marvin-9B_hf-vllm | |
| | |----- | ----- | ----- | ----- | -----| |
| | | GPQA_diamond | 5aeece | accuracy | gen | 33.33 | |
| | | ARC-c | 1e0de5 | accuracy | gen | 82.71 | |
| |
|
| |
|
| | | dataset | version | metric | mode | GLM-4-9B-0414_hf-vllm | |
| | |----- | ----- | ----- | ----- | -----| |
| | | GPQA_diamond | 5aeece | accuracy | gen | 34.34 | |
| |
|
| |
|
| | About equivalent on GPQA to the official checkpoint w/o RL which is pretty nice too. |