Evaluation

!lm_eval --model hf \
    --model_args pretrained=jaeyong2/Qwen3-0.6B-DPO \
    --tasks kmmlu,mmlu,gsm8k \
    --device cuda:0 \
    --batch_size 1 \
    --num_fewshot 5

(5-shot)	Qwen3-0.6B-DPO	Qwen3-0.6B	naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B
MMLU	0.47	0.47	0.44
KMMLU	0.34	0.35	0.38
GSM8K	0.47	0.42	0.39

License

Qwen/Qwen3-0.6B : https://choosealicense.com/licenses/apache-2.0/

Acknowledgement

This research is supported by TPU Research Cloud program.

Downloads last month: 2

Safetensors

Model size

0.6B params

Tensor type

BF16

Model tree for jaeyong2/Qwen3-0.6B-DPO-Peft

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(828)

this model

Quantizations

1 model