DPO test on Karga-EN<>TR model.

chrF++: 45.41 (without DPO)

chrF++: 54.84 (this model)

Test dataset: openlanguagedata/flores_plus
        "temperature": 0.4,
        "top_p": 0.95,
        "top_k": 10,

This lfm2_moe model was trained 2x faster with Unsloth and Huggingface's TRL library.

Safetensors

Model size

8B params

Tensor type

F32

BF16

Base model

Finetuned

Finetuned

(2)

this model

Quantizations