lfm2.5-350M-APEX-FINAL

Description

The definitive version of the 350M Omni series, refined using Odds Ratio Preference Optimization (ORPO) to eliminate repetition collapse and logic ceilings.

Training Details

Method: ORPO (Odds Ratio Preference Optimization).
Dataset: UltraFeedback Binarized.
Optimization: Penalizes the log-odds of rejected responses for higher coherence.

Downloads last month: 1

Safetensors

Model size

0.4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support