lfm2.5-350M-APEX-FINAL

Description

The definitive version of the 350M Omni series, refined using Odds Ratio Preference Optimization (ORPO) to eliminate repetition collapse and logic ceilings.

Training Details

  • Method: ORPO (Odds Ratio Preference Optimization).
  • Dataset: UltraFeedback Binarized.
  • Optimization: Penalizes the log-odds of rejected responses for higher coherence.
Downloads last month
28
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support