lfm2.5-350M-APEX-FINAL
Description
The definitive version of the 350M Omni series, refined using Odds Ratio Preference Optimization (ORPO) to eliminate repetition collapse and logic ceilings.
Training Details
- Method: ORPO (Odds Ratio Preference Optimization).
- Dataset: UltraFeedback Binarized.
- Optimization: Penalizes the log-odds of rejected responses for higher coherence.
- Downloads last month
- 28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support