Submitted by Maksim Afanasyev 26 SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization Floating Point Sigma Lab 1 2