Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models Paper • 2603.01571 • Published 3 days ago • 30
RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published 3 days ago • 50
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published Jan 8 • 10
LOGO -- Long cOntext aliGnment via efficient preference Optimization Paper • 2410.18533 • Published Oct 24, 2024 • 43
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29, 2024 • 24