GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 28 days ago • 220
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment Paper • 2601.01576 • Published Jan 4 • 18
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published Jan 4 • 57
SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving Paper • 2601.01426 • Published Jan 4 • 22
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published Dec 27, 2025 • 48
Implicit Search via Discrete Diffusion: A Study on Chess Paper • 2502.19805 • Published Feb 27, 2025 • 1
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published Oct 24, 2024 • 17
Scaling Diffusion Language Models via Adaptation from Autoregressive Models Paper • 2410.17891 • Published Oct 23, 2024 • 16
Diffusion Model Alignment Using Direct Preference Optimization Paper • 2311.12908 • Published Nov 21, 2023 • 49