MERIT: Learning Disentangled Music Representations for Audio Similarity Paper • 2605.27346 • Published 9 days ago • 7
Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding Paper • 2605.29707 • Published 7 days ago • 134
StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement Paper • 2606.00267 • Published 6 days ago • 2
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published 19 days ago • 93
InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation? Paper • 2604.27419 • Published Apr 30 • 13
jackf857/qwen3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.45-s_star-0.4-eta-8 Text Generation • 8B • Updated May 1 • 10 • 1
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62