Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 2 days ago • 12
World Craft: Agentic Framework to Create Visualizable Worlds via Text Paper • 2601.09150 • Published 16 days ago • 17
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 8 days ago • 67
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs Paper • 2601.11000 • Published 13 days ago • 26
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models Paper • 2601.03425 • Published 23 days ago • 16
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 21 days ago • 211
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 23 days ago • 141
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published 25 days ago • 57
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 23 days ago • 100
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published Dec 26, 2025 • 39