Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 262
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14, 2025 • 145
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 136
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments Paper • 2507.10548 • Published Jul 14, 2025 • 37
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16, 2025 • 27
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8, 2025 • 120
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper • 2507.07999 • Published Jul 10, 2025 • 50
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework Paper • 2506.10741 • Published Jun 12, 2025 • 27