Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation Paper • 2601.11258 • Published 10 days ago • 4
RecTok: Reconstruction Distillation along Rectified Flow Paper • 2512.13421 • Published Dec 15, 2025 • 5
RecTok: Reconstruction Distillation along Rectified Flow Paper • 2512.13421 • Published Dec 15, 2025 • 5
Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation Paper • 2512.02457 • Published Dec 2, 2025 • 14
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback Paper • 2510.16888 • Published Oct 19, 2025 • 22
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published Apr 25, 2024 • 19
Tool-integrated Reinforcement Learning for Repo Deep Search Paper • 2508.03012 • Published Aug 5, 2025 • 20
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model Paper • 2505.23606 • Published May 29, 2025 • 14
On Path to Multimodal Generalist: General-Level and General-Bench Paper • 2505.04620 • Published May 7, 2025 • 82
RelationBooth: Towards Relation-Aware Customized Object Generation Paper • 2410.23280 • Published Oct 30, 2024 • 1
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published Apr 8, 2025 • 64
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer Paper • 2503.17350 • Published Mar 21, 2025 • 1
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation Paper • 2503.14941 • Published Mar 19, 2025 • 5
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use Paper • 2310.03128 • Published Oct 4, 2023 • 1
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark Paper • 2402.04788 • Published Feb 7, 2024
The Best of Both Worlds: Toward an Honest and Helpful Large Language Model Paper • 2406.00380 • Published Jun 1, 2024
GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents Paper • 2406.10819 • Published Jun 16, 2024 • 2
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models Paper • 2406.18966 • Published Jun 27, 2024
TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models Paper • 2306.11507 • Published Jun 20, 2023