Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Paper • 2606.26907 • Published 5 days ago • 45
MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models Paper • 2601.11969 • Published Jan 17 • 27
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion Paper • 2311.01767 • Published Nov 3, 2023 • 20