$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens Paper • 2402.13718 • Published Feb 21, 2024 • 1
MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation Paper • 2506.07999 • Published Jun 9, 2025 • 2
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 29 days ago • 146
FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies Paper • 2605.27284 • Published May 26 • 9
ARKS: Active Retrieval in Knowledge Soup for Code Generation Paper • 2402.12317 • Published Feb 19, 2024
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? Paper • 2407.10956 • Published Jul 15, 2024 • 7
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19, 2025 • 46
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments Paper • 2501.10893 • Published Jan 18, 2025 • 26
One Embedder, Any Task: Instruction-Finetuned Text Embeddings Paper • 2212.09741 • Published Dec 19, 2022 • 4
Selective Annotation Makes Language Models Better Few-Shot Learners Paper • 2209.01975 • Published Sep 5, 2022
OpenAgents: An Open Platform for Language Agents in the Wild Paper • 2310.10634 • Published Oct 16, 2023 • 9
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval Paper • 2407.12883 • Published Jul 16, 2024 • 13
ARKS: Active Retrieval in Knowledge Soup for Code Generation Paper • 2402.12317 • Published Feb 19, 2024
Lemur: Harmonizing Natural Language and Code for Language Agents Paper • 2310.06830 • Published Oct 10, 2023 • 33
Lemur: Harmonizing Natural Language and Code for Language Agents Paper • 2310.06830 • Published Oct 10, 2023 • 33
Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning Paper • 2309.11489 • Published Sep 20, 2023 • 2