From Pixels to Words -- Towards Native One-Vision Models at Scale Paper β’ 2605.28820 β’ Published 29 days ago β’ 75
Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation Paper β’ 2606.12594 β’ Published 15 days ago β’ 17
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper β’ 2605.30280 β’ Published 28 days ago β’ 146
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper β’ 2605.13831 β’ Published May 13 β’ 88
Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models Paper β’ 2602.12586 β’ Published Feb 13 β’ 2
Learning GUI Grounding with Spatial Reasoning from Visual Feedback Paper β’ 2509.21552 β’ Published Sep 25, 2025 β’ 11
Learning GUI Grounding with Spatial Reasoning from Visual Feedback Paper β’ 2509.21552 β’ Published Sep 25, 2025 β’ 11
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper β’ 2507.08800 β’ Published Jul 11, 2025 β’ 81
Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem Paper β’ 2506.03295 β’ Published Jun 3, 2025 β’ 17
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper β’ 2505.10610 β’ Published May 15, 2025 β’ 56
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper β’ 2505.10610 β’ Published May 15, 2025 β’ 56
view article Article π¦Έπ»#1: Open-endedness and AI Agents β A Path from Generative to Creative AI? Kseniase β’ Dec 25, 2024 β’ 16
Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Paper β’ 2503.02812 β’ Published Mar 4, 2025 β’ 10
Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Paper β’ 2503.02812 β’ Published Mar 4, 2025 β’ 10
Q-Filters Collection Pre-computed Q-Filters for efficient KV cache compression. β’ 15 items β’ Updated Mar 3, 2025 β’ 7