-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2505.19640
-
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 266 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 225 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 121
-
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Paper • 2505.14604 • Published • 23 -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
Paper • 2505.16944 • Published • 8 -
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper • 2505.15960 • Published • 7 -
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Paper • 2505.15134 • Published • 6
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 43 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 85
-
Interleaved Reasoning for Large Language Models via Reinforcement Learning
Paper • 2505.19640 • Published • 14 -
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Paper • 2510.27492 • Published • 81 -
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Paper • 2508.21112 • Published • 77 -
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Paper • 2509.06283 • Published • 17
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 510 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 8.19k • 1.23k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 138 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 144 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 7 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 39 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Interleaved Reasoning for Large Language Models via Reinforcement Learning
Paper • 2505.19640 • Published • 14 -
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Paper • 2510.27492 • Published • 81 -
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Paper • 2508.21112 • Published • 77 -
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Paper • 2509.06283 • Published • 17
-
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 266 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 225 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 121
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 510 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Paper • 2505.14604 • Published • 23 -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
Paper • 2505.16944 • Published • 8 -
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper • 2505.15960 • Published • 7 -
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Paper • 2505.15134 • Published • 6
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 8.19k • 1.23k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 138 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 43 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 85
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 144 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 7 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 39 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53