Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.19640

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 121

PotentialApplication

Let LLMs Break Free from Overthinking via Self-Braking Tuning

Paper • 2505.14604 • Published May 20 • 23
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Paper • 2505.16944 • Published May 22 • 8
Training Step-Level Reasoning Verifiers with Formal Verification Tools

Paper • 2505.15960 • Published May 21 • 7
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning

Paper • 2505.15134 • Published May 21 • 6

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 85

Interleaved Reasoning for Large Language Models via Reinforcement Learning

Paper • 2505.19640 • Published May 26 • 14
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30 • 81
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Paper • 2508.21112 • Published Aug 28 • 77
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17

about 7 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 510 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 8.19k • 1.23k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 138 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63

about 11 hours ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models

Paper • 2410.09342 • Published Oct 12, 2024 • 39
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Interleaved Reasoning for Large Language Models via Reinforcement Learning

Paper • 2505.19640 • Published May 26 • 14
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30 • 81
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Paper • 2508.21112 • Published Aug 28 • 77
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 121

about 7 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 510 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

PotentialApplication

Let LLMs Break Free from Overthinking via Self-Braking Tuning

Paper • 2505.14604 • Published May 20 • 23
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Paper • 2505.16944 • Published May 22 • 8
Training Step-Level Reasoning Verifiers with Formal Verification Tools

Paper • 2505.15960 • Published May 21 • 7
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning

Paper • 2505.15134 • Published May 21 • 6

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 8.19k • 1.23k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 138 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 85

about 11 hours ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models

Paper • 2410.09342 • Published Oct 12, 2024 • 39
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs