-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2601.04720
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 152 -
Self-Improving VLM Judges Without Human Annotations
Paper • 2512.05145 • Published • 19 -
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
Paper • 2601.01720 • Published • 5 -
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Paper • 2511.09067 • Published • 2
-
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper • 2502.18418 • Published • 28 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
Paper • 2505.16967 • Published • 24 -
SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension
Paper • 2508.01959 • Published • 59
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.28k • 1.25k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 363 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 199 -
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Paper • 2508.00414 • Published • 93 -
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 71 -
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper • 2511.16518 • Published • 25
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 152 -
Self-Improving VLM Judges Without Human Annotations
Paper • 2512.05145 • Published • 19 -
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
Paper • 2601.01720 • Published • 5 -
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Paper • 2511.09067 • Published • 2
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.28k • 1.25k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 363 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 199 -
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Paper • 2508.00414 • Published • 93 -
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 71 -
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper • 2511.16518 • Published • 25
-
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper • 2502.18418 • Published • 28 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
Paper • 2505.16967 • Published • 24 -
SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension
Paper • 2508.01959 • Published • 59