view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG 11 days ago • 59
KoViDoRe Benchmark (BEIR) v2 Collection Korean Vision Document Retrieval Benchmark • 6 items • Updated 11 days ago • 5
view article Article Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries Dec 22, 2025 • 6
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model Paper • 2501.01028 • Published Jan 2, 2025 • 19
Tarka Embed V1 Collection Efficient DFKD embeddings for language understanding • 5 items • Updated Dec 17, 2025 • 6
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published Nov 13, 2025 • 51
Preserving Multilingual Quality While Tuning Query Encoder on English Only Paper • 2407.00923 • Published Jul 1, 2024 • 1
Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks Paper • 2511.07025 • Published Nov 10, 2025 • 13
Nemotron RAG Collection Set of tools to build retrieval-augmented generation (RAG) systems, improve search and ranking accuracy, and extract structured data from complex do • 11 items • Updated 6 days ago • 64
Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought Paper • 2510.04230 • Published Oct 5, 2025 • 27
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 507
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval Paper • 2506.18902 • Published Jun 23, 2025 • 12
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29, 2025 • 143
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources Paper • 2509.21268 • Published Sep 25, 2025 • 104
Direct Language Model Alignment from Online AI Feedback Paper • 2402.04792 • Published Feb 7, 2024 • 34