Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 4 days ago • 124
EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory Paper • 2606.21649 • Published 8 days ago • 29
KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking Paper • 2606.22807 • Published 5 days ago • 44
FastContext: Training Efficient Repository Explorer for Coding Agents Paper • 2606.14066 • Published 15 days ago • 91
SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research Paper • 2606.09730 • Published 19 days ago • 52
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 70 items • Updated Dec 10, 2025 • 173
SEA-Embedding: Open and Reproducible Text Embeddings for Southeast Asia Paper • 2606.03027 • Published 25 days ago • 1
GrepSeek: Training Search Agents for Direct Corpus Interaction Paper • 2605.29307 • Published 30 days ago • 113
MiniCPM RAG Suite Collection Embedding, re-ranking, generation -- the cornerstone of RAG. • 7 items • Updated May 24 • 18
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19, 2025 • 49
ToolOmni: Enabling Open-World Tool Use via Agentic learning with Proactive Retrieval and Grounded Execution Paper • 2604.13787 • Published Apr 15 • 2
jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition Paper • 2605.08384 • Published May 8 • 11