view article Article ATE-2: State-of-the-Art Armenian Text Embeddings and the ArmBench-TextEmbed Benchmark Metric-AI • Mar 19 • 8
PP-OCRv5 Collection PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated about 6 hours ago • 56
view article Article mmBERT: ModernBERT goes Multilingual +4 mmarone, orionweller, will-fleshman, eugene-yang, dlawrie, vandurme • Sep 9, 2025 • 146
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published Sep 4, 2025 • 213
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19, 2025 • 48
view article Article Welcome EmbeddingGemma, Google's new efficient embedding model +4 tomaarsen, Xenova, alvarobartt, ariG23498, pcuenq, sergiopaniego • Sep 4, 2025 • 273
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Mar 12 • 217
Tucan Collection A series of open-source Bulgarian language models fine-tuned for function calling and tool use. 2.6B, 9B, and 27B parameter variants. • 12 items • Updated Jul 1, 2025 • 1
view article Article Train 400x faster Static Embedding Models with Sentence Transformers tomaarsen • Jan 15, 2025 • 229
CoLLM: A Large Language Model for Composed Image Retrieval Paper • 2503.19910 • Published Mar 25, 2025 • 15
view article Article Training and Finetuning Embedding Models with Sentence Transformers tomaarsen • May 28, 2024 • 274
view article Article Training and Finetuning Reranker Models with Sentence Transformers tomaarsen • Mar 26, 2025 • 193
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14, 2025 • 157
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google +1 merve, ariG23498, andsteing • Feb 19, 2025 • 75
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 212
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 145