Translations All about our translation models which: 1. Dialect Arabic to English Model 2. English to Arabic Dialect / MSA model 3. Arabic Dialect to MSA Model Collection by nadsoft Oct 16, 2024 -
SLM (small language models) Collection by Ji-Xiang Apr 4 14 HuggingFaceTB/SmolLM-135M Text Generation • 0.1B • Updated Aug 1, 2024 • 180k • 261 HuggingFaceTB/SmolLM-135M-Instruct Text Generation • 0.1B • Updated Sep 4, 2024 • 31.5k • 140 HuggingFaceTB/SmolLM-360M-Instruct Text Generation • 0.4B • Updated Aug 18, 2024 • 6.24k • 85 HuggingFaceTB/SmolLM-360M Text Generation • 0.4B • Updated Aug 1, 2024 • 7.71k • 70
Darija Datasets A list of Moroccan darija datasets for various tasks 🤗 Collection by JasperV13 Mar 2 2 JasperV13/Darija_Dataset Viewer • Updated Jul 29, 2024 • 2.73M • 16 • 4 AbderrahmanSkiredj1/moroccan_darija_wikipedia_dataset Viewer • Updated Jul 15, 2024 • 4.86k • 14 • 6 alielfilali01/Darija-Stories-Dataset Viewer • Updated Jul 29, 2023 • 6.14k • 22 • 10 DRAGOO/dataset_dyal_darija Viewer • Updated Aug 25, 2023 • 3.62M • 12 • 5
Bielik-7B-v0.1 A collection of models based on Bielik-7B-v0.1 - base model, instructional and quantized versions, and MLX (Apple). Collection by speakleash Mar 19 5 speakleash/Bielik-7B-v0.1 Text Generation • 7B • Updated Oct 26, 2024 • 2.82k • 74 speakleash/Bielik-7B-Instruct-v0.1 Text Generation • 7B • Updated Oct 26, 2024 • 3.8k • • 64 speakleash/Bielik-7B-Instruct-v0.1-GGUF Text Generation • 7B • Updated Apr 7, 2024 • 1.12k • 15 speakleash/Bielik-7B-Instruct-v0.1-GPTQ Text Generation • 7B • Updated Apr 4, 2024 • 1.44k • 5
Coherence Collection by Geobeat Mar 2 - CohereLabs/Cohere-embed-multilingual-v3.0 Updated Nov 7, 2023 • 34.7k • 108 openai/whisper-large-v3 Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 5.8M • • 5.9k Running on Zero Agents 2.14k Finegrain Image Enhancer 🖼 2.14k Clarity AI Upscaler Reproduction Runtime error Agents 2 Open Sora ⚡ 2
Multimodal Benchmarks Collection by btjhjeon Feb 7 29 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 47 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 35 VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models Paper • 2407.11691 • Published Jul 16, 2024 • 17 MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 61
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 47
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 35
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models Paper • 2407.11691 • Published Jul 16, 2024 • 17
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 61
Align-Anything Collection by PKU-Alignment Apr 16, 2025 4 PKU-Alignment/align-anything Viewer • Updated Apr 5, 2025 • 69.4k • 2.22k • 48 PKU-Alignment/Align-Anything-Instruction-100K-zh Viewer • Updated Oct 10, 2024 • 105k • 62 • 10 PKU-Alignment/Align-Anything-Instruction-100K Viewer • Updated Oct 10, 2024 • 105k • 132 • 9 PKU-Alignment/Align-Anything-TI2T-Instruction-100K Viewer • Updated Nov 20, 2024 • 103k • 201 • 1
GTE models General Text Embedding Models Released by Tongyi Lab of Alibaba Group Collection by Alibaba-NLP Mar 2 35 Alibaba-NLP/gte-Qwen2-7B-instruct Sentence Similarity • 8B • Updated Mar 24, 2025 • 80k • 482 Alibaba-NLP/gte-Qwen2-1.5B-instruct Sentence Similarity • 2B • Updated May 28, 2025 • 750k • 236 Alibaba-NLP/gte-multilingual-base Sentence Similarity • 0.3B • Updated Jul 5, 2025 • 1.24M • 366 Alibaba-NLP/gte-multilingual-reranker-base Text Ranking • 0.3B • Updated Jul 5, 2025 • 224k • 183
RoLlama3 Collection of Romanian models based on Llama3 Collection by OpenLLM-Ro about 15 hours ago 2 OpenLLM-Ro/RoLlama3-8b-Instruct 8B • Updated 28 days ago • 1.94k • 3 OpenLLM-Ro/RoLlama3-8b-Instruct-DPO 8B • Updated 28 days ago • 42 • 1 OpenLLM-Ro/RoLlama3-8b-Instruct-2025-04-23 8B • Updated 28 days ago • 28 OpenLLM-Ro/RoLlama3-8b-Instruct-2024-10-09 8B • Updated 28 days ago • 33 • 1
Translations All about our translation models which: 1. Dialect Arabic to English Model 2. English to Arabic Dialect / MSA model 3. Arabic Dialect to MSA Model Collection by nadsoft Oct 16, 2024 -
Coherence Collection by Geobeat Mar 2 - CohereLabs/Cohere-embed-multilingual-v3.0 Updated Nov 7, 2023 • 34.7k • 108 openai/whisper-large-v3 Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 5.8M • • 5.9k Running on Zero Agents 2.14k Finegrain Image Enhancer 🖼 2.14k Clarity AI Upscaler Reproduction Runtime error Agents 2 Open Sora ⚡ 2
SLM (small language models) Collection by Ji-Xiang Apr 4 14 HuggingFaceTB/SmolLM-135M Text Generation • 0.1B • Updated Aug 1, 2024 • 180k • 261 HuggingFaceTB/SmolLM-135M-Instruct Text Generation • 0.1B • Updated Sep 4, 2024 • 31.5k • 140 HuggingFaceTB/SmolLM-360M-Instruct Text Generation • 0.4B • Updated Aug 18, 2024 • 6.24k • 85 HuggingFaceTB/SmolLM-360M Text Generation • 0.4B • Updated Aug 1, 2024 • 7.71k • 70
Multimodal Benchmarks Collection by btjhjeon Feb 7 29 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 47 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 35 VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models Paper • 2407.11691 • Published Jul 16, 2024 • 17 MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 61
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 47
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 35
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models Paper • 2407.11691 • Published Jul 16, 2024 • 17
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 61
Align-Anything Collection by PKU-Alignment Apr 16, 2025 4 PKU-Alignment/align-anything Viewer • Updated Apr 5, 2025 • 69.4k • 2.22k • 48 PKU-Alignment/Align-Anything-Instruction-100K-zh Viewer • Updated Oct 10, 2024 • 105k • 62 • 10 PKU-Alignment/Align-Anything-Instruction-100K Viewer • Updated Oct 10, 2024 • 105k • 132 • 9 PKU-Alignment/Align-Anything-TI2T-Instruction-100K Viewer • Updated Nov 20, 2024 • 103k • 201 • 1
Darija Datasets A list of Moroccan darija datasets for various tasks 🤗 Collection by JasperV13 Mar 2 2 JasperV13/Darija_Dataset Viewer • Updated Jul 29, 2024 • 2.73M • 16 • 4 AbderrahmanSkiredj1/moroccan_darija_wikipedia_dataset Viewer • Updated Jul 15, 2024 • 4.86k • 14 • 6 alielfilali01/Darija-Stories-Dataset Viewer • Updated Jul 29, 2023 • 6.14k • 22 • 10 DRAGOO/dataset_dyal_darija Viewer • Updated Aug 25, 2023 • 3.62M • 12 • 5
GTE models General Text Embedding Models Released by Tongyi Lab of Alibaba Group Collection by Alibaba-NLP Mar 2 35 Alibaba-NLP/gte-Qwen2-7B-instruct Sentence Similarity • 8B • Updated Mar 24, 2025 • 80k • 482 Alibaba-NLP/gte-Qwen2-1.5B-instruct Sentence Similarity • 2B • Updated May 28, 2025 • 750k • 236 Alibaba-NLP/gte-multilingual-base Sentence Similarity • 0.3B • Updated Jul 5, 2025 • 1.24M • 366 Alibaba-NLP/gte-multilingual-reranker-base Text Ranking • 0.3B • Updated Jul 5, 2025 • 224k • 183
Bielik-7B-v0.1 A collection of models based on Bielik-7B-v0.1 - base model, instructional and quantized versions, and MLX (Apple). Collection by speakleash Mar 19 5 speakleash/Bielik-7B-v0.1 Text Generation • 7B • Updated Oct 26, 2024 • 2.82k • 74 speakleash/Bielik-7B-Instruct-v0.1 Text Generation • 7B • Updated Oct 26, 2024 • 3.8k • • 64 speakleash/Bielik-7B-Instruct-v0.1-GGUF Text Generation • 7B • Updated Apr 7, 2024 • 1.12k • 15 speakleash/Bielik-7B-Instruct-v0.1-GPTQ Text Generation • 7B • Updated Apr 4, 2024 • 1.44k • 5
RoLlama3 Collection of Romanian models based on Llama3 Collection by OpenLLM-Ro about 15 hours ago 2 OpenLLM-Ro/RoLlama3-8b-Instruct 8B • Updated 28 days ago • 1.94k • 3 OpenLLM-Ro/RoLlama3-8b-Instruct-DPO 8B • Updated 28 days ago • 42 • 1 OpenLLM-Ro/RoLlama3-8b-Instruct-2025-04-23 8B • Updated 28 days ago • 28 OpenLLM-Ro/RoLlama3-8b-Instruct-2024-10-09 8B • Updated 28 days ago • 33 • 1