Sub-5b Parameter T2T LLMs

d0zz0d 's Collections

Sub-5b Things

Sub-2b Things

Sub-5b Parameter IT2T LLMs

Sub-5b T2T LLMs - Finetunes

Sub-2b Parameter T2T LLMs

Sub-2b Parameter IT2T VLMs

Sub-2b T2T LLMs - Finetunes

Sub-1b Parameter T2T LLMs

Sub-1b Parameter IT2T VLMs

Sub-1b T2T LLMs - Finetunes

>500M LLMs VLMs

updated 14 days ago

4b is the new 7b, i was meant to do sub-3b and sub-4b collections, but in the end i decided to combine into a single collection of text-to-text LLMs

Upvote

ibm-granite/granite-4.0-micro

Text Generation • Updated Nov 3, 2025 • 317k • 271

Note traditional transformers model
ibm-granite/granite-4.0-h-micro

Text Generation • 3B • Updated Nov 3, 2025 • 6.11k • 144

Note h stands for hybrid, uses transformers and mamba
meta-llama/Llama-3.2-3B-Instruct

Text Generation • 3B • Updated Oct 24, 2024 • 2.37M • • 2.13k
HuggingFaceTB/SmolLM3-3B

Text Generation • 3B • Updated Sep 10, 2025 • 161k • 952

Note biggest glow-up compared to SmolLM2
tiiuae/Falcon-H1-3B-Instruct

Text Generation • 3B • Updated Jul 31, 2025 • 5.48k • 14
microsoft/Phi-4-mini-instruct

Text Generation • Updated Dec 10, 2025 • 1.53M • • 739
microsoft/Phi-4-mini-flash-reasoning

Text Generation • Updated Dec 10, 2025 • 838 • 275

Note just an updated version of phi-4-mini-reasoning. phi-4.5-mini when???
Nanbeige/Nanbeige4.1-3B

Text Generation • 4B • Updated Mar 25 • 231k • • 1.11k

Note better than 2511
Qwen/Qwen3-4B-Instruct-2507

Text Generation • 4B • Updated Sep 17, 2025 • 10.9M • • 841

Note literally the end-game 4b parameter model.
Qwen/Qwen3-4B-Thinking-2507

Text Generation • 4B • Updated Aug 6, 2025 • 494k • • 585

Note literally the end-game 4b parameter model. Now with reasoning!
Tiiny/SmallThinker-4BA0.6B-Instruct

Text Generation • 4B • Updated Jul 31, 2025 • 533 • 60

Note wow! a tiny MoE!
arcee-ai/AFM-4.5B

Text Generation • 5B • Updated Sep 17, 2025 • 2.42k • 95
tencent/Hunyuan-4B-Instruct

Text Generation • 4B • Updated Oct 31, 2025 • 2.15k • 30
LiquidAI/LFM2-2.6B-Exp

Text Generation • 3B • Updated Mar 30 • 6.24k • 339
CohereLabs/tiny-aya-global

Text Generation • 3B • Updated 12 days ago • 16.6k • • 152
ai21labs/AI21-Jamba2-3B

Text Generation • Updated Feb 2 • 977 • 42
nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16

Text Generation • Updated Mar 20 • 97.8k • 85
shoumenchougou/RWKV7-G1e-2.9b-GGUF

3B • Updated Mar 19 • 343 • 6
ibm-granite/granite-4.1-3b

Text Generation • 3B • Updated 9 days ago • 16.8k • 60

Upvote