Small Language Models Collection
Below is a list of small language models suitable for various tasks:
| Model Name | Task/Capability | Hugging Face Link |
|---|---|---|
| BERT Base | General Text Classification | https://huggingface.co/bert-base-uncased |
| DistilBERT | Efficient Text Classification | https://huggingface.co/distilbert-base-uncased |
| RoBERTa Base | Advanced Text Classification | https://huggingface.co/roberta-base |
| ALBERT Base | Efficient Large-Scale Classification | https://huggingface.co/albert-base-v2 |
| T5 Small | Text-to-Text Generation | https://huggingface.co/t5-small |
| T5 Base | General Text-to-Text Tasks | https://huggingface.co/t5-base |
| T5 Large | Advanced Text-to-Text Generation | https://huggingface.co/t5-large |
| Longformer Base | Long-Sequence Text Processing | https://huggingface.co/longformer-base-4096 |
| BigBird Base | Long-Sequence Text Processing | https://huggingface.co/google/bigbird-base-4096 |
| Reformer Base | Efficient Long-Sequence Processing | https://huggingface.co/google/reformer-enwik8 |
| BART Base | Text Summarization and Generation | https://huggingface.co/facebook/bart-base |
| ProphetNet Base | Future Event Prediction | https://huggingface.co/microsoft/prophetnet-large-nli |
| PPLM Base | Controlled Text Generation | https://huggingface.co/decapoda-research/llama-7b-hf (Note: PPLM is not directly available; this link is for a similar model) |
| DeBERTa Base | Advanced Sentiment Analysis | https://huggingface.co/microsoft/deberta-base |
| DeBERTa Large | High-Accuracy Sentiment Analysis | https://huggingface.co/microsoft/deberta-large |
| XLM-R Base | Multilingual Text Classification | https://huggingface.co/xlm-r-100-base |
| XLM-R Large | Advanced Multilingual Tasks | https://huggingface.co/xlm-r-100-large |
| MarianMT | Machine Translation | https://huggingface.co/Helsinki-NLP/opus-mt-en-fr |
| CamemBERT | French Language Tasks | https://huggingface.co/camembert-base |
| FlauBERT | French Language Tasks | https://huggingface.co/flaubert/flaubert-base-uncased |
| DistilCamemBERT | Efficient French Tasks | https://huggingface.co/camembert/camembert-base (Note: DistilCamemBERT is not directly available; this link is for CamemBERT) |
| BART Large | Advanced Text Summarization | https://huggingface.co/facebook/bart-large |
| ProphetNet Large | Advanced Future Event Prediction | https://huggingface.co/microsoft/prophetnet-large-nli |
| T5 3B | Large-Scale Text-to-Text Generation | https://huggingface.co/t5-3b |
| T5 11B | High-Capacity Text-to-Text Generation | https://huggingface.co/t5-11b |
| LLaMA 7B | Large-Scale General Tasks | https://huggingface.co/decapoda-research/llama-7b-hf |
| LLaMA 13B | High-Capacity General Tasks | https://huggingface.co/decapoda-research/llama-13b-hf |
| OPT 175B | Very Large-Scale General Tasks | https://huggingface.co/facebook/opt-175b |
| OPT 2.7B | Large-Scale General Tasks | https://huggingface.co/facebook/opt-2.7b |
| OPT 6.7B | High-Capacity General Tasks | https://huggingface.co/facebook/opt-6.7b |
| OPT 13B | Advanced General Tasks | https://huggingface.co/facebook/opt-13b |
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support