File size: 2,295 Bytes
eac61cb 0c94942 eac61cb 0c94942 eac61cb 0c94942 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | ---
title: MTEB Portuguese
emoji: π
colorFrom: green
colorTo: yellow
sdk: static
pinned: false
license: apache-2.0
---
# MTEB Portuguese
A public benchmark for evaluating text embedding models on **Brazilian Portuguese**, built as a thin extension on top of the [`mteb`](https://github.com/embeddings-benchmark/mteb) library.
## What you'll find here
- π **[Leaderboard](https://huggingface.co/spaces/mteb-pt/leaderboard)** β interactive ranking, 54 models Γ 16 tasks, Pareto chart
- π **[`mteb-pt-results`](https://huggingface.co/datasets/mteb-pt/mteb-pt-results)** β all per-task JSONs + per-query parquets, ~1100 files
- π» **[GitHub repo](https://github.com/tardellirs/mteb-pt)** β task definitions, evaluation scripts, paper sources, issue templates
## Submit a model
We accept submissions via either channel β pick whichever fits:
- π¬ [HF Discussion on the results dataset](https://huggingface.co/datasets/mteb-pt/mteb-pt-results/discussions/new)
- π [GitHub Issue with the model template](https://github.com/tardellirs/mteb-pt/issues/new?template=submit-model.yml)
Required for a submission:
1. `model_id` (HF repo path or vendor product name)
2. Per-task result JSONs for the 16 headline tasks
3. Reproducible evaluation command
We re-run a sample of each submission to verify before merging.
## Propose a new task
Open a [GitHub Issue with the task template](https://github.com/tardellirs/mteb-pt/issues/new?template=propose-task.yml) describing the dataset, license, size, and discrimination evidence. A task is accepted if it's native PT-BR (not machine-translated), has clear licensing, and discriminates between models.
## Maintainer
**Tardelli Stekel** β IFSP, SΓ£o Paulo, Brazil
βοΈ <stekel@ifsp.edu.br>
Contributions, corrections, and discussion all welcome.
## Citation
```bibtex
@misc{mteb-portuguese-2026,
title = {MTEB Portuguese: A Massive Text Embedding Benchmark for Brazilian Portuguese},
author = {Stekel, Tardelli},
year = {2026},
url = {https://huggingface.co/spaces/mteb-pt/leaderboard}
}
```
## Acknowledgments
Built on top of the [`mteb`](https://github.com/embeddings-benchmark/mteb) library by Enevoldsen et al. (2025). Task datasets contributed by their original authors. Compute provided by Modal.
|