| <!doctype html> |
| <html lang="en"> |
| <head> |
| <meta charset="UTF-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1" /> |
| <title>MTEB Portuguese</title> |
| <meta property="og:title" content="MTEB Portuguese β Brazilian Portuguese Embedding Benchmark" /> |
| <meta property="og:description" content="54 models Γ 16 native PT-BR tasks. Interactive leaderboard with Pareto frontier." /> |
| <meta property="og:image" content="https://huggingface.co/spaces/mteb-pt/README/resolve/main/pareto-banner.png" /> |
| <meta property="og:url" content="https://huggingface.co/mteb-pt" /> |
| <meta name="twitter:card" content="summary_large_image" /> |
| <style> |
| :root { color-scheme: light dark; } |
| body { |
| font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif; |
| max-width: 720px; margin: 1.5rem auto; padding: 0 1rem; |
| line-height: 1.55; color: #1f2328; background: #fff; |
| } |
| @media (prefers-color-scheme: dark) { |
| body { color: #e6edf3; background: #0e1117; } |
| a { color: #58a6ff; } |
| code { background: #161b22; } |
| h1, h2 { color: #fff; } |
| blockquote { color: #8b949e; } |
| } |
| h1 { font-size: 1.6rem; margin: 0.5rem 0 0.25rem; } |
| h2 { font-size: 1.15rem; margin: 1.4rem 0 0.5rem; } |
| a { color: #0969da; text-decoration: none; } |
| a:hover { text-decoration: underline; } |
| code { background: #f6f8fa; padding: 0.1rem 0.3rem; border-radius: 4px; font-size: 0.85em; } |
| pre { background: #f6f8fa; padding: 0.75rem; border-radius: 6px; overflow-x: auto; } |
| @media (prefers-color-scheme: dark) { pre { background: #161b22; } } |
| .cta { display: flex; gap: 0.5rem; flex-wrap: wrap; margin: 0.5rem 0 1rem; } |
| .cta a { |
| display: inline-block; padding: 0.45rem 0.9rem; border-radius: 6px; |
| background: #f6f8fa; border: 1px solid #d0d7de; color: #1f2328; |
| font-weight: 500; font-size: 0.9rem; |
| } |
| .cta a.primary { background: #0969da; color: #fff; border-color: #0969da; } |
| .cta a:hover { text-decoration: none; filter: brightness(1.05); } |
| @media (prefers-color-scheme: dark) { .cta a { background: #161b22; color: #e6edf3; border-color: #30363d; } } |
| ul { padding-left: 1.2rem; } |
| .lead { color: #656d76; font-size: 1.05rem; } |
| @media (prefers-color-scheme: dark) { .lead { color: #8b949e; } } |
| .banner { width: 100%; border-radius: 8px; margin: 0.5rem 0 1rem; display: block; } |
| </style> |
| </head> |
| <body> |
| <h1>MTEB Portuguese</h1> |
| <p class="lead">A public benchmark for evaluating text embedding models on Brazilian Portuguese, built on top of the <a href="https://github.com/embeddings-benchmark/mteb">mteb</a> library.</p> |
|
|
| <h2>What you'll find here</h2> |
| <ul> |
| <li>π <a href="https://huggingface.co/spaces/mteb-pt/leaderboard"><b>Leaderboard</b></a> β interactive ranking, 54 models Γ 16 tasks, with Pareto chart</li> |
| <li>π» <a href="https://github.com/tardellirs/mteb-pt">GitHub repo</a> β task definitions, evaluation scripts, paper sources, issue templates</li> |
| <li>π <a href="https://github.com/tardellirs/mteb-pt#task-suite-16-headline-tasks">Task list & sources</a> β every task linked to its original dataset / paper</li> |
| </ul> |
|
|
| <h2>Submit a model</h2> |
| <p>Two channels β pick whichever fits:</p> |
| <div class="cta"> |
| <a class="primary" href="https://huggingface.co/spaces/mteb-pt/leaderboard/discussions/new">π¬ Open HF Discussion</a> |
| <a href="https://github.com/tardellirs/mteb-pt/issues/new?template=submit-model.yml">π GitHub Issue</a> |
| </div> |
| <p>Required for a submission:</p> |
| <ol> |
| <li><code>model_id</code> (HF repo path or vendor product name)</li> |
| <li>Per-task result JSONs for the 16 headline tasks</li> |
| <li>Reproducible evaluation command</li> |
| </ol> |
| <p>We re-run a sample of each submission to verify before merging.</p> |
|
|
| <h2>Propose a new task</h2> |
| <p>Open a <a href="https://github.com/tardellirs/mteb-pt/issues/new?template=propose-task.yml">GitHub Issue with the task template</a> describing the dataset, license, size, and discrimination evidence. A task is accepted if it's native PT-BR (not machine-translated), has clear licensing, and discriminates between models.</p> |
|
|
| <h2>Maintainer</h2> |
| <p><b>Tardelli Stekel</b> β IFSP, SΓ£o Paulo, Brazil<br> |
| βοΈ <a href="mailto:stekel@ifsp.edu.br">stekel@ifsp.edu.br</a></p> |
| <p>Contributions, corrections, and discussion all welcome.</p> |
|
|
| <h2>Citation</h2> |
| <pre>@misc{mteb-portuguese-2026, |
| title = {MTEB Portuguese: A Massive Text Embedding Benchmark for Brazilian Portuguese}, |
| author = {Stekel, Tardelli}, |
| year = {2026}, |
| url = {https://huggingface.co/spaces/mteb-pt/leaderboard} |
| }</pre> |
|
|
| <h2>Acknowledgments</h2> |
| <p>Built on top of the <a href="https://github.com/embeddings-benchmark/mteb">mteb</a> library (Muennighoff et al., 2023). The multilingual sub-benchmark methodology follows MMTEB (Enevoldsen et al., 2025). Task datasets contributed by their original authors β see the <a href="https://github.com/tardellirs/mteb-pt#task-suite-16-headline-tasks">task suite</a> for sources.</p> |
| </body> |
| </html> |
|
|