---
title: Rabbinic Embedding Benchmark
emoji: 📚
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
datasets:
  - Sefaria/Rabbinic-Hebrew-English-Pairs
  - Sefaria/Rabbinic-Embedding-Leaderboard
---

# Rabbinic Hebrew/Aramaic Embedding Benchmark

Evaluate embedding models on cross-lingual retrieval between Hebrew/Aramaic source texts and their English translations from Sefaria.

## How It Works

Given a Hebrew/Aramaic text, can the model find its correct English translation from a pool of candidates? Models that excel at this task produce high-quality embeddings for Rabbinic literature.

## Metrics

| Metric | Description |
|--------|-------------|
| **MRR** | Mean Reciprocal Rank (average of 1/rank of correct answer) |
| **Recall@k** | % of queries where correct translation is in top k results |
| **Bitext Accuracy** | True pair vs random pair classification |

## Corpus

The benchmark uses the [Sefaria/Rabbinic-Hebrew-English-Pairs](https://huggingface.co/datasets/Sefaria/Rabbinic-Hebrew-English-Pairs) dataset, which includes diverse texts with English translations:

- **Talmud**: Bavli & Yerushalmi
- **Mishnah**: Selected tractates
- **Midrash**: Midrash Rabbah
- **Commentary**: Rashi, Ramban, Radak, Rabbeinu Behaye
- **Philosophy**: Guide for the Perplexed, Sefer HaIkkarim
- **Hasidic/Kabbalistic**: Likutei Moharan, Tomer Devorah, Kalach Pitchei Chokhmah
- **Mussar**: Chafetz Chaim, Kav HaYashar, Iggeret HaRamban
- **Halacha**: Sefer HaChinukh, Mishneh Torah

All texts sourced from [Sefaria](https://www.sefaria.org).

## Leaderboard

Results are stored persistently in the [Sefaria/Rabbinic-Embedding-Leaderboard](https://huggingface.co/datasets/Sefaria/Rabbinic-Embedding-Leaderboard) dataset.

## Configuration (Space Secrets)

The following environment variables can be set in Space settings:

### Required for Leaderboard Persistence

| Secret | Description |
|--------|-------------|
| `HF_TOKEN` | HuggingFace token with write access to `Sefaria/Rabbinic-Embedding-Leaderboard`. Without this, evaluations will run but results won't be saved to the leaderboard. |

### Optional for API-based Models

| Secret | Description |
|--------|-------------|
| `OPENAI_API_KEY` | For OpenAI embedding models |
| `VOYAGE_API_KEY` | For Voyage AI embedding models |
| `GEMINI_API_KEY` | For Google Gemini embedding models |

Users can also enter API keys directly in the interface (they are not stored).

## Local Development

```bash
# Clone and install dependencies
git clone https://huggingface.co/spaces/Sefaria/Rabbinic-Embedding-Benchmark
cd Rabbinic-Embedding-Benchmark
pip install -r requirements.txt

# Run locally (leaderboard will be read-only without HF_TOKEN)
python app.py

# Or with write access to leaderboard
export HF_TOKEN=your_token_here
python app.py
```

## Related

- [Benchmark Dataset](https://huggingface.co/datasets/Sefaria/Rabbinic-Hebrew-English-Pairs)
- [Leaderboard Dataset](https://huggingface.co/datasets/Sefaria/Rabbinic-Embedding-Leaderboard)
- [Sefaria](https://www.sefaria.org)