PhotoBench is the first benchmark constructed from authentic, personal albums, designed to shift the paradigm from visual matching to personalized multi-source intent-driven photo retrieval.
PhotoBench-Protected is the limited-information release: only pre-computed captions, embeddings, and metadata are provided, so this leaderboard focuses exclusively on
agent planning ability.
โ ๏ธ Please confirm you are submitting to the correct leaderboard.
The test sets for PhotoBench-Protected and PhotoBench (full) โ are different.
For unrestricted retrieval with raw images, please use the
full PhotoBench leaderboard โ.
Full dataset download: obox โ.
"""
SUBMISSION_GUIDE = """
### Submission Format
The dataset provides `test.json` per album. You must **combine all albums into a single JSON array** and add the `album_id` field to each query before submitting.
**Example transformation:**
```python
import json
submission = []
for album_id in ["1", "2", "3"]:
with open(f"protected/album{album_id}/test.json") as f:
queries = json.load(f)
for q in queries:
submission.append({
"album_id": album_id,
"query_en": q["query_en"],
"pred": ["IMG_0001.jpg", "IMG_0002.jpg", ...] # your predictions
})
with open("submission.json", "w") as f:
json.dump(submission, f, indent=2)
```
**Final submission format:**
```json
[
{
"album_id": "1",
"query_en": "cluttered desk",
"pred": ["IMG_1234.jpg", "IMG_5678.jpg", ...]
}
]
```
**Required fields:**
- `album_id`: Album number (`"1"`, `"2"`, or `"3"` โ string)
- `query_en`: The English query text (must match exactly, case-sensitive)
- `pred`: Ordered list of predicted image filenames (order matters for NDCG)
You may submit results for any subset of albums. Partial submissions are accepted and evaluated, but **only full submissions** (all 3 albums, all test queries) are eligible for public leaderboard ranking.
"""
EVALUATION_INFO = """
### Evaluation Metrics
| Metric | Description |
|--------|-------------|
| **Recall@k** | Proportion of ground-truth images found in top-k predictions |
| **NDCG@k** | Normalized Discounted Cumulative Gain at rank k |
Supported k values: **1, 5, 10, 20, 50, 100**
Results are averaged across all evaluated queries per album, then averaged across albums for the final leaderboard score.
"""
custom_css = """
/* Grass-green clean theme */
body {
background: #f5f9f0 !important;
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif !important;
font-size: 17px !important;
}
/* Tab buttons */
.tab-buttons button {
font-weight: 500 !important;
font-size: 0.9em !important;
border-radius: 10px 10px 0 0 !important;
padding: 12px 24px !important;
background: #e0ead8 !important;
color: #555 !important;
border: none !important;
transition: all 0.25s ease !important;
}
.tab-buttons button.selected {
background: #fff !important;
color: #1a1a1a !important;
box-shadow: 0 -2px 0 #7CB342 inset !important;
}
/* Primary buttons */
.gr-button-primary {
background: #7CB342 !important;
border: none !important;
border-radius: 10px !important;
color: #fff !important;
font-weight: 600 !important;
font-size: 0.95em !important;
padding: 12px 28px !important;
transition: all 0.25s ease !important;
}
.gr-button-primary:hover {
background: #6ba32e !important;
transform: translateY(-1px) !important;
box-shadow: 0 6px 20px rgba(124,179,66,0.25) !important;
}
/* Markdown */
.markdown-text {
max-width: 780px;
margin: 0 auto;
color: #333;
line-height: 1.8;
font-size: 1.05em;
}
/* DataFrame Table */
.gr-dataframe {
border-radius: 14px !important;
overflow: hidden !important;
box-shadow: 0 2px 16px rgba(0,0,0,0.06) !important;
border: 1px solid #d4e0c8 !important;
font-size: 0.95em !important;
}
.gr-dataframe th {
background: #e8f0e0 !important;
color: #444 !important;
font-weight: 600 !important;
font-size: 0.8em !important;
text-transform: uppercase !important;
letter-spacing: 0.5px !important;
padding: 14px 10px !important;
border-bottom: 2px solid #d4e0c8 !important;
}
.gr-dataframe td {
padding: 12px 10px !important;
border-bottom: 1px solid #e0ead8 !important;
color: #333 !important;
}
.gr-dataframe tr:hover td {
background: #f0f7e8 !important;
}
/* Inputs */
input, textarea, select {
border-radius: 10px !important;
border: 1px solid #c4d4b4 !important;
background: #fff !important;
font-size: 1em !important;
padding: 10px 14px !important;
}
input:focus, textarea:focus, select:focus {
border-color: #7CB342 !important;
box-shadow: 0 0 0 3px rgba(124,179,66,0.12) !important;
outline: none !important;
}
/* Form containers */
.gr-form .gr-box {
border-radius: 14px !important;
background: #fff !important;
border: 1px solid #d4e0c8 !important;
padding: 24px !important;
}
/* Labels */
.gr-input-label, .gr-dropdown-label {
font-weight: 500 !important;
color: #444 !important;
font-size: 0.9em !important;
margin-bottom: 6px !important;
}
/* JSON output */
.gr-json {
border-radius: 12px !important;
background: #f5f9f0 !important;
border: 1px solid #d4e0c8 !important;
font-size: 0.9em !important;
}
/* Center submit form */
#submit-form-container {
max-width: 600px;
margin: 0 auto;
}
/* Section headers */
.gr-tab-item h3 {
color: #1a1a1a !important;
font-weight: 600 !important;
font-size: 1.2em !important;
margin-top: 24px;
margin-bottom: 12px;
}
/* Toggle switch for opt-in checkbox */
.toggle-switch input[type="checkbox"] {
appearance: none !important;
-webkit-appearance: none !important;
width: 44px !important;
height: 24px !important;
background: #ccc !important;
border-radius: 12px !important;
position: relative !important;
cursor: pointer !important;
outline: none !important;
transition: background 0.3s ease !important;
flex-shrink: 0 !important;
vertical-align: middle !important;
}
.toggle-switch input[type="checkbox"]::after {
content: '' !important;
position: absolute !important;
width: 20px !important;
height: 20px !important;
background: white !important;
border-radius: 50% !important;
top: 2px !important;
left: 2px !important;
transition: transform 0.3s ease !important;
box-shadow: 0 2px 4px rgba(0,0,0,0.2) !important;
}
.toggle-switch input[type="checkbox"]:checked {
background: #7CB342 !important;
}
.toggle-switch input[type="checkbox"]:checked::after {
transform: translateX(20px) !important;
}
/* Leaderboard controls alignment */
.leaderboard-controls {
align-items: flex-end !important; /* Align items to the bottom */
gap: 12px !important; /* Add space between controls */
}
/* Refresh button styling */
.refresh-btn {
min-width: 0 !important; /* Override gradio default min-width */
height: 42px !important; /* Match dropdown height */
width: auto !important;
padding: 8px 20px !important; /* Adjust padding */
margin: 0 !important; /* Reset margin */
font-size: 0.95em !important; /* Match other inputs */
line-height: normal !important; /* Reset line-height */
border-radius: 10px !important; /* Ensure all corners are rounded */
background: #f0ebe5 !important;
color: #555 !important;
border: 1px solid #d4ccc4 !important;
transition: all 0.2s ease !important;
}
.refresh-btn:hover {
background: #e8e0d8 !important;
border-color: #c8b8ae !important;
}
/* Hide ag-grid header menu button and tooltips to prevent mixed CN/EN UI */
.ag-header-cell-menu-button {
display: none !important;
}
.ag-tooltip {
display: none !important;
}
"""