File size: 2,198 Bytes
74aad8e
a969e99
 
 
 
74aad8e
a969e99
74aad8e
 
 
a969e99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
title: eval-card-registry
emoji: 🗂️
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---

# eval-card-registry

Query-only disambiguation API for AI evaluation entity names. Resolves raw benchmark / model / metric / harness strings (e.g. `"MATH Level 5"`) to stable canonical IDs (`math`).

This Space runs in **read-only mode** — it serves lookups against pre-built entity data. Write operations (entity creation, alias edits) happen in a separate pipeline.

## Base URL

```
https://evaleval-entity-registry.hf.space/api/v1
```

## Resolve

```bash
curl -X POST https://evaleval-entity-registry.hf.space/api/v1/resolve \
  -H 'Content-Type: application/json' \
  -d '{"raw_value": "MATH Level 5", "entity_type": "benchmark"}'
```

Response:

```json
{
  "canonical_id": "math-level-5",
  "strategy": "exact",
  "confidence": 1.0,
  "created_new": false,
  "review_status": "reviewed"
}
```

If nothing matches, `canonical_id` is `null` and `strategy` is `"no_match"`. In read-only mode, no draft entity is created.

`entity_type` is one of: `benchmark`, `model`, `metric`, `harness`. Optional `source_config` scopes the lookup to a specific source.

**Batch resolve:**

```bash
curl -X POST https://evaleval-entity-registry.hf.space/api/v1/resolve/batch \
  -H 'Content-Type: application/json' \
  -d '[
    {"raw_value": "MATH Level 5", "entity_type": "benchmark"},
    {"raw_value": "meta-llama/Llama-3.1-8B", "entity_type": "model"}
  ]'
```

## Browse entities

```
GET /api/v1/benchmarks?search=math
GET /api/v1/benchmarks/{id}
GET /api/v1/models
GET /api/v1/metrics
GET /api/v1/harnesses
GET /api/v1/aliases?status=uncertain&entity_type=benchmark
```

## Health

```
GET /api/v1/health
GET /api/v1/stats
```

## Write endpoints

Disabled in this Space. `POST`/`PATCH` on entities and aliases return `405 Method Not Allowed`. Mutations happen in the data pipeline (separate from this Space).

## Interactive docs

OpenAPI docs at `/docs`.

## Data sources

- Entity data: HF Dataset repo `evaleval/entity-registry-data` (read at startup)
- Resolve logs: HF Storage Bucket `evaleval/entity-registry-storage` (written asynchronously for resolver improvement)