--- title: eval-card-registry emoji: 🗂️ colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false --- # eval-card-registry Query-only disambiguation API for AI evaluation entity names. Resolves raw benchmark / model / metric / harness strings (e.g. `"MATH Level 5"`) to stable canonical IDs (`math`). This Space runs in **read-only mode** — it serves lookups against pre-built entity data. Write operations (entity creation, alias edits) happen in a separate pipeline. ## Base URL ``` https://evaleval-entity-registry.hf.space/api/v1 ``` ## Resolve ```bash curl -X POST https://evaleval-entity-registry.hf.space/api/v1/resolve \ -H 'Content-Type: application/json' \ -d '{"raw_value": "MATH Level 5", "entity_type": "benchmark"}' ``` Response: ```json { "canonical_id": "math-level-5", "strategy": "exact", "confidence": 1.0, "created_new": false, "review_status": "reviewed" } ``` If nothing matches, `canonical_id` is `null` and `strategy` is `"no_match"`. In read-only mode, no draft entity is created. `entity_type` is one of: `benchmark`, `model`, `metric`, `harness`. Optional `source_config` scopes the lookup to a specific source. **Batch resolve:** ```bash curl -X POST https://evaleval-entity-registry.hf.space/api/v1/resolve/batch \ -H 'Content-Type: application/json' \ -d '[ {"raw_value": "MATH Level 5", "entity_type": "benchmark"}, {"raw_value": "meta-llama/Llama-3.1-8B", "entity_type": "model"} ]' ``` ## Browse entities ``` GET /api/v1/benchmarks?search=math GET /api/v1/benchmarks/{id} GET /api/v1/models GET /api/v1/metrics GET /api/v1/harnesses GET /api/v1/aliases?status=uncertain&entity_type=benchmark ``` ## Health ``` GET /api/v1/health GET /api/v1/stats ``` ## Write endpoints Disabled in this Space. `POST`/`PATCH` on entities and aliases return `405 Method Not Allowed`. Mutations happen in the data pipeline (separate from this Space). ## Interactive docs OpenAPI docs at `/docs`. ## Data sources - Entity data: HF Dataset repo `evaleval/entity-registry-data` (read at startup) - Resolve logs: HF Storage Bucket `evaleval/entity-registry-storage` (written asynchronously for resolver improvement)