Epicure-Core
A 300-dimensional skip-gram ingredient embedding over a 1,790-ingredient canonical vocabulary, trained on a blend of (i) typed FlavorDB ingredient-compound metapath walks and (ii) injected pure ingredient-ingredient walks at ii_repeat=10. Core is the middle sibling on the chemistry-vs-recipe-context spectrum.
The 10x I-I injection is the design lever that concentrates Core's geometry: participation ratio drops to 94.2 of 300 (vs ~180 for the isotropic Cooc and Chem siblings), average pairwise cosine rises to 0.35, and the resulting concentration coincides with the tightest emergent modes of the three.
Companions in the family: epicure-cooc (recipe-context only) and epicure-chem (chemistry only).
Paper: Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings
Quick start
from epicure import Epicure
m = Epicure.from_pretrained("Kaikaku/epicure-core")
m.neighbors("chicken", k=5)
# -> [('pork', 0.58), ('beef', 0.57), ('chicken_broth', 0.55),
# ('peanut', 0.52), ('cream_of_chicken_soup', 0.52)]
m.slerp("rice", "cuisine:South_Asian", theta_deg=30, k=5)
# -> [('turmeric', 0.76), ('mustard_seed', 0.76), ('fenugreek_seed', 0.75),
# ('coriander', 0.74), ('cumin', 0.74)]
m.closest_mode("chocolate", kind="factor", k=3)
What is in this repo
Identical structure to the Cooc sibling. The Core-specific differences:
modes.json: 193 modes across 44 properties (vs 150/41 for Cooc, 200/43 for Chem).factor_poles.npyshape: (87, 300).supervised_poles.json: 113 entries.
See the Cooc model card for the per-file inventory.
Reported numbers (this sibling)
From the paper:
- Isotropy: participation ratio
PR = 94.2, average pairwise cosine 0.35. Concentrated geometry by design (10x I-I injection). - Direction quality (5-fold CV Spearman rho): baked-in CF 0.40; held-out basic-taste CF 0.42; USDA macros 0.45. Cuisine Cohen's d mean 2.70.
- Emergent modes: 193 modes / 44 properties. Mean within-mode coherence 0.833 against random-pair baseline 0.348 (margin 0.485, tightest of the three siblings).
Core's concentrated geometry pulls both pole tightness and the all-pairs floor upward; the tightness margin (mode coherence minus baseline) is comparable to Cooc and Chem at ~0.5, so the concentration is a design lever, not a defect.
When to pick Core: you want chemistry-aware structure but cannot afford to lose recipe-context companionship entirely. Core's nearest-neighbour for chicken is pork (chemistry peer) but its full top-5 includes chicken_broth and cream_of_chicken_soup (recipe context).
Operator semantics
Same as Cooc. See epicure-cooc for the full operator reference. The three operator families (top-K neighbours, closest-mode lookup, SLERP direction arithmetic) are identical across siblings; only the geometry they act on differs.
Honesty about cuisine pole reconstruction
See the epicure-cooc model card for the full discussion. Short version: the eight cuisine-macro-region pole vectors used in the paper's Section 4.2 hero examples are reconstructed here as the unit-mean of every mode whose Claude label contains a cuisine keyword. Core happens to reproduce paper-genre results with high fidelity because the chemistry-mediated walks cluster cuisines by aroma-compound profile.
Limitations and citation
Same as Cooc. See the paper Section 5.3 for corpus imbalance, hub coverage, and LLM-dependence notes.
@article{radzikowski2026epicure,
title = {Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings},
author = {Radzikowski, Jakub and Chen, Josef},
journal = {arXiv preprint arXiv:2605.22391},
year = {2026}
}
License: CC BY 4.0.
- Downloads last month
- 154