Update README.md
Browse files
README.md
CHANGED
|
@@ -0,0 +1,149 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: mit
|
| 4 |
+
tags:
|
| 5 |
+
- recommendation
|
| 6 |
+
- ranking
|
| 7 |
+
- personalization
|
| 8 |
+
- xgboost
|
| 9 |
+
- xgbranker
|
| 10 |
+
- recipe
|
| 11 |
+
- cold-start
|
| 12 |
+
datasets:
|
| 13 |
+
- your-username/recipe-cleaned-dataset
|
| 14 |
+
model-index:
|
| 15 |
+
- name: Personalized Recipe Ranking Models
|
| 16 |
+
results:
|
| 17 |
+
- task:
|
| 18 |
+
type: recommendation
|
| 19 |
+
name: Personalized Recipe Ranking
|
| 20 |
+
dataset:
|
| 21 |
+
name: Food.com (Cleaned)
|
| 22 |
+
type: your-username/recipe-cleaned-dataset
|
| 23 |
+
metrics:
|
| 24 |
+
- type: ndcg@5
|
| 25 |
+
value: 0.44
|
| 26 |
+
- type: ndcg@10
|
| 27 |
+
value: 0.44
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
# Model Card: Personalized Recipe Ranking Models
|
| 31 |
+
|
| 32 |
+
## Overview
|
| 33 |
+
|
| 34 |
+
This project implements a personalized recipe recommendation system using two model categories:
|
| 35 |
+
|
| 36 |
+
1. **Scratch-trained baseline**: A simple rule-based + embedding matching ranker trained on a synthetic preference dataset (no user-specific rules).
|
| 37 |
+
2. **Rule-enhanced cold-start models**: Five separate XGBRanker models trained with more complex rule-based preference signals and user-specific interaction patterns (user1–user5).
|
| 38 |
+
|
| 39 |
+
The goal is to evaluate how different user profiles affect ranking behavior and recommendation diversity, even when overall NDCG scores are lower than the baseline.
|
| 40 |
+
|
| 41 |
+
---
|
| 42 |
+
|
| 43 |
+
## Model Category 1: Scratch-trained Baseline
|
| 44 |
+
|
| 45 |
+
### Purpose
|
| 46 |
+
Provide a simple cold-start recommendation baseline that matches ingredients and ranks recipes without personalization. It uses parent–child ingredient overlap and a few numeric features (e.g., protein, cost, cooking time).
|
| 47 |
+
|
| 48 |
+
### Data Sources
|
| 49 |
+
- Cleaned Food.com dataset (~180k recipes)
|
| 50 |
+
- 10,000 synthetic preference samples generated via uniform random selection
|
| 51 |
+
|
| 52 |
+
### Training Details
|
| 53 |
+
- Model type: **XGBRanker** (`objective='rank:pairwise'`)
|
| 54 |
+
- Features: ~1000 numeric ingredient-parent ratio features + basic nutrition/time features
|
| 55 |
+
- Train/test split: 80/20 (by recipe ID)
|
| 56 |
+
- Evaluation metric: NDCG@5, NDCG@10
|
| 57 |
+
|
| 58 |
+
### Evaluation
|
| 59 |
+
The baseline achieves **very high NDCG scores (95%+)**, because training and evaluation rely on synthetic signals that align perfectly with the ranking structure.
|
| 60 |
+
|
| 61 |
+
### Intended Use
|
| 62 |
+
Serve as a **sanity check** and upper bound for ranking performance, not for deployment.
|
| 63 |
+
|
| 64 |
+
### Limitations
|
| 65 |
+
- Unrealistically clean preference structure
|
| 66 |
+
- No user differentiation
|
| 67 |
+
- Inflated metrics due to synthetic evaluation
|
| 68 |
+
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
## Model Category 2: Rule-enhanced Cold Start Models (User1–User5)
|
| 72 |
+
|
| 73 |
+
### Purpose
|
| 74 |
+
Capture user-specific dietary preferences and ranking heuristics using richer rule sets, leading to more diverse recommendation patterns across different users.
|
| 75 |
+
|
| 76 |
+
### Data Sources
|
| 77 |
+
- Cleaned Food.com dataset (~180k recipes)
|
| 78 |
+
- 5,000 cold-start synthetic interactions per user profile
|
| 79 |
+
- Additional unselected (negative) samples included to simulate realistic cold-start scenarios
|
| 80 |
+
|
| 81 |
+
### Model
|
| 82 |
+
- Model type: **XGBRanker** (scratch-trained)
|
| 83 |
+
- Training objective: `rank:pairwise`
|
| 84 |
+
- Feature space:
|
| 85 |
+
- Ingredient-parent coverage ratios (~1000 parent nodes)
|
| 86 |
+
- Nutrition features: protein, calories, cost, cooking time
|
| 87 |
+
- User preference weights: protein/time/cost
|
| 88 |
+
- Dietary tag filters and exclusion rules
|
| 89 |
+
|
| 90 |
+
### Training Setup
|
| 91 |
+
- Train/valid/test split: 70/15/15 by recipe ID per profile
|
| 92 |
+
- No fine-tuning between profiles; each profile trained independently
|
| 93 |
+
- Evaluation metric: NDCG@5 and NDCG@10
|
| 94 |
+
|
| 95 |
+
### Evaluation Results
|
| 96 |
+
|
| 97 |
+
| User Profile | NDCG@5 | NDCG@10 |
|
| 98 |
+
|-------------|--------|---------|
|
| 99 |
+
| user1 | 0.4400 | 0.4400 |
|
| 100 |
+
| user2 | 0.4342 | 0.4342 |
|
| 101 |
+
| user3 | 0.4179 | 0.4179 |
|
| 102 |
+
| user4 | 0.1651 | 0.1651 |
|
| 103 |
+
| user5 | 0.4607 | 0.4607 |
|
| 104 |
+
|
| 105 |
+
**Note:** User4 has very restrictive dietary preferences, resulting in very few matching recipes and inherently lower achievable NDCG.
|
| 106 |
+
|
| 107 |
+
:contentReference[oaicite:0]{index=0}:contentReference[oaicite:1]{index=1}:contentReference[oaicite:2]{index=2}:contentReference[oaicite:3]{index=3}:contentReference[oaicite:4]{index=4}
|
| 108 |
+
|
| 109 |
+
Although these NDCG values are lower than the baseline, this is expected for several reasons:
|
| 110 |
+
|
| 111 |
+
- The cold-start datasets contain a large proportion of unselected recipes, leading to sparse positive signals.
|
| 112 |
+
- More complex preference rules increase variability and reduce alignment with NDCG’s single-label relevance assumptions.
|
| 113 |
+
- The models now produce more differentiated ranking behaviors across user profiles, which aligns with the intended personalization goals.
|
| 114 |
+
|
| 115 |
+
---
|
| 116 |
+
|
| 117 |
+
## Model Selection Justification
|
| 118 |
+
|
| 119 |
+
- **XGBRanker** was chosen for all models due to its effectiveness on structured tabular data, fast training time, and compatibility with large feature spaces (1000+ ingredients).
|
| 120 |
+
- The **baseline model** acts as a clean control, providing an upper bound on achievable NDCG under idealized preferences.
|
| 121 |
+
- The **rule-enhanced models** trade some raw NDCG performance for greater personalization fidelity, which is critical in multi-user recommendation contexts.
|
| 122 |
+
|
| 123 |
+
---
|
| 124 |
+
|
| 125 |
+
## Evaluation Methodology
|
| 126 |
+
|
| 127 |
+
- Metric: NDCG@5 and NDCG@10 on held-out cold-start samples
|
| 128 |
+
- Each user model evaluated independently
|
| 129 |
+
- Negative samples retained to approximate real-world recommendation class imbalance
|
| 130 |
+
|
| 131 |
+
---
|
| 132 |
+
|
| 133 |
+
## Intended Uses and Limitations
|
| 134 |
+
|
| 135 |
+
**Intended Uses**
|
| 136 |
+
- Multi-profile recipe recommendation
|
| 137 |
+
- Studying personalization behaviors under sparse feedback
|
| 138 |
+
- Cold-start scenarios for new users
|
| 139 |
+
|
| 140 |
+
**Limitations**
|
| 141 |
+
- Synthetic user interactions do not perfectly reflect real-world feedback
|
| 142 |
+
- NDCG is not well aligned with multi-rule personalization behavior
|
| 143 |
+
- User4 performance is limited by scarcity of relevant recipes
|
| 144 |
+
|
| 145 |
+
---
|
| 146 |
+
|
| 147 |
+
## Citation
|
| 148 |
+
|
| 149 |
+
Tang, Xinxuan. Personalized Recipe Ranking Models. 2025.
|