Iris314
/

SmartFridgeUserModels

+---
+language: en
+license: mit
+tags:
+- recommendation
+- ranking
+- personalization
+- xgboost
+- xgbranker
+- recipe
+- cold-start
+datasets:
+- your-username/recipe-cleaned-dataset
+model-index:
+- name: Personalized Recipe Ranking Models
+  results:
+  - task:
+      type: recommendation
+      name: Personalized Recipe Ranking
+    dataset:
+      name: Food.com (Cleaned)
+      type: your-username/recipe-cleaned-dataset
+    metrics:
+      - type: ndcg@5
+        value: 0.44
+      - type: ndcg@10
+        value: 0.44
+---
+# Model Card: Personalized Recipe Ranking Models
+## Overview
+This project implements a personalized recipe recommendation system using two model categories:
+1. **Scratch-trained baseline**: A simple rule-based + embedding matching ranker trained on a synthetic preference dataset (no user-specific rules).
+2. **Rule-enhanced cold-start models**: Five separate XGBRanker models trained with more complex rule-based preference signals and user-specific interaction patterns (user1–user5).
+The goal is to evaluate how different user profiles affect ranking behavior and recommendation diversity, even when overall NDCG scores are lower than the baseline.
+---
+## Model Category 1: Scratch-trained Baseline
+### Purpose
+Provide a simple cold-start recommendation baseline that matches ingredients and ranks recipes without personalization. It uses parent–child ingredient overlap and a few numeric features (e.g., protein, cost, cooking time).
+### Data Sources
+- Cleaned Food.com dataset (~180k recipes)
+- 10,000 synthetic preference samples generated via uniform random selection
+### Training Details
+- Model type: **XGBRanker** (`objective='rank:pairwise'`)
+- Features: ~1000 numeric ingredient-parent ratio features + basic nutrition/time features
+- Train/test split: 80/20 (by recipe ID)
+- Evaluation metric: NDCG@5, NDCG@10
+### Evaluation
+The baseline achieves **very high NDCG scores (95%+)**, because training and evaluation rely on synthetic signals that align perfectly with the ranking structure.
+### Intended Use
+Serve as a **sanity check** and upper bound for ranking performance, not for deployment.
+### Limitations
+- Unrealistically clean preference structure
+- No user differentiation
+- Inflated metrics due to synthetic evaluation
+---
+## Model Category 2: Rule-enhanced Cold Start Models (User1–User5)
+### Purpose
+Capture user-specific dietary preferences and ranking heuristics using richer rule sets, leading to more diverse recommendation patterns across different users.
+### Data Sources
+- Cleaned Food.com dataset (~180k recipes)
+- 5,000 cold-start synthetic interactions per user profile
+- Additional unselected (negative) samples included to simulate realistic cold-start scenarios
+### Model
+- Model type: **XGBRanker** (scratch-trained)
+- Training objective: `rank:pairwise`
+- Feature space:
+  - Ingredient-parent coverage ratios (~1000 parent nodes)
+  - Nutrition features: protein, calories, cost, cooking time
+  - User preference weights: protein/time/cost
+  - Dietary tag filters and exclusion rules
+### Training Setup
+- Train/valid/test split: 70/15/15 by recipe ID per profile
+- No fine-tuning between profiles; each profile trained independently
+- Evaluation metric: NDCG@5 and NDCG@10
+### Evaluation Results
+| User Profile | NDCG@5 | NDCG@10 |
+|-------------|--------|---------|
+| user1       | 0.4400 | 0.4400  |
+| user2       | 0.4342 | 0.4342  |
+| user3       | 0.4179 | 0.4179  |
+| user4       | 0.1651 | 0.1651  |
+| user5       | 0.4607 | 0.4607  |
+**Note:** User4 has very restrictive dietary preferences, resulting in very few matching recipes and inherently lower achievable NDCG.
+:contentReference[oaicite:0]{index=0}:contentReference[oaicite:1]{index=1}:contentReference[oaicite:2]{index=2}:contentReference[oaicite:3]{index=3}:contentReference[oaicite:4]{index=4}
+Although these NDCG values are lower than the baseline, this is expected for several reasons:
+- The cold-start datasets contain a large proportion of unselected recipes, leading to sparse positive signals.
+- More complex preference rules increase variability and reduce alignment with NDCG’s single-label relevance assumptions.
+- The models now produce more differentiated ranking behaviors across user profiles, which aligns with the intended personalization goals.
+---
+## Model Selection Justification
+- **XGBRanker** was chosen for all models due to its effectiveness on structured tabular data, fast training time, and compatibility with large feature spaces (1000+ ingredients).
+- The **baseline model** acts as a clean control, providing an upper bound on achievable NDCG under idealized preferences.
+- The **rule-enhanced models** trade some raw NDCG performance for greater personalization fidelity, which is critical in multi-user recommendation contexts.
+---
+## Evaluation Methodology
+- Metric: NDCG@5 and NDCG@10 on held-out cold-start samples
+- Each user model evaluated independently
+- Negative samples retained to approximate real-world recommendation class imbalance
+---
+## Intended Uses and Limitations
+**Intended Uses**
+- Multi-profile recipe recommendation
+- Studying personalization behaviors under sparse feedback
+- Cold-start scenarios for new users
+**Limitations**
+- Synthetic user interactions do not perfectly reflect real-world feedback
+- NDCG is not well aligned with multi-rule personalization behavior
+- User4 performance is limited by scarcity of relevant recipes
+---
+## Citation
+Tang, Xinxuan. Personalized Recipe Ranking Models. 2025.