RecipeAI Ultra-Performance Model

🏆 BREAKTHROUGH: 100% satisfaction on all nutrients in live testing!

Model Description

This is an ultra-performance reinforcement learning model for generating personalized recipes for diabetic patients. The model uses PPO (Proximal Policy Optimization) with a novel fat-focused curriculum learning approach to achieve publication-ready performance.

Key Achievement

  • 100% satisfaction on all 5 nutrients (Calories, Protein, Fat, Carbs, Sodium) in live testing
  • 81.2% satisfaction in 200-episode stress testing
  • ✅ Resolved critical fat constraint bottleneck: 10% → 100% (+90 percentage points)

Model Details

  • Model Type: Reinforcement Learning (PPO with ActorCriticPolicy)
  • Parameters: 245,574 trainable parameters
  • Training Time: 16 minutes on RTX 4070 Laptop GPU
  • Architecture: MLP with 256×256×128 layers
  • Framework: Stable-Baselines3
  • Training Steps: 800,000 timesteps
  • Curriculum: 5-phase fat-focused curriculum learning

Performance Metrics

Live Testing (20 recipes)

Nutrient Satisfaction Target
Calories 100.0% ≥85% ✅
Protein 100.0% ≥85% ✅
Fat 100.0% ≥85% ✅
Carbs 100.0% ≥85% ✅
Sodium 100.0% ≥85% ✅
Overall 100.0% ≥85% ✅

Average Reward: +214.4

Stress Testing (200 episodes)

Nutrient Satisfaction
Calories 89.5%
Protein 65.5%
Fat 74.5%
Carbs 98.0%
Sodium 78.5%
Overall 81.2%

Average Reward: +537.08

Efficiency

  • Generation Speed: 0.116s per recipe
  • Throughput: 8.61 recipes/second
  • Memory Overhead: 0.01 MB
  • CPU Usage: 3.7%

Training Details

Reward Configuration (v3 - Fat-Priority)

NUTRIENT_IMPORTANCE = {
    'fat': 0.35,      # Increased from 0.25 (critical improvement)
    'calories': 0.18,
    'protein': 0.18,
    'carbs': 0.19,
    'sodium': 0.10
}

# Fat-specific optimizations
FAT_EXPONENTIAL_FACTOR = 2.0  # vs 1.5 for others
FAT_SATISFACTION_BONUS = 100   # Extra reward
FAT_BONUS_MULTIPLIER = 2.0     # Double rewards

5-Phase Fat-Focused Curriculum

  1. Easy Fat (2.5x range) - 150k steps
  2. Medium Fat (2.0x range) - 150k steps
  3. Normal Fat (1.5x range) - 150k steps
  4. Tight Fat (1.2x range) - 150k steps
  5. Target Fat (1.0x range) - 200k steps

Hyperparameters

learning_rate: 3e-4
n_steps: 2048
batch_size: 128
n_epochs: 15
gamma: 0.99
gae_lambda: 0.95
clip_range: 0.2
ent_coef: 0.15 (annealed to 0.005)
vf_coef: 0.5
max_grad_norm: 0.5

Intended Use

Primary Use Case

Generating personalized, nutritionally-balanced recipes for diabetic patients that satisfy multiple constraints:

  • Caloric requirements
  • Protein targets
  • Fat limitations (critical constraint)
  • Carbohydrate management
  • Sodium restrictions

Out-of-Scope Use

  • General population recipe generation (optimized for diabetic constraints)
  • Real-time dietary advice without professional consultation
  • Medical diagnosis or treatment decisions

How to Use

Installation

pip install stable-baselines3 gymnasium numpy pandas

Load and Use Model

from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecNormalize

# Load the model
model = PPO.load("ultra_performance_final.zip")

# For normalized environments, also load normalization stats
vec_normalize = VecNormalize.load("vec_normalize_ultra.pkl", env)

# Generate recipes
obs = env.reset()
for _ in range(10):  # Generate 10 ingredients
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, done, info = env.step(action)
    if done:
        break

Full Example

See the GitHub repository for complete usage examples:

  • scripts/quick_test.py - Live recipe generation
  • scripts/comprehensive_model_analysis.py - Full evaluation

Training Procedure

Data

Datasets Used:

  • USDA FoodData Central: 324 enriched ingredients with complete nutritional profiles
  • UCI Diabetes Dataset: 66 diabetic patient profiles with personalized constraints
  • RecipeNLG: 301,689 recipes for ingredient relationship learning

Nutrients Tracked: Calories, Protein, Fat, Carbohydrates, Sodium

Training Evolution

  1. Baseline (100k steps): 89% overall, 60% fat satisfaction
  2. Curriculum v2 (700k steps): 70% overall, 10% fat (bottleneck identified)
  3. Ultra-Performance (800k steps): 100% overall, 100% fat

Critical Breakthrough

The model achieved breakthrough performance by addressing the fat constraint bottleneck through:

  • Increased fat importance in reward weighting (35% vs 18-19% for others)
  • Fat-specific bonuses (+100 points for satisfaction)
  • Stricter fat thresholds (10%/20%/30% vs 15%/25%/35%)
  • Aggressive fat penalties (2.0 exponential factor)
  • 5-phase fat-focused curriculum (gradual constraint tightening)

Hardware

  • GPU: NVIDIA RTX 4070 Laptop (8GB VRAM)
  • Training Speed: ~820 iterations/second
  • Total Training Time: 16 minutes 15 seconds

Evaluation

Metrics

  • Constraint Satisfaction Rate: Percentage of recipes meeting all nutritional constraints
  • Average Reward: Cumulative reward per episode
  • Precision/Recall/F1-Score: Per-nutrient classification metrics
  • Generation Speed: Time per recipe

Results Summary

Model Training Overall Fat Reward Time
Baseline 100k 89.0% 60.0% +76.5 10 min
Curriculum v2 700k 70.0% 10.0% ❌ -78.6 14 min
Ultra-Perf 800k 100.0% 100.0% +214.4 16 min

Visualizations

See the analysis results for comprehensive visualizations:

  • Confusion matrices
  • Accuracy metrics
  • Constraint satisfaction heatmaps
  • Efficiency analysis
  • Reward distributions
  • Model comparisons

Limitations and Bias

Limitations

  • Training Data: Limited to 324 ingredients; may not generalize to all available foods
  • Patient Diversity: Trained on 66 diabetic patient profiles; may need fine-tuning for other populations
  • Stress Testing: Performance drops to 81.2% under extreme edge cases (200-episode stress test)
  • Cultural Bias: Recipe patterns may reflect dataset biases

Recommendations

  • Always validate generated recipes with healthcare professionals
  • Consider individual patient preferences and allergies
  • Monitor nutrient absorption and medication interactions
  • Adjust constraints based on patient feedback

Ethical Considerations

Health Impact

  • Model designed to assist, not replace, professional nutritional guidance
  • Should be used as a tool alongside medical supervision
  • Critical for users to consult healthcare providers

Transparency

  • Complete training methodology documented
  • Evaluation results publicly available
  • Reproducible with provided code and data

Citation

@software{recipeai_ultraperformance_2026,
  title={RecipeAI Ultra-Performance Model: Reinforcement Learning for Diabetic Recipe Generation},
  author={Bhavesh},
  year={2026},
  url={https://huggingface.co/bhxvxshh/recipeai-ultra-performance},
  note={100% satisfaction on all nutrients through fat-focused curriculum learning}
}

Model Card Authors


License: MIT

Downloads last month
53
Video Preview
loading

Evaluation results

  • Overall Satisfaction (Live) on Diabetic Patient Constraints
    self-reported
    100.000
  • Overall Satisfaction (Stress) on Diabetic Patient Constraints
    self-reported
    81.200
  • Average Reward (Live) on Diabetic Patient Constraints
    self-reported
    214.400
  • Average Reward (Stress) on Diabetic Patient Constraints
    self-reported
    537.080