RecipeAI: Hierarchical Recipe Generation with RL

Model Description

This is a trained PPO agent for nutritionally-balanced recipe generation, part of a hierarchical reinforcement learning system (HRM).

Phase 1 Baseline Agent:

  • Algorithm: Proximal Policy Optimization (PPO)
  • Training: 150,000 timesteps
  • Framework: Stable Baselines3
  • Environment: Custom Gymnasium env (RecipeEnv)

Performance

Phase 1 (Standalone)

  • Constraint Compliance: 100% (deterministic mode)
  • Recipe Diversity: 1% (deterministic), 96% (stochastic)
  • Average Reward: -164.06

Phase 2 (HRM Integration)

  • Weekly Target Achievement: 4/5 nutrients within 10%
  • Recipe Diversity: 100% unique recipes
  • Unique Ingredients: 55/324 used
  • Daily Compliance: 14% (expected due to dynamic constraint adjustment)

Training Details

  • State Space: 11-dimensional

    • Current nutrients (5): calories, protein, sodium, carbs, fat
    • Target nutrients (5)
    • Ingredient count (1)
  • Action Space: 325 discrete actions

    • 324 ingredients
    • 1 DONE action
  • Constraints (per meal):

    • Calories: 400-800 kcal
    • Protein: 15-50g
    • Sodium: 0-800mg
    • Carbs: 30-100g
    • Fat: 10-30g
  • Dataset: USDA FoodData Central (324 processed ingredients)

Usage

import gymnasium as gym
from stable_baselines3 import PPO

# Load model
model = PPO.load("recipe_agent_standard_ppo")

# Generate recipe (requires RecipeEnv - see repository)
# obs, info = env.reset()
# action, _states = model.predict(obs, deterministic=True)

Repository

Full code: [RecipeAI GitHub Repository]

Citation

@software{recipeai2024,
  title={RecipeAI: Hierarchical Reinforcement Learning for Recipe Generation},
  author=Bhavesh,
  year=2025,
  url={https://huggingface.co/bhxvxsh/recipe-ai-hrm}
}

License

MIT License

Downloads last month
8
Video Preview
loading