mealgraph / README.md
moazeldegwy's picture
Simplify topology to 3 agents + 2 tools
1933348
---
title: Nutrition Multi-Agent System
emoji: πŸ₯—
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
license: cc-by-nc-4.0
short_description: Multi-agent nutrition planner with LangGraph + Gemini
---
# πŸ₯— MealGraph β€” Nutrition Multi-Agent System
**Live demo:** <https://huggingface.co/spaces/moazeldegwy/mealgraph>
A clinical-nutrition planner built on **LangGraph** and **Gemini 3.x**
(`gemini-pro-latest` Β· `gemini-flash-latest` Β· `gemini-flash-lite-latest`).
Three agents β€” **Coach**, **MedicalAssessmentAgent**, and
**PlannerAgent** β€” sit on top of two safe-by-construction tools (a PuLP
linear-program meal solver and a Gemini-grounded web search). Clinical
math runs through closed-form Python formulas (Mifflin-St Jeor BMR,
ACSM activity multipliers); the LLM interprets the numbers but never
recomputes them. The Planner runs a deterministic plan check after the
LP solver and self-revises on allergy / calorie / macro violations
before returning. The Coach does an LLM-graded self-review (medical
flag respect, citation presence, cultural fit) before composing.
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Coach β”‚
β”‚ one typed action per turn (LangGraph) β”‚
β”‚ + self-review of the Planner's output β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ call_agent / ask_user
β”‚ write_memory / compose_response
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό
MedicalAssessment Planner
deterministic math draft -> LP -> check_plan
+ LLM enrichment (≀ 2 internal revisions)
β”‚ β”‚
β–Ό β–Ό
nutrition_formulas QuantitiesFinder (PuLP)
(BMI / BMR / TDEE) WebSearchTool (grounded)
```
## Quick start
```bash
pip install -r requirements.txt
python app.py
```
Open the Gradio UI, paste a Gemini API key, fill in the profile sidebar,
and ask for a plan.
The same repository deploys directly as a **Hugging Face Space** β€” the
YAML front-matter above is the Space manifest, and `app.py` is the
auto-detected entry point.
## Architecture
| Component | Role | Key file |
|---|---|---|
| **CoachAgent** | Orchestrator. Picks one action per turn (`call_agent` / `ask_user` / `write_memory` / `compose_response`). After the Planner returns, runs an LLM-graded self-review (medical-flag respect, citation presence, cultural fit) and triggers a revision when needed. | [agents.py](agents.py) |
| **MedicalAssessmentAgent** | Deterministic clinical math first (`full_assessment` β†’ BMI / BMR / TDEE / macros), then an LLM step that emits flags / recommendations / evidence. The agent **overwrites** the LLM's `calculations` with the deterministic values so the math is exact by construction. | [agents.py](agents.py) |
| **PlannerAgent** | Drafts meals, batches nutrition lookups via the grounded `WebSearchTool`, runs the `QuantitiesFinder` LP, then runs `check_plan` (allergy / calorie / macro tolerances) inline. Up to two internal revisions resolve any blocking issue before returning. | [agents.py](agents.py) |
| **`check_plan`** | Deterministic post-LP critic. Allergy β†’ severity `high` (hard block); calorie Β±3 % / macro Β±5 % β†’ severity `medium`; disliked food β†’ severity `low`. Same code path the Planner uses internally and the eval harness asserts against. | [agents.py](agents.py) |
| **QuantitiesFinder** | PuLP linear-program meal-quantity solver. Default per-food bounds `min = max(20, est Γ— 0.3)`, `max = min(400, est Γ— 2.5)` keep the LP from suggesting 1 g of butter or 900 g of broccoli. Estimate-anchor weight is `0.3`. | [tools.py](tools.py) |
| **WebSearchTool** | Single round-trip wrapper around Gemini's built-in `google_search` grounding. Returns answer + citations + queries from `grounding_metadata`. Prompt biases toward USDA / WHO / ADA / EFSA / NICE / FDA / MedlinePlus. | [tools.py](tools.py) |
| **LongTermMemory** | SQLite-backed semantic / procedural / episodic tiers. | [memory.py](memory.py) |
| **Guardrails** | Prompt-injection sniff, PII redaction, HITL escalation marker (`<<HITL:CLINICIAN_REVIEW_REQUIRED>>`). | [guardrails.py](guardrails.py) |
| **MCP server** | Exposes `QuantitiesFinder` and `assess_user` to Claude Desktop, Cursor, and any MCP-aware client. | [mcp_server.py](mcp_server.py) |
| **Agent cards** | A2A capability descriptors (three cards) with an in-process registry. | [agent_cards.py](agent_cards.py) |
| **Observability** | LangSmith passthrough + in-process metrics surface. | [observability.py](observability.py) |
| **Eval harness** | Three fixture personas; runs offline (no Gemini calls) against `check_plan`. | [evals/](evals/) |
### Models and rate limits
Three Gemini 3.x rolling aliases, mapped per role. The free-tier RPM /
RPD limits below are conservative defaults; override with
`enable_rate_limiting=False` (or pass a paid quota) if you have one.
| Alias | RPM | RPD | Default role |
|---|---:|---:|---|
| `gemini-pro-latest` | 5 | 100 | Coach, Medical, Planner |
| `gemini-flash-latest` | 10 | 250 | Available for overrides |
| `gemini-flash-lite-latest` | 15 | 500 | Tools (WebSearch), simulator |
### Safety guarantees
| Guarantee | Where it lives |
|---|---|
| **Allergies never appear in the plan** | Planner's `check_plan` β€” severity `high` β†’ hard block β†’ internal revision. |
| **Calorie target hit within Β±3 %** | Planner's `check_plan` β€” severity `medium` β†’ revision. |
| **Each macro hit within Β±5 %** | Planner's `check_plan` β€” severity `medium` β†’ revision. |
| **Medical flags respected** | Coach's self-review turn (LLM-graded). |
| **Clinical claims carry citations** | `WebSearchTool` returns `grounding_chunks` natively; Coach checks for them in self-review. |
| **Serious cases escalate** | Medical sets `requires_professional_consultation=True`; Coach appends `<<HITL:CLINICIAN_REVIEW_REQUIRED>>`. |
| **No RCE via LLM-generated code** | No code-from-LLM path exists at all. `nutrition_formulas` is closed-form Python; `QuantitiesFinder` is a pure LP. |
| **Deterministic math** | `full_assessment()` runs server-side; the Medical agent overwrites whatever the LLM emitted for `calculations`. |
### Run the offline eval harness
Three persona fixtures (athlete, diabetic, vegan-budget) exercise the
deterministic surface β€” no Gemini calls needed:
```bash
python -m evals.runner
```
### Run the test suite
```bash
pytest -ra
```
Coverage: schemas, solver behaviour, safety surface, rate-limit pool,
memory tiers, and full Coach ↔ specialist loops via a mock LLM. The
post-LP allergy revision and deterministic-calculation overwrite are
both unit-tested.
---
## Library usage
The same code runs as a library. Import the `mealgraph` module,
provide API keys, and call a few setup functions.
### 1. Import
```python
import mealgraph
```
### 2. API keys
Provide a list of keys; the system rotates through them and respects
each model's RPM / RPD limit. A single key is enough for evaluation.
```python
api_keys = [
"your_api_key1",
"your_api_key2",
]
```
### 3. (Optional) Model overrides
Override `model_name` or `params` per role. Other configuration is fixed
in the module.
```python
model_overrides = {
"main": {"model_name": "gemini-pro-latest", "params": {"temperature": 0.5}},
"agents_llm": {"model_name": "gemini-flash-latest", "params": {"max_tokens": 6000}},
}
```
### 4. Initialise
```python
mealgraph.create_llm_instances(api_keys, model_overrides, enable_rate_limiting=True)
mealgraph.initialize_tools()
mealgraph.initialize_agents()
mealgraph.setup_workflow()
```
### 5. Run
Either interactive mode (collects user data via stdin) or simulation
mode (drives one or more synthetic users through a fixed question list):
```python
mealgraph.run(simulate=False)
# or
mealgraph.run(simulate=True, simulated_users=[...])
```
## Behaviour notes
* **User mode output** β€” high-level progress lines, one per agent / tool
action.
* **Debug mode output** β€” raw LLM input / output (or output only),
scoped per agent / tool. Enable with `mealgraph.debug(level='full',
scopes={'agents': ['CoachAgent'], 'tools': ['all']})`.
* **API-key pooling** β€” the manager rotates keys and (when rate
limiting is on) enforces per-model RPM / RPD. Keys that exhaust their
daily quota are dropped from the pool until the next UTC day.
* **Interactive mode** β€” prompts for profile fields, then accepts free
questions; type `exit` to quit.
* **Simulation mode** β€” each entry in `simulated_users` is a dict with
`user_profile`, `medical_history`, and `questions`; the loop drives
each user's questions sequentially.
* **Error handling** β€” provide at least one API key (else
`ValueError`). Each initialisation function checks that its
predecessor has been called.