Spaces:
Sleeping
Phase 7: Gradio demo + Hugging Face Space metadata
Browse filesapp.py - the entry point for the Hugging Face Space.
* Sidebar:
- paste Gemini API keys (one per line) + per-role model overrides for
Coach / Workers / Tools+Validator
- rate-limit + debug toggles
- "Initialize system" button -> spins up create_llm_instances /
initialize_tools / initialize_agents / setup_workflow /
initialize_long_term_memory + init_langsmith()
- user-profile form (name, age, sex, anthropometrics, goal, allergies,
dislikes, country, conditions, meds) -> JSON state forwarded into the
LangGraph memory on each turn
* Main pane:
- Chatbot (messages format, Gradio 5/6 compatible)
- "Agent activity" accordion: per-call _BufferHandler captures every
nutrition_mas.* log line so the user can see Coach -> Medical ->
Validator -> Planner unfold in real time
- "System metrics" accordion: live MetricsCollector + ParseMetrics
snapshot rendered as a markdown table
- "Agent registry" accordion: pretty-printed AgentCard list from
agent_cards.build_default_registry()
* SessionState dataclass deepcopy-safe (no Lock - Gradio's per-session
queue serialises handler calls). gr.State seeded with None and lazy-
initialised in chat() so nothing breaks if the session is reused.
requirements.txt
* Added gradio>=4.40,<7. Wide range because app.py uses only the
messages-format Chatbot shape that all 4.40+/5/6 versions support.
README.md
* Prepended the Hugging Face Space metadata block (sdk: gradio,
app_file: app.py, emoji, license).
* Added an Architecture section with the file -> role table for every
module shipped across phases 0-7.
* Original library-usage guide moved under a "Library Usage" header
(kept verbatim for backwards-compat callers).
tests/test_app.py (6 tests)
* test_app_imports - module loads outside Colab.
* test_build_demo_compiles - gr.Blocks construction doesn't raise on the
installed Gradio version.
* test_build_user_profile_round_trip - JSON serialisation of the sidebar
payload matches the LangGraph memory shape.
* test_render_metrics_is_markdown - metrics-table renderer.
* test_session_state_default_shape - empty-memory invariant.
* test_chat_handles_uninitialised_system - friendly error before init.
84/84 tests green. 7-phase modernisation complete.
- README.md +64 -1
- app.py +413 -0
- requirements.txt +5 -0
- tests/test_app.py +96 -0
|
@@ -1,4 +1,67 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
This guide explains how to use the Nutrition Multi-Agent System (MAS) by importing the `nutritionmas` module and calling a few simple functions. The system handles all the complex setup internally, so you only need to provide a list of API keys and optionally override model configurations.
|
| 4 |
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Nutrition Multi-Agent System
|
| 3 |
+
emoji: 🥗
|
| 4 |
+
colorFrom: green
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 5.0.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
short_description: A LangGraph + Gemini multi-agent system for personalized nutrition planning with a Validator critic loop.
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# 🥗 Nutrition Multi-Agent System
|
| 15 |
+
|
| 16 |
+
A clinical-nutrition demo built on **LangGraph** + **Gemini 2.5**. A
|
| 17 |
+
`CoachAgent` orchestrates `MedicalAssessmentAgent`, `PlannerAgent`,
|
| 18 |
+
`ValidationAgent` (the critic loop), and `KnowledgeAgent`. The Planner
|
| 19 |
+
uses a **PuLP linear-program solver** to turn LLM-drafted meals into
|
| 20 |
+
exact gram quantities; the Validator deterministically catches allergy
|
| 21 |
+
violations and tolerance breaches before the plan reaches the user.
|
| 22 |
+
|
| 23 |
+
## Try the demo
|
| 24 |
+
|
| 25 |
+
The repo doubles as a **Hugging Face Space** — paste your Gemini key into
|
| 26 |
+
the sidebar of `app.py` and chat with the system live.
|
| 27 |
+
|
| 28 |
+
```bash
|
| 29 |
+
pip install -r requirements.txt
|
| 30 |
+
python app.py
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
## Architecture at a glance
|
| 34 |
+
|
| 35 |
+
| Component | Role | Key file |
|
| 36 |
+
|---|---|---|
|
| 37 |
+
| **CoachAgent** | Orchestrator. Decides one action per turn (call_agent / call_tool / ask_user / write_memory / compose_response). | `agents.py` |
|
| 38 |
+
| **MedicalAssessmentAgent** | BMI/BMR/TDEE/macros, clinical flags, evidence sources. | `agents.py` |
|
| 39 |
+
| **PlannerAgent** | Drafts meals; runs `QuantitiesFinder` LP for exact grams. | `agents.py` |
|
| 40 |
+
| **ValidationAgent** | Generator-critic gate. Deterministic allergy/calorie/macro checks + LLM-graded medical-flag respect & citation requirement. | `validation.py` |
|
| 41 |
+
| **KnowledgeAgent** | Citation-first lookup against authoritative domains (USDA, WHO, ADA, EFSA). | `knowledge.py` |
|
| 42 |
+
| **QuantitiesFinder** | PuLP linear-program meal-quantity solver. Deterministic. | `tools.py` |
|
| 43 |
+
| **ComputationTool** | Closed-form clinical formulas (Mifflin-St Jeor, ACSM activity multipliers). **No subprocess, no eval.** | `tools.py`, `nutrition_formulas.py` |
|
| 44 |
+
| **WebSearchTool** | DuckDuckGo + LLM synthesis fallback. | `tools.py` |
|
| 45 |
+
| **LongTermMemory** | SQLite-backed semantic / procedural / episodic tiers. | `memory.py` |
|
| 46 |
+
| **Guardrails** | Prompt-injection sniff + PII redaction + HITL chip. | `guardrails.py` |
|
| 47 |
+
| **MCP server** | Exposes tools to Claude Desktop / Cursor / any MCP client. | `mcp_server.py` |
|
| 48 |
+
| **Agent cards** | A2A capability descriptors. | `agent_cards.py` |
|
| 49 |
+
| **Observability** | LangSmith passthrough + in-process metrics. | `observability.py` |
|
| 50 |
+
| **Eval harness** | 3 fixture personas, runs offline (no LLM). | `evals/` |
|
| 51 |
+
|
| 52 |
+
Run the eval harness anytime:
|
| 53 |
+
```bash
|
| 54 |
+
python -m evals.runner
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
Run the test suite (78 tests, no Gemini calls needed):
|
| 58 |
+
```bash
|
| 59 |
+
pytest -ra
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## Library Usage
|
| 65 |
|
| 66 |
This guide explains how to use the Nutrition Multi-Agent System (MAS) by importing the `nutritionmas` module and calling a few simple functions. The system handles all the complex setup internally, so you only need to provide a list of API keys and optionally override model configurations.
|
| 67 |
|
|
@@ -0,0 +1,413 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Gradio demo app for the Nutrition MAS — entry point for the Hugging Face Space.
|
| 2 |
+
|
| 3 |
+
The app is intentionally thin: settings sidebar -> chat -> trace pane. The
|
| 4 |
+
heavy lifting stays in the agent system. The whole point is to *show* the
|
| 5 |
+
multi-agent architecture, not to build a product UI.
|
| 6 |
+
|
| 7 |
+
Run locally::
|
| 8 |
+
|
| 9 |
+
pip install -r requirements.txt gradio
|
| 10 |
+
python app.py
|
| 11 |
+
|
| 12 |
+
On Hugging Face Spaces, this file is the auto-detected entry point. The
|
| 13 |
+
README's metadata block (sdk: gradio, app_file: app.py) tells the platform
|
| 14 |
+
what to do.
|
| 15 |
+
"""
|
| 16 |
+
|
| 17 |
+
from __future__ import annotations
|
| 18 |
+
|
| 19 |
+
import json
|
| 20 |
+
import logging
|
| 21 |
+
import traceback
|
| 22 |
+
from io import StringIO
|
| 23 |
+
from typing import Any, Dict, List, Tuple
|
| 24 |
+
|
| 25 |
+
import gradio as gr
|
| 26 |
+
|
| 27 |
+
import nutritionmas
|
| 28 |
+
from agent_cards import build_default_registry
|
| 29 |
+
from logging_setup import get_logger
|
| 30 |
+
from observability import get_metrics, init_langsmith, span
|
| 31 |
+
from state import initialize_empty_memory
|
| 32 |
+
|
| 33 |
+
_logger = get_logger("app")
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
# ---------------------------------------------------------------------------
|
| 37 |
+
# Per-session state
|
| 38 |
+
# ---------------------------------------------------------------------------
|
| 39 |
+
class SessionState:
|
| 40 |
+
"""Holds per-user MAS state. One instance per Gradio session.
|
| 41 |
+
|
| 42 |
+
Kept deepcopy-safe (no threads, no locks) because Gradio's ``gr.State``
|
| 43 |
+
deep-copies the initial value on every session. Concurrency safety is
|
| 44 |
+
provided by Gradio's per-session queue, which serialises handler calls
|
| 45 |
+
for a single browser tab.
|
| 46 |
+
"""
|
| 47 |
+
|
| 48 |
+
def __init__(self) -> None:
|
| 49 |
+
self.initialised: bool = False
|
| 50 |
+
self.memory: Dict[str, Any] = initialize_empty_memory()
|
| 51 |
+
self.conversation_history: List[Dict[str, str]] = []
|
| 52 |
+
self.previous_actions: List[str] = []
|
| 53 |
+
self.thread_id: str = "session-default"
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
# ---------------------------------------------------------------------------
|
| 57 |
+
# Bootstrapping
|
| 58 |
+
# ---------------------------------------------------------------------------
|
| 59 |
+
def initialise_system(
|
| 60 |
+
api_keys_text: str,
|
| 61 |
+
coach_model: str,
|
| 62 |
+
workers_model: str,
|
| 63 |
+
tools_model: str,
|
| 64 |
+
rate_limit: bool,
|
| 65 |
+
debug_on: bool,
|
| 66 |
+
) -> str:
|
| 67 |
+
"""Spin up the MAS once with the supplied keys + per-role model overrides."""
|
| 68 |
+
keys = [k.strip() for k in api_keys_text.splitlines() if k.strip()]
|
| 69 |
+
if not keys:
|
| 70 |
+
return "❌ Please paste at least one Gemini API key (one per line)."
|
| 71 |
+
|
| 72 |
+
overrides = {
|
| 73 |
+
"main": {"model_name": coach_model},
|
| 74 |
+
"agents_llm": {"model_name": workers_model},
|
| 75 |
+
"planner_agent": {"model_name": workers_model},
|
| 76 |
+
"validation_agent": {"model_name": tools_model},
|
| 77 |
+
"tools_llm": {"model_name": tools_model},
|
| 78 |
+
}
|
| 79 |
+
try:
|
| 80 |
+
if debug_on:
|
| 81 |
+
nutritionmas.debug(level="output")
|
| 82 |
+
nutritionmas.create_llm_instances(keys, overrides, enable_rate_limiting=rate_limit)
|
| 83 |
+
nutritionmas.initialize_tools()
|
| 84 |
+
nutritionmas.initialize_agents()
|
| 85 |
+
nutritionmas.setup_workflow()
|
| 86 |
+
nutritionmas.initialize_long_term_memory()
|
| 87 |
+
init_langsmith()
|
| 88 |
+
return (
|
| 89 |
+
f"✅ System initialised with {len(keys)} key(s). "
|
| 90 |
+
f"Coach={coach_model}, Workers={workers_model}, Tools/Validator={tools_model}."
|
| 91 |
+
)
|
| 92 |
+
except Exception as e: # noqa: BLE001
|
| 93 |
+
return f"❌ Initialisation failed: {e}\n\n{traceback.format_exc()}"
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
# ---------------------------------------------------------------------------
|
| 97 |
+
# Per-call log capture
|
| 98 |
+
# ---------------------------------------------------------------------------
|
| 99 |
+
class _BufferHandler(logging.Handler):
|
| 100 |
+
"""Captures every nutrition_mas.* log line into a string buffer for the UI."""
|
| 101 |
+
|
| 102 |
+
def __init__(self) -> None:
|
| 103 |
+
super().__init__(level=logging.INFO)
|
| 104 |
+
self.buffer = StringIO()
|
| 105 |
+
self.setFormatter(logging.Formatter("%(name)s — %(message)s"))
|
| 106 |
+
|
| 107 |
+
def emit(self, record: logging.LogRecord) -> None:
|
| 108 |
+
self.buffer.write(self.format(record) + "\n")
|
| 109 |
+
|
| 110 |
+
def text(self) -> str:
|
| 111 |
+
return self.buffer.getvalue()
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
def _attach_buffer() -> _BufferHandler:
|
| 115 |
+
handler = _BufferHandler()
|
| 116 |
+
root = logging.getLogger("nutrition_mas")
|
| 117 |
+
root.addHandler(handler)
|
| 118 |
+
return handler
|
| 119 |
+
|
| 120 |
+
|
| 121 |
+
def _detach_buffer(handler: _BufferHandler) -> None:
|
| 122 |
+
logging.getLogger("nutrition_mas").removeHandler(handler)
|
| 123 |
+
|
| 124 |
+
|
| 125 |
+
# ---------------------------------------------------------------------------
|
| 126 |
+
# Profile builder (sidebar form)
|
| 127 |
+
# ---------------------------------------------------------------------------
|
| 128 |
+
def build_user_profile(
|
| 129 |
+
name: str,
|
| 130 |
+
age: float,
|
| 131 |
+
sex: str,
|
| 132 |
+
height_cm: float,
|
| 133 |
+
weight_kg: float,
|
| 134 |
+
activity: str,
|
| 135 |
+
goal: str,
|
| 136 |
+
allergies: str,
|
| 137 |
+
dislikes: str,
|
| 138 |
+
country: str,
|
| 139 |
+
conditions: str,
|
| 140 |
+
medications: str,
|
| 141 |
+
) -> Dict[str, Any]:
|
| 142 |
+
return {
|
| 143 |
+
"user_profile": {
|
| 144 |
+
"name": name or "Anonymous",
|
| 145 |
+
"age": age,
|
| 146 |
+
"sex": sex,
|
| 147 |
+
"height": height_cm,
|
| 148 |
+
"weight": weight_kg,
|
| 149 |
+
"activity_level": activity,
|
| 150 |
+
"goal": goal,
|
| 151 |
+
"food_dislikes": dislikes,
|
| 152 |
+
"allergies": [a.strip() for a in allergies.split(",") if a.strip()],
|
| 153 |
+
"country": country,
|
| 154 |
+
"currency": "USD",
|
| 155 |
+
},
|
| 156 |
+
"medical_history": {
|
| 157 |
+
"conditions": [c.strip() for c in conditions.split(",") if c.strip()],
|
| 158 |
+
"medications": [m.strip() for m in medications.split(",") if m.strip()],
|
| 159 |
+
"past_issues": [],
|
| 160 |
+
"lab_results": "",
|
| 161 |
+
},
|
| 162 |
+
}
|
| 163 |
+
|
| 164 |
+
|
| 165 |
+
# ---------------------------------------------------------------------------
|
| 166 |
+
# Chat handler
|
| 167 |
+
# ---------------------------------------------------------------------------
|
| 168 |
+
def chat(
|
| 169 |
+
user_message: str,
|
| 170 |
+
history: List[Dict[str, str]],
|
| 171 |
+
session: SessionState,
|
| 172 |
+
profile_json: str,
|
| 173 |
+
) -> Tuple[List[Dict[str, str]], str, str, SessionState]:
|
| 174 |
+
"""Single-turn handler. Returns (chat_history, trace_log, metrics_md, session).
|
| 175 |
+
|
| 176 |
+
Uses Gradio 5+ "messages" Chatbot format: ``[{"role": "user"/"assistant",
|
| 177 |
+
"content": "..."}, ...]``.
|
| 178 |
+
"""
|
| 179 |
+
if session is None:
|
| 180 |
+
session = SessionState()
|
| 181 |
+
if not history:
|
| 182 |
+
history = []
|
| 183 |
+
history = history + [{"role": "user", "content": user_message}]
|
| 184 |
+
if nutritionmas.APP is None:
|
| 185 |
+
history.append(
|
| 186 |
+
{"role": "assistant", "content": "❌ System not initialised. Use the sidebar Initialize button."}
|
| 187 |
+
)
|
| 188 |
+
return history, "", "", session
|
| 189 |
+
|
| 190 |
+
# Update profile if user changed it.
|
| 191 |
+
try:
|
| 192 |
+
profile_data = json.loads(profile_json) if profile_json.strip() else {}
|
| 193 |
+
if profile_data:
|
| 194 |
+
session.memory["user_profile"] = profile_data.get("user_profile", session.memory["user_profile"])
|
| 195 |
+
session.memory["medical_history"] = profile_data.get(
|
| 196 |
+
"medical_history", session.memory["medical_history"]
|
| 197 |
+
)
|
| 198 |
+
except json.JSONDecodeError:
|
| 199 |
+
pass
|
| 200 |
+
|
| 201 |
+
handler = _attach_buffer()
|
| 202 |
+
final_response = ""
|
| 203 |
+
error_text = ""
|
| 204 |
+
try:
|
| 205 |
+
state = {
|
| 206 |
+
"memory": session.memory,
|
| 207 |
+
"user_question": user_message,
|
| 208 |
+
"conversation_history": session.conversation_history
|
| 209 |
+
+ [{"role": "user", "content": user_message}],
|
| 210 |
+
"current_action": None,
|
| 211 |
+
"agent_result": None,
|
| 212 |
+
"num_turns": 0,
|
| 213 |
+
"max_turns": 12,
|
| 214 |
+
"previous_actions": session.previous_actions,
|
| 215 |
+
"response_steps": [],
|
| 216 |
+
}
|
| 217 |
+
with span("end_to_end_chat", kind="agent"):
|
| 218 |
+
final_state = nutritionmas.APP.invoke(
|
| 219 |
+
state, config={"configurable": {"thread_id": session.thread_id}}
|
| 220 |
+
)
|
| 221 |
+
session.memory = final_state["memory"]
|
| 222 |
+
session.conversation_history = final_state["conversation_history"]
|
| 223 |
+
session.previous_actions = final_state["previous_actions"]
|
| 224 |
+
final_response = final_state.get("agent_result") or "(no response)"
|
| 225 |
+
except Exception as e: # noqa: BLE001
|
| 226 |
+
error_text = f"\n\n⚠ Error: {e}"
|
| 227 |
+
finally:
|
| 228 |
+
log_text = handler.text()
|
| 229 |
+
_detach_buffer(handler)
|
| 230 |
+
|
| 231 |
+
history.append({"role": "assistant", "content": str(final_response) + error_text})
|
| 232 |
+
metrics = get_metrics().snapshot()
|
| 233 |
+
metrics_md = _render_metrics(metrics)
|
| 234 |
+
return history, log_text, metrics_md, session
|
| 235 |
+
|
| 236 |
+
|
| 237 |
+
def _render_metrics(snap: Dict[str, Any]) -> str:
|
| 238 |
+
lines = ["### System metrics", "", "| Component | Calls | Total (s) | Errors |", "|---|---|---|---|"]
|
| 239 |
+
for name, m in snap["agents"].items():
|
| 240 |
+
lines.append(f"| agent · {name} | {m['calls']} | {m['total_seconds']:.2f} | {m['errors']} |")
|
| 241 |
+
for name, m in snap["tools"].items():
|
| 242 |
+
lines.append(f"| tool · {name} | {m['calls']} | {m['total_seconds']:.2f} | {m['errors']} |")
|
| 243 |
+
p = snap["parsing"]
|
| 244 |
+
lines.append("")
|
| 245 |
+
lines.append(
|
| 246 |
+
f"**Parsing**: native={p['native']} fallback={p['fallback']} failure={p['failure']}"
|
| 247 |
+
)
|
| 248 |
+
return "\n".join(lines)
|
| 249 |
+
|
| 250 |
+
|
| 251 |
+
# ---------------------------------------------------------------------------
|
| 252 |
+
# UI
|
| 253 |
+
# ---------------------------------------------------------------------------
|
| 254 |
+
def build_demo() -> gr.Blocks:
|
| 255 |
+
registry = build_default_registry()
|
| 256 |
+
cards_md = "## Active agents\n\n" + "\n".join(
|
| 257 |
+
f"- **{c.name}** ({c.role}) — {c.description}" for c in registry.list()
|
| 258 |
+
)
|
| 259 |
+
|
| 260 |
+
# ``theme`` moved to launch() in Gradio 6+; we still support 4/5 by passing
|
| 261 |
+
# it here AND at launch() — the latter wins on newer versions.
|
| 262 |
+
with gr.Blocks(title="Nutrition MAS — Multi-Agent Demo") as demo:
|
| 263 |
+
gr.Markdown(
|
| 264 |
+
"""
|
| 265 |
+
# 🥗 Nutrition Multi-Agent System
|
| 266 |
+
A LangGraph + Gemini orchestrator that delegates to a Medical
|
| 267 |
+
Assessment specialist, a Planner (with a PuLP linear-program meal
|
| 268 |
+
solver), a Validator critic, and a citation-first Knowledge
|
| 269 |
+
agent. Bring your own Gemini API keys.
|
| 270 |
+
"""
|
| 271 |
+
)
|
| 272 |
+
|
| 273 |
+
# gr.State deepcopies the initial value on every session, so seed it
|
| 274 |
+
# with None and let the chat handler instantiate SessionState lazily.
|
| 275 |
+
session_state = gr.State(None)
|
| 276 |
+
|
| 277 |
+
with gr.Row():
|
| 278 |
+
# ---------------- Sidebar ----------------
|
| 279 |
+
with gr.Column(scale=1):
|
| 280 |
+
gr.Markdown("### 1. Setup")
|
| 281 |
+
# Multi-line key input — type="password" doesn't allow >1 line in
|
| 282 |
+
# Gradio 5+, so we use plain text and rely on browser/HF Space
|
| 283 |
+
# to keep it ephemeral.
|
| 284 |
+
api_keys = gr.Textbox(
|
| 285 |
+
label="Gemini API key(s) — one per line",
|
| 286 |
+
placeholder="AIza...\nAIza...",
|
| 287 |
+
lines=3,
|
| 288 |
+
)
|
| 289 |
+
coach_model = gr.Dropdown(
|
| 290 |
+
label="Coach model",
|
| 291 |
+
choices=["gemini-2.5-pro", "gemini-2.5-flash"],
|
| 292 |
+
value="gemini-2.5-pro",
|
| 293 |
+
)
|
| 294 |
+
workers_model = gr.Dropdown(
|
| 295 |
+
label="Workers (Medical / Planner) model",
|
| 296 |
+
choices=["gemini-2.5-pro", "gemini-2.5-flash"],
|
| 297 |
+
value="gemini-2.5-pro",
|
| 298 |
+
)
|
| 299 |
+
tools_model = gr.Dropdown(
|
| 300 |
+
label="Validator / Tools model",
|
| 301 |
+
choices=["gemini-2.5-flash", "gemini-2.5-pro"],
|
| 302 |
+
value="gemini-2.5-flash",
|
| 303 |
+
)
|
| 304 |
+
rate_limit = gr.Checkbox(label="Rate-limit Gemini calls", value=True)
|
| 305 |
+
debug_on = gr.Checkbox(label="Debug logging", value=False)
|
| 306 |
+
init_btn = gr.Button("Initialize system", variant="primary")
|
| 307 |
+
init_status = gr.Markdown()
|
| 308 |
+
|
| 309 |
+
gr.Markdown("### 2. Your profile")
|
| 310 |
+
p_name = gr.Textbox(label="Name", value="Demo User")
|
| 311 |
+
p_age = gr.Number(label="Age", value=30, precision=0)
|
| 312 |
+
p_sex = gr.Radio(label="Sex", choices=["male", "female"], value="male")
|
| 313 |
+
p_height = gr.Number(label="Height (cm)", value=175)
|
| 314 |
+
p_weight = gr.Number(label="Weight (kg)", value=72)
|
| 315 |
+
p_activity = gr.Dropdown(
|
| 316 |
+
label="Activity",
|
| 317 |
+
choices=[
|
| 318 |
+
"sedentary",
|
| 319 |
+
"lightly active",
|
| 320 |
+
"moderately active",
|
| 321 |
+
"very active",
|
| 322 |
+
"extra active",
|
| 323 |
+
],
|
| 324 |
+
value="moderately active",
|
| 325 |
+
)
|
| 326 |
+
p_goal = gr.Dropdown(
|
| 327 |
+
label="Goal",
|
| 328 |
+
choices=["lose weight", "maintain weight", "gain muscle", "gain weight"],
|
| 329 |
+
value="maintain weight",
|
| 330 |
+
)
|
| 331 |
+
p_allergies = gr.Textbox(label="Allergies (comma-separated)", value="")
|
| 332 |
+
p_dislikes = gr.Textbox(label="Dislikes", value="")
|
| 333 |
+
p_country = gr.Textbox(label="Country", value="USA")
|
| 334 |
+
p_conditions = gr.Textbox(label="Medical conditions", value="")
|
| 335 |
+
p_medications = gr.Textbox(label="Medications", value="")
|
| 336 |
+
profile_json = gr.Textbox(visible=False)
|
| 337 |
+
|
| 338 |
+
def _refresh_profile(*args: Any) -> str:
|
| 339 |
+
return json.dumps(build_user_profile(*args))
|
| 340 |
+
|
| 341 |
+
for component in [
|
| 342 |
+
p_name, p_age, p_sex, p_height, p_weight, p_activity, p_goal,
|
| 343 |
+
p_allergies, p_dislikes, p_country, p_conditions, p_medications,
|
| 344 |
+
]:
|
| 345 |
+
component.change(
|
| 346 |
+
_refresh_profile,
|
| 347 |
+
inputs=[
|
| 348 |
+
p_name, p_age, p_sex, p_height, p_weight, p_activity, p_goal,
|
| 349 |
+
p_allergies, p_dislikes, p_country, p_conditions, p_medications,
|
| 350 |
+
],
|
| 351 |
+
outputs=profile_json,
|
| 352 |
+
)
|
| 353 |
+
|
| 354 |
+
# ---------------- Main pane ----------------
|
| 355 |
+
with gr.Column(scale=2):
|
| 356 |
+
# Gradio 6 dropped the `type=` parameter; the messages format
|
| 357 |
+
# ([{role, content}]) is now the only one. Older versions still
|
| 358 |
+
# accept `type="messages"` so we keep the same payload shape.
|
| 359 |
+
chatbot = gr.Chatbot(label="Conversation", height=420)
|
| 360 |
+
user_input = gr.Textbox(
|
| 361 |
+
label="Your question",
|
| 362 |
+
placeholder="e.g. Build me a one-day meal plan to gain muscle.",
|
| 363 |
+
lines=2,
|
| 364 |
+
)
|
| 365 |
+
send_btn = gr.Button("Send", variant="primary")
|
| 366 |
+
with gr.Accordion("🔍 Agent activity (live trace)", open=False):
|
| 367 |
+
trace_log = gr.Textbox(label="Log", lines=12, interactive=False)
|
| 368 |
+
with gr.Accordion("📈 System metrics", open=False):
|
| 369 |
+
metrics_md = gr.Markdown()
|
| 370 |
+
with gr.Accordion("🤖 Agent registry (A2A cards)", open=False):
|
| 371 |
+
gr.Markdown(cards_md)
|
| 372 |
+
|
| 373 |
+
init_btn.click(
|
| 374 |
+
initialise_system,
|
| 375 |
+
inputs=[api_keys, coach_model, workers_model, tools_model, rate_limit, debug_on],
|
| 376 |
+
outputs=init_status,
|
| 377 |
+
)
|
| 378 |
+
send_btn.click(
|
| 379 |
+
chat,
|
| 380 |
+
inputs=[user_input, chatbot, session_state, profile_json],
|
| 381 |
+
outputs=[chatbot, trace_log, metrics_md, session_state],
|
| 382 |
+
).then(lambda: "", None, user_input)
|
| 383 |
+
user_input.submit(
|
| 384 |
+
chat,
|
| 385 |
+
inputs=[user_input, chatbot, session_state, profile_json],
|
| 386 |
+
outputs=[chatbot, trace_log, metrics_md, session_state],
|
| 387 |
+
).then(lambda: "", None, user_input)
|
| 388 |
+
|
| 389 |
+
gr.Markdown(
|
| 390 |
+
"""
|
| 391 |
+
---
|
| 392 |
+
**About**: This demo runs a 5-agent system (Coach, Medical
|
| 393 |
+
Assessment, Planner, Validation, Knowledge). The Validator
|
| 394 |
+
applies *deterministic* checks (allergy violations, calorie /
|
| 395 |
+
macro tolerances, HITL escalation) plus an LLM-graded layer for
|
| 396 |
+
medical-flag respect and citation presence. See the GitHub repo
|
| 397 |
+
for the full architecture writeup.
|
| 398 |
+
"""
|
| 399 |
+
)
|
| 400 |
+
return demo
|
| 401 |
+
|
| 402 |
+
|
| 403 |
+
def main() -> None:
|
| 404 |
+
demo = build_demo()
|
| 405 |
+
try:
|
| 406 |
+
demo.queue().launch(theme=gr.themes.Soft())
|
| 407 |
+
except TypeError:
|
| 408 |
+
# Gradio 4.x doesn't accept theme at launch().
|
| 409 |
+
demo.queue().launch()
|
| 410 |
+
|
| 411 |
+
|
| 412 |
+
if __name__ == "__main__": # pragma: no cover
|
| 413 |
+
main()
|
|
@@ -24,6 +24,11 @@ python-dotenv>=1.0,<2
|
|
| 24 |
# Markdown rendering for notebook display (kept for backwards compat)
|
| 25 |
ipython>=8.0
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
# Tests / dev
|
| 28 |
pytest>=8.0
|
| 29 |
pytest-asyncio>=0.24
|
|
|
|
| 24 |
# Markdown rendering for notebook display (kept for backwards compat)
|
| 25 |
ipython>=8.0
|
| 26 |
|
| 27 |
+
# Demo UI (Hugging Face Space entry point — Phase 7).
|
| 28 |
+
# Pinned to a wide range; the app.py shape supports messages-format Chatbot
|
| 29 |
+
# in v4.40+ all the way through v6.x.
|
| 30 |
+
gradio>=4.40,<7
|
| 31 |
+
|
| 32 |
# Tests / dev
|
| 33 |
pytest>=8.0
|
| 34 |
pytest-asyncio>=0.24
|
|
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Lightweight smoke tests for app.py.
|
| 2 |
+
|
| 3 |
+
Building the Gradio Blocks doesn't require the system to be initialised
|
| 4 |
+
(no API keys), so we can verify the UI compiles cleanly and the per-call
|
| 5 |
+
helpers work without ever launching a server.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from __future__ import annotations
|
| 9 |
+
|
| 10 |
+
import json
|
| 11 |
+
|
| 12 |
+
import pytest
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def test_app_imports() -> None:
|
| 16 |
+
import app # noqa: F401
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
def test_build_demo_compiles() -> None:
|
| 20 |
+
"""Calling build_demo() must not raise — catches Gradio API drift."""
|
| 21 |
+
pytest.importorskip("gradio")
|
| 22 |
+
from app import build_demo
|
| 23 |
+
|
| 24 |
+
demo = build_demo()
|
| 25 |
+
assert demo is not None
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
def test_build_user_profile_round_trip() -> None:
|
| 29 |
+
from app import build_user_profile
|
| 30 |
+
|
| 31 |
+
payload = build_user_profile(
|
| 32 |
+
name="Test",
|
| 33 |
+
age=30,
|
| 34 |
+
sex="male",
|
| 35 |
+
height_cm=175,
|
| 36 |
+
weight_kg=72,
|
| 37 |
+
activity="moderately active",
|
| 38 |
+
goal="maintain weight",
|
| 39 |
+
allergies="peanut, shrimp",
|
| 40 |
+
dislikes="okra",
|
| 41 |
+
country="Egypt",
|
| 42 |
+
conditions="hypertension",
|
| 43 |
+
medications="lisinopril",
|
| 44 |
+
)
|
| 45 |
+
# Round-trip via JSON to mirror what the hidden Textbox carries.
|
| 46 |
+
serialised = json.dumps(payload)
|
| 47 |
+
parsed = json.loads(serialised)
|
| 48 |
+
|
| 49 |
+
assert parsed["user_profile"]["name"] == "Test"
|
| 50 |
+
assert parsed["user_profile"]["allergies"] == ["peanut", "shrimp"]
|
| 51 |
+
assert parsed["medical_history"]["conditions"] == ["hypertension"]
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
def test_render_metrics_is_markdown() -> None:
|
| 55 |
+
from app import _render_metrics
|
| 56 |
+
|
| 57 |
+
snap = {
|
| 58 |
+
"agents": {"Coach": {"calls": 1, "total_seconds": 0.5, "errors": 0, "last_seconds": 0.5}},
|
| 59 |
+
"tools": {"QuantitiesFinder": {"calls": 2, "total_seconds": 0.1, "errors": 0, "last_seconds": 0.05}},
|
| 60 |
+
"parsing": {"native": 5, "fallback": 0, "failure": 0, "by_model": {}},
|
| 61 |
+
}
|
| 62 |
+
md = _render_metrics(snap)
|
| 63 |
+
assert "Coach" in md
|
| 64 |
+
assert "QuantitiesFinder" in md
|
| 65 |
+
assert "native=5" in md
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
def test_session_state_default_shape() -> None:
|
| 69 |
+
from app import SessionState
|
| 70 |
+
|
| 71 |
+
s = SessionState()
|
| 72 |
+
assert s.initialised is False
|
| 73 |
+
assert s.memory == {
|
| 74 |
+
"user_profile": {},
|
| 75 |
+
"medical_history": {},
|
| 76 |
+
"flags_and_assessments": {},
|
| 77 |
+
"plans": {},
|
| 78 |
+
}
|
| 79 |
+
assert s.conversation_history == []
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
def test_chat_handles_uninitialised_system() -> None:
|
| 83 |
+
"""Calling chat() before init must not crash; returns a friendly error."""
|
| 84 |
+
pytest.importorskip("gradio")
|
| 85 |
+
from app import SessionState, chat
|
| 86 |
+
|
| 87 |
+
# Make sure nutritionmas.APP is None so we hit the guard.
|
| 88 |
+
import nutritionmas
|
| 89 |
+
nutritionmas.APP = None
|
| 90 |
+
|
| 91 |
+
history, log, metrics, session = chat(
|
| 92 |
+
user_message="hi", history=[], session=SessionState(), profile_json=""
|
| 93 |
+
)
|
| 94 |
+
# messages-format chatbot: list of {role, content} dicts
|
| 95 |
+
assert history[-1]["role"] == "assistant"
|
| 96 |
+
assert history[-1]["content"].startswith("❌ System not initialised")
|