| # **Product Requirements Document: StepWise Math (Gradio MCP Implementation)** | |
| | Document Info | Details | | |
| | :--------------------- | :------------------------------------------------------------------------------- | | |
| | **Product Name** | StepWise Math - Gradio MCP Framework | | |
| | **Version** | 2.0 (MCP Server with Two-Stage Pipeline) | | |
| | **Status** | Active | | |
| | **Target Demographic** | Middle School (Grades 6-8) to High School (Grades 9-10) | | |
| | **Tech Stack** | **Gradio 6.0+**, Python, **Google Gemini 2.5 Flash & 3.0 Pro**, **MCP Protocol** | | |
| | **Deployment** | Hugging Face Spaces, Docker, Local Development | | |
| ## **1. Executive Summary** | |
| **StepWise Math** is a Gradio-based web application with **Model Context Protocol (MCP) server capabilities** that converts static mathematical problems—supplied via text, screenshots, or URLs—into **interactive, step-by-step visual proofs**. | |
| Unlike a calculator that just gives the answer, StepWise Math builds a bespoke HTML5 application that guides students through the logical stages of a proof or concept. It breaks down complex ideas into incremental steps (e.g., "Step 1: Construct the shape", "Step 2: Apply the transformation", "Step 3: Observe the result"), allowing students to manipulate variables at each stage to internalize the logic. | |
| ### **Key Architecture Features** | |
| * **MCP Server Integration:** Exposes proof generation tools via the Model Context Protocol, enabling AI agents and external tools to programmatically generate mathematical proofs | |
| * **Two-Stage AI Pipeline:** Separates concept analysis (fast, ~10-15s) from code generation (moderate, ~20-30s) to comply with MCP timeout constraints | |
| * **Gradio Framework:** Provides both web UI and MCP server endpoints through a single application instance | |
| * **Docker-Ready:** Fully containerized for deployment to Hugging Face Spaces or any Docker-compatible environment | |
| ## **2. Target Audience** | |
| * **Students (Grades 6-10):** Visual learners who need structured guidance to understand abstract concepts in Geometry, Algebra, and Trigonometry. | |
| * **Math Teachers:** Need "digital manipulatives" that walk the class through a concept phase-by-phase. | |
| * **Tutors:** Need to generate custom step-by-step explanations for specific homework problems. | |
| * **AI Agents & Developers:** Can programmatically generate mathematical proofs via MCP protocol integration for educational tools, chatbots, or automated tutoring systems. | |
| ## **3. User Flow & Experience** | |
| ### **3.1 High-Level Flow (Gradio Web UI)** | |
| 1. **Initial Load:** Application automatically loads first example proof to demonstrate capabilities | |
| 2. **Input Selection:** User chooses input mode (Text, URL, or Image) via Gradio interface | |
| 3. **Content Entry:** User provides the mathematical concept or problem | |
| 4. **API Key (Optional):** User can override default API key in settings | |
| 5. **Two-Stage Generation:** | |
| - **Stage 1:** Concept Analysis (Gemini 2.5 Flash, ~10-15s) → JSON Specification | |
| - **Stage 2:** Code Generation (Gemini 3.0 Pro, ~20-30s) → Interactive HTML/JS Application | |
| 6. **Visualization:** Generated proof displays in iframe with step navigation | |
| 7. **Actions:** Save to library, export as JSON, or refine with feedback | |
|  | |
| ### **3.2 MCP Server Flow (Programmatic Access)** | |
| 1. **Tool Discovery:** AI agent connects to Gradio MCP server endpoint | |
| 2. **Available Tools:** | |
| - `analyze_concept_from_text`: Analyzes text-based mathematical concept → returns JSON spec | |
| - `analyze_concept_from_url`: Analyzes concept from URL → returns JSON spec | |
| - `analyze_concept_from_image`: Analyzes concept from image → returns JSON spec | |
| - `generate_code_from_concept`: Generates interactive proof code from JSON spec → returns HTML/JS | |
| 3. **Two-Step Invocation:** | |
| - **Step 1:** Agent calls `analyze_concept_from_text/url/image` with input | |
| - **Step 2:** Agent calls `generate_code_from_concept` with JSON from Step 1 | |
| 4. **Output Handling:** Agent receives HTML/JS code for rendering or further processing | |
|  | |
| **MCP Architecture Diagram:** | |
|  | |
| ### **3.3 Feedback & Iteration Flow** | |
| The user can view the generated proof and provide text feedback (e.g., "Make the triangle red" or "Add a step for area calculation"). | |
| 1. **User Feedback:** User enters text in the "Refinement" panel (Gradio Textbox) | |
| 2. **Intent Analysis:** The system determines if the request requires a structural change (new steps) or just a visual update | |
| 3. **Regeneration:** | |
| - **Stage 1 (Refine Spec):** Gemini 2.5 Flash updates the JSON spec based on feedback | |
| - **Stage 2 (Refine Code):** Gemini 3.0 Pro rewrites the application code using the new spec and specific user instructions | |
| 4. **Update:** The Gradio HTML component refreshes with the modified application | |
| ## **4. Functional Requirements** | |
| ### **4.1 Multi-Modal Input Handling** | |
| The app must accept three distinct types of input: | |
| 1. **Natural Language Text:** e.g., "Prove the Pythagorean Theorem." | |
| 2. **Image/Screenshot:** A photo of a textbook problem. | |
| 3. **URL:** A link to a math concept video or page. | |
| * **Validation:** The system must strictly validate inputs (e.g., check for empty text, valid URL format, or missing image files) before communicating with the AI to prevent "hallucinated" default responses. | |
| ### **4.2 The "Thinking" Engine (Gemini Integration)** | |
| **Two-Stage Pipeline Architecture (MCP-Optimized)** | |
| The application uses a **two-stage AI pipeline** specifically designed to comply with MCP protocol timeout constraints (typically <30 seconds per tool call): | |
| * **Stage 1: Concept Decomposition (The Teacher)** | |
| * **Model:** `gemini-2.5-flash` | |
| * **Role:** Identifies the mathematical concept and breaks it down into a logical teaching sequence | |
| * **Execution Time:** ~10-15 seconds | |
| * **Output:** JSON Spec containing a list of **Steps**. Each step defines what the user should do and what they should see | |
| * **MCP Exposure:** Three separate tools based on input type: | |
| - `analyze_concept_from_text(text_input, api_key)` | |
| - `analyze_concept_from_url(url_input, api_key)` | |
| - `analyze_concept_from_image(image_input, api_key)` | |
| * **Return Value:** JSON string containing the `MathSpec` | |
| * **Stage 2: Implementation (The Engineer)** | |
| * **Model:** `gemini-3-pro-preview` | |
| * **Config:** `thinkingConfig: { thinkingBudget: 4096 }` | |
| * **Role:** Writes the HTML5/Canvas code based on the JSON specification | |
| * **Execution Time:** ~20-30 seconds | |
| * **Requirement:** The generated app must include a **Step Navigation System** (Next/Previous buttons, Progress bar) and distinct visual states for each step | |
| * **MCP Exposure:** Single tool for code generation: | |
| - `generate_code_from_concept(concept_json, api_key)` | |
| * **Return Value:** HTML string containing complete interactive application | |
| **Why Two Stages for MCP?** | |
| MCP protocol has strict timeout limitations. The original single-step `generate_proof` operation took 30-60 seconds, exceeding MCP timeout windows. By splitting into two independent operations: | |
| - Each operation completes within timeout constraints | |
| - AI agents can cache the JSON spec and regenerate code multiple times without re-analyzing | |
| - More granular control over the generation process | |
| - Better error recovery (if Stage 2 fails, Stage 1 results are preserved) | |
| ### **4.3 Output & Interaction** | |
| * **Guided Experience:** The app starts at "Step 1". The user reads an instruction, interacts with the visual, and clicks "Next" to proceed | |
| * **Interactive Canvas:** Graphics update based on the current step. For example, construction lines might appear only in Step 2 | |
| * **Live Feedback:** Equations and values update in real-time as user drags elements | |
| * **Gradio Components:** | |
| - **HTML Component:** Displays the generated interactive proof application | |
| - **JSON Component:** Shows the MathSpec structure for debugging/inspection | |
| - **Textbox Components:** Display process logs and thinking streams | |
| - **Accordion/Tab Layouts:** Organize different views (Proof, Spec, Code, Logs) | |
| ### **4.4 Feedback Loop** | |
| * **Refinement Interface:** A text input field below the simulation area allows users to request changes. | |
| * **Context Awareness:** The AI must receive the *previous* JSON specification and the *new* user feedback to generate a delta or a completely new version. | |
| * **Logic:** | |
| * If the feedback changes the math concept (e.g., "Switch to Isosceles"), Stage 1 must regenerate the steps. | |
| * If the feedback is cosmetic (e.g., "Dark mode"), Stage 2 must implement it while preserving the logic. | |
| ### **4.5 Export & Sharing** [Coming Soon] | |
| * **Export Button:** A Gradio Button to download the current session | |
| * **Format:** A JSON file containing: | |
| * **Input Data:** The original problem text, URL, or image data | |
| * **Math Concept:** The JSON Specification (Steps, Explanation) | |
| * **Source Code:** The Generated HTML/JS | |
| * **Metadata:** Timestamp, input mode | |
| * **Exclusion:** Process Logs are **not** included in the export file to keep it clean | |
| * **Import Capability:** A Gradio File Upload component to restore a previously exported JSON file. This restores the input fields, the concept specification, and the interactive proof code | |
| * **Implementation:** Uses Gradio's `gr.DownloadButton` and `gr.File` components | |
| ### **4.6 Persistence (Local Storage via ProofLibrary)** [Coming Soon] | |
| * **Save Capability:** Users can save the currently generated proof (Math Spec + Code) using a Gradio Button | |
| * **Backend Storage:** Python-based `ProofLibrary` class manages proof persistence: | |
| - Saves to `saved_proofs/` directory as JSON files | |
| - Each proof file contains: title, timestamp, input data, concept spec, generated code | |
| - Filename format: `proof_YYYYMMDD_HHMMSS.json` | |
| * **Library View:** A Gradio component (Dropdown or Gallery) lists previously saved items with timestamps and concept titles | |
| * **Load Capability:** Users can instantly restore a previously generated proof from the library without re-querying the AI | |
| * **File System:** Unlike browser Local Storage, Gradio implementation uses server-side file system storage for better reliability and Docker compatibility | |
| ### **4.7 Configuration & API Key Management** | |
| * **Configuration Interface:** Gradio Accordion component in settings panel | |
| * **API Key Management:** | |
| * The app defaults to using the `GEMINI_API_KEY` from environment variables (`os.getenv("GEMINI_API_KEY")`) | |
| * Users can optionally provide their own API Key via a Gradio Textbox (type="password") | |
| * **Logic:** If a custom key is provided to MCP tools or web UI, it takes precedence over the environment variable | |
| * **MCP Tools:** API key is an optional parameter in all MCP-exposed functions: | |
| - `analyze_concept_from_text(text_input, api_key: Optional[str] = None)` | |
| - `generate_code_from_concept(concept_json, api_key: Optional[str] = None)` | |
| * **Security:** Custom API keys are passed per-request and not persisted server-side | |
| * **Environment Variable Setup:** | |
| ```bash | |
| # Linux/Mac | |
| export GEMINI_API_KEY="your-api-key-here" | |
| # Windows PowerShell | |
| $env:GEMINI_API_KEY="your-api-key-here" | |
| # Docker | |
| docker run -e GEMINI_API_KEY="your-key" -p 7860:7860 hf-stepwise-math | |
| ``` | |
| ### **4.8 Thinking Process Streaming (Enhanced UI)** [Coming Soon] | |
| * **Streaming Thoughts:** The application utilizes Gemini's `thinkingConfig` with `includeThoughts: true` to capture the model's internal reasoning process | |
| * **Gradio Implementation:** | |
| * **Textbox Component:** Displays thinking stream with `max_lines=20` for scrollable content | |
| * **Real-time Updates:** Uses Gradio's streaming capabilities to update UI as thoughts arrive | |
| * **Markdown Rendering:** Gradio automatically renders Markdown in textboxes when configured | |
| * **Display Features:** | |
| * **Timer:** Progress message showing elapsed time (e.g., "Running for 12s") | |
| * **Structured Layout:** Separate sections for "Analysis Phase" and "Code Generation Phase" | |
| * **Collapsible Accordions:** Users can expand/collapse thought details to focus on results | |
| * **Process Logs:** | |
| * Separate from thinking stream | |
| * Shows high-level pipeline progress: "Starting Stage 1...", "Concept analyzed", "Generating code..." | |
| * Stored in `GeminiPipeline.process_logs` list for debugging | |
| ### **4.9 Pre-loaded Examples Library** | |
| * **Examples Section:** Gradio Dropdown component populated from `examples/` directory | |
| * **File Format:** Each example is a JSON file containing: | |
| ```json | |
| { | |
| "title": "Visual Proof: Pythagorean Theorem", | |
| "input_mode": "Text", | |
| "input_data": "Prove the Pythagorean theorem...", | |
| "concept": { /* MathSpec JSON */ }, | |
| "code": "<!DOCTYPE html>..." | |
| } | |
| ``` | |
| * **Initial Load Behavior:** On application startup, automatically load the first example (or a designated default example) to: | |
| - Provide immediate visual demonstration of app capabilities | |
| - Avoid empty/blank initial state | |
| - Give users instant understanding of the output format | |
| - Enable immediate interaction without waiting for AI generation | |
| * **One-Click Loading:** Selecting an example from dropdown triggers a Gradio event handler that: | |
| - Populates input fields with example data | |
| - Loads the pre-generated concept spec into JSON viewer | |
| - Renders the code in the HTML iframe | |
| - Bypasses AI generation for instant loading | |
| * **Content:** The library covers diverse topics: | |
| - Geometry: Pythagorean Theorem, Area of Quadrilaterals, Altitude-Hypotenuse Ratios | |
| - Probability: Probability of Odd Sums | |
| - Algebra: Diagonals in Rhombus | |
| * **Example Files:** Located in `examples/` directory with naming convention `001-visual-proof-{topic}.json` | |
| * **Default Example:** First example in alphabetical order (`001-visual-proof-probability-of-an-odd-sum.json`) loads automatically on app initialization | |
| ### **4.10 MCP Server Integration** | |
| * **Gradio MCP Support:** Application launches with `mcp_server=True` flag to enable MCP protocol endpoints | |
| * **Tool Exposure Mechanism:** | |
| - Gradio only exposes methods that are connected to UI components as MCP tools | |
| - Hidden UI components (created with `visible=False` in a `gr.Group`) are used to expose MCP-specific methods | |
| - Event handlers connect methods to hidden buttons/textboxes for MCP discovery | |
| * **Exposed MCP Tools (4 Total):** | |
| 1. **analyze_concept_from_text** | |
| - **Parameters:** `text_input: str`, `api_key: Optional[str]` | |
| - **Returns:** JSON string containing MathSpec | |
| - **Purpose:** Fast concept analysis for text input | |
| - **Timeout:** ~10-15 seconds | |
| 2. **analyze_concept_from_url** | |
| - **Parameters:** `url_input: str`, `api_key: Optional[str]` | |
| - **Returns:** JSON string containing MathSpec | |
| - **Purpose:** Fast concept analysis from URL content | |
| - **Timeout:** ~10-15 seconds | |
| 3. **analyze_concept_from_image** | |
| - **Parameters:** `image_input: str` (base64 or file path), `api_key: Optional[str]` | |
| - **Returns:** JSON string containing MathSpec | |
| - **Purpose:** Fast concept analysis from image | |
| - **Timeout:** ~10-15 seconds | |
| 4. **generate_code_from_concept** | |
| - **Parameters:** `concept_json: str`, `api_key: Optional[str]` | |
| - **Returns:** HTML string containing interactive proof application | |
| - **Purpose:** Generate code from previously analyzed concept | |
| - **Timeout:** ~20-30 seconds | |
| * **MCP Usage Pattern:** | |
| ```python | |
| # Step 1: Analyze concept | |
| concept_json = mcp_client.call_tool( | |
| "analyze_concept_from_text", | |
| {"text_input": "Prove Pythagorean theorem", "api_key": "optional-key"} | |
| ) | |
| # Step 2: Generate code | |
| html_code = mcp_client.call_tool( | |
| "generate_code_from_concept", | |
| {"concept_json": concept_json, "api_key": "optional-key"} | |
| ) | |
| ``` | |
| * **Testing MCP Server:** | |
| ```bash | |
| # Launch MCP Inspector | |
| npx @modelcontextprotocol/inspector | |
| # Connect to: http://localhost:7860/mcp | |
| # Available tools will appear in inspector UI | |
| ``` | |
| * **Hidden UI Components (MCP Exposure):** | |
| ```python | |
| with gr.Group(visible=False) as mcp_hidden_group: | |
| # Analysis tools | |
| mcp_analyze_text_input = gr.Textbox() | |
| mcp_analyze_text_btn = gr.Button() | |
| mcp_analyze_text_output = gr.Textbox() | |
| # Code generation tool | |
| mcp_generate_concept_input = gr.Textbox() | |
| mcp_generate_concept_btn = gr.Button() | |
| mcp_generate_concept_output = gr.Textbox() | |
| # Event handlers connect methods for MCP | |
| mcp_analyze_text_btn.click( | |
| fn=analyze_concept_from_text, | |
| inputs=[mcp_analyze_text_input], | |
| outputs=[mcp_analyze_text_output] | |
| ) | |
| ``` | |
| ## **5. Feature Specifications (Examples)** | |
| ### **Example 1: The Pythagorean Theorem** | |
| * **Generated App Steps:** | |
| * **Step 1: Setup:** Display a right triangle. User drags vertices to resize. "Observe the legs a, b and hypotenuse c." | |
| * **Step 2: Geometric Construction:** Squares appear on each side. "We build a square on each side of the triangle." | |
| * **Step 3: Area Calculation:** The app calculates the area of each square. "Note the values: A = a², B = b², C = c²." | |
| * **Step 4: The Proof:** The app rearranges the areas or shows the equation `Area A + Area B = Area C`. User drags vertices to verify it holds true for *any* right triangle. | |
| ### **Example 2: Slope Intercept Form** | |
| * **Input:** Text: "Explain y = mx + b" | |
| * **Generated App Steps:** | |
| * **Step 1: The Grid:** Shows a coordinate plane. | |
| * **Step 2: The Y-Intercept:** User adjusts slider `b`. The line moves up/down. "b controls where the line crosses the Y-axis." | |
| * **Step 3: The Slope:** User adjusts slider `m`. The line rotates. "m controls the steepness." | |
| * **Step 4: Prediction:** User is asked to set sliders to match a target line. | |
| ## **9. Development & Testing** | |
| ### **9.1 Local Development** | |
| ```bash | |
| # Clone repository | |
| git clone <repo-url> | |
| cd hf-StepWise-Math/gradio-app | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Set API key | |
| export GEMINI_API_KEY="your-api-key" # Linux/Mac | |
| $env:GEMINI_API_KEY="your-api-key" # Windows PowerShell | |
| # Run application | |
| python app.py | |
| ``` | |
| ### **9.2 Testing MCP Integration** | |
| ```bash | |
| # Terminal 1: Launch Gradio app with MCP server | |
| cd gradio-app | |
| python app.py | |
| # Terminal 2: Launch MCP Inspector | |
| npx @modelcontextprotocol/inspector | |
| # In Inspector UI: | |
| # 1. Connect to: http://localhost:7860/mcp | |
| # 2. Verify 4 tools appear: analyze_concept_from_text/url/image, generate_code_from_concept | |
| # 3. Test two-step workflow: | |
| # - Call analyze_concept_from_text with test input | |
| # - Copy JSON result | |
| # - Call generate_code_from_concept with JSON | |
| ``` | |
| ### **9.3 Docker Testing** | |
| ```bash | |
| # Build image | |
| docker build -t hf-stepwise-math . | |
| # Run container | |
| docker run --rm -it -e GEMINI_API_KEY="your-key" -p 7860:7860 hf-stepwise-math | |
| # Access at: http://localhost:7860 | |
| # MCP endpoint: http://localhost:7860/mcp | |
| ``` | |
| ### **9.4 Test Files** | |
| * `test_app.py`: Unit tests for Gradio components | |
| * `test_generate_proof.py`: Integration tests for AI pipeline | |
| * `test_logging.py`: Logging system validation | |
| * Example files in `examples/`: Pre-generated proofs for UI testing | |
| ### **9.5 Deployment Checklist** | |
| - [ ] GEMINI_API_KEY environment variable configured | |
| - [ ] Docker image builds successfully | |
| - [ ] MCP server accessible at `/mcp` endpoint | |
| - [ ] All 4 MCP tools discoverable | |
| - [ ] Example library loads correctly | |
| - [ ] **Default example auto-loads on app initialization** | |
| - [ ] Save/Load functionality works with file system | |
| - [ ] Export/Import produces valid JSON | |
| - [ ] Two-stage pipeline completes within timeout constraints | |
| **MathSpec (JSON)** | |
| ```json | |
| { | |
| "conceptTitle": "Pythagorean Theorem", | |
| "educationalGoal": "Prove a^2 + b^2 = c^2", | |
| "explanation": "In a right-angled triangle...", | |
| "steps": [ | |
| { | |
| "stepTitle": "The Triangle", | |
| "instruction": "Drag the red dots to change the shape of the right triangle.", | |
| "visualFocus": "Triangle ABC" | |
| }, | |
| { | |
| "stepTitle": "Adding Squares", | |
| "instruction": "Click Next to visualize squares attached to each side.", | |
| "visualFocus": "Squares on sides a, b, c" | |
| } | |
| ], | |
| "visualSpec": { | |
| "elements": ["Triangle", "Squares", "Grid"], | |
| "interactions": ["Drag Vertex", "Hover info"], | |
| "mathLogic": "Calculate distances..." | |
| } | |
| } | |
| ``` | |
| ## **7. UI/UX Design (Gradio Components)** | |
| * **Input Panel (Left Column):** | |
| - **Radio Buttons:** Select input mode (Text/URL/Image) | |
| - **Conditional Components:** Show relevant input field based on mode | |
| - **Textbox:** For text input | |
| - **Textbox:** For URL input | |
| - **Image Upload:** For image input | |
| - **Textbox (Password):** Optional API key override | |
| - **Button:** "Generate Proof" | |
| - **Dropdown:** Example selection | |
| - **Accordion:** Settings panel | |
| * **Output Panel (Right Column):** | |
| - **Tabs Component:** | |
| 1. **Interactive Proof:** `gr.HTML()` component displaying generated application | |
| 2. **Concept Spec:** `gr.JSON()` component showing MathSpec structure | |
| 3. **Source Code:** `gr.Code(language="html")` for viewing/editing generated code | |
| 4. **Process Logs:** `gr.Textbox()` with process execution details | |
| 5. **Thinking Stream:** `gr.Textbox()` with AI reasoning (collapsible accordion) | |
| * **Action Buttons:** | |
| - **Save to Library:** Stores proof to `saved_proofs/` directory | |
| - **Export/Download:** `gr.DownloadButton()` for JSON export | |
| - **Import/Upload:** `gr.File()` for JSON import | |
| - **Load from Library:** `gr.Dropdown()` with saved proofs | |
| * **Refinement Panel (Below Output):** | |
| - **Textbox:** Multi-line input for feedback | |
| - **Button:** "Refine Proof" to trigger regeneration | |
| ## **8. Technical Constraints & Requirements** | |
| ### **8.1 MCP Protocol Constraints** | |
| * **Timeout Limitation:** Each MCP tool call must complete within ~30 seconds | |
| * **Solution:** Two-stage pipeline splits 30-60s operation into 10-15s + 20-30s stages | |
| * **Tool Discovery:** Tools must be connected to Gradio UI components (even if hidden) to be exposed via MCP | |
| * **Parameter Handling:** All MCP tool parameters must be either required or properly typed with `Optional[]` | |
| * **No Conditional Parameters:** Cannot require different parameters based on conditions (e.g., url_input required when mode="URL") | |
| ### **8.2 Gemini API Configuration** | |
| * **Thinking Budget:** Uses high reasoning budget (4096) for Stage 2 (code generation) to ensure robust logic | |
| * **Model Selection:** | |
| - Stage 1: `gemini-2.5-flash` for fast analysis | |
| - Stage 2: `gemini-3-pro-preview` for extended thinking during code generation | |
| * **API Key Management:** Supports environment variable fallback with optional per-request override | |
| ### **8.3 Gradio Framework** | |
| * **Version:** Gradio 6.0+ (supports `mcp_server=True` flag) | |
| * **Launch Command:** `demo.launch(mcp_server=True, share=False, server_port=7860)` | |
| * **Hidden Components:** Required for MCP tool exposure without cluttering UI | |
| * **Event Handlers:** Connect Python methods to Gradio components for both UI and MCP access | |
| ### **8.4 Visual Clarity (Generated Code)** | |
| * **Layout Requirements:** Generated code **MUST** prioritize visual clarity | |
| * **Separation:** Overlapping elements are strictly prohibited | |
| * **CSS/Layout:** Must use Flexbox/Grid to strictly separate graphics area (Canvas/SVG) from controls and textual instructions | |
| * **Step Navigation:** Generated apps must include prominent navigation UI (buttons, progress indicators) | |
| ### **8.5 Docker & Deployment** | |
| * **Containerization:** Application is fully Docker-ready with `Dockerfile` and `requirements.txt` | |
| * **Port Mapping:** Exposes port 7860 for web UI and MCP server | |
| * **Environment Variables:** API key passed via `-e GEMINI_API_KEY="..."` | |
| * **Hugging Face Spaces:** Compatible with HF Spaces deployment (uses `gradio` template) | |
| * **Build Command:** `docker build -t hf-stepwise-math .` | |
| * **Run Command:** `docker run -e GEMINI_API_KEY="key" -p 7860:7860 hf-stepwise-math` | |
| ### **8.6 Security & Safety** | |
| * **Content Filtering:** Image inputs should be validated for appropriate educational content | |
| * **API Key Security:** Custom keys are per-request only, not persisted server-side | |
| * **CORS:** Gradio handles CORS automatically for web UI and MCP endpoints | |
| * **Rate Limiting:** Consider implementing rate limits for MCP tool calls in production | |
| ## **10. Data Models (Python Implementation)** | |
| **MathSpec (Python Class)** | |
| ```python | |
| class MathSpec: | |
| """Structured mathematical concept specification""" | |
| def __init__(self, data: dict): | |
| self.concept_title = data.get("conceptTitle", "") | |
| self.educational_goal = data.get("educationalGoal", "") | |
| self.explanation = data.get("explanation", "") | |
| self.steps = data.get("steps", []) | |
| self.visual_spec = data.get("visualSpec", {}) | |
| def to_dict(self): | |
| return { | |
| "conceptTitle": self.concept_title, | |
| "educationalGoal": self.educational_goal, | |
| "explanation": self.explanation, | |
| "steps": self.steps, | |
| "visualSpec": self.visual_spec | |
| } | |
| ``` | |
| **MathSpec JSON Format** | |
| ```json | |
| { | |
| "conceptTitle": "Pythagorean Theorem", | |
| "educationalGoal": "Prove a^2 + b^2 = c^2", | |
| "explanation": "In a right-angled triangle...", | |
| "steps": [ | |
| { | |
| "stepTitle": "The Triangle", | |
| "instruction": "Drag the red dots to change the shape of the right triangle.", | |
| "visualFocus": "Triangle ABC" | |
| }, | |
| { | |
| "stepTitle": "Adding Squares", | |
| "instruction": "Click Next to visualize squares attached to each side.", | |
| "visualFocus": "Squares on sides a, b, c" | |
| } | |
| ], | |
| "visualSpec": { | |
| "elements": ["Triangle", "Squares", "Grid"], | |
| "interactions": ["Drag Vertex", "Hover info"], | |
| "mathLogic": "Calculate distances..." | |
| } | |
| } | |
| ``` | |
| **Export/Save Format** | |
| ```json | |
| { | |
| "title": "Visual Proof: Pythagorean Theorem", | |
| "timestamp": "2025-11-25T10:30:00", | |
| "input_mode": "Text", | |
| "input_data": "Prove the Pythagorean theorem using visual methods", | |
| "concept": { | |
| "conceptTitle": "Pythagorean Theorem", | |
| "educationalGoal": "...", | |
| "steps": [...] | |
| }, | |
| "code": "<!DOCTYPE html>...", | |
| "metadata": { | |
| "generated_by": "StepWise Math Gradio MCP", | |
| "version": "2.0" | |
| } | |
| } | |
| ``` | |
| ## **11. File Structure** | |
| ``` | |
| gradio-app/ | |
| ├── app.py # Main Gradio application with MCP server | |
| ├── requirements.txt # Python dependencies | |
| ├── Dockerfile # Container configuration | |
| ├── setup.sh / setup.bat # Environment setup scripts | |
| ├── README.md # Documentation | |
| ├── saved_proofs/ # User-generated proofs (persistent storage) | |
| │ ├── proof_20251125_103000.json | |
| │ └── proof_20251125_154500.json | |
| ├── examples/ # Pre-loaded example proofs | |
| │ ├── 001-visual-proof-pythagorean-theorem.json | |
| │ ├── 002-visual-proof-probability-odd-sum.json | |
| │ └── ... | |
| ├── test_app.py # Unit tests | |
| ├── test_generate_proof.py # Integration tests | |
| └── docs/ # Additional documentation | |
| ├── DEPLOYMENT.md | |
| ├── TEST_GUIDE.md | |
| └── MCP_SCHEMA_README.md | |
| ``` | |
| ## **12. Appendix: MCP Tool Schema** | |
| ### **Tool 1: analyze_concept_from_text** | |
| ```json | |
| { | |
| "name": "analyze_concept_from_text", | |
| "description": "Analyze a text-based mathematical concept and generate structured JSON specification", | |
| "inputSchema": { | |
| "type": "object", | |
| "properties": { | |
| "text_input": { | |
| "type": "string", | |
| "description": "Mathematical concept or problem description" | |
| }, | |
| "api_key": { | |
| "type": "string", | |
| "description": "Optional Gemini API key (uses environment variable if not provided)" | |
| } | |
| }, | |
| "required": ["text_input"] | |
| }, | |
| "outputSchema": { | |
| "type": "string", | |
| "description": "JSON string containing MathSpec structure" | |
| } | |
| } | |
| ``` | |
| ### **Tool 2: analyze_concept_from_url** | |
| ```json | |
| { | |
| "name": "analyze_concept_from_url", | |
| "description": "Analyze mathematical concept from URL and generate structured JSON specification", | |
| "inputSchema": { | |
| "type": "object", | |
| "properties": { | |
| "url_input": { | |
| "type": "string", | |
| "description": "URL to mathematical concept page or video" | |
| }, | |
| "api_key": { | |
| "type": "string", | |
| "description": "Optional Gemini API key" | |
| } | |
| }, | |
| "required": ["url_input"] | |
| } | |
| } | |
| ``` | |
| ### **Tool 3: analyze_concept_from_image** | |
| ```json | |
| { | |
| "name": "analyze_concept_from_image", | |
| "description": "Analyze mathematical concept from image and generate structured JSON specification", | |
| "inputSchema": { | |
| "type": "object", | |
| "properties": { | |
| "image_input": { | |
| "type": "string", | |
| "description": "Base64-encoded image or file path" | |
| }, | |
| "api_key": { | |
| "type": "string", | |
| "description": "Optional Gemini API key" | |
| } | |
| }, | |
| "required": ["image_input"] | |
| } | |
| } | |
| ``` | |
| ### **Tool 4: generate_code_from_concept** | |
| ```json | |
| { | |
| "name": "generate_code_from_concept", | |
| "description": "Generate interactive HTML/JS proof application from JSON concept specification", | |
| "inputSchema": { | |
| "type": "object", | |
| "properties": { | |
| "concept_json": { | |
| "type": "string", | |
| "description": "JSON string containing MathSpec (from analyze_concept_* tools)" | |
| }, | |
| "api_key": { | |
| "type": "string", | |
| "description": "Optional Gemini API key" | |
| } | |
| }, | |
| "required": ["concept_json"] | |
| }, | |
| "outputSchema": { | |
| "type": "string", | |
| "description": "HTML string containing complete interactive proof application" | |
| } | |
| } | |
| ``` | |
| ## **13. Model Context Protocol (MCP) Integration** | |
| StepWise Math functions as a complete **MCP Server**, exposing its capabilities to external AI agents and automation tools. This enables programmatic access to the visual proof generation pipeline. | |
| ### **13.1 MCP Tools** | |
| The application exposes **4 primary MCP tools** organized in a two-step workflow: | |
| #### **Step 1: Specification Creation Tools** | |
| 1. **`create_math_specification_from_text`** | |
| - Creates a structured teaching specification from natural language descriptions | |
| - Input: Text description of the math problem | |
| - Output: JSON specification with teaching steps | |
| - Processing time: ~10-15 seconds | |
| 2. **`create_math_specification_from_url`** | |
| - Creates a specification from web resources (Wikipedia, Khan Academy, etc.) | |
| - Input: URL pointing to math content | |
| - Output: JSON specification with teaching steps | |
| - Processing time: ~10-15 seconds | |
| 3. **`create_math_specification_from_image`** | |
| - Creates a specification from uploaded images (textbook problems, screenshots, handwritten notes) | |
| - Input: PIL Image object | |
| - Output: JSON specification with teaching steps | |
| - Processing time: ~10-15 seconds | |
| #### **Step 2: Application Building Tool** | |
| 4. **`build_interactive_proof_from_specification`** | |
| - Builds a complete HTML/JavaScript application from a specification | |
| - Input: JSON specification from any Step 1 tool | |
| - Output: Self-contained HTML document | |
| - Processing time: ~20-30 seconds | |
| ### **13.2 MCP Prompts** | |
| Pre-defined prompts guide agents on effective tool usage: | |
| 1. **`create_visual_math_proof`** - Complete workflow for creating visual proofs | |
| 2. **`create_math_specification`** - Focus on pedagogical design | |
| 3. **`build_from_specification`** - Focus on implementation with customization | |
| ### **13.3 MCP Resources** | |
| The server provides helpful templates and examples: | |
| | Resource URI | Description | Type | | |
| | ----------------------------------- | ------------------------------------- | -------- | | |
| | `stepwise://specification-template` | JSON template for math specifications | JSON | | |
| | `stepwise://example-pythagorean` | Complete Pythagorean theorem example | JSON | | |
| | `stepwise://example-probability` | Probability visualization example | JSON | | |
| | `stepwise://workflow-guide` | Two-step workflow documentation | Markdown | | |
| ### **13.4 MCP Use Cases** | |
| **For AI Agents:** | |
| - Automatically generate visual proofs from student questions | |
| - Create custom teaching materials on-demand | |
| - Build interactive homework help applications | |
| **For Automation:** | |
| - Batch-process textbook problems into interactive visualizations | |
| - Convert curriculum PDFs into step-by-step interactive lessons | |
| - Generate proof variations for different learning styles | |
| **For Integration:** | |
| - Embed in learning management systems (LMS) | |
| - Connect to homework platforms | |
| - Integrate with educational chatbots | |
| ### **13.5 MCP Server Configuration** | |
| The application launches with MCP server enabled: | |
| ```python | |
| demo.launch( | |
| server_name="0.0.0.0", | |
| server_port=7860, | |
| mcp_server=True, # Enable MCP protocol | |
| theme=theme, | |
| debug=True | |
| ) | |
| ``` | |
| **Access Points:** | |
| - **Web UI:** `http://localhost:7860` | |
| - **MCP Inspector:** Compatible with `@modelcontextprotocol/inspector` | |
| - **API Endpoints:** Auto-generated for all 4 tools + resource endpoints |