Spaces:

MCP-1st-Birthday
/

StepWise-Math-AI

Running

File size: 34,649 Bytes

69ac033

# **Product Requirements Document: StepWise Math (Gradio MCP Implementation)**

| Document Info          | Details                                                                          |
| :--------------------- | :------------------------------------------------------------------------------- |
| **Product Name**       | StepWise Math - Gradio MCP Framework                                             |
| **Version**            | 2.0 (MCP Server with Two-Stage Pipeline)                                         |
| **Status**             | Active                                                                           |
| **Target Demographic** | Middle School (Grades 6-8) to High School (Grades 9-10)                          |
| **Tech Stack**         | **Gradio 6.0+**, Python, **Google Gemini 2.5 Flash & 3.0 Pro**, **MCP Protocol** |
| **Deployment**         | Hugging Face Spaces, Docker, Local Development                                   |

## **1. Executive Summary**

**StepWise Math** is a Gradio-based web application with **Model Context Protocol (MCP) server capabilities** that converts static mathematical problems—supplied via text, screenshots, or URLs—into **interactive, step-by-step visual proofs**. 

Unlike a calculator that just gives the answer, StepWise Math builds a bespoke HTML5 application that guides students through the logical stages of a proof or concept. It breaks down complex ideas into incremental steps (e.g., "Step 1: Construct the shape", "Step 2: Apply the transformation", "Step 3: Observe the result"), allowing students to manipulate variables at each stage to internalize the logic.

### **Key Architecture Features**

* **MCP Server Integration:** Exposes proof generation tools via the Model Context Protocol, enabling AI agents and external tools to programmatically generate mathematical proofs
* **Two-Stage AI Pipeline:** Separates concept analysis (fast, ~10-15s) from code generation (moderate, ~20-30s) to comply with MCP timeout constraints
* **Gradio Framework:** Provides both web UI and MCP server endpoints through a single application instance
* **Docker-Ready:** Fully containerized for deployment to Hugging Face Spaces or any Docker-compatible environment

## **2. Target Audience**

* **Students (Grades 6-10):** Visual learners who need structured guidance to understand abstract concepts in Geometry, Algebra, and Trigonometry.
* **Math Teachers:** Need "digital manipulatives" that walk the class through a concept phase-by-phase.
* **Tutors:** Need to generate custom step-by-step explanations for specific homework problems.
* **AI Agents & Developers:** Can programmatically generate mathematical proofs via MCP protocol integration for educational tools, chatbots, or automated tutoring systems.

## **3. User Flow & Experience**

### **3.1 High-Level Flow (Gradio Web UI)**

1. **Initial Load:** Application automatically loads first example proof to demonstrate capabilities
2. **Input Selection:** User chooses input mode (Text, URL, or Image) via Gradio interface
3. **Content Entry:** User provides the mathematical concept or problem
4. **API Key (Optional):** User can override default API key in settings
5. **Two-Stage Generation:**
   - **Stage 1:** Concept Analysis (Gemini 2.5 Flash, ~10-15s) → JSON Specification
   - **Stage 2:** Code Generation (Gemini 3.0 Pro, ~20-30s) → Interactive HTML/JS Application
6. **Visualization:** Generated proof displays in iframe with step navigation
7. **Actions:** Save to library, export as JSON, or refine with feedback

![High-Level User Flow Diagram](./img/gradio-ui-user-flow.jpg)
### **3.2 MCP Server Flow (Programmatic Access)**

1. **Tool Discovery:** AI agent connects to Gradio MCP server endpoint
2. **Available Tools:**
   - `analyze_concept_from_text`: Analyzes text-based mathematical concept → returns JSON spec
   - `analyze_concept_from_url`: Analyzes concept from URL → returns JSON spec
   - `analyze_concept_from_image`: Analyzes concept from image → returns JSON spec
   - `generate_code_from_concept`: Generates interactive proof code from JSON spec → returns HTML/JS
3. **Two-Step Invocation:**
   - **Step 1:** Agent calls `analyze_concept_from_text/url/image` with input
   - **Step 2:** Agent calls `generate_code_from_concept` with JSON from Step 1
4. **Output Handling:** Agent receives HTML/JS code for rendering or further processing

![MCP Server Flow Diagram](./img/mcp-server-flow.jpg)

**MCP Architecture Diagram:**

![MCP Architecture Diagram](./img/architecture-diagram.jpg)

### **3.3 Feedback & Iteration Flow**

The user can view the generated proof and provide text feedback (e.g., "Make the triangle red" or "Add a step for area calculation").
1. **User Feedback:** User enters text in the "Refinement" panel (Gradio Textbox)
2. **Intent Analysis:** The system determines if the request requires a structural change (new steps) or just a visual update
3. **Regeneration:**
    - **Stage 1 (Refine Spec):** Gemini 2.5 Flash updates the JSON spec based on feedback
    - **Stage 2 (Refine Code):** Gemini 3.0 Pro rewrites the application code using the new spec and specific user instructions
4. **Update:** The Gradio HTML component refreshes with the modified application

## **4. Functional Requirements**

### **4.1 Multi-Modal Input Handling**

The app must accept three distinct types of input:
1. **Natural Language Text:** e.g., "Prove the Pythagorean Theorem."
2. **Image/Screenshot:** A photo of a textbook problem.
3. **URL:** A link to a math concept video or page.
*   **Validation:** The system must strictly validate inputs (e.g., check for empty text, valid URL format, or missing image files) before communicating with the AI to prevent "hallucinated" default responses.

### **4.2 The "Thinking" Engine (Gemini Integration)**

**Two-Stage Pipeline Architecture (MCP-Optimized)**

The application uses a **two-stage AI pipeline** specifically designed to comply with MCP protocol timeout constraints (typically <30 seconds per tool call):

* **Stage 1: Concept Decomposition (The Teacher)**
  * **Model:** `gemini-2.5-flash`
  * **Role:** Identifies the mathematical concept and breaks it down into a logical teaching sequence
  * **Execution Time:** ~10-15 seconds
  * **Output:** JSON Spec containing a list of **Steps**. Each step defines what the user should do and what they should see
  * **MCP Exposure:** Three separate tools based on input type:
    - `analyze_concept_from_text(text_input, api_key)` 
    - `analyze_concept_from_url(url_input, api_key)`
    - `analyze_concept_from_image(image_input, api_key)`
  * **Return Value:** JSON string containing the `MathSpec`
  
* **Stage 2: Implementation (The Engineer)**
  * **Model:** `gemini-3-pro-preview`
  * **Config:** `thinkingConfig: { thinkingBudget: 4096 }`
  * **Role:** Writes the HTML5/Canvas code based on the JSON specification
  * **Execution Time:** ~20-30 seconds
  * **Requirement:** The generated app must include a **Step Navigation System** (Next/Previous buttons, Progress bar) and distinct visual states for each step
  * **MCP Exposure:** Single tool for code generation:
    - `generate_code_from_concept(concept_json, api_key)`
  * **Return Value:** HTML string containing complete interactive application

**Why Two Stages for MCP?**

MCP protocol has strict timeout limitations. The original single-step `generate_proof` operation took 30-60 seconds, exceeding MCP timeout windows. By splitting into two independent operations:
- Each operation completes within timeout constraints
- AI agents can cache the JSON spec and regenerate code multiple times without re-analyzing
- More granular control over the generation process
- Better error recovery (if Stage 2 fails, Stage 1 results are preserved)

### **4.3 Output & Interaction**

* **Guided Experience:** The app starts at "Step 1". The user reads an instruction, interacts with the visual, and clicks "Next" to proceed
* **Interactive Canvas:** Graphics update based on the current step. For example, construction lines might appear only in Step 2
* **Live Feedback:** Equations and values update in real-time as user drags elements
* **Gradio Components:**
  - **HTML Component:** Displays the generated interactive proof application
  - **JSON Component:** Shows the MathSpec structure for debugging/inspection
  - **Textbox Components:** Display process logs and thinking streams
  - **Accordion/Tab Layouts:** Organize different views (Proof, Spec, Code, Logs)

### **4.4 Feedback Loop**

* **Refinement Interface:** A text input field below the simulation area allows users to request changes.
* **Context Awareness:** The AI must receive the *previous* JSON specification and the *new* user feedback to generate a delta or a completely new version.
* **Logic:**
    *   If the feedback changes the math concept (e.g., "Switch to Isosceles"), Stage 1 must regenerate the steps.
    *   If the feedback is cosmetic (e.g., "Dark mode"), Stage 2 must implement it while preserving the logic.

### **4.5 Export & Sharing** [Coming Soon]

* **Export Button:** A Gradio Button to download the current session
* **Format:** A JSON file containing:
    *   **Input Data:** The original problem text, URL, or image data
    *   **Math Concept:** The JSON Specification (Steps, Explanation)
    *   **Source Code:** The Generated HTML/JS
    *   **Metadata:** Timestamp, input mode
    *   **Exclusion:** Process Logs are **not** included in the export file to keep it clean
* **Import Capability:** A Gradio File Upload component to restore a previously exported JSON file. This restores the input fields, the concept specification, and the interactive proof code
* **Implementation:** Uses Gradio's `gr.DownloadButton` and `gr.File` components

### **4.6 Persistence (Local Storage via ProofLibrary)**  [Coming Soon]

*   **Save Capability:** Users can save the currently generated proof (Math Spec + Code) using a Gradio Button
*   **Backend Storage:** Python-based `ProofLibrary` class manages proof persistence:
    - Saves to `saved_proofs/` directory as JSON files
    - Each proof file contains: title, timestamp, input data, concept spec, generated code
    - Filename format: `proof_YYYYMMDD_HHMMSS.json`
*   **Library View:** A Gradio component (Dropdown or Gallery) lists previously saved items with timestamps and concept titles
*   **Load Capability:** Users can instantly restore a previously generated proof from the library without re-querying the AI
*   **File System:** Unlike browser Local Storage, Gradio implementation uses server-side file system storage for better reliability and Docker compatibility

### **4.7 Configuration & API Key Management** 

*   **Configuration Interface:** Gradio Accordion component in settings panel
*   **API Key Management:**
    *   The app defaults to using the `GEMINI_API_KEY` from environment variables (`os.getenv("GEMINI_API_KEY")`)
    *   Users can optionally provide their own API Key via a Gradio Textbox (type="password")
    *   **Logic:** If a custom key is provided to MCP tools or web UI, it takes precedence over the environment variable
    *   **MCP Tools:** API key is an optional parameter in all MCP-exposed functions:
        - `analyze_concept_from_text(text_input, api_key: Optional[str] = None)`
        - `generate_code_from_concept(concept_json, api_key: Optional[str] = None)`
    *   **Security:** Custom API keys are passed per-request and not persisted server-side
*   **Environment Variable Setup:**
    ```bash

    # Linux/Mac

    export GEMINI_API_KEY="your-api-key-here"

    

    # Windows PowerShell

    $env:GEMINI_API_KEY="your-api-key-here"

    

    # Docker

    docker run -e GEMINI_API_KEY="your-key" -p 7860:7860 hf-stepwise-math

    ```


### **4.8 Thinking Process Streaming (Enhanced UI)**  [Coming Soon]

*   **Streaming Thoughts:** The application utilizes Gemini's `thinkingConfig` with `includeThoughts: true` to capture the model's internal reasoning process
*   **Gradio Implementation:**
    *   **Textbox Component:** Displays thinking stream with `max_lines=20` for scrollable content
    *   **Real-time Updates:** Uses Gradio's streaming capabilities to update UI as thoughts arrive
    *   **Markdown Rendering:** Gradio automatically renders Markdown in textboxes when configured
*   **Display Features:**
    *   **Timer:** Progress message showing elapsed time (e.g., "Running for 12s")
    *   **Structured Layout:** Separate sections for "Analysis Phase" and "Code Generation Phase"
    *   **Collapsible Accordions:** Users can expand/collapse thought details to focus on results
*   **Process Logs:**
    *   Separate from thinking stream
    *   Shows high-level pipeline progress: "Starting Stage 1...", "Concept analyzed", "Generating code..."
    *   Stored in `GeminiPipeline.process_logs` list for debugging

### **4.9 Pre-loaded Examples Library**

*   **Examples Section:** Gradio Dropdown component populated from `examples/` directory
*   **File Format:** Each example is a JSON file containing:
    ```json

    {

      "title": "Visual Proof: Pythagorean Theorem",

      "input_mode": "Text",

      "input_data": "Prove the Pythagorean theorem...",

      "concept": { /* MathSpec JSON */ },

      "code": "<!DOCTYPE html>..."

    }

    ```

*   **Initial Load Behavior:** On application startup, automatically load the first example (or a designated default example) to:

    - Provide immediate visual demonstration of app capabilities

    - Avoid empty/blank initial state

    - Give users instant understanding of the output format

    - Enable immediate interaction without waiting for AI generation

*   **One-Click Loading:** Selecting an example from dropdown triggers a Gradio event handler that:

    - Populates input fields with example data

    - Loads the pre-generated concept spec into JSON viewer

    - Renders the code in the HTML iframe

    - Bypasses AI generation for instant loading

*   **Content:** The library covers diverse topics:

    - Geometry: Pythagorean Theorem, Area of Quadrilaterals, Altitude-Hypotenuse Ratios

    - Probability: Probability of Odd Sums

    - Algebra: Diagonals in Rhombus

*   **Example Files:** Located in `examples/` directory with naming convention `001-visual-proof-{topic}.json`

*   **Default Example:** First example in alphabetical order (`001-visual-proof-probability-of-an-odd-sum.json`) loads automatically on app initialization


### **4.10 MCP Server Integration**

*   **Gradio MCP Support:** Application launches with `mcp_server=True` flag to enable MCP protocol endpoints
*   **Tool Exposure Mechanism:** 
    - Gradio only exposes methods that are connected to UI components as MCP tools
    - Hidden UI components (created with `visible=False` in a `gr.Group`) are used to expose MCP-specific methods
    - Event handlers connect methods to hidden buttons/textboxes for MCP discovery
*   **Exposed MCP Tools (4 Total):**
    

    1. **analyze_concept_from_text**

       - **Parameters:** `text_input: str`, `api_key: Optional[str]`

       - **Returns:** JSON string containing MathSpec

       - **Purpose:** Fast concept analysis for text input

       - **Timeout:** ~10-15 seconds

    

    2. **analyze_concept_from_url**

       - **Parameters:** `url_input: str`, `api_key: Optional[str]`

       - **Returns:** JSON string containing MathSpec

       - **Purpose:** Fast concept analysis from URL content

       - **Timeout:** ~10-15 seconds

    

    3. **analyze_concept_from_image**

       - **Parameters:** `image_input: str` (base64 or file path), `api_key: Optional[str]`

       - **Returns:** JSON string containing MathSpec

       - **Purpose:** Fast concept analysis from image

       - **Timeout:** ~10-15 seconds

    

    4. **generate_code_from_concept**

       - **Parameters:** `concept_json: str`, `api_key: Optional[str]`

       - **Returns:** HTML string containing interactive proof application

       - **Purpose:** Generate code from previously analyzed concept

       - **Timeout:** ~20-30 seconds


*   **MCP Usage Pattern:**
    ```python

    # Step 1: Analyze concept

    concept_json = mcp_client.call_tool(

        "analyze_concept_from_text",

        {"text_input": "Prove Pythagorean theorem", "api_key": "optional-key"}

    )

    

    # Step 2: Generate code

    html_code = mcp_client.call_tool(

        "generate_code_from_concept", 

        {"concept_json": concept_json, "api_key": "optional-key"}

    )

    ```


*   **Testing MCP Server:**
    ```bash

    # Launch MCP Inspector

    npx @modelcontextprotocol/inspector

    

    # Connect to: http://localhost:7860/mcp

    # Available tools will appear in inspector UI

    ```


*   **Hidden UI Components (MCP Exposure):**
    ```python

    with gr.Group(visible=False) as mcp_hidden_group:

        # Analysis tools

        mcp_analyze_text_input = gr.Textbox()

        mcp_analyze_text_btn = gr.Button()

        mcp_analyze_text_output = gr.Textbox()

        

        # Code generation tool

        mcp_generate_concept_input = gr.Textbox()

        mcp_generate_concept_btn = gr.Button()

        mcp_generate_concept_output = gr.Textbox()

    

    # Event handlers connect methods for MCP

    mcp_analyze_text_btn.click(

        fn=analyze_concept_from_text,

        inputs=[mcp_analyze_text_input],

        outputs=[mcp_analyze_text_output]

    )

    ```


## **5. Feature Specifications (Examples)**

### **Example 1: The Pythagorean Theorem**
* **Generated App Steps:**
  * **Step 1: Setup:** Display a right triangle. User drags vertices to resize. "Observe the legs a, b and hypotenuse c."
  * **Step 2: Geometric Construction:** Squares appear on each side. "We build a square on each side of the triangle."
  * **Step 3: Area Calculation:** The app calculates the area of each square. "Note the values: A = a², B = b², C = c²."
  * **Step 4: The Proof:** The app rearranges the areas or shows the equation `Area A + Area B = Area C`. User drags vertices to verify it holds true for *any* right triangle.

### **Example 2: Slope Intercept Form**

* **Input:** Text: "Explain y = mx + b"
* **Generated App Steps:**
  * **Step 1: The Grid:** Shows a coordinate plane.
  * **Step 2: The Y-Intercept:** User adjusts slider `b`. The line moves up/down. "b controls where the line crosses the Y-axis."
  * **Step 3: The Slope:** User adjusts slider `m`. The line rotates. "m controls the steepness."
  * **Step 4: Prediction:** User is asked to set sliders to match a target line.

## **9. Development & Testing**

### **9.1 Local Development**
```bash

# Clone repository

git clone <repo-url>

cd hf-StepWise-Math/gradio-app



# Install dependencies

pip install -r requirements.txt



# Set API key

export GEMINI_API_KEY="your-api-key"  # Linux/Mac

$env:GEMINI_API_KEY="your-api-key"    # Windows PowerShell



# Run application

python app.py

```

### **9.2 Testing MCP Integration**
```bash

# Terminal 1: Launch Gradio app with MCP server

cd gradio-app

python app.py



# Terminal 2: Launch MCP Inspector

npx @modelcontextprotocol/inspector



# In Inspector UI:

# 1. Connect to: http://localhost:7860/mcp

# 2. Verify 4 tools appear: analyze_concept_from_text/url/image, generate_code_from_concept

# 3. Test two-step workflow:

#    - Call analyze_concept_from_text with test input

#    - Copy JSON result

#    - Call generate_code_from_concept with JSON

```

### **9.3 Docker Testing**
```bash

# Build image

docker build -t hf-stepwise-math .



# Run container

docker run --rm -it -e GEMINI_API_KEY="your-key" -p 7860:7860 hf-stepwise-math



# Access at: http://localhost:7860

# MCP endpoint: http://localhost:7860/mcp

```

### **9.4 Test Files**
* `test_app.py`: Unit tests for Gradio components
* `test_generate_proof.py`: Integration tests for AI pipeline
* `test_logging.py`: Logging system validation
* Example files in `examples/`: Pre-generated proofs for UI testing

### **9.5 Deployment Checklist**
- [ ] GEMINI_API_KEY environment variable configured
- [ ] Docker image builds successfully
- [ ] MCP server accessible at `/mcp` endpoint
- [ ] All 4 MCP tools discoverable
- [ ] Example library loads correctly
- [ ] **Default example auto-loads on app initialization**
- [ ] Save/Load functionality works with file system
- [ ] Export/Import produces valid JSON
- [ ] Two-stage pipeline completes within timeout constraints

**MathSpec (JSON)**

```json

{

  "conceptTitle": "Pythagorean Theorem",

  "educationalGoal": "Prove a^2 + b^2 = c^2",

  "explanation": "In a right-angled triangle...",

  "steps": [

    {

      "stepTitle": "The Triangle",

      "instruction": "Drag the red dots to change the shape of the right triangle.",

      "visualFocus": "Triangle ABC"

    },

    {

      "stepTitle": "Adding Squares",

      "instruction": "Click Next to visualize squares attached to each side.",

      "visualFocus": "Squares on sides a, b, c"

    }

  ],

  "visualSpec": {

    "elements": ["Triangle", "Squares", "Grid"],

    "interactions": ["Drag Vertex", "Hover info"],

    "mathLogic": "Calculate distances..."

  }

}

```

## **7. UI/UX Design (Gradio Components)**

* **Input Panel (Left Column):**
  - **Radio Buttons:** Select input mode (Text/URL/Image)
  - **Conditional Components:** Show relevant input field based on mode
  - **Textbox:** For text input
  - **Textbox:** For URL input
  - **Image Upload:** For image input
  - **Textbox (Password):** Optional API key override
  - **Button:** "Generate Proof"
  - **Dropdown:** Example selection
  - **Accordion:** Settings panel

* **Output Panel (Right Column):**
  - **Tabs Component:**
    1. **Interactive Proof:** `gr.HTML()` component displaying generated application
    2. **Concept Spec:** `gr.JSON()` component showing MathSpec structure
    3. **Source Code:** `gr.Code(language="html")` for viewing/editing generated code
    4. **Process Logs:** `gr.Textbox()` with process execution details
    5. **Thinking Stream:** `gr.Textbox()` with AI reasoning (collapsible accordion)

* **Action Buttons:**
  - **Save to Library:** Stores proof to `saved_proofs/` directory
  - **Export/Download:** `gr.DownloadButton()` for JSON export
  - **Import/Upload:** `gr.File()` for JSON import
  - **Load from Library:** `gr.Dropdown()` with saved proofs

* **Refinement Panel (Below Output):**
  - **Textbox:** Multi-line input for feedback
  - **Button:** "Refine Proof" to trigger regeneration

## **8. Technical Constraints & Requirements**

### **8.1 MCP Protocol Constraints**
* **Timeout Limitation:** Each MCP tool call must complete within ~30 seconds
* **Solution:** Two-stage pipeline splits 30-60s operation into 10-15s + 20-30s stages
* **Tool Discovery:** Tools must be connected to Gradio UI components (even if hidden) to be exposed via MCP
* **Parameter Handling:** All MCP tool parameters must be either required or properly typed with `Optional[]`
* **No Conditional Parameters:** Cannot require different parameters based on conditions (e.g., url_input required when mode="URL")



### **8.2 Gemini API Configuration**

* **Thinking Budget:** Uses high reasoning budget (4096) for Stage 2 (code generation) to ensure robust logic

* **Model Selection:** 

  - Stage 1: `gemini-2.5-flash` for fast analysis

  - Stage 2: `gemini-3-pro-preview` for extended thinking during code generation

* **API Key Management:** Supports environment variable fallback with optional per-request override



### **8.3 Gradio Framework**

* **Version:** Gradio 6.0+ (supports `mcp_server=True` flag)
* **Launch Command:** `demo.launch(mcp_server=True, share=False, server_port=7860)`
* **Hidden Components:** Required for MCP tool exposure without cluttering UI
* **Event Handlers:** Connect Python methods to Gradio components for both UI and MCP access

### **8.4 Visual Clarity (Generated Code)**
* **Layout Requirements:** Generated code **MUST** prioritize visual clarity
* **Separation:** Overlapping elements are strictly prohibited
* **CSS/Layout:** Must use Flexbox/Grid to strictly separate graphics area (Canvas/SVG) from controls and textual instructions
* **Step Navigation:** Generated apps must include prominent navigation UI (buttons, progress indicators)

### **8.5 Docker & Deployment**
* **Containerization:** Application is fully Docker-ready with `Dockerfile` and `requirements.txt`
* **Port Mapping:** Exposes port 7860 for web UI and MCP server
* **Environment Variables:** API key passed via `-e GEMINI_API_KEY="..."`
* **Hugging Face Spaces:** Compatible with HF Spaces deployment (uses `gradio` template)
* **Build Command:** `docker build -t hf-stepwise-math .`
* **Run Command:** `docker run -e GEMINI_API_KEY="key" -p 7860:7860 hf-stepwise-math`

### **8.6 Security & Safety**
* **Content Filtering:** Image inputs should be validated for appropriate educational content
* **API Key Security:** Custom keys are per-request only, not persisted server-side
* **CORS:** Gradio handles CORS automatically for web UI and MCP endpoints
* **Rate Limiting:** Consider implementing rate limits for MCP tool calls in production

## **10. Data Models (Python Implementation)**

**MathSpec (Python Class)**

```python

class MathSpec:

    """Structured mathematical concept specification"""

    def __init__(self, data: dict):

        self.concept_title = data.get("conceptTitle", "")

        self.educational_goal = data.get("educationalGoal", "")

        self.explanation = data.get("explanation", "")

        self.steps = data.get("steps", [])

        self.visual_spec = data.get("visualSpec", {})

    

    def to_dict(self):

        return {

            "conceptTitle": self.concept_title,

            "educationalGoal": self.educational_goal,

            "explanation": self.explanation,

            "steps": self.steps,

            "visualSpec": self.visual_spec

        }

```

**MathSpec JSON Format**

```json

{

  "conceptTitle": "Pythagorean Theorem",

  "educationalGoal": "Prove a^2 + b^2 = c^2",

  "explanation": "In a right-angled triangle...",

  "steps": [

    {

      "stepTitle": "The Triangle",

      "instruction": "Drag the red dots to change the shape of the right triangle.",

      "visualFocus": "Triangle ABC"

    },

    {

      "stepTitle": "Adding Squares",

      "instruction": "Click Next to visualize squares attached to each side.",

      "visualFocus": "Squares on sides a, b, c"

    }

  ],

  "visualSpec": {

    "elements": ["Triangle", "Squares", "Grid"],

    "interactions": ["Drag Vertex", "Hover info"],

    "mathLogic": "Calculate distances..."

  }

}

```

**Export/Save Format**

```json

{

  "title": "Visual Proof: Pythagorean Theorem",

  "timestamp": "2025-11-25T10:30:00",

  "input_mode": "Text",

  "input_data": "Prove the Pythagorean theorem using visual methods",

  "concept": {

    "conceptTitle": "Pythagorean Theorem",

    "educationalGoal": "...",

    "steps": [...]

  },

  "code": "<!DOCTYPE html>...",

  "metadata": {

    "generated_by": "StepWise Math Gradio MCP",

    "version": "2.0"

  }

}

```

## **11. File Structure**

```

gradio-app/

├── app.py                          # Main Gradio application with MCP server

├── requirements.txt                # Python dependencies

├── Dockerfile                      # Container configuration

├── setup.sh / setup.bat            # Environment setup scripts

├── README.md                       # Documentation

├── saved_proofs/                   # User-generated proofs (persistent storage)

│   ├── proof_20251125_103000.json

│   └── proof_20251125_154500.json

├── examples/                       # Pre-loaded example proofs

│   ├── 001-visual-proof-pythagorean-theorem.json

│   ├── 002-visual-proof-probability-odd-sum.json

│   └── ...

├── test_app.py                     # Unit tests

├── test_generate_proof.py          # Integration tests

└── docs/                           # Additional documentation

    ├── DEPLOYMENT.md

    ├── TEST_GUIDE.md

    └── MCP_SCHEMA_README.md

```

## **12. Appendix: MCP Tool Schema**

### **Tool 1: analyze_concept_from_text**

```json

{

  "name": "analyze_concept_from_text",

  "description": "Analyze a text-based mathematical concept and generate structured JSON specification",

  "inputSchema": {

    "type": "object",

    "properties": {

      "text_input": {

        "type": "string",

        "description": "Mathematical concept or problem description"

      },

      "api_key": {

        "type": "string",

        "description": "Optional Gemini API key (uses environment variable if not provided)"

      }

    },

    "required": ["text_input"]

  },

  "outputSchema": {

    "type": "string",

    "description": "JSON string containing MathSpec structure"

  }

}

```



### **Tool 2: analyze_concept_from_url**
```json

{

  "name": "analyze_concept_from_url",

  "description": "Analyze mathematical concept from URL and generate structured JSON specification",

  "inputSchema": {

    "type": "object",

    "properties": {

      "url_input": {

        "type": "string",

        "description": "URL to mathematical concept page or video"

      },

      "api_key": {

        "type": "string",

        "description": "Optional Gemini API key"

      }

    },

    "required": ["url_input"]

  }

}

```

### **Tool 3: analyze_concept_from_image**

```json

{

  "name": "analyze_concept_from_image",

  "description": "Analyze mathematical concept from image and generate structured JSON specification",

  "inputSchema": {

    "type": "object",

    "properties": {

      "image_input": {

        "type": "string",

        "description": "Base64-encoded image or file path"

      },

      "api_key": {

        "type": "string",

        "description": "Optional Gemini API key"

      }

    },

    "required": ["image_input"]

  }

}

```



### **Tool 4: generate_code_from_concept**
```json

{

  "name": "generate_code_from_concept",

  "description": "Generate interactive HTML/JS proof application from JSON concept specification",

  "inputSchema": {

    "type": "object",

    "properties": {

      "concept_json": {

        "type": "string",

        "description": "JSON string containing MathSpec (from analyze_concept_* tools)"

      },

      "api_key": {

        "type": "string",

        "description": "Optional Gemini API key"

      }

    },

    "required": ["concept_json"]

  },

  "outputSchema": {

    "type": "string",

    "description": "HTML string containing complete interactive proof application"

  }

}

```


## **13. Model Context Protocol (MCP) Integration**

StepWise Math functions as a complete **MCP Server**, exposing its capabilities to external AI agents and automation tools. This enables programmatic access to the visual proof generation pipeline.

### **13.1 MCP Tools**

The application exposes **4 primary MCP tools** organized in a two-step workflow:

#### **Step 1: Specification Creation Tools**
1. **`create_math_specification_from_text`**
   - Creates a structured teaching specification from natural language descriptions
   - Input: Text description of the math problem
   - Output: JSON specification with teaching steps
   - Processing time: ~10-15 seconds

2. **`create_math_specification_from_url`**
   - Creates a specification from web resources (Wikipedia, Khan Academy, etc.)
   - Input: URL pointing to math content
   - Output: JSON specification with teaching steps
   - Processing time: ~10-15 seconds

3. **`create_math_specification_from_image`**
   - Creates a specification from uploaded images (textbook problems, screenshots, handwritten notes)
   - Input: PIL Image object
   - Output: JSON specification with teaching steps
   - Processing time: ~10-15 seconds

#### **Step 2: Application Building Tool**
4. **`build_interactive_proof_from_specification`**
   - Builds a complete HTML/JavaScript application from a specification
   - Input: JSON specification from any Step 1 tool
   - Output: Self-contained HTML document
   - Processing time: ~20-30 seconds

### **13.2 MCP Prompts**

Pre-defined prompts guide agents on effective tool usage:

1. **`create_visual_math_proof`** - Complete workflow for creating visual proofs

2. **`create_math_specification`** - Focus on pedagogical design

3. **`build_from_specification`** - Focus on implementation with customization



### **13.3 MCP Resources**



The server provides helpful templates and examples:



| Resource URI                        | Description                           | Type     |

| ----------------------------------- | ------------------------------------- | -------- |

| `stepwise://specification-template` | JSON template for math specifications | JSON     |

| `stepwise://example-pythagorean`    | Complete Pythagorean theorem example  | JSON     |

| `stepwise://example-probability`    | Probability visualization example     | JSON     |

| `stepwise://workflow-guide`         | Two-step workflow documentation       | Markdown |



### **13.4 MCP Use Cases**



**For AI Agents:**

- Automatically generate visual proofs from student questions

- Create custom teaching materials on-demand

- Build interactive homework help applications



**For Automation:**

- Batch-process textbook problems into interactive visualizations

- Convert curriculum PDFs into step-by-step interactive lessons

- Generate proof variations for different learning styles



**For Integration:**

- Embed in learning management systems (LMS)

- Connect to homework platforms

- Integrate with educational chatbots



### **13.5 MCP Server Configuration**



The application launches with MCP server enabled:



```python

demo.launch(

    server_name="0.0.0.0",

    server_port=7860,

    mcp_server=True,  # Enable MCP protocol

    theme=theme,

    debug=True

)

```



**Access Points:**
- **Web UI:** `http://localhost:7860`
- **MCP Inspector:** Compatible with `@modelcontextprotocol/inspector`
- **API Endpoints:** Auto-generated for all 4 tools + resource endpoints