# RPGPortrait Design Document ## Project Overview RPGPortrait is a Gradio-based web application that helps users build detailed prompts and generate character portraits leveraging Gemini AI models. ## UI Design The UI is built using `gradio.Blocks` to allow for a structured, multi-column layout. ### Layout Structure - **Global Control**: Save/Load character (JSON), Randomize features, **Character Naming**. - **Input Sections**: Organized into columns/rows of dropdown menus. - **Identity**: Race, Class (with **🎲 Auto-Name**), Gender, Age. - **Appearance**: Hair Style/Color, Eye Color, Build, Skin Tone, Distinguishing Features. - **Equipment**: Armor/Clothing, Weapons, Accessories (up to 2), Materials. - **Environment**: Background, Lighting, Atmosphere. - **Artistic Style**: Art Style, Mood, Special Effects (VFX). - **Generation Suite**: - Large technical prompt textbox (read-only). - **🧠 Refine with Gemini**: Calls Gemini 3 Pro to create a vivid prompt. - **🖼️ Generate Image**: Routes to selected backend (Gemini or ComfyUI). - **Backend Selector**: - Radio buttons to toggle between **Gemini (Cloud)** and **ComfyUI (Local)**. - **Output Area**: - Image display for the portrait. - 📥 Download button (PNG) for the generated image. ## Prompt Assembly Logic The app uses a 3-stage prompt pipeline: 1. **Technical Segmenting**: Combines dropdown values and extra info text into a base technical prompt using a YAML-based template. 2. **AI Refinement**: The technical prompt is sent to an LLM to "art-ify" the description. The system instructions for this are stored in `prompts.yaml`. - **Cloud**: Uses `gemini-3-pro-preview`. - **Local**: Uses Ollama (e.g., `llama3`). 3. **Image Synthesis**: - **Cloud**: Sent to `imagen-4.0-generate-001`. - **Local**: Sent to a ComfyUI endpoint via `POST /prompt`, utilizing a custom workflow (`comfy_rpg_char_gen.json`). ## Naming & Metadata - **Local Naming**: Uses a procedural generator based on the `fictional-names` library for instant, thematic results. - **Metadata Embedding**: Generation prompts and character names are embedded directly into the PNG `tEXt` chunks (Comment and CharacterName keys). - **Filenames**: Character names are sanitized and used for both JSON and PNG output. ## Data & Persistence - **YAML Configuration**: `features.yaml` stores all possible dropdown values, their descriptive labels, and the final prompt template. - **JSON Serialization**: "Save/Load" functionality allows users to export and import their full character state as a JSON file. ## Technical Stack - **Backend**: Python 3 - **SDK**: `google-genai` - **UI Framework**: Gradio - **AI Models**: - **Text (Cloud)**: `gemini-3-pro-preview` - **Text (Local)**: Ollama - **Image Cloud**: `imagen-4.0-generate-001` - **Image Local**: ComfyUI (Local Server) - **Configuration**: - `.env` for API keys and connection hosts/ports (ComfyUI, Ollama). ## Maintenance & Development Lessons ### 1. Server Lifecycle - **Mandatory Restarts**: Any modification to the `modules/` folder or configuration files (`features.yaml`, `prompts.yaml`) requires a manual restart of the Gradio server. - **Process Management (CRITICAL)**: Always ensure the previously running process is fully terminated before starting a new one. - **Safe Command**: Use this PowerShell command to only kill the process using the target port (7860): ```powershell Stop-Process -Id (Get-NetTCPConnection -LocalPort 7860 -ErrorAction SilentlyContinue).OwningProcess -Force -ErrorAction SilentlyContinue ``` ### 2. Project Architecture (Modular) - **`app.py`**: Entry point that launches the Gradio `demo`. - **`modules/config.py`**: Global constants and environment variables. - **`modules/integrations.py`**: Wrappers for AI backends (Gemini, Ollama, ComfyUI). - **`modules/core_logic.py`**: Character state management and prompt assembly. - **`modules/ui_layout.py`**: The full Gradio UI definition (`build_ui`). - **`modules/name_generator.py`**: Local procedural name generator using the `fictional-names` library. - **`comfy/`**: Dedicated folder for ComfyUI-specific JSON workflows and utility scripts. ### 3. Deployment (Hugging Face Spaces) - **Containerization**: The app is containerized using the provided `Dockerfile` and `.dockerignore`. - **User Permissions**: The Dockerfile uses `useradd -m -u 1000 user` to comply with Hugging Face's security requirements for non-root users. - **Port Mapping**: Hugging Face Spaces expects the app on port 7860. The `GRADIO_SERVER_NAME="0.0.0.0"` and `GRADIO_SERVER_PORT=7860` environment variables ensure the app is bound correctly for external routing. - **Local Testing**: Run `docker-compose up --build` to verify the deployment state locally.