# Content Engine Automated AI content generation system with cloud APIs, LoRA training, and multi-backend support. ## Repositories - **Local Development**: `D:\AI automation\content_engine\` - **HuggingFace Deployment**: `D:\AI automation\content-engine\` (deployed to https://huggingface.co/spaces/dippoo/content-engine) Always sync changes between both directories when modifying code. ## Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ Frontend (ui.html) │ │ Generate | Batch | Gallery | Train LoRA | Status | Settings │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ FastAPI Backend (main.py) │ │ routes_generation | routes_video | routes_training | etc │ └─────────────────────────────────────────────────────────────┘ │ ┌─────────────────────┼─────────────────────┐ ▼ ▼ ▼ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ Local GPU │ │ RunPod │ │ Cloud APIs │ │ (ComfyUI) │ │ (Serverless) │ │ (WaveSpeed) │ └───────────────┘ └───────────────┘ └───────────────┘ ``` ## Cloud Providers ### WaveSpeed (wavespeed_provider.py) Primary cloud API for image/video generation. Uses direct HTTP API (SDK optional). **Text-to-Image Models:** - `seedream-4.5` - Best quality, NSFW OK (ByteDance) - `seedream-4`, `seedream-3.1` - NSFW friendly - `gpt-image-1.5`, `gpt-image-1-mini` - OpenAI models - `nano-banana-pro`, `nano-banana` - Google models - `wan-2.6`, `wan-2.5` - Alibaba models - `kling-image-o3` - Kuaishou **Image-to-Image (Edit) Models:** - `seedream-4.5-edit` - Best for face preservation - `seedream-4.5-multi`, `seedream-4-multi` - Multi-reference (up to 3 images) - `kling-o1-multi` - Multi-reference (up to 10 images) - `wan-2.6-edit`, `wan-2.5-edit` - NSFW friendly **Image-to-Video Models:** - `wan-2.6-i2v-pro` - Best quality ($0.05/s) - `wan-2.6-i2v-flash` - Fast - `kling-o3-pro`, `kling-o3` - Kuaishou - `higgsfield-dop` - Cinematic 5s clips - `veo-3.1`, `sora-2` - Premium models **API Pattern:** ```python # WaveSpeed returns async jobs - must poll for result response = {"data": {"outputs": [], "urls": {"get": "poll_url"}}} # Poll urls.get until outputs[] is populated ``` ### RunPod - **Training**: Cloud GPU for LoRA training (runpod_trainer.py) - **Generation**: Serverless endpoint for inference (runpod_provider.py) ## Character System Characters link a trained LoRA to generation: **Config file** (`config/characters/alice.yaml`): ```yaml id: alice name: "Alice" trigger_word: "alicechar" # Activates the LoRA lora_filename: "alice_v1.safetensors" # In D:\ComfyUI\Models\Lora\ lora_strength: 0.85 ``` **Generation flow:** 1. User selects character from dropdown 2. System prepends trigger word: `"alicechar, a woman in red dress"` 3. LoRA is loaded into workflow (local/RunPod only) 4. Character identity is preserved in output **For cloud-only (no local GPU):** - Use img2img with reference photo - Or deploy LoRA to RunPod serverless endpoint ## Templates Prompt recipes with variables (`config/templates/*.yaml`): ```yaml id: portrait_glamour name: "Glamour Portrait" positive: "{{character}}, {{pose}}, {{lighting}}, professional photo" variables: - name: pose options: ["standing", "sitting", "leaning"] - name: lighting options: ["studio", "natural", "dramatic"] ``` ## Key Files ### API Routes - `routes_generation.py` - txt2img, img2img endpoints - `routes_video.py` - img2video, WaveSpeed/Higgsfield video - `routes_training.py` - LoRA training jobs - `routes_catalog.py` - Gallery/image management - `routes_system.py` - Health checks, character list ### Services - `wavespeed_provider.py` - WaveSpeed API client (SDK optional, uses httpx) - `runpod_trainer.py` - Cloud LoRA training - `runpod_provider.py` - Cloud generation endpoint - `comfyui_client.py` - Local ComfyUI integration - `workflow_builder.py` - ComfyUI workflow JSON builder - `template_engine.py` - Prompt template rendering - `variation_engine.py` - Batch variation generation ### Frontend - `ui.html` - Single-page app with all UI ## Environment Variables ```env # Cloud APIs WAVESPEED_API_KEY=ws_xxx # WaveSpeed.ai API key RUNPOD_API_KEY=xxx # RunPod API key RUNPOD_ENDPOINT_ID=xxx # RunPod serverless endpoint (for generation) # Optional HIGGSFIELD_API_KEY=xxx # Higgsfield (Kling 3.0, etc.) COMFYUI_URL=http://127.0.0.1:8188 # Local ComfyUI ``` ## Database SQLite with async (aiosqlite): - `images` - Generated image catalog - `characters` - Character profiles - `generation_jobs` - Job tracking - `scheduled_posts` - Publishing queue ## UI Structure **Generate Page:** - Mode chips: Text to Image | Image to Image | Image to Video - Backend chips: Local GPU | RunPod GPU | Cloud API - Model dropdowns (conditional on mode/backend) - Character/Template selectors (2-column grid) - Prompt textareas - Output settings (aspect ratio, seed) **Controls Panel:** 340px width, compact styling **Drop Zones:** For reference images (character + pose) ## Common Issues ### "Product not found" from WaveSpeed Model ID doesn't exist. Check `MODEL_MAP`, `EDIT_MODEL_MAP`, `VIDEO_MODEL_MAP` in wavespeed_provider.py against https://wavespeed.ai/models ### "No image URL in output" WaveSpeed returned async job. Check `outputs` is empty and `urls.get` exists, then poll that URL. ### HuggingFace Space startup hang Check requirements.txt for missing packages. Common: `python-dotenv`, `runpod`, `wavespeed` (optional). ### Import errors on HF Spaces Make optional imports with try/except: ```python try: from wavespeed import Client SDK_AVAILABLE = True except ImportError: SDK_AVAILABLE = False ``` ## Development Commands ```bash # Run locally cd content_engine python -m uvicorn content_engine.main:app --port 8000 --reload # Push to HuggingFace cd content-engine git add . && git commit -m "message" && git push origin main # Sync local ↔ HF cp content_engine/src/content_engine/file.py content-engine/src/content_engine/file.py ``` ## Multi-Reference Image Support For img2img with 2 reference images (character + pose): 1. **UI**: Two drop zones side-by-side 2. **API**: `image` (required) + `image2` (optional) in FormData 3. **Backend**: Both uploaded to temp URLs, sent to WaveSpeed 4. **Models**: SeeDream Sequential, Kling O1 support multi-ref ## Pricing Notes - **WaveSpeed**: ~$0.003-0.01 per image, $0.01-0.05/s for video - **RunPod**: ~$0.0002/s for GPU time (training/generation) - Cloud API cheaper for light use; RunPod better for volume ## Future Improvements - [ ] RunPod serverless endpoint for LoRA-based generation - [ ] Auto-captioning for training images - [ ] Batch video generation - [ ] Publishing integrations (social media APIs)