content-engine / CLAUDE.md
dippoo's picture
Add pod generation with FLUX.2, persistent state, training improvements
27fea48
|
raw
history blame
7.94 kB

Content Engine

Automated AI content generation system with cloud APIs, LoRA training, and multi-backend support.

Repositories

Always sync changes between both directories when modifying code.

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Frontend (ui.html)                    β”‚
β”‚  Generate | Batch | Gallery | Train LoRA | Status | Settings β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    FastAPI Backend (main.py)                 β”‚
β”‚  routes_generation | routes_video | routes_training | etc   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό                     β–Ό                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Local GPU    β”‚    β”‚   RunPod      β”‚    β”‚  Cloud APIs   β”‚
β”‚  (ComfyUI)    β”‚    β”‚  (Serverless) β”‚    β”‚  (WaveSpeed)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Cloud Providers

WaveSpeed (wavespeed_provider.py)

Primary cloud API for image/video generation. Uses direct HTTP API (SDK optional).

Text-to-Image Models:

  • seedream-4.5 - Best quality, NSFW OK (ByteDance)
  • seedream-4, seedream-3.1 - NSFW friendly
  • gpt-image-1.5, gpt-image-1-mini - OpenAI models
  • nano-banana-pro, nano-banana - Google models
  • wan-2.6, wan-2.5 - Alibaba models
  • kling-image-o3 - Kuaishou

Image-to-Image (Edit) Models:

  • seedream-4.5-edit - Best for face preservation
  • seedream-4.5-multi, seedream-4-multi - Multi-reference (up to 3 images)
  • kling-o1-multi - Multi-reference (up to 10 images)
  • wan-2.6-edit, wan-2.5-edit - NSFW friendly

Image-to-Video Models:

  • wan-2.6-i2v-pro - Best quality ($0.05/s)
  • wan-2.6-i2v-flash - Fast
  • kling-o3-pro, kling-o3 - Kuaishou
  • higgsfield-dop - Cinematic 5s clips
  • veo-3.1, sora-2 - Premium models

API Pattern:

# WaveSpeed returns async jobs - must poll for result
response = {"data": {"outputs": [], "urls": {"get": "poll_url"}}}
# Poll urls.get until outputs[] is populated

RunPod

  • Training: Cloud GPU for LoRA training (runpod_trainer.py)
  • Generation: Serverless endpoint for inference (runpod_provider.py)

Character System

Characters link a trained LoRA to generation:

Config file (config/characters/alice.yaml):

id: alice
name: "Alice"
trigger_word: "alicechar"              # Activates the LoRA
lora_filename: "alice_v1.safetensors"  # In D:\ComfyUI\Models\Lora\
lora_strength: 0.85

Generation flow:

  1. User selects character from dropdown
  2. System prepends trigger word: "alicechar, a woman in red dress"
  3. LoRA is loaded into workflow (local/RunPod only)
  4. Character identity is preserved in output

For cloud-only (no local GPU):

  • Use img2img with reference photo
  • Or deploy LoRA to RunPod serverless endpoint

Templates

Prompt recipes with variables (config/templates/*.yaml):

id: portrait_glamour
name: "Glamour Portrait"
positive: "{{character}}, {{pose}}, {{lighting}}, professional photo"
variables:
  - name: pose
    options: ["standing", "sitting", "leaning"]
  - name: lighting
    options: ["studio", "natural", "dramatic"]

Key Files

API Routes

  • routes_generation.py - txt2img, img2img endpoints
  • routes_video.py - img2video, WaveSpeed/Higgsfield video
  • routes_training.py - LoRA training jobs
  • routes_catalog.py - Gallery/image management
  • routes_system.py - Health checks, character list

Services

  • wavespeed_provider.py - WaveSpeed API client (SDK optional, uses httpx)
  • runpod_trainer.py - Cloud LoRA training
  • runpod_provider.py - Cloud generation endpoint
  • comfyui_client.py - Local ComfyUI integration
  • workflow_builder.py - ComfyUI workflow JSON builder
  • template_engine.py - Prompt template rendering
  • variation_engine.py - Batch variation generation

Frontend

  • ui.html - Single-page app with all UI

Environment Variables

# Cloud APIs
WAVESPEED_API_KEY=ws_xxx           # WaveSpeed.ai API key
RUNPOD_API_KEY=xxx                 # RunPod API key
RUNPOD_ENDPOINT_ID=xxx             # RunPod serverless endpoint (for generation)

# Optional
HIGGSFIELD_API_KEY=xxx             # Higgsfield (Kling 3.0, etc.)
COMFYUI_URL=http://127.0.0.1:8188  # Local ComfyUI

Database

SQLite with async (aiosqlite):

  • images - Generated image catalog
  • characters - Character profiles
  • generation_jobs - Job tracking
  • scheduled_posts - Publishing queue

UI Structure

Generate Page:

  • Mode chips: Text to Image | Image to Image | Image to Video
  • Backend chips: Local GPU | RunPod GPU | Cloud API
  • Model dropdowns (conditional on mode/backend)
  • Character/Template selectors (2-column grid)
  • Prompt textareas
  • Output settings (aspect ratio, seed)

Controls Panel: 340px width, compact styling Drop Zones: For reference images (character + pose)

Common Issues

"Product not found" from WaveSpeed

Model ID doesn't exist. Check MODEL_MAP, EDIT_MODEL_MAP, VIDEO_MODEL_MAP in wavespeed_provider.py against https://wavespeed.ai/models

"No image URL in output"

WaveSpeed returned async job. Check outputs is empty and urls.get exists, then poll that URL.

HuggingFace Space startup hang

Check requirements.txt for missing packages. Common: python-dotenv, runpod, wavespeed (optional).

Import errors on HF Spaces

Make optional imports with try/except:

try:
    from wavespeed import Client
    SDK_AVAILABLE = True
except ImportError:
    SDK_AVAILABLE = False

Development Commands

# Run locally
cd content_engine
python -m uvicorn content_engine.main:app --port 8000 --reload

# Push to HuggingFace
cd content-engine
git add . && git commit -m "message" && git push origin main

# Sync local ↔ HF
cp content_engine/src/content_engine/file.py content-engine/src/content_engine/file.py

Multi-Reference Image Support

For img2img with 2 reference images (character + pose):

  1. UI: Two drop zones side-by-side
  2. API: image (required) + image2 (optional) in FormData
  3. Backend: Both uploaded to temp URLs, sent to WaveSpeed
  4. Models: SeeDream Sequential, Kling O1 support multi-ref

Pricing Notes

  • WaveSpeed: ~$0.003-0.01 per image, $0.01-0.05/s for video
  • RunPod: ~$0.0002/s for GPU time (training/generation)
  • Cloud API cheaper for light use; RunPod better for volume

Future Improvements

  • RunPod serverless endpoint for LoRA-based generation
  • Auto-captioning for training images
  • Batch video generation
  • Publishing integrations (social media APIs)