content-engine / CLAUDE.md
dippoo's picture
Add pod generation with FLUX.2, persistent state, training improvements
27fea48
|
raw
history blame
7.94 kB
# Content Engine
Automated AI content generation system with cloud APIs, LoRA training, and multi-backend support.
## Repositories
- **Local Development**: `D:\AI automation\content_engine\`
- **HuggingFace Deployment**: `D:\AI automation\content-engine\` (deployed to https://huggingface.co/spaces/dippoo/content-engine)
Always sync changes between both directories when modifying code.
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Frontend (ui.html) β”‚
β”‚ Generate | Batch | Gallery | Train LoRA | Status | Settings β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FastAPI Backend (main.py) β”‚
β”‚ routes_generation | routes_video | routes_training | etc β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Local GPU β”‚ β”‚ RunPod β”‚ β”‚ Cloud APIs β”‚
β”‚ (ComfyUI) β”‚ β”‚ (Serverless) β”‚ β”‚ (WaveSpeed) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Cloud Providers
### WaveSpeed (wavespeed_provider.py)
Primary cloud API for image/video generation. Uses direct HTTP API (SDK optional).
**Text-to-Image Models:**
- `seedream-4.5` - Best quality, NSFW OK (ByteDance)
- `seedream-4`, `seedream-3.1` - NSFW friendly
- `gpt-image-1.5`, `gpt-image-1-mini` - OpenAI models
- `nano-banana-pro`, `nano-banana` - Google models
- `wan-2.6`, `wan-2.5` - Alibaba models
- `kling-image-o3` - Kuaishou
**Image-to-Image (Edit) Models:**
- `seedream-4.5-edit` - Best for face preservation
- `seedream-4.5-multi`, `seedream-4-multi` - Multi-reference (up to 3 images)
- `kling-o1-multi` - Multi-reference (up to 10 images)
- `wan-2.6-edit`, `wan-2.5-edit` - NSFW friendly
**Image-to-Video Models:**
- `wan-2.6-i2v-pro` - Best quality ($0.05/s)
- `wan-2.6-i2v-flash` - Fast
- `kling-o3-pro`, `kling-o3` - Kuaishou
- `higgsfield-dop` - Cinematic 5s clips
- `veo-3.1`, `sora-2` - Premium models
**API Pattern:**
```python
# WaveSpeed returns async jobs - must poll for result
response = {"data": {"outputs": [], "urls": {"get": "poll_url"}}}
# Poll urls.get until outputs[] is populated
```
### RunPod
- **Training**: Cloud GPU for LoRA training (runpod_trainer.py)
- **Generation**: Serverless endpoint for inference (runpod_provider.py)
## Character System
Characters link a trained LoRA to generation:
**Config file** (`config/characters/alice.yaml`):
```yaml
id: alice
name: "Alice"
trigger_word: "alicechar" # Activates the LoRA
lora_filename: "alice_v1.safetensors" # In D:\ComfyUI\Models\Lora\
lora_strength: 0.85
```
**Generation flow:**
1. User selects character from dropdown
2. System prepends trigger word: `"alicechar, a woman in red dress"`
3. LoRA is loaded into workflow (local/RunPod only)
4. Character identity is preserved in output
**For cloud-only (no local GPU):**
- Use img2img with reference photo
- Or deploy LoRA to RunPod serverless endpoint
## Templates
Prompt recipes with variables (`config/templates/*.yaml`):
```yaml
id: portrait_glamour
name: "Glamour Portrait"
positive: "{{character}}, {{pose}}, {{lighting}}, professional photo"
variables:
- name: pose
options: ["standing", "sitting", "leaning"]
- name: lighting
options: ["studio", "natural", "dramatic"]
```
## Key Files
### API Routes
- `routes_generation.py` - txt2img, img2img endpoints
- `routes_video.py` - img2video, WaveSpeed/Higgsfield video
- `routes_training.py` - LoRA training jobs
- `routes_catalog.py` - Gallery/image management
- `routes_system.py` - Health checks, character list
### Services
- `wavespeed_provider.py` - WaveSpeed API client (SDK optional, uses httpx)
- `runpod_trainer.py` - Cloud LoRA training
- `runpod_provider.py` - Cloud generation endpoint
- `comfyui_client.py` - Local ComfyUI integration
- `workflow_builder.py` - ComfyUI workflow JSON builder
- `template_engine.py` - Prompt template rendering
- `variation_engine.py` - Batch variation generation
### Frontend
- `ui.html` - Single-page app with all UI
## Environment Variables
```env
# Cloud APIs
WAVESPEED_API_KEY=ws_xxx # WaveSpeed.ai API key
RUNPOD_API_KEY=xxx # RunPod API key
RUNPOD_ENDPOINT_ID=xxx # RunPod serverless endpoint (for generation)
# Optional
HIGGSFIELD_API_KEY=xxx # Higgsfield (Kling 3.0, etc.)
COMFYUI_URL=http://127.0.0.1:8188 # Local ComfyUI
```
## Database
SQLite with async (aiosqlite):
- `images` - Generated image catalog
- `characters` - Character profiles
- `generation_jobs` - Job tracking
- `scheduled_posts` - Publishing queue
## UI Structure
**Generate Page:**
- Mode chips: Text to Image | Image to Image | Image to Video
- Backend chips: Local GPU | RunPod GPU | Cloud API
- Model dropdowns (conditional on mode/backend)
- Character/Template selectors (2-column grid)
- Prompt textareas
- Output settings (aspect ratio, seed)
**Controls Panel:** 340px width, compact styling
**Drop Zones:** For reference images (character + pose)
## Common Issues
### "Product not found" from WaveSpeed
Model ID doesn't exist. Check `MODEL_MAP`, `EDIT_MODEL_MAP`, `VIDEO_MODEL_MAP` in wavespeed_provider.py against https://wavespeed.ai/models
### "No image URL in output"
WaveSpeed returned async job. Check `outputs` is empty and `urls.get` exists, then poll that URL.
### HuggingFace Space startup hang
Check requirements.txt for missing packages. Common: `python-dotenv`, `runpod`, `wavespeed` (optional).
### Import errors on HF Spaces
Make optional imports with try/except:
```python
try:
from wavespeed import Client
SDK_AVAILABLE = True
except ImportError:
SDK_AVAILABLE = False
```
## Development Commands
```bash
# Run locally
cd content_engine
python -m uvicorn content_engine.main:app --port 8000 --reload
# Push to HuggingFace
cd content-engine
git add . && git commit -m "message" && git push origin main
# Sync local ↔ HF
cp content_engine/src/content_engine/file.py content-engine/src/content_engine/file.py
```
## Multi-Reference Image Support
For img2img with 2 reference images (character + pose):
1. **UI**: Two drop zones side-by-side
2. **API**: `image` (required) + `image2` (optional) in FormData
3. **Backend**: Both uploaded to temp URLs, sent to WaveSpeed
4. **Models**: SeeDream Sequential, Kling O1 support multi-ref
## Pricing Notes
- **WaveSpeed**: ~$0.003-0.01 per image, $0.01-0.05/s for video
- **RunPod**: ~$0.0002/s for GPU time (training/generation)
- Cloud API cheaper for light use; RunPod better for volume
## Future Improvements
- [ ] RunPod serverless endpoint for LoRA-based generation
- [ ] Auto-captioning for training images
- [ ] Batch video generation
- [ ] Publishing integrations (social media APIs)