Spaces:
Running
Running
File size: 7,943 Bytes
27fea48 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | # Content Engine
Automated AI content generation system with cloud APIs, LoRA training, and multi-backend support.
## Repositories
- **Local Development**: `D:\AI automation\content_engine\`
- **HuggingFace Deployment**: `D:\AI automation\content-engine\` (deployed to https://huggingface.co/spaces/dippoo/content-engine)
Always sync changes between both directories when modifying code.
## Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (ui.html) β
β Generate | Batch | Gallery | Train LoRA | Status | Settings β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (main.py) β
β routes_generation | routes_video | routes_training | etc β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
β Local GPU β β RunPod β β Cloud APIs β
β (ComfyUI) β β (Serverless) β β (WaveSpeed) β
βββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
```
## Cloud Providers
### WaveSpeed (wavespeed_provider.py)
Primary cloud API for image/video generation. Uses direct HTTP API (SDK optional).
**Text-to-Image Models:**
- `seedream-4.5` - Best quality, NSFW OK (ByteDance)
- `seedream-4`, `seedream-3.1` - NSFW friendly
- `gpt-image-1.5`, `gpt-image-1-mini` - OpenAI models
- `nano-banana-pro`, `nano-banana` - Google models
- `wan-2.6`, `wan-2.5` - Alibaba models
- `kling-image-o3` - Kuaishou
**Image-to-Image (Edit) Models:**
- `seedream-4.5-edit` - Best for face preservation
- `seedream-4.5-multi`, `seedream-4-multi` - Multi-reference (up to 3 images)
- `kling-o1-multi` - Multi-reference (up to 10 images)
- `wan-2.6-edit`, `wan-2.5-edit` - NSFW friendly
**Image-to-Video Models:**
- `wan-2.6-i2v-pro` - Best quality ($0.05/s)
- `wan-2.6-i2v-flash` - Fast
- `kling-o3-pro`, `kling-o3` - Kuaishou
- `higgsfield-dop` - Cinematic 5s clips
- `veo-3.1`, `sora-2` - Premium models
**API Pattern:**
```python
# WaveSpeed returns async jobs - must poll for result
response = {"data": {"outputs": [], "urls": {"get": "poll_url"}}}
# Poll urls.get until outputs[] is populated
```
### RunPod
- **Training**: Cloud GPU for LoRA training (runpod_trainer.py)
- **Generation**: Serverless endpoint for inference (runpod_provider.py)
## Character System
Characters link a trained LoRA to generation:
**Config file** (`config/characters/alice.yaml`):
```yaml
id: alice
name: "Alice"
trigger_word: "alicechar" # Activates the LoRA
lora_filename: "alice_v1.safetensors" # In D:\ComfyUI\Models\Lora\
lora_strength: 0.85
```
**Generation flow:**
1. User selects character from dropdown
2. System prepends trigger word: `"alicechar, a woman in red dress"`
3. LoRA is loaded into workflow (local/RunPod only)
4. Character identity is preserved in output
**For cloud-only (no local GPU):**
- Use img2img with reference photo
- Or deploy LoRA to RunPod serverless endpoint
## Templates
Prompt recipes with variables (`config/templates/*.yaml`):
```yaml
id: portrait_glamour
name: "Glamour Portrait"
positive: "{{character}}, {{pose}}, {{lighting}}, professional photo"
variables:
- name: pose
options: ["standing", "sitting", "leaning"]
- name: lighting
options: ["studio", "natural", "dramatic"]
```
## Key Files
### API Routes
- `routes_generation.py` - txt2img, img2img endpoints
- `routes_video.py` - img2video, WaveSpeed/Higgsfield video
- `routes_training.py` - LoRA training jobs
- `routes_catalog.py` - Gallery/image management
- `routes_system.py` - Health checks, character list
### Services
- `wavespeed_provider.py` - WaveSpeed API client (SDK optional, uses httpx)
- `runpod_trainer.py` - Cloud LoRA training
- `runpod_provider.py` - Cloud generation endpoint
- `comfyui_client.py` - Local ComfyUI integration
- `workflow_builder.py` - ComfyUI workflow JSON builder
- `template_engine.py` - Prompt template rendering
- `variation_engine.py` - Batch variation generation
### Frontend
- `ui.html` - Single-page app with all UI
## Environment Variables
```env
# Cloud APIs
WAVESPEED_API_KEY=ws_xxx # WaveSpeed.ai API key
RUNPOD_API_KEY=xxx # RunPod API key
RUNPOD_ENDPOINT_ID=xxx # RunPod serverless endpoint (for generation)
# Optional
HIGGSFIELD_API_KEY=xxx # Higgsfield (Kling 3.0, etc.)
COMFYUI_URL=http://127.0.0.1:8188 # Local ComfyUI
```
## Database
SQLite with async (aiosqlite):
- `images` - Generated image catalog
- `characters` - Character profiles
- `generation_jobs` - Job tracking
- `scheduled_posts` - Publishing queue
## UI Structure
**Generate Page:**
- Mode chips: Text to Image | Image to Image | Image to Video
- Backend chips: Local GPU | RunPod GPU | Cloud API
- Model dropdowns (conditional on mode/backend)
- Character/Template selectors (2-column grid)
- Prompt textareas
- Output settings (aspect ratio, seed)
**Controls Panel:** 340px width, compact styling
**Drop Zones:** For reference images (character + pose)
## Common Issues
### "Product not found" from WaveSpeed
Model ID doesn't exist. Check `MODEL_MAP`, `EDIT_MODEL_MAP`, `VIDEO_MODEL_MAP` in wavespeed_provider.py against https://wavespeed.ai/models
### "No image URL in output"
WaveSpeed returned async job. Check `outputs` is empty and `urls.get` exists, then poll that URL.
### HuggingFace Space startup hang
Check requirements.txt for missing packages. Common: `python-dotenv`, `runpod`, `wavespeed` (optional).
### Import errors on HF Spaces
Make optional imports with try/except:
```python
try:
from wavespeed import Client
SDK_AVAILABLE = True
except ImportError:
SDK_AVAILABLE = False
```
## Development Commands
```bash
# Run locally
cd content_engine
python -m uvicorn content_engine.main:app --port 8000 --reload
# Push to HuggingFace
cd content-engine
git add . && git commit -m "message" && git push origin main
# Sync local β HF
cp content_engine/src/content_engine/file.py content-engine/src/content_engine/file.py
```
## Multi-Reference Image Support
For img2img with 2 reference images (character + pose):
1. **UI**: Two drop zones side-by-side
2. **API**: `image` (required) + `image2` (optional) in FormData
3. **Backend**: Both uploaded to temp URLs, sent to WaveSpeed
4. **Models**: SeeDream Sequential, Kling O1 support multi-ref
## Pricing Notes
- **WaveSpeed**: ~$0.003-0.01 per image, $0.01-0.05/s for video
- **RunPod**: ~$0.0002/s for GPU time (training/generation)
- Cloud API cheaper for light use; RunPod better for volume
## Future Improvements
- [ ] RunPod serverless endpoint for LoRA-based generation
- [ ] Auto-captioning for training images
- [ ] Batch video generation
- [ ] Publishing integrations (social media APIs)
|