Spaces:

dippoo
/

content-engine

Running

File size: 7,943 Bytes

27fea48

# Content Engine

Automated AI content generation system with cloud APIs, LoRA training, and multi-backend support.

## Repositories

- **Local Development**: `D:\AI automation\content_engine\`
- **HuggingFace Deployment**: `D:\AI automation\content-engine\` (deployed to https://huggingface.co/spaces/dippoo/content-engine)

Always sync changes between both directories when modifying code.

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                        Frontend (ui.html)                    │
│  Generate | Batch | Gallery | Train LoRA | Status | Settings │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    FastAPI Backend (main.py)                 │
│  routes_generation | routes_video | routes_training | etc   │
└─────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌───────────────┐    ┌───────────────┐    ┌───────────────┐
│  Local GPU    │    │   RunPod      │    │  Cloud APIs   │
│  (ComfyUI)    │    │  (Serverless) │    │  (WaveSpeed)  │
└───────────────┘    └───────────────┘    └───────────────┘
```

## Cloud Providers

### WaveSpeed (wavespeed_provider.py)
Primary cloud API for image/video generation. Uses direct HTTP API (SDK optional).

**Text-to-Image Models:**
- `seedream-4.5` - Best quality, NSFW OK (ByteDance)
- `seedream-4`, `seedream-3.1` - NSFW friendly
- `gpt-image-1.5`, `gpt-image-1-mini` - OpenAI models
- `nano-banana-pro`, `nano-banana` - Google models
- `wan-2.6`, `wan-2.5` - Alibaba models
- `kling-image-o3` - Kuaishou

**Image-to-Image (Edit) Models:**
- `seedream-4.5-edit` - Best for face preservation
- `seedream-4.5-multi`, `seedream-4-multi` - Multi-reference (up to 3 images)
- `kling-o1-multi` - Multi-reference (up to 10 images)
- `wan-2.6-edit`, `wan-2.5-edit` - NSFW friendly

**Image-to-Video Models:**
- `wan-2.6-i2v-pro` - Best quality ($0.05/s)
- `wan-2.6-i2v-flash` - Fast
- `kling-o3-pro`, `kling-o3` - Kuaishou
- `higgsfield-dop` - Cinematic 5s clips
- `veo-3.1`, `sora-2` - Premium models

**API Pattern:**
```python
# WaveSpeed returns async jobs - must poll for result
response = {"data": {"outputs": [], "urls": {"get": "poll_url"}}}
# Poll urls.get until outputs[] is populated
```

### RunPod
- **Training**: Cloud GPU for LoRA training (runpod_trainer.py)
- **Generation**: Serverless endpoint for inference (runpod_provider.py)

## Character System

Characters link a trained LoRA to generation:

**Config file** (`config/characters/alice.yaml`):
```yaml
id: alice
name: "Alice"
trigger_word: "alicechar"              # Activates the LoRA
lora_filename: "alice_v1.safetensors"  # In D:\ComfyUI\Models\Lora\
lora_strength: 0.85
```

**Generation flow:**
1. User selects character from dropdown
2. System prepends trigger word: `"alicechar, a woman in red dress"`
3. LoRA is loaded into workflow (local/RunPod only)
4. Character identity is preserved in output

**For cloud-only (no local GPU):**
- Use img2img with reference photo
- Or deploy LoRA to RunPod serverless endpoint

## Templates

Prompt recipes with variables (`config/templates/*.yaml`):
```yaml
id: portrait_glamour
name: "Glamour Portrait"
positive: "{{character}}, {{pose}}, {{lighting}}, professional photo"
variables:
  - name: pose
    options: ["standing", "sitting", "leaning"]
  - name: lighting
    options: ["studio", "natural", "dramatic"]
```

## Key Files

### API Routes
- `routes_generation.py` - txt2img, img2img endpoints
- `routes_video.py` - img2video, WaveSpeed/Higgsfield video
- `routes_training.py` - LoRA training jobs
- `routes_catalog.py` - Gallery/image management
- `routes_system.py` - Health checks, character list

### Services
- `wavespeed_provider.py` - WaveSpeed API client (SDK optional, uses httpx)
- `runpod_trainer.py` - Cloud LoRA training
- `runpod_provider.py` - Cloud generation endpoint
- `comfyui_client.py` - Local ComfyUI integration
- `workflow_builder.py` - ComfyUI workflow JSON builder
- `template_engine.py` - Prompt template rendering
- `variation_engine.py` - Batch variation generation

### Frontend
- `ui.html` - Single-page app with all UI

## Environment Variables

```env
# Cloud APIs
WAVESPEED_API_KEY=ws_xxx           # WaveSpeed.ai API key
RUNPOD_API_KEY=xxx                 # RunPod API key
RUNPOD_ENDPOINT_ID=xxx             # RunPod serverless endpoint (for generation)

# Optional
HIGGSFIELD_API_KEY=xxx             # Higgsfield (Kling 3.0, etc.)
COMFYUI_URL=http://127.0.0.1:8188  # Local ComfyUI
```

## Database

SQLite with async (aiosqlite):
- `images` - Generated image catalog
- `characters` - Character profiles
- `generation_jobs` - Job tracking
- `scheduled_posts` - Publishing queue

## UI Structure

**Generate Page:**
- Mode chips: Text to Image | Image to Image | Image to Video
- Backend chips: Local GPU | RunPod GPU | Cloud API
- Model dropdowns (conditional on mode/backend)
- Character/Template selectors (2-column grid)
- Prompt textareas
- Output settings (aspect ratio, seed)

**Controls Panel:** 340px width, compact styling
**Drop Zones:** For reference images (character + pose)

## Common Issues

### "Product not found" from WaveSpeed
Model ID doesn't exist. Check `MODEL_MAP`, `EDIT_MODEL_MAP`, `VIDEO_MODEL_MAP` in wavespeed_provider.py against https://wavespeed.ai/models

### "No image URL in output"
WaveSpeed returned async job. Check `outputs` is empty and `urls.get` exists, then poll that URL.

### HuggingFace Space startup hang
Check requirements.txt for missing packages. Common: `python-dotenv`, `runpod`, `wavespeed` (optional).

### Import errors on HF Spaces
Make optional imports with try/except:
```python
try:
    from wavespeed import Client
    SDK_AVAILABLE = True
except ImportError:
    SDK_AVAILABLE = False
```

## Development Commands

```bash
# Run locally
cd content_engine
python -m uvicorn content_engine.main:app --port 8000 --reload

# Push to HuggingFace
cd content-engine
git add . && git commit -m "message" && git push origin main

# Sync local ↔ HF
cp content_engine/src/content_engine/file.py content-engine/src/content_engine/file.py
```

## Multi-Reference Image Support

For img2img with 2 reference images (character + pose):

1. **UI**: Two drop zones side-by-side
2. **API**: `image` (required) + `image2` (optional) in FormData
3. **Backend**: Both uploaded to temp URLs, sent to WaveSpeed
4. **Models**: SeeDream Sequential, Kling O1 support multi-ref

## Pricing Notes

- **WaveSpeed**: ~$0.003-0.01 per image, $0.01-0.05/s for video
- **RunPod**: ~$0.0002/s for GPU time (training/generation)
- Cloud API cheaper for light use; RunPod better for volume

## Future Improvements

- [ ] RunPod serverless endpoint for LoRA-based generation
- [ ] Auto-captioning for training images
- [ ] Batch video generation
- [ ] Publishing integrations (social media APIs)