Spaces:

dippoo
/

content-engine

Running

App Files Files

content-engine / CLAUDE.md

dippoo

Add pod generation with FLUX.2, persistent state, training improvements

27fea48 28 days ago

preview code

raw

history blame

7.94 kB

	# Content Engine

	Automated AI content generation system with cloud APIs, LoRA training, and multi-backend support.

	## Repositories

	- Local Development: `D:\AI automation\content_engine\`
	- HuggingFace Deployment: `D:\AI automation\content-engine\` (deployed to https://huggingface.co/spaces/dippoo/content-engine)

	Always sync changes between both directories when modifying code.

	## Architecture

	```
	┌─────────────────────────────────────────────────────────────┐
	│ Frontend (ui.html) │
	│ Generate \| Batch \| Gallery \| Train LoRA \| Status \| Settings │
	└─────────────────────────────────────────────────────────────┘
	│
	▼
	┌─────────────────────────────────────────────────────────────┐
	│ FastAPI Backend (main.py) │
	│ routes_generation \| routes_video \| routes_training \| etc │
	└─────────────────────────────────────────────────────────────┘
	│
	┌─────────────────────┼─────────────────────┐
	▼ ▼ ▼
	┌───────────────┐ ┌───────────────┐ ┌───────────────┐
	│ Local GPU │ │ RunPod │ │ Cloud APIs │
	│ (ComfyUI) │ │ (Serverless) │ │ (WaveSpeed) │
	└───────────────┘ └───────────────┘ └───────────────┘
	```

	## Cloud Providers

	### WaveSpeed (wavespeed_provider.py)
	Primary cloud API for image/video generation. Uses direct HTTP API (SDK optional).

	Text-to-Image Models:
	- `seedream-4.5` - Best quality, NSFW OK (ByteDance)
	- `seedream-4`, `seedream-3.1` - NSFW friendly
	- `gpt-image-1.5`, `gpt-image-1-mini` - OpenAI models
	- `nano-banana-pro`, `nano-banana` - Google models
	- `wan-2.6`, `wan-2.5` - Alibaba models
	- `kling-image-o3` - Kuaishou

	Image-to-Image (Edit) Models:
	- `seedream-4.5-edit` - Best for face preservation
	- `seedream-4.5-multi`, `seedream-4-multi` - Multi-reference (up to 3 images)
	- `kling-o1-multi` - Multi-reference (up to 10 images)
	- `wan-2.6-edit`, `wan-2.5-edit` - NSFW friendly

	Image-to-Video Models:
	- `wan-2.6-i2v-pro` - Best quality ($0.05/s)
	- `wan-2.6-i2v-flash` - Fast
	- `kling-o3-pro`, `kling-o3` - Kuaishou
	- `higgsfield-dop` - Cinematic 5s clips
	- `veo-3.1`, `sora-2` - Premium models

	API Pattern:
	```python
	# WaveSpeed returns async jobs - must poll for result
	response = {"data": {"outputs": [], "urls": {"get": "poll_url"}}}
	# Poll urls.get until outputs[] is populated
	```

	### RunPod
	- Training: Cloud GPU for LoRA training (runpod_trainer.py)
	- Generation: Serverless endpoint for inference (runpod_provider.py)

	## Character System

	Characters link a trained LoRA to generation:

	Config file (`config/characters/alice.yaml`):
	```yaml
	id: alice
	name: "Alice"
	trigger_word: "alicechar" # Activates the LoRA
	lora_filename: "alice_v1.safetensors" # In D:\ComfyUI\Models\Lora\
	lora_strength: 0.85
	```

	Generation flow:
	1. User selects character from dropdown
	2. System prepends trigger word: `"alicechar, a woman in red dress"`
	3. LoRA is loaded into workflow (local/RunPod only)
	4. Character identity is preserved in output

	For cloud-only (no local GPU):
	- Use img2img with reference photo
	- Or deploy LoRA to RunPod serverless endpoint

	## Templates

	Prompt recipes with variables (`config/templates/*.yaml`):
	```yaml
	id: portrait_glamour
	name: "Glamour Portrait"
	positive: "{{character}}, {{pose}}, {{lighting}}, professional photo"
	variables:
	- name: pose
	options: ["standing", "sitting", "leaning"]
	- name: lighting
	options: ["studio", "natural", "dramatic"]
	```

	## Key Files

	### API Routes
	- `routes_generation.py` - txt2img, img2img endpoints
	- `routes_video.py` - img2video, WaveSpeed/Higgsfield video
	- `routes_training.py` - LoRA training jobs
	- `routes_catalog.py` - Gallery/image management
	- `routes_system.py` - Health checks, character list

	### Services
	- `wavespeed_provider.py` - WaveSpeed API client (SDK optional, uses httpx)
	- `runpod_trainer.py` - Cloud LoRA training
	- `runpod_provider.py` - Cloud generation endpoint
	- `comfyui_client.py` - Local ComfyUI integration
	- `workflow_builder.py` - ComfyUI workflow JSON builder
	- `template_engine.py` - Prompt template rendering
	- `variation_engine.py` - Batch variation generation

	### Frontend
	- `ui.html` - Single-page app with all UI

	## Environment Variables

	```env
	# Cloud APIs
	WAVESPEED_API_KEY=ws_xxx # WaveSpeed.ai API key
	RUNPOD_API_KEY=xxx # RunPod API key
	RUNPOD_ENDPOINT_ID=xxx # RunPod serverless endpoint (for generation)

	# Optional
	HIGGSFIELD_API_KEY=xxx # Higgsfield (Kling 3.0, etc.)
	COMFYUI_URL=http://127.0.0.1:8188 # Local ComfyUI
	```

	## Database

	SQLite with async (aiosqlite):
	- `images` - Generated image catalog
	- `characters` - Character profiles
	- `generation_jobs` - Job tracking
	- `scheduled_posts` - Publishing queue

	## UI Structure

	Generate Page:
	- Mode chips: Text to Image \| Image to Image \| Image to Video
	- Backend chips: Local GPU \| RunPod GPU \| Cloud API
	- Model dropdowns (conditional on mode/backend)
	- Character/Template selectors (2-column grid)
	- Prompt textareas
	- Output settings (aspect ratio, seed)

	Controls Panel: 340px width, compact styling
	Drop Zones: For reference images (character + pose)

	## Common Issues

	### "Product not found" from WaveSpeed
	Model ID doesn't exist. Check `MODEL_MAP`, `EDIT_MODEL_MAP`, `VIDEO_MODEL_MAP` in wavespeed_provider.py against https://wavespeed.ai/models

	### "No image URL in output"
	WaveSpeed returned async job. Check `outputs` is empty and `urls.get` exists, then poll that URL.

	### HuggingFace Space startup hang
	Check requirements.txt for missing packages. Common: `python-dotenv`, `runpod`, `wavespeed` (optional).

	### Import errors on HF Spaces
	Make optional imports with try/except:
	```python
	try:
	from wavespeed import Client
	SDK_AVAILABLE = True
	except ImportError:
	SDK_AVAILABLE = False
	```

	## Development Commands

	```bash
	# Run locally
	cd content_engine
	python -m uvicorn content_engine.main:app --port 8000 --reload

	# Push to HuggingFace
	cd content-engine
	git add . && git commit -m "message" && git push origin main

	# Sync local ↔ HF
	cp content_engine/src/content_engine/file.py content-engine/src/content_engine/file.py
	```

	## Multi-Reference Image Support

	For img2img with 2 reference images (character + pose):

	1. UI: Two drop zones side-by-side
	2. API: `image` (required) + `image2` (optional) in FormData
	3. Backend: Both uploaded to temp URLs, sent to WaveSpeed
	4. Models: SeeDream Sequential, Kling O1 support multi-ref

	## Pricing Notes

	- WaveSpeed: ~$0.003-0.01 per image, $0.01-0.05/s for video
	- RunPod: ~$0.0002/s for GPU time (training/generation)
	- Cloud API cheaper for light use; RunPod better for volume

	## Future Improvements

	- [ ] RunPod serverless endpoint for LoRA-based generation
	- [ ] Auto-captioning for training images
	- [ ] Batch video generation
	- [ ] Publishing integrations (social media APIs)