Spaces:

Genkanwall
/

Ultrapixel-Multistage

Paused

App Files Files Community

Ultrapixel-Multistage / README.md

Genkanwall

Fresh start with UltraPixel implementation

e0a2bd0 3 months ago

preview code

raw

history blame contribute delete

3.73 kB

	---
	title: UltraPixel Multi-Stage (Community Fixed)
	emoji: 🎨
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.49.1
	app_file: app.py
	pinned: false
	license: apache-2.0
	---

	# 🎨 UltraPixel Multi-Stage Generator (Community Fixed)

	A properly working UltraPixel-style high-resolution image generator that actually respects your parameter inputs.

	## What's Different From Original UltraPixel Spaces?

	The original public UltraPixel spaces have a critical flaw - they hardcode CFG and timesteps inside the generation function, making the UI sliders meaningless:

	```python
	# Original broken code:
	extras.sampling_configs['cfg'] = 4 # ← Always uses 4!
	extras.sampling_configs['timesteps'] = 20 # ← Ignores your slider!
	```

	### This Space Fixes That ✅

	- Real CFG Control: Your slider values are actually passed to the model
	- Real Steps Control: Set your own timesteps (10-100) per stage
	- Memory Optimized: Won't OOM on ZeroGPU (max 3072×3072 with tiling)
	- No Login Required: Public access for easy testing

	## Features

	- 🎯 3-Stage Pipeline: Stable Cascade architecture (Stage C → B → A)
	- 🔧 Independent Controls: Separate CFG/steps for each stage
	- 💾 Memory Safe: Aggressive cleanup between stages, forced tiling
	- ⏱️ 120s Per Stage: Each stage gets fresh GPU allocation
	- 🔓 Public Access: No authentication needed

	## How to Use

	### Standard Workflow (3-4 minutes total)

	1. Stage C - Generate Initial Latent (~30-60s)
	- Enter your prompt
	- Set CFG (recommended: 7.5) and Steps (recommended: 30)
	- Click "Generate Stage C"
	- Wait for completion

	2. Wait for GPU availability (if needed during high traffic)

	3. Stage B - Upscale Latent (~30-50s)
	- Adjust CFG (recommended: 5.0) and Steps (recommended: 15)
	- Click "Generate Stage B"
	- Uses the latent from Stage C automatically

	4. Wait again if needed

	5. Stage A - Final Decode (~60-90s)
	- Keep "Use Tiling" checked (prevents OOM)
	- Click "Generate Final Image"
	- Download your high-res result!

	### Optimal Settings 💡

	- Stage C: CFG 7-8, Steps 30-40
	- Stage B: CFG 4-6, Steps 15-20
	- Stage A: Always use tiling
	- Resolution Limits: Max 3072×3072 for stability (1536×1536 per stage C/B)
	- For Training Data: Generate at 3072px, then downscale to 1024px for optimal quality

	## Technical Details

	### Memory Management

	Each stage runs in isolated `@spaces.GPU(duration=120)` calls:
	- Models loaded only when needed
	- Aggressive `torch.cuda.empty_cache()` after each stage
	- Latents stored in-memory (automatically cleaned after 1 hour)
	- VAE tiling enabled for Stage A decode

	### Resolution Scaling

	- Stage C Input: 512-1536px (base resolution)
	- Stage B Output: 2× Stage C (1024-3072px)
	- Stage A Output: Full decode to target resolution
	- Memory Usage: ~20-30GB peak per stage (safe for ZeroGPU)

	## Why This Matters

	Many public AI spaces claim to offer "full control" but secretly override your parameters. This leads to:
	- ❌ Inconsistent results despite changing settings
	- ❌ Users wasting time tweaking sliders that do nothing
	- ❌ Frustration when trying to reproduce outputs

	This space guarantees that your inputs = actual model parameters.

	## Deployment Notes

	Built specifically for:
	- ZeroGPU compatibility (120s duration per stage)
	- Public/unlogged access
	- High-resolution output (up to 3072×3072 stable)
	- Proper parameter control

	## Credits

	- Stable Cascade: Stability AI
	- Original UltraPixel Concept: Various community implementations
	- This Implementation: Community-fixed version with proper parameter control

	## License

	Apache 2.0