Spaces:

jolieee206
/

ComfyUI-Style-IPAdapterGenerator

Runtime error

App Files Files Community

ComfyUI-Style-IPAdapterGenerator / README.md

jolieee206

Update README.md

04b9229 verified 5 months ago

preview code

raw

history blame contribute delete

6.6 kB

	---
	title: ComfyUI-Style IPAdapter Generator
	emoji: 🎨
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.39.0
	app_file: app.py
	pinned: false
	license: mit
	---

	# 🎨 ComfyUI-Style IPAdapter Generator

	A Hugging Face Space that replicates core ComfyUI + IPAdapter functionality using Gradio. Generate images using text prompts and reference images with advanced AI models.

	## ✨ Features

	- Text-to-Image Generation: Create images from detailed text descriptions
	- IPAdapter Integration: Use reference images to guide generation (faces, styles, compositions)
	- Multiple Models: Support for Stable Diffusion 1.5 and SDXL
	- Advanced Controls: Fine-tune generation with guidance scale, steps, and resolution
	- Face Enhancement: Optional CodeFormer/GFPGAN integration for face improvement
	- LoRA Support: Apply custom style models for unique aesthetics
	- Side-by-Side Comparison: View reference and generated images together
	- Memory Optimized: Works on both CPU and GPU with automatic fallbacks

	## 🚀 Quick Start

	### Local Installation

	1. Clone and Setup:
	```bash
	git clone <your-repo-url>
	cd comfyui-ipAdapter-space
	pip install -r requirements.txt
	```

	2. Run the Application:
	```bash
	python app.py
	```

	3. Access the Interface:
	Open your browser to `http://localhost:7860`

	### Hugging Face Space Deployment

	1. Create a new Space on Hugging Face
	2. Upload files: `app.py`, `requirements.txt`, `README.md`
	3. Select hardware: CPU (free) or GPU (paid) based on your needs
	4. Deploy: The space will automatically build and launch

	## 📖 Usage Guide

	### Basic Workflow

	1. Select Model: Choose between Stable Diffusion 1.5 or SDXL
	2. Enter Prompt: Describe the image you want to generate
	3. Upload Reference: Provide a reference image (face, style, or composition guide)
	4. Adjust Settings: Fine-tune generation parameters
	5. Generate: Click the generate button and wait for results

	### Parameters Explained

	#### Core Settings
	- Text Prompt: Detailed description of desired image
	- Reference Image: Guide image for IPAdapter (faces work best)
	- Model: Base diffusion model (SD 1.5 for speed, SDXL for quality)

	#### Generation Controls
	- Guidance Scale (1-20): How closely to follow the prompt (7.5 recommended)
	- IPAdapter Scale (0-2): Strength of reference image influence (1.0 recommended)
	- Resolution: Output image dimensions (512x512 for speed, higher for quality)
	- Inference Steps (10-50): Quality vs speed tradeoff (20 recommended)
	- Seed: For reproducible results (0 for random)

	#### Enhancement Options
	- Face Enhancement: Improve facial details in generated images
	- CodeFormer vs GFPGAN: Different face enhancement algorithms
	- LoRA Path: Local path to custom style models
	- LoRA Scale: Strength of style model application

	### Best Practices

	#### For Face Generation
	- Use clear, well-lit reference photos
	- Keep IPAdapter scale between 0.8-1.2
	- Enable face enhancement for better results
	- Use descriptive prompts: "professional headshot, studio lighting"

	#### For Style Transfer
	- Use artistic references (paintings, illustrations)
	- Adjust IPAdapter scale based on desired style strength
	- Experiment with different guidance scales
	- Consider using LoRA models for consistent styles

	#### Performance Optimization
	- Use 512x512 resolution for faster generation
	- Reduce inference steps to 15-20 for speed
	- Enable face enhancement only when needed
	- Use CPU mode if GPU memory is limited

	## 🛠️ Technical Details

	### Architecture
	- Frontend: Gradio web interface
	- Backend: Hugging Face Diffusers + IPAdapter
	- Models: Stable Diffusion 1.5/XL with IPAdapter weights
	- Enhancement: CodeFormer/GFPGAN for face improvement
	- Styling: LoRA support for custom aesthetics

	### Memory Management
	- Automatic model loading/unloading
	- GPU memory optimization with xformers
	- CPU fallback for limited hardware
	- Efficient attention mechanisms

	### Supported Formats
	- Input Images: JPG, PNG, WebP
	- Output: PNG format
	- LoRA Models: .safetensors, .ckpt files

	## 🔧 Configuration

	### Environment Variables
	```bash
	# Optional: Set device preference
	CUDA_VISIBLE_DEVICES=0

	# Optional: Set cache directory
	HF_HOME=/path/to/cache
	```

	### Hardware Requirements

	#### Minimum (CPU)
	- 8GB RAM
	- 10GB storage
	- Generation time: 2-5 minutes

	#### Recommended (GPU)
	- NVIDIA GPU with 6GB+ VRAM
	- 16GB RAM
	- 20GB storage
	- Generation time: 10-30 seconds

	## 📝 Example Prompts

	### Portrait Generation
	```
	"A professional headshot photo of a person, studio lighting, high quality, detailed facial features"
	```

	### Artistic Styles
	```
	"An oil painting portrait in the style of Renaissance masters, dramatic lighting, classical composition"
	```

	### Fantasy/Sci-Fi
	```
	"A cyberpunk character with neon lighting, futuristic elements, digital art style"
	```

	### Anime/Illustration
	```
	"An anime-style character portrait, vibrant colors, detailed eyes, manga illustration"
	```

	## 🐛 Troubleshooting

	### Common Issues

	Model Loading Errors
	- Check internet connection for model downloads
	- Ensure sufficient disk space (20GB+)
	- Try switching to CPU mode if GPU memory insufficient

	Generation Failures
	- Verify reference image is valid (JPG/PNG)
	- Check prompt length (keep under 200 characters)
	- Reduce resolution if memory errors occur

	Slow Performance
	- Use smaller resolutions (512x512)
	- Reduce inference steps
	- Disable face enhancement
	- Switch to CPU mode if GPU is overloaded

	Face Enhancement Issues
	- Ensure face is clearly visible in reference
	- Try different enhancement algorithms
	- Adjust IPAdapter scale for better face preservation

	## 🤝 Contributing

	1. Fork the repository
	2. Create a feature branch
	3. Make your changes
	4. Test thoroughly
	5. Submit a pull request

	## 📄 License

	This project is licensed under the MIT License. See LICENSE file for details.

	## 🙏 Acknowledgments

	- Hugging Face for the Diffusers library and model hosting
	- IPAdapter team for the reference image integration
	- ComfyUI for inspiration and workflow concepts
	- Gradio team for the excellent web interface framework

	## 📞 Support

	- Issues: Report bugs via GitHub Issues
	- Discussions: Join the community discussions
	- Documentation: Check the Hugging Face Spaces documentation

	---

	Note: This is an educational project replicating ComfyUI functionality. For production use, consider the original ComfyUI or commercial alternatives.