comfyui-models / README.md
umyunsang's picture
Upload README.md with huggingface_hub
1fdc411 verified
---
license: other
library_name: comfyui
pipeline_tag: image-to-image
tags:
- stable-diffusion
- stable-diffusion-diffusers
- image-to-image
- lora
- comfyui-workflow
- education
- portfolio
- art
- onetrainer
base_model: stabilityai/stable-diffusion-xl-base-1.0
---
# 🎨 CS x Design Convergence Project: Generative AI Pipeline & Workflow Archive
> **"Bridging Technical Logic with Aesthetic Sensibility"**
>
> This repository serves as a **Portfolio Archive** documenting the construction of Generative AI image generation pipelines and workflow optimization.
> As a result of an interdisciplinary curriculum merging **Computer Science and Design**, this project demonstrates the end-to-end process from data collection and model fine-tuning to the design of advanced inference workflows.
---
## πŸ“‹ 1. Project Overview
The core objective of this project is to demonstrate the ability to **accurately train specific artistic styles** and implement them into **highly controllable workflows**, going beyond simple prompt engineering. It aims to prove both technical proficiency (Model Architecture, Latent Space understanding) and artistic expression (Style Transfer).
* **Key Activities:** Custom LoRA Training, Advanced ComfyUI Workflow Design, Automated Pipeline Scripting.
* **Tools Used:** ComfyUI, OneTrainer, Stable Diffusion, Python, Hugging Face.
---
## 🧠 2. Model Training Methodology: Kirochy Style LoRA
To replicate the unique style of the illustrator **Kirochy**, I conducted LoRA (Low-Rank Adaptation) training with a rigorous data processing approach.
### 2.1 Data Acquisition & Preprocessing
* **Data Source:** Aggregated reference illustrations from the artist's official portfolios ([Instagram @kirochy_00](https://www.instagram.com/kirochy_00/), X).
* **Preprocessing:** Implemented **OneTrainer** to handle various resolutions and aspect ratios via bucketing. Conducted detailed tagging to capture specific stylistic features (line art weight, color palettes, shading techniques).
### 2.2 Training Framework & Optimization
* **Engine:** Trained using **OneTrainer** for precise parameter control.
* **Optimization:** Adjusted Epochs and Learning Rates iteratively to balance between style fidelity and generalization, ensuring the model avoids overfitting while retaining the artist's signature touch.
---
## βš™οΈ 3. Workflow Architecture: P2A (Photo to Anime) Pipeline
The `p2a.ai.json` file in this repository is a highly sophisticated **Img2Img Workflow** designed to convert real-world photos into Kirochy-style illustrations. To solve common structural distortion issues in style transfer, I engineered a multi-stage processing pipeline.
### 3.1 Technical Logic & Customization
This workflow is not a mere copy-paste; it is a **custom-built architecture** integrating various advanced techniques researched from diverse community workflows and technical documentation.
1. **ControlNet Integration (Structural Integrity):**
* Utilized ControlNet algorithms to strictly preserve the pose and depth information of the source image, preventing the "hallucinations" often seen in generative models.
2. **SAM (Segment Anything Model) & SAG (Self-Attention Guidance):**
* Integrated **SAM** for precise object segmentation and **SAG** to refine attention mechanisms. This ensures a clear separation between the subject and the background, enhancing the clarity of the illustration style.
3. **Automatic Detailer (Face & Hand Refinement):**
* Implemented a post-processing pipeline using **Face and Hand Detailers**. The workflow automatically detects and masks these complex regions, resampling them at higher resolutions to fix artifacts and ensure anatomical correctness.
---
## πŸ–ΌοΈ 4. Results & Portfolio Showcase
The final outputs generated using this model and workflow are archived on Instagram. You can compare the reference inputs with the generated results to verify the technical quality.
* **Instagram Portfolio:** [@eom0am](https://www.instagram.com/eom0am)
---
## ⚠️ 5. Ethical Considerations & License
This project was conducted strictly for **Academic Study and Research purposes**.
### β›” Copyright & Usage Warning
* **Intellectual Property:** The copyright and stylistic rights of the LoRA model belong entirely to the original artist, **Kirochy** ([@kirochy_00](https://www.instagram.com/kirochy_00/)).
* **Non-Commercial Use Only:** Utilizing this model file or the workflows for **any commercial purpose (sales, paid commissions, advertising, etc.) is strictly prohibited.**
* **Legal Notice:** Any commercial exploitation may result in legal consequences under copyright laws.
### πŸ“ Scope of Permitted Use
* β­• **Allowed:** Personal study, portfolio research, non-commercial fan art.
* ❌ **Prohibited:** Commercial use, impersonation of the original artist, unauthorized redistribution for profit.
---
**Author:** Um Yunsang
**Role:** CS & Design Convergence Researcher / AI Engineer Candidate