Upload README.md with huggingface_hub

1fdc411 verified 2 months ago

5.04 kB

	---
	license: other
	library_name: comfyui
	pipeline_tag: image-to-image
	tags:
	- stable-diffusion
	- stable-diffusion-diffusers
	- image-to-image
	- lora
	- comfyui-workflow
	- education
	- portfolio
	- art
	- onetrainer
	base_model: stabilityai/stable-diffusion-xl-base-1.0
	---

	# 🎨 CS x Design Convergence Project: Generative AI Pipeline & Workflow Archive

	> "Bridging Technical Logic with Aesthetic Sensibility"
	>
	> This repository serves as a Portfolio Archive documenting the construction of Generative AI image generation pipelines and workflow optimization.
	> As a result of an interdisciplinary curriculum merging Computer Science and Design, this project demonstrates the end-to-end process from data collection and model fine-tuning to the design of advanced inference workflows.

	---

	## 📋 1. Project Overview

	The core objective of this project is to demonstrate the ability to accurately train specific artistic styles and implement them into highly controllable workflows, going beyond simple prompt engineering. It aims to prove both technical proficiency (Model Architecture, Latent Space understanding) and artistic expression (Style Transfer).

	* Key Activities: Custom LoRA Training, Advanced ComfyUI Workflow Design, Automated Pipeline Scripting.
	* Tools Used: ComfyUI, OneTrainer, Stable Diffusion, Python, Hugging Face.

	---

	## 🧠 2. Model Training Methodology: Kirochy Style LoRA

	To replicate the unique style of the illustrator Kirochy, I conducted LoRA (Low-Rank Adaptation) training with a rigorous data processing approach.

	### 2.1 Data Acquisition & Preprocessing
	* Data Source: Aggregated reference illustrations from the artist's official portfolios ([Instagram @kirochy_00](https://www.instagram.com/kirochy_00/), X).
	* Preprocessing: Implemented OneTrainer to handle various resolutions and aspect ratios via bucketing. Conducted detailed tagging to capture specific stylistic features (line art weight, color palettes, shading techniques).

	### 2.2 Training Framework & Optimization
	* Engine: Trained using OneTrainer for precise parameter control.
	* Optimization: Adjusted Epochs and Learning Rates iteratively to balance between style fidelity and generalization, ensuring the model avoids overfitting while retaining the artist's signature touch.

	---

	## ⚙️ 3. Workflow Architecture: P2A (Photo to Anime) Pipeline

	The `p2a.ai.json` file in this repository is a highly sophisticated Img2Img Workflow designed to convert real-world photos into Kirochy-style illustrations. To solve common structural distortion issues in style transfer, I engineered a multi-stage processing pipeline.

	### 3.1 Technical Logic & Customization
	This workflow is not a mere copy-paste; it is a custom-built architecture integrating various advanced techniques researched from diverse community workflows and technical documentation.

	1. ControlNet Integration (Structural Integrity):
	* Utilized ControlNet algorithms to strictly preserve the pose and depth information of the source image, preventing the "hallucinations" often seen in generative models.

	2. SAM (Segment Anything Model) & SAG (Self-Attention Guidance):
	* Integrated SAM for precise object segmentation and SAG to refine attention mechanisms. This ensures a clear separation between the subject and the background, enhancing the clarity of the illustration style.

	3. Automatic Detailer (Face & Hand Refinement):
	* Implemented a post-processing pipeline using Face and Hand Detailers. The workflow automatically detects and masks these complex regions, resampling them at higher resolutions to fix artifacts and ensure anatomical correctness.

	---

	## 🖼️ 4. Results & Portfolio Showcase

	The final outputs generated using this model and workflow are archived on Instagram. You can compare the reference inputs with the generated results to verify the technical quality.

	* Instagram Portfolio: [@eom0am](https://www.instagram.com/eom0am)

	---

	## ⚠️ 5. Ethical Considerations & License

	This project was conducted strictly for Academic Study and Research purposes.

	### ⛔ Copyright & Usage Warning
	* Intellectual Property: The copyright and stylistic rights of the LoRA model belong entirely to the original artist, Kirochy ([@kirochy_00](https://www.instagram.com/kirochy_00/)).
	* Non-Commercial Use Only: Utilizing this model file or the workflows for any commercial purpose (sales, paid commissions, advertising, etc.) is strictly prohibited.
	* Legal Notice: Any commercial exploitation may result in legal consequences under copyright laws.

	### 📝 Scope of Permitted Use
	* ⭕ Allowed: Personal study, portfolio research, non-commercial fan art.
	* ❌ Prohibited: Commercial use, impersonation of the original artist, unauthorized redistribution for profit.

	---

	Author: Um Yunsang
	Role: CS & Design Convergence Researcher / AI Engineer Candidate