Update README.md

dbde43c verified over 1 year ago

21.5 kB

	<table style="border-collapse: collapse; border: 2px solid #ddd; width: 100%;">
	<tr>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://discord.com/servers/software-engineering-courses-secourses-772774097734074388"><img src="https://img.shields.io/discord/772774097734074388?label=Discord&logo=discord" alt="Discord"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://hits.seeyoufarm.com"><img src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fhuggingface.co%2FMonsterMMORPG%2F3D-Cartoon-Style-FLUX&count_bg=%2379C83D&title_bg=%239E0F0F&icon=apachespark.svg&icon_color=%23E7E7E7&title=views&edge_flat=false" alt="Hits"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://www.patreon.com/SECourses"><img src="https://img.shields.io/badge/Patreon-Support%20Me-F2EB0E?style=for-the-badge&logo=patreon" alt="Patreon"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://www.buymeacoffee.com/DrFurkan"><img src="https://img.shields.io/badge/Buy%20Me%20a%20Coffee-ffdd00?style=for-the-badge&logo=buy-me-a-coffee&logoColor=black" alt="BuyMeACoffee"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://medium.com/@furkangozukara"><img src="https://img.shields.io/badge/Medium-Follow%20Me-800080?style=for-the-badge&logo=medium&logoColor=white" alt="Furkan Gözükara Medium"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://civitai.com/user/SECourses/articles"><img src="https://img.shields.io/static/v1?style=for-the-badge&message=Articles&color=4574E0&logo=Codio&logoColor=FFFFFF&label=CivitAI" alt="Codio"></a></td>
	</tr>
	<tr>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://www.deviantart.com/monstermmorpg"><img src="https://img.shields.io/badge/DeviantArt-Follow%20Me-990000?style=for-the-badge&logo=deviantart&logoColor=white" alt="Furkan Gözükara DeviantArt"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://www.youtube.com/SECourses"><img src="https://img.shields.io/badge/YouTube-SECourses-C50C0C?style=for-the-badge&logo=youtube" alt="YouTube Channel"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://www.linkedin.com/in/furkangozukara/"><img src="https://img.shields.io/badge/LinkedIn-Follow%20Me-0077B5?style=for-the-badge&logo=linkedin&logoColor=white" alt="Furkan Gözükara LinkedIn"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://www.udemy.com/course/stable-diffusion-dreambooth-lora-zero-to-hero/?referralCode=E327407C9BDF0CEA8156"><img src="https://img.shields.io/static/v1?style=for-the-badge&message=Stable%20Diffusion%20Course&color=A435F0&logo=Udemy&logoColor=FFFFFF&label=Udemy" alt="Udemy"></a></td>
	<td style="border: 1px solid #ddd; padding: 3px; text-align: center; vertical-align: middle;"><a href="https://twitter.com/GozukaraFurkan"><img src="https://img.shields.io/badge/Twitter-Follow%20Me-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white" alt="Twitter Follow Furkan Gözükara"></a></td>
	</tr>
	</table>

	# Full Training Tutorial and Guide and Research For a FLUX Style

	This is a training of a public LoRA style (4 separate training each on 4x A6000).

	Experimenting captions vs non-captions. So we will see which yields best results for style training on FLUX.

	CivitAI Link : https://civitai.com/models/731347

	Generated captions with multi-GPU batch Joycaption app.

	# I used my multi-GPU Joycaption APP (used 8x A6000 for ultra fast captioning)
	# https://www.patreon.com/posts/110613301

	<img src="https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/LTfUYHXCpcwzt3_us0R26.png" alt="Joycaption examples" style="max-height: 500px; width: auto;">

	# I used my Gradio batch caption editor to edit some words and add activation token as ohwx 3d render
	# https://www.patreon.com/posts/108992085

	<img src="https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/BleDJpEMrCMXXRTCPKJqb.png" alt="Gradio batch caption editor" style="max-height: 500px; width: auto;">

	The no caption dataset uses only ohwx 3d render as caption

	# I am using my newest 4x_GPU_Rank_1_SLOW_Better_Quality.json on 4X A6000 GPU and train 500 epochs - 114 images
	# https://www.patreon.com/posts/110879657

	<img src="https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/jK75d8i1x5hAHSYSsJNBd.png" alt="Training configuration" style="max-height: 500px; width: auto;">

	All trainings are saved as Float and 128 LoRA Network Rank thus they are above 2GB per checkpoint

	## Inconsistent Dataset Training

	This is the first training I made with the below dataset

	[Inconsistent-Training-Dataset-Images-Grid.jpg](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Inconsistent-Training-Dataset-Images-Grid.jpg)

	When you pay attention to the grid image above shared, you will see that the dataset is not consistent

	The training dataset with used captions (only for With Captions training) can be see in below directory

	[Training-Dataset](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/tree/main/Training-Dataset)

	It has total 114 images

	This training total step count was 500 * 114 / 4 (4x GPU - batch size 1) = 14250 steps

	It took like 37 hours on 4x RTX A6000 GPU with slow config - faster config would take like half

	There were 2 trainings made with this dataset. Epoch 500 checkpoints are named as below

	[SECourses_Style_Inconsistent_DATASET_NO_Captions.safetensors](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/SECourses_Style_Inconsistent_DATASET_NO_Captions.safetensors)
	[SECourses_Style_Inconsistent_DATASET_With_Captions.safetensors](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/SECourses_Style_Inconsistent_DATASET_With_Captions.safetensors)

	Their checkpoints are saved in below folders

	[Training-Checkpoints-NO-Captions](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/tree/main/Training-Checkpoints-NO-Captions)
	[Training-Checkpoints-With-Captions](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/tree/main/Training-Checkpoints-With-Captions)

	Its grid results are shared below

	[Inconsistent-Training-Dataset-Results-Grid-26100x23700px.jpg](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Inconsistent-Training-Dataset-Results-Grid-26100x23700px.jpg)

	When you pay attention to above image you will see that it has inconsistent results

	## Consistent Dataset Training

	After I noticed that the initial training dataset was inconsistent i have pruned the dataset and made it much more consistent

	[Fixed-Consistent-Training-Dataset-Images-Grid.jpg](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Fixed-Consistent-Training-Dataset-Images-Grid.jpg)

	When you pay attention to the grid image above shared, you will see that is way more consistent, still not perfect though

	Now it has total 66 images

	The training dataset with used captions for this training (only for With Captions training) can be see in below directory

	[Fixed-Consistent-Training-Dataset](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/tree/main/Fixed-Consistent-Training-Dataset)

	This training total step count was 500 * 66 / 4 (4x GPU - batch size 1) = 8250 steps

	It took like 24 hours on 4x RTX A6000 GPU with slow config - faster config would take like half

	There were 2 trainings made with this dataset. Epoch 500 checkpoints are named as below

	[SECourses_3D_Render_Style_Fixed_Dataset_NO_Captions.safetensors](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/SECourses_3D_Render_Style_Fixed_Dataset_NO_Captions.safetensors)
	[SECourses_3D_Render_Style_Fixed_Dataset_With_Captions.safetensors](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/SECourses_3D_Render_Style_Fixed_Dataset_With_Captions.safetensors)

	Their checkpoints are saved in below folders

	[Training-Checkpoints-Fixed-DATASET-NO-Captions](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/tree/main/Training-Checkpoints-Fixed-DATASET-NO-Captions)
	[Training-Checkpoints-Fixed-DATASET-With-Captions](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/tree/main/Training-Checkpoints-Fixed-DATASET-With-Captions)

	Its grid results are shared below - this one includes results from inconsistent dataset as well

	[Fixed-Consistent-Training-Dataset-Results-Grid-50700x15500px.jpg](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Fixed-Consistent-Training-Dataset-Results-Grid-50700x15500px.jpg)

	When you pay attention to above image you will see now it is way more consistent

	# Best Checkpoint And Conclusion

	When inconsistent dataset was used, training with captions yielded way better results.

	However, when training made with a consistent dataset, no captions yielded better and more consistent results with early epochs.

	Thus I concluded that, epoch 75 of no-captions dataset is best checkpoint

	Here below comparison images for fixed dataset

	[Fixed-Consistent-Training-Dataset-No-Captions-Only-Grid.jpg](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Fixed-Consistent-Training-Dataset-No-Captions-Only-Grid.jpg)

	[Fixed-Consistent-Training-Dataset-With-Captions-Only-Grid.jpg](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Fixed-Consistent-Training-Dataset-With-Captions-Only-Grid.jpg)

	Best checkpoint download link : [Training-Checkpoints-Fixed-DATASET-NO-Captions/SECourses_3D_Render_Style_Fixed_Dataset_NO_Captions-000075.safetensors](https://huggingface.co/MonsterMMORPG/3D-Cartoon-Style-FLUX/resolve/main/Training-Checkpoints-Fixed-DATASET-NO-Captions/SECourses_3D_Render_Style_Fixed_Dataset_NO_Captions-000075.safetensors)

	75 checkpoints is equal to 75 * 66 / 4 = 1238 steps

	# Tutorials To Train Your Style

	1 : https://youtu.be/bupRePUOA18

	### [FLUX: The First Ever Open Source txt2img Model Truly Beats Midjourney & Others - FLUX is Awaited SD3](https://youtu.be/bupRePUOA18)

	[![image](https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/dguyYoaghc8IVdBrKMDkl.png)](https://youtu.be/bupRePUOA18)

	2 : https://youtu.be/nySGu12Y05k

	### [FLUX LoRA Training Simplified: From Zero to Hero with Kohya SS GUI (8GB GPU, Windows) Tutorial Guide](https://youtu.be/nySGu12Y05k)

	[![image](https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/5oeVl6mmaRyYZkxuXSShm.png)](https://youtu.be/nySGu12Y05k)

	3 : https://youtu.be/-uhL2nW7Ddw

	### [Blazing Fast & Ultra Cheap FLUX LoRA Training on Massed Compute & RunPod Tutorial - No GPU Required!](https://youtu.be/-uhL2nW7Ddw)

	[![image](https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/hPBegzqT2A52hrveI7buf.png)](https://youtu.be/-uhL2nW7Ddw)

	The dataset can't be used commercially

	<img src="https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/7ZFz_ZW53ipp8LHYuPPSg.png" alt="Training progress" style="max-height: 500px; width: auto;">

	# Grid Testing Prompts - Example Images on CivitAI Taken From Grid - No Cherry Pick

	```

	a ohwx 3d rendering of a car

	a car rendered in ohwx 3d style

	a ohwx style car image

	a ohwx render of a car

	a ohwx car

	a ohwx 3d rendering of a chest, depicted in a cartoon style. The background is a plain white, making the chest and its contents stand out clearly. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give the chest a realistic, three-dimensional appearance. The metal bands and rivets add a sense of realism and durability to the chest. The image is vibrant and eye-catching, inviting the viewer to imagine the treasure within. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold, with a focus on oranges, browns, and golds to create a sense of warmth and excitement. The overall mood is one of excitement and discovery.

	a ohwx 3d rendering of an airplane, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a battleship, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a robot, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a dog, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a cat, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of an axe, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a house, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a dragon, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a flower, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a rose, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a tank, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a computer, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a graphics processing unit (gpu), depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a fork, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a lock, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	a ohwx 3d rendering of a umbrella, depicted in a cartoon style. The background is a plain white. The overall style is playful and whimsical, with clean lines and bright colors, suggesting a fantasy or adventure theme. The illustration is highly detailed, with a focus on textures and shading to give a realistic, three-dimensional appearance. The image is vibrant and eye-catching. The illustration is likely used in a digital context, such as a game or a children's book. The colors are bright and bold to create a sense of warmth and excitement.

	```