alimama-creative
/

SD3-Controlnet-Softedge

Model card Files Files and versions

SD3-Controlnet-Softedge / README.md

ByzHero's picture

Update README.md

6d92b7d verified over 1 year ago

|

history blame contribute delete

3.28 kB

	# SD3 Controlnet softedge
	The softedge controlnet is finetuned based on SD3-medium. It is trained using 12M open source and internal e-commerce dataset, and achieve good performance on both general and e-commerce image generation. It supports preprocessors such as pidinet, hed as well as their safe mode.


	## Examples
	From left to right: pidinet preprocessor, ours with pidinet, hed preprocessor, ours with hed.

	`pidinet`   \|`controlnet`\|`hed`       \|`controlnet`
	:--:\|:--:\|:--:\|:--:
	![images)](./images/im1_1.webp) \| ![images)](./images/im1_2.webp) \| ![images)](./images/im1_3.webp) \| ![images)](./images/im1_4.webp)
	![images)](./images/im2_1.webp) \| ![images)](./images/im2_2.webp) \| ![images)](./images/im2_3.webp) \| ![images)](./images/im2_4.webp)
	![images)](./images/im3_1.webp) \| ![images)](./images/im3_2.webp) \| ![images)](./images/im3_3.webp) \| ![images)](./images/im3_4.webp)
	![images)](./images/im4_1.webp) \| ![images)](./images/im4_2.webp) \| ![images)](./images/im4_3.webp) \| ![images)](./images/im4_4.webp)
	![images)](./images/im5_1.webp) \| ![images)](./images/im5_2.webp) \| ![images)](./images/im5_3.webp) \| ![images)](./images/im5_4.webp)



	## Usage with Diffusers
	```python
	import torch
	from diffusers.utils import load_image, check_min_version
	from diffusers.models import SD3ControlNetModel
	from diffusers import StableDiffusion3ControlNetPipeline
	from controlnet_aux import PidiNetDetector

	controlnet = SD3ControlNetModel.from_pretrained(
	"alimama-creative/SD3-Controlnet-Softedge",torch_dtype=torch.float16
	)
	pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
	"stabilityai/stable-diffusion-3-medium-diffusers",
	controlnet=controlnet,
	variant="fp16",
	torch_dtype=torch.float16,
	)
	pipe.text_encoder.to(torch.float16)
	pipe.controlnet.to(torch.float16)
	pipe.to("cuda")

	image = load_image(
	"https://huggingface.co/alimama-creative/SD3-Controlnet-Softedge/resolve/main/images/im1_0.png"
	)
	prompt = "A dog sitting on a park bench."
	width = 1024
	height = 1024

	edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators')
	edge_image = edge_processor(image, detect_resolution=width, image_resolution=width)

	res_image = pipe(
	prompt=prompt,
	negative_prompt="deformed, distorted, disfigured, poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, mutated hands and fingers, disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation, NSFW",
	height=height,
	width=width,
	control_image=edge_image,
	num_inference_steps=25,
	controlnet_conditioning_scale=0.95,
	guidance_scale=5,
	).images[0]
	res_image.save("sd3.png")

	```

	## Training Detail
	The model was trained on 12M laion2B and internal sources images with aesthetic 6+ for 20k steps at resolution 1024x1024. ControlNet with 6, 12 and 23 layers have been explored, and the 12-layer model achieves a good balance between performance and model size, so we release the 12-layer model.

	Mixed precision : FP16<br/>
	Learning rate : 1e-4<br/>
	Batch size : 256<br/>
	Timestep sampling mode : 'logit_normal'<br/>
	Loss : Flow Matching<br/>

	## LICENSE
	The model is based on SD3 finetuning; therefore, the license follows the original SD3 license.