Garment UV-Texture ControlNet (v3)

A ControlNet for Stable Diffusion XL that generates UV-space texture atlases for 3D garment meshes, conditioned on tangent-space normal maps baked into UV space.

Given a UV-space normal map of a garment mesh and a text prompt describing the material/pattern, this ControlNet produces a flat 2D texture atlas with the garment panels correctly placed for the mesh's UV layout. The atlas can then be applied as a texture to the 3D mesh.

Categories trained on

Category	Samples
long-shirt	~383
long-dress	~413
short-shirt	~236
shorts	~74
pants	~38
Total	~1144

Training details

Base: stabilityai/stable-diffusion-xl-base-1.0
VAE: madebyollin/sdxl-vae-fp16-fix
Resolution: 1024×1024
Steps: 20000 (warm-started from a 12000-step single-category checkpoint)
Batch size: 2
Learning rate: 1e-5, cosine schedule, 500 warmup steps
Mixed precision: fp16
Loss masking: per-pixel weighted MSE with UV-island mask (background weight 0.1)
Captions: per-sample, generated with Gemma 3 27B vision and trimmed to fit the 77-token CLIP limit

Usage

import torch
from diffusers import (
    AutoencoderKL, ControlNetModel,
    StableDiffusionXLControlNetPipeline, UniPCMultistepScheduler,
)
from PIL import Image

controlnet = ControlNetModel.from_pretrained(
    "JorgeAskur/garment-uv-controlnet-v3", torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16
)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet, vae=vae,
    torch_dtype=torch.float16,
)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

normal_map = Image.open("normal.png").convert("RGB").resize((1024, 1024))
atlas = pipe(
    prompt="long-sleeved plaid shirt, cotton, red and cream checkered pattern",
    image=normal_map,
    num_inference_steps=40,
    guidance_scale=7.5,
    controlnet_conditioning_scale=1.0,
    height=1024, width=1024,
).images[0]
atlas.save("atlas.png")

Conditioning input

The conditioning image is a UV-space tangent normal map: render your mesh in UV space (UV coordinates as 2D positions) and encode the per-fragment surface normal as RGB: R = (N.x * 0.5 + 0.5) * 255, same for G/B. Background should be black (0, 0, 0).

Limitations

Trained on registered/fitted garment meshes — works best on similar topology.
Five garment categories only; out-of-distribution garments (e.g. jackets, hats) will produce poor results.
Captions should follow the training distribution: a single comma-separated line describing material, pattern, color, and notable details. Avoid 3D-photo wording.

License

OpenRAIL++ (inherits from SDXL base).

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

F32

Model tree for JorgeAskur/garment-uv-controlnet-v3

Base model

stabilityai/stable-diffusion-xl-base-1.0

Adapter

(9689)

this model