File size: 2,861 Bytes
c12813d
a5975f0
c12813d
 
 
42a27fc
 
 
 
 
 
c12813d
 
a5975f0
c12813d
a5975f0
c12813d
a5975f0
c12813d
 
 
 
 
 
 
 
a5975f0
c12813d
 
a5975f0
c12813d
a5975f0
 
c12813d
a5975f0
c12813d
331f981
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a5975f0
c12813d
 
 
 
 
a5975f0
c12813d
a5975f0
 
 
c12813d
 
a5975f0
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
library_name: diffusers
tags:
  - image-decomposition
  - layered-image-editing
  - diffusion
  - flux
  - lora
  - image-to-image
  - transparent-rgba
  - arxiv:2605.15167
---

# SynLayers Stage 2 Checkpoints

This repository hosts the **Stage 2 checkpoints and runtime assets** for SynLayers, our real-world image layer decomposition system.

The main assets in this repo include:

- `SynLayers_checkpoints/FLUX.1-dev`
- `SynLayers_checkpoints/FLUX.1-dev-Controlnet-Inpainting-Alpha`
- `SynLayers_ckpt/step_120000`
- `ckpt/trans_vae/0008000.pt`
- `ckpt/pre_trained_LoRA`
- `ckpt/prism_ft_LoRA`

These assets are used by our public Space:
[SynLayers/synlayers](https://huggingface.co/spaces/SynLayers/synlayers)

The full SynLayers system has two stages:

1. bbox + whole-caption prediction from [`SynLayers/Bbox-caption-8b`](https://huggingface.co/SynLayers/Bbox-caption-8b)
2. layer decomposition into transparent RGBA outputs using this repository

This repository is intended for the SynLayers decomposition pipeline. It is not meant to be loaded as a single generic `DiffusionPipeline(prompt)` model.

## Stage 2 Inference

The standalone Stage 2 entrypoint is:

- `infer/infer.py`
- `infer/infer.yaml`

Stage 2 expects images plus a JSONL file containing the whole-image caption and bounding boxes. The easiest way to get those inputs is to run Stage 1 first with [`SynLayers/Bbox-caption-8b`](https://huggingface.co/SynLayers/Bbox-caption-8b), or use the public Space for the full two-stage pipeline.

After preparing your inputs, update these fields in `infer/infer.yaml`:

```yaml
data_dir: "path/to/your/work_dir"
image_dir: "path/to/your/images"
test_jsonl: "path/to/caption_bbox_infer.jsonl"
save_dir: "path/to/save/results"
```

Then run:

```bash
python infer/infer.py \
  --config_path infer/infer.yaml
```

The default checkpoint paths in `infer/infer.yaml` are repo-relative and point to the assets in this repository:

```yaml
pretrained_model_name_or_path: "SynLayers_checkpoints/FLUX.1-dev"
pretrained_adapter_path: "SynLayers_checkpoints/FLUX.1-dev-Controlnet-Inpainting-Alpha"
lora_ckpt: "SynLayers_ckpt/step_120000/transformer"
layer_ckpt: "SynLayers_ckpt/step_120000"
adapter_lora_dir: "SynLayers_ckpt/step_120000/adapter"
```

For most users, the public Space is the recommended interface because it runs both Stage 1 and Stage 2 in one workflow.

For more details, please check our paper:
[https://arxiv.org/abs/2605.15167](https://arxiv.org/abs/2605.15167)

If you find our work useful, please consider citing:

```bibtex
@article{wu2026does,
  title={Does Synthetic Layered Design Data Benefit Layered Design Decomposition?},
  author={Wu, Kam Man and Yang, Haolin and Chen, Qingyu and Tang, Yihu and Chen, Jingye and Chen, Qifeng},
  journal={arXiv preprint arXiv:2605.15167},
  year={2026}
}
```

Thanks for trying SynLayers.