eafn commited on
Commit
1e11899
·
verified ·
1 Parent(s): 84d21f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -6
README.md CHANGED
@@ -1,6 +1,83 @@
1
- ---
2
- license: other
3
- license_name: stabilityai-ai-community
4
- license_link: >-
5
- https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/LICENSE.md
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: stabilityai-ai-community
4
+ license_link: >-
5
+ https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/LICENSE.md
6
+ language:
7
+ - en
8
+ base_model:
9
+ - alimama-creative/SD3-Controlnet-Inpainting
10
+ - stabilityai/stable-diffusion-3-medium
11
+ pipeline_tag: text-to-image
12
+ library_name: diffusers
13
+ tags:
14
+ - alimama-creative
15
+ - stable-diffusion
16
+ ---
17
+
18
+
19
+ # PosterMaker
20
+ ![demo images](assets/tesear.jpg)
21
+
22
+ To learn more about PosterMaker, please visit [Project page](https://arxiv.org/abs/2504.06632).
23
+
24
+
25
+ ## Model
26
+
27
+ ![pomethodster](assets/method.png)
28
+
29
+ PosterMaker is an advanced framework for generating promotional product posters with high text rendering and fidelity. Utilizing TextRenderNet for precise character-level text control and SceneGenNet for maintaining product fidelity, PosterMaker excels in creating visually appealing posters. Through a two-stage training strategy to optimize text rendering and background generation separately, PosterMaker outperforms existing methods significantly.
30
+
31
+ For more technical details, please refer to the [Research paper](https://arxiv.org/abs/2504.06632).
32
+
33
+
34
+ ### Model Weight
35
+
36
+ Introduce the model names and weights
37
+
38
+ | Model Name | Weight Name | Download Link |
39
+ | --- | --- | --- |
40
+ | TextRenderNet_v1 | textrender_net-0415.pth | [HuggingFace](https://huggingface.co/alimama-creative/PosterMaker/tree/main) |
41
+ | SceneGenNet_v1 | scenegen_net-0415.pth | [HuggingFace](https://huggingface.co/alimama-creative/PosterMaker/tree/main) |
42
+ | SceneGenNet_v1 with Reward Learning | scenegen_net-rl-0415.pth | [HuggingFace](https://huggingface.co/alimama-creative/PosterMaker//tree/main) |
43
+ | TextRenderNet_v2 | textrender_net-1m-0415.pth | [HuggingFace](https://huggingface.co/alimama-creative/PosterMaker/tree/main) |
44
+ | SceneGenNet_v2 | scenegen_net-1m-0415.pth | [HuggingFace](https://huggingface.co/alimama-creative/PosterMaker/tree/main) |
45
+
46
+ **NOTE:** TextRenderNet_v2 is trained with more data for training in the Stage 1, resulting in better text rendering effects. Related details can be found in Section 8 of the Supplementary Materials.
47
+
48
+
49
+ ### Known Limitations
50
+ The current model exhibits the following known limitations stemming from processing strategies applied to textual elements and captions during constructing our training dataset:
51
+
52
+ **Text**
53
+ - During training, we restrict texts to 7 lines of up to 16 characters each, and the same applies during inference.
54
+ - The training data comes from e-commerce platforms, resulting in relatively simple text colors and font styles with limited design diversity. This leads to similarly simple styles in the inference outputs.
55
+
56
+
57
+ **Layout**
58
+ - Only horizontal text boxes are supported (since the amount of vertical text boxes was insufficient, we excluded them from training data)
59
+ - Text box must maintain aspect ratios proportional to content length for optimal results (derived from tight bounding box annotations in training)
60
+ - No automatic text wrapping within boxes (multi-line text was split into separate boxes during training)
61
+
62
+ **Prompt Behavior**
63
+ - Text content should not be specified in prompts (to match the training setting).
64
+ - Limited precise control over text attributes. For poster generation, we expect the model to automatically determine text attributes like fonts and colors. Thus, descriptions about text attributes were intentionally suppressed in training captions.
65
+
66
+
67
+ ## Citation
68
+ If you find PosterMaker useful for your research and applications, please cite using this BibTeX:
69
+
70
+ ```BibTeX
71
+ @misc{gao2025postermakerhighqualityproductposter,
72
+ title={PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering},
73
+ author={Yifan Gao and Zihang Lin and Chuanbin Liu and Min Zhou and Tiezheng Ge and Bo Zheng and Hongtao Xie},
74
+ year={2025},
75
+ eprint={2504.06632},
76
+ archivePrefix={arXiv},
77
+ primaryClass={cs.CV},
78
+ url={https://arxiv.org/abs/2504.06632},
79
+ }
80
+ ```
81
+
82
+ ## LICENSE
83
+ The model is based on SD3 finetuning; therefore, the license follows the original [SD3 license](https://huggingface.co/stabilityai/stable-diffusion-3-medium#license).