Tony1109 commited on
Commit
73b54e6
·
verified ·
1 Parent(s): 15d4172

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +95 -0
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - graphic-design
9
+ - design-generation
10
+ - layout-planning
11
+ - qwen3
12
+ base_model: Qwen/Qwen3-8B
13
+ ---
14
+
15
+ # DesignAsCode Semantic Planner
16
+
17
+ The Semantic Planner for the [DesignAsCode](https://github.com/liuziyuan1109/design-as-code) pipeline. Given a natural-language design request, it generates a structured design plan — including layout reasoning, layer grouping, image generation prompts, and text element specifications.
18
+
19
+ ## Model Details
20
+
21
+ | | |
22
+ |---|---|
23
+ | **Base Model** | Qwen3-8B |
24
+ | **Fine-tuning** | Supervised Fine-Tuning (SFT) |
25
+ | **Size** | 16 GB (fp16) |
26
+ | **Context Window** | 8,192 tokens |
27
+
28
+ ## Training Data
29
+
30
+ Trained on ~10k examples sampled from the [DesignAsCode Training Data](https://huggingface.co/datasets/Tony1109/DesignAsCode-training-data), which contains 19,479 design samples distilled from the [Crello](https://huggingface.co/datasets/cyberagent/crello) dataset using GPT-4o and GPT-o3. No additional data was used.
31
+
32
+ ### Training Format
33
+
34
+ - **Input:** `prompt` — natural-language design request
35
+ - **Output:** `layout_thought` + `grouping` + `image_generator` + `generate_text`
36
+
37
+ See the [training data repo](https://huggingface.co/datasets/Tony1109/DesignAsCode-training-data) for field details.
38
+
39
+ ## Training Configuration
40
+
41
+ | | |
42
+ |---|---|
43
+ | **Batch Size** | 1 |
44
+ | **Gradient Accumulation** | 2 |
45
+ | **Learning Rate** | 5e-5 (AdamW) |
46
+ | **Epochs** | 2 |
47
+ | **Max Sequence Length** | 8,192 tokens |
48
+ | **Precision** | bfloat16 |
49
+ | **Loss** | Completion-only (only on generated tokens) |
50
+
51
+ ## Usage
52
+
53
+ ```python
54
+ from transformers import AutoTokenizer, AutoModelForCausalLM
55
+ import torch
56
+
57
+ model_path = "Tony1109/DesignAsCode-planner"
58
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
59
+ model = AutoModelForCausalLM.from_pretrained(
60
+ model_path,
61
+ torch_dtype=torch.float16,
62
+ device_map="auto"
63
+ )
64
+ ```
65
+
66
+ For full pipeline usage (plan → retrieve → implement → refine), see the [project repo](https://github.com/liuziyuan1109/design-as-code) and [QUICKSTART.md](https://github.com/liuziyuan1109/design-as-code/blob/main/QUICKSTART.md).
67
+
68
+ ## Outputs
69
+
70
+ The model generates semi-structured text with XML tags:
71
+
72
+ - `<layout_thought>...</layout_thought>` — detailed layout reasoning
73
+ - `<grouping>...</grouping>` — JSON array grouping related layers with thematic labels
74
+ - `<image_generator>...</image_generator>` — JSON array of per-layer image generation prompts
75
+ - `<generate_text>...</generate_text>` — JSON array of text element specifications (font, size, alignment, etc.)
76
+
77
+ ## Ethical Considerations
78
+
79
+ - Designs should be reviewed by humans before production use.
80
+ - May reflect biases present in the training data.
81
+ - Generated content should be checked for copyright compliance.
82
+
83
+ ## Citation
84
+
85
+ ```bibtex
86
+ @article{liu2025designascode,
87
+ title = {DesignAsCode: Bridging Structural Editability and
88
+ Visual Fidelity in Graphic Design Generation},
89
+ author = {Liu, Ziyuan and Sun, Shizhao and Huang, Danqing
90
+ and Shi, Yingdong and Zhang, Meisheng and Li, Ji
91
+ and Yu, Jingsong and Bian, Jiang},
92
+ journal = {arXiv preprint},
93
+ year = {2025}
94
+ }
95
+ ```