Image-Text-to-Text
Transformers
Safetensors
qwen3
text-generation
conversational
text-generation-inference
luzimu commited on
Commit
771caa6
·
verified ·
1 Parent(s): ac06d32

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -3
README.md CHANGED
@@ -1,3 +1,74 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # WebGen-Agent
6
+
7
+ WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](fig/WebGen_Agent.pdf).
8
+
9
+ ## Project Overview
10
+
11
+ WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
12
+
13
+ ![WebGen-Agent Workflow](fig/webgen-agent.png)
14
+
15
+ ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
16
+
17
+ ## Resources
18
+
19
+ Links to the data and model parameters are as follows:
20
+
21
+ | Data | HF Link |
22
+ |----------|------|
23
+ | webgen-agent_train_sft | 🤗 [luzimu/webgen-agent_train_sft](https://huggingface.co/datasets/luzimu/webgen-agent_train_sft) |
24
+ | webgen-agent_train_step-grpo | 🤗 [luzimu/webgen-agent_train_step-grpo](https://huggingface.co/datasets/luzimu/webgen-agent_train_step-grpo) |
25
+
26
+ | Model | HF Link |
27
+ |----------|------|
28
+ | WebGenAgent-LM-7B-SFT | 🤗 [luzimu/WebGenAgent-LM-7B-SFT](https://huggingface.co/luzimu/WebGenAgent-LM-7B-SFT) |
29
+ | WebGenAgent-LM-7B-Step-GRPO | 🤗 [luzimu/WebGenAgent-LM-7B-Step-GRPO](https://huggingface.co/luzimu/WebGenAgent-LM-7B-Step-GRPO) |
30
+ | WebGenAgent-LM-8B-SFT | 🤗 [luzimu/WebGenAgent-LM-8B-SFT](https://huggingface.co/luzimu/WebGenAgent-LM-8B-SFT) |
31
+ | WebGenAgent-LM-8B-Step-GRPO | 🤗 [luzimu/WebGenAgent-LM-8B-Step-GRPO](https://huggingface.co/luzimu/WebGenAgent-LM-8B-Step-GRPO) |
32
+
33
+ ## How WebGen-Agent Works
34
+
35
+ WebGen-Agent follows an iterative, multi-step paradigm for website generation:
36
+
37
+ 1. **Code Generation**: The agent generates code to create or edit website files based on natural language instructions
38
+ 2. **Code Execution**: Dependencies are installed and the website service is started
39
+ 3. **Feedback Gathering**:
40
+ - A screenshot of the website is captured
41
+ - A Visual Language Model (VLM) provides appearance feedback and scores
42
+ - A GUI-agent tests the website functionality and provides functional feedback
43
+ 4. **Refinement**: Based on the feedback, the agent continues to improve the website until it meets requirements
44
+
45
+ ## Key Features
46
+
47
+ - **Iterative Refinement**: Continuously improves website appearance and functionality
48
+ - **Feedback Integration**: Uses both visual and functional feedback for enhanced performance
49
+ - **Backtracking Mechanism**: Reverts to previous states when encountering persistent errors
50
+ - **Best Step Selection**: Selects the optimal version based on screenshot and GUI-agent scores
51
+
52
+ ## Step-GRPO with Screenshot and GUI-agent Feedback
53
+
54
+ The Step-GRPO with Screenshot and GUI-agent Feedback approach uses the screenshot and GUI-agent scores inherently produced in the WebGen-Agent workflow as step-level rewards:
55
+ - **Screenshot Score**: Quantifies the visual appeal and aesthetics of the website
56
+ - **GUI-agent Score**: Measures how well the website meets functional requirements
57
+
58
+ These dual rewards provide dense, reliable process supervision that significantly improves the model's ability to generate high-quality websites.
59
+
60
+ ## Citation
61
+
62
+ If you find our project useful, please cite:
63
+
64
+ ```
65
+ @misc{lu2025webgenbenchevaluatingllmsgenerating,
66
+ title={WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch},
67
+ author={Zimu Lu and Yunqiao Yang and Houxing Ren and Haotian Hou and Han Xiao and Ke Wang and Weikang Shi and Aojun Zhou and Mingjie Zhan and Hongsheng Li},
68
+ year={2025},
69
+ eprint={2505.03733},
70
+ archivePrefix={arXiv},
71
+ primaryClass={cs.CL},
72
+ url={https://arxiv.org/abs/2505.03733},
73
+ }
74
+ ```