Buckets:
| license: mit | |
| task_categories: | |
| - text-generation | |
| language: | |
| - en | |
| tags: | |
| - code-generation | |
| - web-development | |
| - llm-evaluation | |
| - benchmark | |
| - llm-agents | |
| dataset_info: | |
| features: | |
| - name: id | |
| dtype: string | |
| - name: instruction | |
| dtype: string | |
| - name: Category | |
| struct: | |
| - name: primary_category | |
| dtype: string | |
| - name: subcategories | |
| sequence: string | |
| - name: application_type | |
| dtype: string | |
| - name: ui_instruct | |
| list: | |
| - name: task | |
| dtype: string | |
| - name: expected_result | |
| dtype: string | |
| - name: task_category | |
| struct: | |
| - name: primary_category | |
| dtype: string | |
| - name: subcategories | |
| sequence: string | |
| splits: | |
| - name: train | |
| num_bytes: 4038022 | |
| num_examples: 6667 | |
| - name: test | |
| num_bytes: 244776 | |
| num_examples: 101 | |
| download_size: 1566240 | |
| dataset_size: 4282798 | |
| configs: | |
| - config_name: default | |
| data_files: | |
| - split: train | |
| path: data/train-* | |
| - split: test | |
| path: data/test-* | |
| # WebGen-Instruct: Training Data for WebGen-Bench | |
| This repository contains `WebGen-Instruct`, the training data used in the paper [WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch](https://arxiv.org/abs/2505.03733). | |
| WebGen-Bench is a novel benchmark designed to measure an LLM-based agent's ability to create multi-file website codebases from scratch. The benchmark dataset itself consists of 101 instructions and 647 test cases. This particular dataset (`WebGen-Instruct`) provides 6,667 website-generation instructions, including 600 trajectories collected from DeepSeek-V3 and filtered by appearance score (larger or equal to 3). | |
| The code for evaluation, as well as the training code and the full WebGen-Bench data, are released at [WebGen-Bench (Github)](https://github.com/mnluzimu/WebGen-Bench). | |
| ## Sample Usage | |
| You can easily load the training dataset using the `load_dataset` function from the 🤗 Datasets library: | |
| ```python | |
| from datasets import load_dataset | |
| # Load the WebGen-Instruct training dataset | |
| train_dataset = load_dataset("luzimu/WebGen-Bench_train_data", split="train") | |
| # Print dataset information | |
| print(train_dataset) | |
| # Access an example | |
| print(train_dataset[0]) | |
| ``` | |
| ## Training Results | |
| The performance of the WebGen-LM models which are trained with this data is shown below: | |
|  | |
| ## Citation | |
| If you find our project useful, please cite: | |
| ```bibtex | |
| @misc{lu2025webgenbenchevaluatingllmsgenerating, | |
| title={WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch}, | |
| author={Zimu Lu and Yunqiao Yang and Houxing Ren and Haotian Hou and Han Xiao and Ke Wang and Weikang Shi and Aojun Zhou and Mingjie Zhan and Hongsheng Li}, | |
| year={2025}, | |
| eprint={2505.03733}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2505.03733}, | |
| } | |
| @misc{lu2025webgenagentenhancinginteractivewebsite, | |
| title={WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning}, | |
| author={Zimu Lu and Houxing Ren and Yunqiao Yang and Ke Wang and Zhuofan Zong and Junting Pan and Mingjie Zhan and Hongsheng Li}, | |
| year={2025}, | |
| eprint={2509.22644}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2509.22644}, | |
| } | |
| ``` |
Xet Storage Details
- Size:
- 3.49 kB
- Xet hash:
- a5d6c71054a3f5985d3c79053421a46ea2f9c45ace456877546fb804a56aefa8
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.