File size: 16,031 Bytes
afa9244 e8ab92f afa9244 ffb7c15 afa9244 e8ab92f afa9244 e8ab92f 4a9c224 7e0a5ca e8ab92f 7e0a5ca e8ab92f fd429f2 895fed9 c024710 e8ab92f afa9244 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 |
---
base_model:
- Qwen/Qwen3-4B-Instruct-2507
pipeline_tag: text-generation
library_name: transformers
license: apache-2.0
---
# [ICLR 2026] Code Aesthetics with Agentic Reward Feedback
<div align="center">
<a href='https://bangx7.github.io/code-aesthetics/'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
<a href="https://huggingface.co/SamuelBang/AesCoder-4B"><img alt="Hugging Face"
src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-ffc107?color=ffc107&logoColor=white"/></a>
<br>
<a href="https://arxiv.org/abs/2510.23272"><b>Paper Link</b>๐๏ธ</a>
</div>
<div align="center">
<p>
<sup>1,2</sup><a href="https://bangx7.github.io" target="_blank">Bang Xiao</a><sup>#</sup>,</span>
<span class="author-block">
<sup>1,3</sup><a href="https://github.com/JackLingjie" target="_blank">Lingjie Jiang</a><sup>#</sup>,</span>
<span class="author-block">
<sup>1</sup><a href="https://www.microsoft.com/en-us/research/people/shaohanh/" target="_blank">Shaohan Huang</a><sup>โ</sup>,</span>
<span class="author-block">
<sup>1</sup><a href="https://www.microsoft.com/en-us/research/people/tengchaolv/" target="_blank">Tengchao Lv</a>,
</span>
<span class="author-block">
<sup>1</sup><a href="https://www.microsoft.com/en-us/research/people/yupanhuang/" target="_blank">Yupan Huang</a>,
</span>
<span class="author-block">
<sup>1</sup><a href="https://yushuiwx.github.io/" target="_blank">Xun Wu</a>,
</span>
<span class="author-block">
<sup>1</sup><a href="https://www.microsoft.com/en-us/research/people/lecu/" target="_blank">Lei Cui</a>,
</span>
<span class="author-block">
<sup>1</sup><a href="https://www.microsoft.com/en-us/research/people/fuwei/" target="_blank">Furu Wei</a>
</span>
</p>
<p>
<sup>1</sup>Microsoft Research Asia
<sup>2</sup>Zhiyuan College, Shanghai Jiao Tong University
<sup>3</sup>Peking University<br>
<sup>#</sup>Equal Contribution
<sup>โ</sup>Corresponding author
</p>
</div>
For the codebase, refer to: https://github.com/bangx7/code_aesthetics
## ๐ News
- __[2026.01.12]__: Release the [AesCode](https://huggingface.co/datasets/SamuelBang/AesCode-358K) dataset.
- __[2025.10.29]__: Release the [AesCoder-4B](https://huggingface.co/SamuelBang/AesCoder-4B/) model.
- __[2025.10.27]__: Release the [Project Page](https://bangx7.github.io/code-aesthetics/) and the [Arxiv](https://arxiv.org/abs/2510.23272) version.
## ๐ท Abstract
Large Language Models (LLMs) have become valuable assistants for developers in code-related tasks. While LLMs excel at traditional programming tasks such as code generation and bug fixing, they struggle with visually-oriented coding tasks, often producing suboptimal aesthetics. In this paper, we introduce a new pipeline to enhance the aesthetic quality of LLM-generated code. We first construct AesCode-358K, a large-scale instruction-tuning dataset focused on code aesthetics. Next, we propose agentic reward feedback, a multi-agent system that evaluates executability, static aesthetics, and interactive aesthetics. Building on this, we develop GRPO-AR, which integrates these signals into the GRPO algorithm for joint optimization of functionality and code aesthetics. Finally, we develop OpenDesign, a benchmark for assessing code aesthetics. Experimental results show that combining supervised fine-tuning on AesCode-358K with reinforcement learning using agentic reward feedback significantly improves performance on OpenDesign and also enhances results on existing benchmarks such as PandasPlotBench. Notably, our AesCoder-4B surpasses GPT-4o and GPT-4.1, and achieves performance comparable to large open-source models with 480B-685B parameters, underscoring the effectiveness of our approach.
## To-do List
- [x] Release paper and project page
- [x] Release our AesCoder model
- [x] Release our AesCoder model
- [x] Release code
**Note: This is the version of AesCoder-4B model for only webpage design.**
## Quickstart
### VLLM deployment (Recommended)
We recommend using `vllm>=0.8.5` for efficient inference and deployment. Here's how to get started:
**Installation:**
```bash
pip install vllm>=0.8.5
```
**API Server Deployment:**
To create an OpenAI-compatible API endpoint:
```bash
vllm serve SamuelBang/AesCoder-4B --max-model-len 262144
```
**Using with OpenAI Client:**
```python
from openai import OpenAI
# Initialize the client
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="None"
)
# Generate completion
response = client.chat.completions.create(
model="SamuelBang/AesCoder-4B",
messages=[
{"role": "user", "content": "Create a user-friendly website for a landing page dedicated to selling dog-related products."}
],
temperature=0.8,
max_tokens=16384
)
# Get the generated content
content = response.choices[0].message.content
print("Generated content:", content)
```
**Basic vLLM Usage:**
```python
from vllm import LLM, SamplingParams
model_name = "SamuelBang/AesCoder-4B"
# Initialize the model
llm = LLM(
model=model_name,
max_model_len=262144, # Maximum context length
tensor_parallel_size=1, # Adjust based on your GPU setup
)
# Define sampling parameters
sampling_params = SamplingParams(
temperature=0.8,
top_p=0.8,
top_k=20,
min_p=0,
max_tokens=16384
)
# Prepare the prompt
prompt = "Create a user-friendly website for a landing page dedicated to selling dog-related products. "
messages = [
{"role": "user", "content": prompt}
]
# Apply chat template (you'll need to get the tokenizer for this)
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
# Generate text
outputs = llm.generate([text], sampling_params)
# Get the result
content = outputs[0].outputs[0].text
print("Generated content:", content)
```
**Note:** If you encounter out-of-memory (OOM) issues, consider reducing the context length to a shorter value, such as `32,768` or adjusting the `tensor_parallel_size` based on your available GPU memory.
### SGLang Deployment
You can use `sglang>=0.4.6.post1` to create an OpenAI-compatible API endpoint:
```shell
python -m sglang.launch_server --model-path SamuelBang/AesCoder-4B --context-length 262144
```
**Note: If you encounter out-of-memory (OOM) issues, consider reducing the context length to a shorter value, such as `32,768`.**
### Use with origin `transformer` package (NOT recommended, very slow)
The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
With `transformers<4.51.0`, you will encounter the following error:
```
KeyError: 'qwen3'
```
The following contains a code snippet illustrating how to use the model generate content based on given inputs.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "SamuelBang/AesCoder-4B"
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# prepare the model input
prompt = "Create a user-friendly website for a landing page dedicated to selling dog-related products. "
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# conduct text completion
generated_ids = model.generate(
**model_inputs,
max_new_tokens=16384,
temperature=0.8
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=True)
print("content:", content)
```
## Best Practices
### Sampling Parameters
To achieve optimal performance, we suggest using `Temperature=0.8`, `TopP=0.8`, `TopK=20`, and `MinP=0` for sampling.
### System Prompt
To achieve better webpage generation results with our model, please use appropriate system prompts. We have categorized webpage generation into 5 main categories, and the recommended system prompts for each category are as follows (reference to: https://designarena.ai):
Website:
```txt
You are an expert web developer and designer specializing in modern websites. Create a complete, working HTML page with embedded CSS and JavaScript if needed. Feel free to use lightweight libraries like Tailwind CSS to enhance the design as long as they can be rendered in an iframe.
Requirements:
1. Create a fully functional, modern, and responsive website design
2. Use only HTML, CSS, and JavaScript, but feel free to use libraries like Tailwind CSS to make the design better. Libraries such as ThreeJS, react three fiber, drei, @react-three/postprocessing, @react-three/cannon, d3, and recharts as additional libraries can be imported.
3. Include all styles inline within <style> tags
4. Focus on clean layouts, typography, and user experience
5. Implement modern web design trends (gradients, shadows, smooth animations)
6. Ensure excellent mobile responsiveness
7. Include interactive elements where appropriate
8. Make it production-ready and professional
9. You must include all relevant script tags for libraries to work properly.
Return ONLY the complete HTML code, starting with <!DOCTYPE html> and ending with </html>. Do not include any explanations or markdown formatting.
```
Game Development:
```txt
You are an expert game developer specializing in browser-based games. Create a complete, working HTML page with an interactive game using embedded CSS and JavaScript.
Requirements:
1. Create a fully functional, playable browser game
2. Use HTML, CSS, and JavaScript, but feel free to use libraries Tailwind CSS to make the game better as long as they will render in iframe. Libraries such as ThreeJS, react three fiber, drei, @react-three/postprocessing, @react-three/cannon, d3, and recharts (and others) can be imported.
3. Include all styles inline within <style> tags
4. Implement game mechanics, controls, and scoring systems
5. Include game states (start screen, gameplay, game over)
6. Add visual feedback, animations, and sound effects using Web Audio API if needed
7. Ensure responsive design that works on both desktop and mobile
8. Make the game engaging and fun to play
9. You must include all relevant script tags for libraries to work properly.
Return ONLY the complete HTML code, starting with <!DOCTYPE html> and ending with </html>. Do not include any explanations or markdown formatting.
```
3D Design
```txt
You are an expert in 3D graphics and WebGL. Create a complete, working HTML page with 3D graphics and animations using embedded CSS and JavaScript.
Requirements:
1. Create a fully functional 3D scene or application
2. Use only CSS, and vanilla JavaScript with WebGL or CSS 3D transforms. Feel free to use lightweight libraries like Three.js or Babylon.js to make the 3D design better as long as they can be rendered in an iframe. Libraries such as ThreeJS, react three fiber, drei, @react-three/postprocessing, @react-three/cannon, d3, and recharts (and others) can be imported.
3. Include all styles inline within <style> tags
4. Implement 3D models, animations, and interactive controls
5. Add proper lighting, materials, and camera controls
6. Include smooth animations and user interaction
7. Ensure good performance and responsive design
8. Make it visually impressive and production-ready
9. You must include all relevant script tags for libraries to work properly.
Return ONLY the complete HTML code, starting with <!DOCTYPE html> and ending with </html>. Do not include any explanations or markdown formatting.
```
Data Visualization
```txt
You are an expert in data visualization and interactive charts. Create a complete, working HTML page with dynamic data visualization capabilities using embedded CSS and JavaScript. Feel free to use lightwight libraries (as long as it can be rendered in an iframe) such as Tailwind CSS.
Requirements:
1. Create a fully functional data visualization application with interactive charts and graphs
2. Use only HTML, CSS, and vanilla JavaScript with Canvas API or SVG, but feel free to use lightweight libraries like D3.js or Chart.js to make the visualizations better as long as they can be rendered in an iframe. Libraries such as ThreeJS, react three fiber, drei, @react-three/postprocessing, @react-three/cannon, d3, and recharts (and others) can be imported.
3. Include all styles inline within <style> tags
4. Ensure responsive design that adapts to different screen sizes
5. Your goal is to make the design of the data visualization top-notch.
6. You must include all relevant script tags for libraries to work properly.
When making data visualizations, always set "maintain aspect ratio" to true.
Return ONLY the complete HTML code, starting with <!DOCTYPE html> and ending with </html>. Do not include any explanations or markdown formatting.
```
UI Component
```txt
You are a world-class UI/UX designer and frontend engineer with a sharp eye for aesthetics, accessibility, and modern interaction design. Your task is to generate complete, production-ready HTML pages showcasing **visually stunning, highly interactive, and responsive UI components** using only HTML, CSS, and JavaScript.
**Guidelines:**
1. Deliver a single, fully functional UI component as a complete HTML page
2. Use **only** embedded <style> and <script> tags โ all code must be self-contained
3. You may use:
- Tailwind CSS (via CDN)
- Lightweight icon libraries (e.g., Heroicons)
- Three.js, react-three-fiber, drei, d3, recharts (for advanced visuals), and others if you import them properly
4. Ensure **fully responsive design**, supporting both desktop and mobile layouts
5. Design for **realistic production scenarios** โ avoid toy examples; the component should look ready to ship in a startup or app design system
6. You must include all relevant script tags for libraries to work properly.
**Design Requirements (unless the user specifies otherwise):**
- Contemporary visual style: Soft shadows, rounded corners, clean typography, subtle gradients
- State handling: Show all interactive states (hover, active, loading, disabled, success)
- Microinteractions: Include smooth transitions and animations for interactive elements
- Accessibility: Use semantic HTML and ARIA roles where appropriate
- Use thoughtful spacing, sizing, and visual hierarchy
**Bonus:**
- Add delight through animations, hover effects, or clever color usage
- Incorporate a beautiful and functional layout structure, not just the component
Final Output:
Return ONLY the full, standalone HTML code (starting with <!DOCTYPE html>) and nothing else. No explanations, no markdown formatting.
If the user specifies a particular style (e.g., glassmorphism, brutalism, Material Design), follow their style instructions instead of the default design preferences.
```
## 📚 Citation
If you find this codebase useful for your research, please use the following entry.
```BibTeX
@misc{xiao2025codeaestheticsagenticreward,
title={Code Aesthetics with Agentic Reward Feedback},
author={Bang Xiao and Lingjie Jiang and Shaohan Huang and Tengchao Lv and Yupan Huang and Xun Wu and Lei Cui and Furu Wei},
year={2025},
eprint={2510.23272},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.23272},
}
``` |