Add model card and metadata for CES
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,3 +1,46 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: transformers
|
| 4 |
+
pipeline_tag: image-text-to-text
|
| 5 |
+
tags:
|
| 6 |
+
- gui-agent
|
| 7 |
+
- reinforcement-learning
|
| 8 |
+
- multi-agent
|
| 9 |
+
- vlm
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation
|
| 13 |
+
|
| 14 |
+
This repository provides the models and code for the **CES (Coordinator-Executor-State Tracker)** framework, a multi-agent system designed to handle long-horizon GUI automation tasks.
|
| 15 |
+
|
| 16 |
+
- **Paper:** [Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation](https://huggingface.co/papers/2511.22235)
|
| 17 |
+
- **Code:** [Official GitHub Repository](https://github.com/hehehahi4/CES)
|
| 18 |
+
|
| 19 |
+
## Introduction
|
| 20 |
+
|
| 21 |
+
The CES framework addresses the limitations of single-agent GUI models in long-horizon tasks by decoupling high-level strategic planning from low-level execution. The system consists of three specialized components:
|
| 22 |
+
|
| 23 |
+
* **Coordinator:** Responsible for strategic planning and task decomposition.
|
| 24 |
+
* **State Tracker:** Manages context compression and information management to maintain task coherence and state awareness.
|
| 25 |
+
* **Executor:** A low-level model (such as [GUI-R1](https://github.com/ritzz-ai/GUI-R1)) that executes the grounded operations.
|
| 26 |
+
|
| 27 |
+
The high-level scheduling modules (Coordinator and State Tracker) are trained using a staged execution-feedback reinforcement learning algorithm. This design makes them generalizable, plug-and-play modules that can be integrated with various Executor models to significantly enhance their planning and state management capabilities.
|
| 28 |
+
|
| 29 |
+
## Key Features
|
| 30 |
+
|
| 31 |
+
* **Multi-Agent Decoupling:** Decouples high-level reasoning from execution, resolving responsibility coupling and capability conflicts.
|
| 32 |
+
* **State Context Compression:** Resolves state unawareness problems in long-horizon tasks through dynamic summarization.
|
| 33 |
+
* **Staged Execution-Feedback RL:** A training strategy that freezes a pre-trained Executor and uses its reward signals to train the scheduling models.
|
| 34 |
+
|
| 35 |
+
## Citation
|
| 36 |
+
|
| 37 |
+
If you find this work helpful, please cite the following paper:
|
| 38 |
+
|
| 39 |
+
```bibtex
|
| 40 |
+
@article{deng2025training,
|
| 41 |
+
title={Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation},
|
| 42 |
+
author={Deng, Zehao and Ju, Tianjie and Wu, Zheng and Zhang, Zhuosheng and Liu, Gongshen},
|
| 43 |
+
journal={arXiv preprint arXiv:2511.22235},
|
| 44 |
+
year={2025}
|
| 45 |
+
}
|
| 46 |
+
```
|