Add model card and metadata for CES

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -3
README.md CHANGED
@@ -1,3 +1,46 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
+ tags:
6
+ - gui-agent
7
+ - reinforcement-learning
8
+ - multi-agent
9
+ - vlm
10
+ ---
11
+
12
+ # Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation
13
+
14
+ This repository provides the models and code for the **CES (Coordinator-Executor-State Tracker)** framework, a multi-agent system designed to handle long-horizon GUI automation tasks.
15
+
16
+ - **Paper:** [Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation](https://huggingface.co/papers/2511.22235)
17
+ - **Code:** [Official GitHub Repository](https://github.com/hehehahi4/CES)
18
+
19
+ ## Introduction
20
+
21
+ The CES framework addresses the limitations of single-agent GUI models in long-horizon tasks by decoupling high-level strategic planning from low-level execution. The system consists of three specialized components:
22
+
23
+ * **Coordinator:** Responsible for strategic planning and task decomposition.
24
+ * **State Tracker:** Manages context compression and information management to maintain task coherence and state awareness.
25
+ * **Executor:** A low-level model (such as [GUI-R1](https://github.com/ritzz-ai/GUI-R1)) that executes the grounded operations.
26
+
27
+ The high-level scheduling modules (Coordinator and State Tracker) are trained using a staged execution-feedback reinforcement learning algorithm. This design makes them generalizable, plug-and-play modules that can be integrated with various Executor models to significantly enhance their planning and state management capabilities.
28
+
29
+ ## Key Features
30
+
31
+ * **Multi-Agent Decoupling:** Decouples high-level reasoning from execution, resolving responsibility coupling and capability conflicts.
32
+ * **State Context Compression:** Resolves state unawareness problems in long-horizon tasks through dynamic summarization.
33
+ * **Staged Execution-Feedback RL:** A training strategy that freezes a pre-trained Executor and uses its reward signals to train the scheduling models.
34
+
35
+ ## Citation
36
+
37
+ If you find this work helpful, please cite the following paper:
38
+
39
+ ```bibtex
40
+ @article{deng2025training,
41
+ title={Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation},
42
+ author={Deng, Zehao and Ju, Tianjie and Wu, Zheng and Zhang, Zhuosheng and Liu, Gongshen},
43
+ journal={arXiv preprint arXiv:2511.22235},
44
+ year={2025}
45
+ }
46
+ ```