Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -6,17 +6,14 @@ colorTo: blue
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
|
|
|
| 9 |
|
| 10 |
### [Paper - Stay tuned for the Monday release!]() | [Project Page](https://berkeley-nlp.github.io/agent-eval-refine) | [Code](https://github.com/Berkeley-NLP/Agent-Eval-Refine)
|
| 11 |
|
| 12 |
-
Model/Data associated with research project *Autonomous Evaluation and Refinement of Digital Agents*.
|
| 13 |
|
| 14 |
TLDR: We explore the design and use of model-based evaluators to both evaluate and autonomously refine the performance of digital agents. Experiments show that domain-general automated evaluators can significantly improve the performance of digital agents, without any extra supervision.
|
| 15 |
|
| 16 |
|
| 17 |
[Jiayi Pan](https://www.jiayipan.me/), [Yichi Zhang](https://sled.eecs.umich.edu/author/yichi-zhang/), [Nicholas Tomlin](https://people.eecs.berkeley.edu/~nicholas_tomlin/), [Yifei Zhou](https://yifeizhou02.github.io/), [Sergey Levine](https://people.eecs.berkeley.edu/~svlevine/), [Alane Suhr](https://www.alanesuhr.com/)
|
| 18 |
|
| 19 |
-
UC Berkeley, University of Michigan
|
| 20 |
-
|
| 21 |
-
---
|
| 22 |
-
|
|
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
+
## Model/Data associated with research project *Autonomous Evaluation and Refinement of Digital Agents*.
|
| 10 |
|
| 11 |
### [Paper - Stay tuned for the Monday release!]() | [Project Page](https://berkeley-nlp.github.io/agent-eval-refine) | [Code](https://github.com/Berkeley-NLP/Agent-Eval-Refine)
|
| 12 |
|
|
|
|
| 13 |
|
| 14 |
TLDR: We explore the design and use of model-based evaluators to both evaluate and autonomously refine the performance of digital agents. Experiments show that domain-general automated evaluators can significantly improve the performance of digital agents, without any extra supervision.
|
| 15 |
|
| 16 |
|
| 17 |
[Jiayi Pan](https://www.jiayipan.me/), [Yichi Zhang](https://sled.eecs.umich.edu/author/yichi-zhang/), [Nicholas Tomlin](https://people.eecs.berkeley.edu/~nicholas_tomlin/), [Yifei Zhou](https://yifeizhou02.github.io/), [Sergey Levine](https://people.eecs.berkeley.edu/~svlevine/), [Alane Suhr](https://www.alanesuhr.com/)
|
| 18 |
|
| 19 |
+
UC Berkeley, University of Michigan
|
|
|
|
|
|
|
|
|