Commit ·
4b7e335
1
Parent(s): 1e570e7
Add custom banner, upgrade conclusion style
Browse files- README.md +10 -5
- banner.png +0 -0
README.md
CHANGED
|
@@ -11,12 +11,9 @@ language:
|
|
| 11 |
---
|
| 12 |
|
| 13 |
<p align="center">
|
| 14 |
-
<img src="
|
| 15 |
</p>
|
| 16 |
|
| 17 |
-
<h1 align="center">ComtradeBench</h1>
|
| 18 |
-
<h3 align="center">An OpenEnv Benchmark for Reliable LLM Tool-Use</h3>
|
| 19 |
-
|
| 20 |
<p align="center">
|
| 21 |
<a href="https://github.com/yonghongzhang-io/comtrade-openenv">
|
| 22 |
<img src="https://img.shields.io/badge/GitHub-Repository-181717?logo=github" alt="GitHub"/>
|
|
@@ -239,7 +236,15 @@ All benchmark data is generated procedurally from a seeded PRNG — no external
|
|
| 239 |
|
| 240 |
## Conclusion
|
| 241 |
|
| 242 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 243 |
|
| 244 |
That question matters far beyond trade data. It applies to any agent expected to operate against real interfaces with pagination, retries, noisy outputs, and resource limits.
|
| 245 |
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
<p align="center">
|
| 14 |
+
<img src="banner.png" width="100%" alt="ComtradeBench — An OpenEnv Benchmark for Reliable LLM Tool-Use"/>
|
| 15 |
</p>
|
| 16 |
|
|
|
|
|
|
|
|
|
|
| 17 |
<p align="center">
|
| 18 |
<a href="https://github.com/yonghongzhang-io/comtrade-openenv">
|
| 19 |
<img src="https://img.shields.io/badge/GitHub-Repository-181717?logo=github" alt="GitHub"/>
|
|
|
|
| 236 |
|
| 237 |
## Conclusion
|
| 238 |
|
| 239 |
+
<div align="center">
|
| 240 |
+
<br>
|
| 241 |
+
|
| 242 |
+
| |
|
| 243 |
+
|:---:|
|
| 244 |
+
| **Can an agent still finish the job when the API fights back?** |
|
| 245 |
+
|
| 246 |
+
<br>
|
| 247 |
+
</div>
|
| 248 |
|
| 249 |
That question matters far beyond trade data. It applies to any agent expected to operate against real interfaces with pagination, retries, noisy outputs, and resource limits.
|
| 250 |
|
banner.png
ADDED
|