Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -23,9 +23,12 @@ This is a Hugging Face Space that hosts a leaderboard for comparing model perfor
|
|
| 23 |
|
| 24 |
## Instructions
|
| 25 |
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
## Benchmarking on TRAIL
|
| 31 |
|
|
|
|
| 23 |
|
| 24 |
## Instructions
|
| 25 |
|
| 26 |
+
* Please refer to our GitHub repository at https://github.com/patronus-ai/trail-benchmark for step‑by‑step instructions on how to run your model with the TRAIL dataset.
|
| 27 |
+
* Please upload a zip file containing your model outputs. The zip file should contain:
|
| 28 |
+
- One or more directories with model outputs
|
| 29 |
+
- Each directory should contain JSON files with the model's predictions
|
| 30 |
+
- Directory names should indicate the split (GAIA_ or SWE_)
|
| 31 |
+
* Once the evaluation is complete, we’ll upload the scores (this process will soon be automated).
|
| 32 |
|
| 33 |
## Benchmarking on TRAIL
|
| 34 |
|