Spaces:
Runtime error
Runtime error
Valentin Buchner
commited on
Commit
·
797106d
1
Parent(s):
03b54a3
remove markdown leaderboard, add huggingface space as remote
Browse files- .github/scripts/generate-leaderboard.py +0 -63
- .github/workflows/update-leaderboard.yaml +0 -31
- README.md +2 -3
- leaderboard/Leaderboard.md +0 -32
.github/scripts/generate-leaderboard.py
DELETED
|
@@ -1,63 +0,0 @@
|
|
| 1 |
-
import json
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
def generate_markdown(data):
|
| 5 |
-
markdown = """<div align="center">\n\n"""
|
| 6 |
-
markdown += "# 🔥🏅️GenCeption Leaderboard 🏅️🔥\n\n"
|
| 7 |
-
markdown += """\n\n</div>\n\n"""
|
| 8 |
-
markdown += "#### GC@3 scores for different models and categories:\n"
|
| 9 |
-
markdown += "| Model | **Mean** | Exist. | Count | Posi. | Col. | Post. | Cel. | Sce. | Lan. | Art. | Comm. | **Vis Mean** | Code | Num. | Tran. | OCR | **Text Mean** |\n"
|
| 10 |
-
markdown += (
|
| 11 |
-
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n"
|
| 12 |
-
)
|
| 13 |
-
for model in data["models"]:
|
| 14 |
-
scores = model["scores"]
|
| 15 |
-
markdown += f"| [{model['name']}]({model['url']}) "
|
| 16 |
-
for score_key in [
|
| 17 |
-
"Mean",
|
| 18 |
-
"Exist",
|
| 19 |
-
"Count",
|
| 20 |
-
"Posi",
|
| 21 |
-
"Col",
|
| 22 |
-
"Post",
|
| 23 |
-
"Cel",
|
| 24 |
-
"Sce",
|
| 25 |
-
"Lan",
|
| 26 |
-
"Art",
|
| 27 |
-
"Comm",
|
| 28 |
-
"VisMean",
|
| 29 |
-
"Code",
|
| 30 |
-
"Num",
|
| 31 |
-
"Tran",
|
| 32 |
-
"OCR",
|
| 33 |
-
"TextMean",
|
| 34 |
-
]:
|
| 35 |
-
if "Mean" in score_key:
|
| 36 |
-
markdown += f"| **{scores[score_key]}** "
|
| 37 |
-
else:
|
| 38 |
-
markdown += f"| {scores[score_key]} "
|
| 39 |
-
markdown += "|\n"
|
| 40 |
-
markdown += """\n\nLegend:
|
| 41 |
-
- Exist.: Existence
|
| 42 |
-
- Count: Count
|
| 43 |
-
- Posi.: Position
|
| 44 |
-
- Col.: Color
|
| 45 |
-
- Post.: Poster
|
| 46 |
-
- Cel.: Celebrity
|
| 47 |
-
- Sce.: Scene
|
| 48 |
-
- Lan.: Landmark
|
| 49 |
-
- Art.: Artwork
|
| 50 |
-
- Com. R.: Commonsense Reasoning
|
| 51 |
-
- Code: Code Reasoning
|
| 52 |
-
- Num.: Numerical Calculation
|
| 53 |
-
- Tran.: Text Translation
|
| 54 |
-
- OCR: OCR"""
|
| 55 |
-
return markdown
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
if __name__ == "__main__":
|
| 59 |
-
with open("leaderboard/leaderboard.json", "r") as f:
|
| 60 |
-
data = json.load(f)
|
| 61 |
-
markdown_content = generate_markdown(data)
|
| 62 |
-
with open("leaderboard/Leaderboard.md", "w") as f:
|
| 63 |
-
f.write(markdown_content)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.github/workflows/update-leaderboard.yaml
DELETED
|
@@ -1,31 +0,0 @@
|
|
| 1 |
-
name: Update Markdown from JSON
|
| 2 |
-
|
| 3 |
-
on:
|
| 4 |
-
push:
|
| 5 |
-
paths:
|
| 6 |
-
- 'leaderboard/leaderboard.json'
|
| 7 |
-
|
| 8 |
-
jobs:
|
| 9 |
-
update-markdown:
|
| 10 |
-
runs-on: ubuntu-latest
|
| 11 |
-
steps:
|
| 12 |
-
- uses: actions/checkout@v4 # Updated to the latest version at the time of writing
|
| 13 |
-
- name: Set up Python
|
| 14 |
-
uses: actions/setup-python@v5
|
| 15 |
-
with:
|
| 16 |
-
python-version: '3.10'
|
| 17 |
-
- name: Install dependencies
|
| 18 |
-
run: |
|
| 19 |
-
python -m pip install --upgrade pip
|
| 20 |
-
pip install pyyaml
|
| 21 |
-
- name: Generate Markdown
|
| 22 |
-
run: python .github/scripts/generate-leaderboard.py
|
| 23 |
-
- name: Commit and push if changed
|
| 24 |
-
env:
|
| 25 |
-
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
| 26 |
-
run: |
|
| 27 |
-
git config --global user.email "action@github.com"
|
| 28 |
-
git config --global user.name "GitHub Action"
|
| 29 |
-
git add leaderboard/Leaderboard.md
|
| 30 |
-
git commit -m "Update Leaderboard.md from JSON data" -a || echo "No changes to commit"
|
| 31 |
-
git push
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
|
@@ -2,10 +2,9 @@
|
|
| 2 |
|
| 3 |
<div>
|
| 4 |
<p align="center">
|
| 5 |
-
<a href="https://
|
| 6 |
<a href="#contribute">Contribute</a> • 
|
| 7 |
<a href="https://arxiv.org/abs/2402.14973">Paper</a> • 
|
| 8 |
-
<a href="">HF Space 🤗</a> • 
|
| 9 |
<a href="#cite-this-work">Citation</a>
|
| 10 |
</p>
|
| 11 |
|
|
@@ -23,7 +22,7 @@ We demostrate a 5-iteration GenCeption procedure below run on a seed images to e
|
|
| 23 |
|
| 24 |
|
| 25 |
## Contribute
|
| 26 |
-
Please add your model details and results to `leaderboard/leaderboard.json` and **create a PR (Pull-Request)** to contribute your results to the [🔥🏅️**Leaderboard**🏅️🔥](https://
|
| 27 |
|
| 28 |
```{bash}
|
| 29 |
conda create --name genception python=3.10 -y
|
|
|
|
| 2 |
|
| 3 |
<div>
|
| 4 |
<p align="center">
|
| 5 |
+
<a href="https://huggingface.co/spaces/valbuc/GenCeption">🔥🏅️🤗 Leaderboard🏅️🔥</a> • 
|
| 6 |
<a href="#contribute">Contribute</a> • 
|
| 7 |
<a href="https://arxiv.org/abs/2402.14973">Paper</a> • 
|
|
|
|
| 8 |
<a href="#cite-this-work">Citation</a>
|
| 9 |
</p>
|
| 10 |
|
|
|
|
| 22 |
|
| 23 |
|
| 24 |
## Contribute
|
| 25 |
+
Please add your model details and results to `leaderboard/leaderboard.json` and **create a PR (Pull-Request)** to contribute your results to the [🔥🏅️**Leaderboard**🏅️🔥](https://huggingface.co/spaces/). Start by creating your virtual environment:
|
| 26 |
|
| 27 |
```{bash}
|
| 28 |
conda create --name genception python=3.10 -y
|
leaderboard/Leaderboard.md
DELETED
|
@@ -1,32 +0,0 @@
|
|
| 1 |
-
<div align="center">
|
| 2 |
-
|
| 3 |
-
# 🔥🏅️GenCeption Leaderboard 🏅️🔥
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
</div>
|
| 8 |
-
|
| 9 |
-
#### GC@3 scores for different models and categories:
|
| 10 |
-
| Model | **Mean** | Exist. | Count | Posi. | Col. | Post. | Cel. | Sce. | Lan. | Art. | Comm. | **Vis Mean** | Code | Num. | Tran. | OCR | **Text Mean** |
|
| 11 |
-
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 12 |
-
| [ChatGPT-4V](https://cdn.openai.com/papers/GPTV_System_Card.pdf) | **0.351** | 0.422 | 0.404 | 0.408 | 0.403 | 0.324 | 0.332 | 0.393 | 0.353 | 0.421 | 0.471 | **0.393** | 0.193 | 0.24 | 0.157 | 0.393 | **0.246** |
|
| 13 |
-
| [mPLUG-Owl2](https://arxiv.org/pdf/2311.04257.pdf) | **0.257** | 0.323 | 0.299 | 0.306 | 0.29 | 0.243 | 0.232 | 0.299 | 0.275 | 0.252 | 0.353 | **0.287** | 0.176 | 0.192 | 0.081 | 0.276 | **0.181** |
|
| 14 |
-
| [LLaVA-13B](https://arxiv.org/pdf/2304.08485.pdf) | **0.238** | 0.305 | 0.294 | 0.255 | 0.3 | 0.215 | 0.206 | 0.277 | 0.242 | 0.212 | 0.334 | **0.264** | 0.144 | 0.195 | 0.116 | 0.239 | **0.174** |
|
| 15 |
-
| [LLaVA-7B](https://arxiv.org/pdf/2304.08485.pdf) | **0.225** | 0.308 | 0.253 | 0.285 | 0.284 | 0.214 | 0.188 | 0.266 | 0.252 | 0.21 | 0.294 | **0.255** | 0.107 | 0.155 | 0.111 | 0.222 | **0.149** |
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
Legend:
|
| 19 |
-
- Exist.: Existence
|
| 20 |
-
- Count: Count
|
| 21 |
-
- Posi.: Position
|
| 22 |
-
- Col.: Color
|
| 23 |
-
- Post.: Poster
|
| 24 |
-
- Cel.: Celebrity
|
| 25 |
-
- Sce.: Scene
|
| 26 |
-
- Lan.: Landmark
|
| 27 |
-
- Art.: Artwork
|
| 28 |
-
- Com. R.: Commonsense Reasoning
|
| 29 |
-
- Code: Code Reasoning
|
| 30 |
-
- Num.: Numerical Calculation
|
| 31 |
-
- Tran.: Text Translation
|
| 32 |
-
- OCR: OCR
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|