Update README.md
Browse files
README.md
CHANGED
|
@@ -30,7 +30,7 @@ Instruction following training: [rogerxi/LLaVA-Spatial-Instruct-850K](https://hu
|
|
| 30 |
## 📊 Evaluation
|
| 31 |
A collection of 10 benchmarks:
|
| 32 |
| Model | VQAv2 | GQA | VizWiz | SQA | TextVQA | POPE | MME | MM-Bench | MM-Bench-cn | MM-Vet |
|
| 33 |
-
|
| 34 |
| LLaVA-1.5-7b | 78.5 | 62.0 | **50.0** | 66.8 | 58.2 | 85.9 | **1510.7** | 64.3 | 58.3 | 31.1 |
|
| 35 |
| Spatial-LLaVA-7b | **79.7** | **62.7** | 48.7 | **68.7** | **58.5** | **87.2** | 1472.7 | **67.8** | **60.7** | **31.6** |
|
| 36 |
|
|
@@ -48,3 +48,6 @@ A collection of 10 benchmarks:
|
|
| 48 |
|:-----------------------:|:------------------------:|:----------------------------:|:--------------------------:|:------------------:|:-------------------:|:----------------------:|
|
| 49 |
| LLaVA-1.5-7b | 12.90 / 1.06 | 10.68 / 2.03 | 20.79 / 0.94 | **24.19 / 0.50** | 14.29 / 5.27 | 10.23 / 58.33 |
|
| 50 |
| Spatial-LLaVA-7b | **24.19 / 0.57** | **14.56 / 0.62** | **41.58 / 0.42** | 22.58 / 1.12 | **18.25 / 2.92** | **20.45 / 56.47** |
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
## 📊 Evaluation
|
| 31 |
A collection of 10 benchmarks:
|
| 32 |
| Model | VQAv2 | GQA | VizWiz | SQA | TextVQA | POPE | MME | MM-Bench | MM-Bench-cn | MM-Vet |
|
| 33 |
+
|:-----------------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:----------:|:--------:|:-----------:|:--------:|
|
| 34 |
| LLaVA-1.5-7b | 78.5 | 62.0 | **50.0** | 66.8 | 58.2 | 85.9 | **1510.7** | 64.3 | 58.3 | 31.1 |
|
| 35 |
| Spatial-LLaVA-7b | **79.7** | **62.7** | 48.7 | **68.7** | **58.5** | **87.2** | 1472.7 | **67.8** | **60.7** | **31.6** |
|
| 36 |
|
|
|
|
| 48 |
|:-----------------------:|:------------------------:|:----------------------------:|:--------------------------:|:------------------:|:-------------------:|:----------------------:|
|
| 49 |
| LLaVA-1.5-7b | 12.90 / 1.06 | 10.68 / 2.03 | 20.79 / 0.94 | **24.19 / 0.50** | 14.29 / 5.27 | 10.23 / 58.33 |
|
| 50 |
| Spatial-LLaVA-7b | **24.19 / 0.57** | **14.56 / 0.62** | **41.58 / 0.42** | 22.58 / 1.12 | **18.25 / 2.92** | **20.45 / 56.47** |
|
| 51 |
+
|
| 52 |
+
## 🙏 Acknowledgements
|
| 53 |
+
We thank Liu Haotian et al. for the LLaVA pretrained script, weights and LLaVA-v1.5 mixture dataset; the teams behind CLEVR, TextCaps, VisualMRC and VQAv2 (via “HuggingFaceM4/the_cauldron”); remyxai for OpenSpaces; Anjie Cheng et al. for Spatial-Bench and data pipeline; Google for OpenImages; and Hugging Face for their datasets infrastructure.
|