rogerxi commited on
Commit
ea864cf
·
verified ·
1 Parent(s): 37fd3cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -30,7 +30,7 @@ Instruction following training: [rogerxi/LLaVA-Spatial-Instruct-850K](https://hu
30
  ## 📊 Evaluation
31
  A collection of 10 benchmarks:
32
  | Model | VQAv2 | GQA | VizWiz | SQA | TextVQA | POPE | MME | MM-Bench | MM-Bench-cn | MM-Vet |
33
- |:----------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:----------:|:--------:|:-----------:|:--------:|
34
  | LLaVA-1.5-7b | 78.5 | 62.0 | **50.0** | 66.8 | 58.2 | 85.9 | **1510.7** | 64.3 | 58.3 | 31.1 |
35
  | Spatial-LLaVA-7b | **79.7** | **62.7** | 48.7 | **68.7** | **58.5** | **87.2** | 1472.7 | **67.8** | **60.7** | **31.6** |
36
 
@@ -48,3 +48,6 @@ A collection of 10 benchmarks:
48
  |:-----------------------:|:------------------------:|:----------------------------:|:--------------------------:|:------------------:|:-------------------:|:----------------------:|
49
  | LLaVA-1.5-7b | 12.90 / 1.06 | 10.68 / 2.03 | 20.79 / 0.94 | **24.19 / 0.50** | 14.29 / 5.27 | 10.23 / 58.33 |
50
  | Spatial-LLaVA-7b | **24.19 / 0.57** | **14.56 / 0.62** | **41.58 / 0.42** | 22.58 / 1.12 | **18.25 / 2.92** | **20.45 / 56.47** |
 
 
 
 
30
  ## 📊 Evaluation
31
  A collection of 10 benchmarks:
32
  | Model | VQAv2 | GQA | VizWiz | SQA | TextVQA | POPE | MME | MM-Bench | MM-Bench-cn | MM-Vet |
33
+ |:-----------------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:----------:|:--------:|:-----------:|:--------:|
34
  | LLaVA-1.5-7b | 78.5 | 62.0 | **50.0** | 66.8 | 58.2 | 85.9 | **1510.7** | 64.3 | 58.3 | 31.1 |
35
  | Spatial-LLaVA-7b | **79.7** | **62.7** | 48.7 | **68.7** | **58.5** | **87.2** | 1472.7 | **67.8** | **60.7** | **31.6** |
36
 
 
48
  |:-----------------------:|:------------------------:|:----------------------------:|:--------------------------:|:------------------:|:-------------------:|:----------------------:|
49
  | LLaVA-1.5-7b | 12.90 / 1.06 | 10.68 / 2.03 | 20.79 / 0.94 | **24.19 / 0.50** | 14.29 / 5.27 | 10.23 / 58.33 |
50
  | Spatial-LLaVA-7b | **24.19 / 0.57** | **14.56 / 0.62** | **41.58 / 0.42** | 22.58 / 1.12 | **18.25 / 2.92** | **20.45 / 56.47** |
51
+
52
+ ## 🙏 Acknowledgements
53
+ We thank Liu Haotian et al. for the LLaVA pretrained script, weights and LLaVA-v1.5 mixture dataset; the teams behind CLEVR, TextCaps, VisualMRC and VQAv2 (via “HuggingFaceM4/the_cauldron”); remyxai for OpenSpaces; Anjie Cheng et al. for Spatial-Bench and data pipeline; Google for OpenImages; and Hugging Face for their datasets infrastructure.