Update README.md
Browse files
README.md
CHANGED
|
@@ -133,7 +133,7 @@ processor.save_pretrained(SAVE_DIR)
|
|
| 133 |
## Evaluation
|
| 134 |
|
| 135 |
|
| 136 |
-
The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), on reasoning tasks using [lighteval](https://github.com/huggingface/lighteval).
|
| 137 |
[vLLM](https://docs.vllm.ai/en/stable/) was used for all evaluations.
|
| 138 |
|
| 139 |
<details>
|
|
@@ -173,6 +173,15 @@ The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-
|
|
| 173 |
--tasks lighteval|aime25|0 \
|
| 174 |
```
|
| 175 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
</details>
|
| 177 |
|
| 178 |
### Accuracy
|
|
|
|
| 133 |
## Evaluation
|
| 134 |
|
| 135 |
|
| 136 |
+
The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), on reasoning tasks using [lighteval](https://github.com/huggingface/lighteval) and on vision tasks using [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval).
|
| 137 |
[vLLM](https://docs.vllm.ai/en/stable/) was used for all evaluations.
|
| 138 |
|
| 139 |
<details>
|
|
|
|
| 173 |
--tasks lighteval|aime25|0 \
|
| 174 |
```
|
| 175 |
|
| 176 |
+
**lmms-eval**
|
| 177 |
+
```
|
| 178 |
+
python3 -m lmms_eval \
|
| 179 |
+
--model vllm \
|
| 180 |
+
--model_args model=RedHatAI/Qwen3-VL-235B-A22B-Instruct-FP8-dynamic,tensor_parallel_size=4,max_model_len=8192,gpu_memory_utilization=0.9 \
|
| 181 |
+
--tasks mmmu_val, chartqa\
|
| 182 |
+
--batch_size 1
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
</details>
|
| 186 |
|
| 187 |
### Accuracy
|