nm-research commited on
Commit
96efa89
·
verified ·
1 Parent(s): f452fe1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -133,7 +133,7 @@ processor.save_pretrained(SAVE_DIR)
133
  ## Evaluation
134
 
135
 
136
- The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), on reasoning tasks using [lighteval](https://github.com/huggingface/lighteval).
137
  [vLLM](https://docs.vllm.ai/en/stable/) was used for all evaluations.
138
 
139
  <details>
@@ -173,6 +173,15 @@ The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-
173
  --tasks lighteval|aime25|0 \
174
  ```
175
 
 
 
 
 
 
 
 
 
 
176
  </details>
177
 
178
  ### Accuracy
 
133
  ## Evaluation
134
 
135
 
136
+ The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), on reasoning tasks using [lighteval](https://github.com/huggingface/lighteval) and on vision tasks using [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval).
137
  [vLLM](https://docs.vllm.ai/en/stable/) was used for all evaluations.
138
 
139
  <details>
 
173
  --tasks lighteval|aime25|0 \
174
  ```
175
 
176
+ **lmms-eval**
177
+ ```
178
+ python3 -m lmms_eval \
179
+ --model vllm \
180
+ --model_args model=RedHatAI/Qwen3-VL-235B-A22B-Instruct-FP8-dynamic,tensor_parallel_size=4,max_model_len=8192,gpu_memory_utilization=0.9 \
181
+ --tasks mmmu_val, chartqa\
182
+ --batch_size 1
183
+ ```
184
+
185
  </details>
186
 
187
  ### Accuracy