Update README.md
Browse files
README.md
CHANGED
|
@@ -76,8 +76,7 @@ We trained four models using RLinf:
|
|
| 76 |
### Benchmark Results
|
| 77 |
|
| 78 |
Sft models for LIBERO-90 and LIBERO-130 are trained by ourself following training reciepe from [OpenVLA-OFT](https://github.com/moojink/openvla-oft/blob/main/vla-scripts/finetune.py). And other sft models are from [SimpleVLA-RL](https://huggingface.co/collections/Haozhan72/simplevla-rl-6833311430cd9df52aeb1f86).
|
| 79 |
-
> We
|
| 80 |
-
> We evaluate each model according to its training configuration.
|
| 81 |
> For the SFT-trained (LoRA-base) models, we set do_sample = False.
|
| 82 |
> For the RL-trained models, we set do_sample = True, temperature = 1.6, and enable rollout_epoch=2, and the final results are reported as the average across the two runs.
|
| 83 |
|
|
|
|
| 76 |
### Benchmark Results
|
| 77 |
|
| 78 |
Sft models for LIBERO-90 and LIBERO-130 are trained by ourself following training reciepe from [OpenVLA-OFT](https://github.com/moojink/openvla-oft/blob/main/vla-scripts/finetune.py). And other sft models are from [SimpleVLA-RL](https://huggingface.co/collections/Haozhan72/simplevla-rl-6833311430cd9df52aeb1f86).
|
| 79 |
+
> We evaluate each model according to its training configuration. Using libero_seed = 0 and evaluating 500 episodes for the Object, Spatial, Goal, and Long suites, 4,500 episodes for LIBERO-90, and 6,500 episodes for LIBERO-130.
|
|
|
|
| 80 |
> For the SFT-trained (LoRA-base) models, we set do_sample = False.
|
| 81 |
> For the RL-trained models, we set do_sample = True, temperature = 1.6, and enable rollout_epoch=2, and the final results are reported as the average across the two runs.
|
| 82 |
|