# SkillZero Best Checkpoints Export This package contains the two best checkpoints selected by validation success rate from the completed SkillZero runs. ## Included Checkpoints 1. ALFWorld `global_step_160` - Validation metric: `val/success_rate = 0.594` - Archive path: `checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/global_step_160` 2. ALFWorld `global_step_150` - Validation metric: `val/success_rate = 0.477` - Archive path: `checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/global_step_150` ## Related Search Checkpoint The best Search checkpoint by validation success rate is not included in the "top two overall" package, but is useful for reproducing the Search run: - Search `global_step_180` - Validation metric: `val/success_rate = 0.356` - Test metrics: - `test/full_skill/success_rate = 0.282` - `test/no_skill/success_rate = 0.310` - Checkpoint path, if packaged separately: `checkpoints/SkillZero_search/skillzero_search_vl_3b_local_retriever/global_step_180` ## Hardware Used Training was submitted through Slurm on the `a100` partition. - ALFWorld: - GPUs: 4 x A100 - CPUs per task: 32 - Memory: 200GB - Time limit: 2 days - Search local retriever: - GPUs: 4 x A100 allocated - Training used GPUs 0,1,2 - Local retriever used GPU 3 - CPUs per task: 32 - Memory: 220GB - Time limit: 2 days ## Runtime Notes - Python environment name used on the cluster: `skillzero` - Retriever environment name: `retriever` - Main model: `Qwen/Qwen2.5-VL-3B-Instruct` - Training entry point: `python3 -m verl.trainer.main_ppo` - Original training logs are not required to use the checkpoints. ## Restore After extracting the archive, place checkpoint directories under: ```bash checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/ ``` Then use `trainer.resume_mode=resume_path` and set `trainer.resume_from_path` to the target `global_step_*` directory.