YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

SkillZero Best Checkpoints Export

This package contains the two best checkpoints selected by validation success rate from the completed SkillZero runs.

Included Checkpoints

  1. ALFWorld global_step_160

    • Validation metric: val/success_rate = 0.594
    • Archive path: checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/global_step_160
  2. ALFWorld global_step_150

    • Validation metric: val/success_rate = 0.477
    • Archive path: checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/global_step_150

Related Search Checkpoint

The best Search checkpoint by validation success rate is not included in the "top two overall" package, but is useful for reproducing the Search run:

  • Search global_step_180
  • Validation metric: val/success_rate = 0.356
  • Test metrics:
    • test/full_skill/success_rate = 0.282
    • test/no_skill/success_rate = 0.310
  • Checkpoint path, if packaged separately: checkpoints/SkillZero_search/skillzero_search_vl_3b_local_retriever/global_step_180

Hardware Used

Training was submitted through Slurm on the a100 partition.

  • ALFWorld:

    • GPUs: 4 x A100
    • CPUs per task: 32
    • Memory: 200GB
    • Time limit: 2 days
  • Search local retriever:

    • GPUs: 4 x A100 allocated
    • Training used GPUs 0,1,2
    • Local retriever used GPU 3
    • CPUs per task: 32
    • Memory: 220GB
    • Time limit: 2 days

Runtime Notes

  • Python environment name used on the cluster: skillzero
  • Retriever environment name: retriever
  • Main model: Qwen/Qwen2.5-VL-3B-Instruct
  • Training entry point: python3 -m verl.trainer.main_ppo
  • Original training logs are not required to use the checkpoints.

Restore

After extracting the archive, place checkpoint directories under:

checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/

Then use trainer.resume_mode=resume_path and set trainer.resume_from_path to the target global_step_* directory.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support