Nickybcybc
/

skillzero-best-checkpoints

Model card Files Files and versions

skillzero-best-checkpoints / README.md

Nickybcybc's picture

Upload folder using huggingface_hub

1d91be3 verified 6 days ago

|

history blame contribute delete

2.02 kB

	# SkillZero Best Checkpoints Export

	This package contains the two best checkpoints selected by validation success rate from the completed SkillZero runs.

	## Included Checkpoints

	1. ALFWorld `global_step_160`
	- Validation metric: `val/success_rate = 0.594`
	- Archive path:
	`checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/global_step_160`

	2. ALFWorld `global_step_150`
	- Validation metric: `val/success_rate = 0.477`
	- Archive path:
	`checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/global_step_150`

	## Related Search Checkpoint

	The best Search checkpoint by validation success rate is not included in the "top two overall" package, but is useful for reproducing the Search run:

	- Search `global_step_180`
	- Validation metric: `val/success_rate = 0.356`
	- Test metrics:
	- `test/full_skill/success_rate = 0.282`
	- `test/no_skill/success_rate = 0.310`
	- Checkpoint path, if packaged separately:
	`checkpoints/SkillZero_search/skillzero_search_vl_3b_local_retriever/global_step_180`

	## Hardware Used

	Training was submitted through Slurm on the `a100` partition.

	- ALFWorld:
	- GPUs: 4 x A100
	- CPUs per task: 32
	- Memory: 200GB
	- Time limit: 2 days

	- Search local retriever:
	- GPUs: 4 x A100 allocated
	- Training used GPUs 0,1,2
	- Local retriever used GPU 3
	- CPUs per task: 32
	- Memory: 220GB
	- Time limit: 2 days

	## Runtime Notes

	- Python environment name used on the cluster: `skillzero`
	- Retriever environment name: `retriever`
	- Main model: `Qwen/Qwen2.5-VL-3B-Instruct`
	- Training entry point: `python3 -m verl.trainer.main_ppo`
	- Original training logs are not required to use the checkpoints.

	## Restore

	After extracting the archive, place checkpoint directories under:

	```bash
	checkpoints/SkillZero_alfworld/skillzero_alfworld_vl_3b_safe/
	```

	Then use `trainer.resume_mode=resume_path` and set `trainer.resume_from_path` to the target `global_step_*` directory.