CoVT-Phase2-3expert-Strict

Canonical strict 3-expert Phase2 LoRA + merged weights for our CoVT reproduction. Trained after the model.global_steps-on-resume bug was fixed, so the anchor-loss schedule matches the paper.

Contents

Path Description
adapter_config.json, adapter_model.safetensors, non_lora_state_dict.bin Final Phase2 LoRA adapter + non-LoRA state.
checkpoint-9000/, checkpoint-10000/ Intermediate Phase2 checkpoints (LoRA + DeepSpeed Zero-2 optimizer states).
stage4_merged_strict/ Final adapter merged into Qwen2.5-VL-7B base โ€” ready for inference / eval.
eval_stage4_strict/ VLMEvalKit results on CV-Bench 2D / 3D.
config.json, config_patched.py, trainer_patched.py, trainer_state.json Final config and trainer state.
runs/ Tensorboard event files.
*.log Phase2 strict training / eval / continuation logs.

Companion repos

Caveat

This is a 3-expert variant of CoVT. The paper's main configuration uses 4 experts, so absolute numbers are not directly comparable to the paper's headline table.

Downloads last month
663
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support