CoVT-Phase2-3expert-Strict
Canonical strict 3-expert Phase2 LoRA + merged weights for our CoVT reproduction. Trained after the model.global_steps-on-resume bug was fixed, so the anchor-loss schedule matches the paper.
Contents
| Path | Description |
|---|---|
adapter_config.json, adapter_model.safetensors, non_lora_state_dict.bin |
Final Phase2 LoRA adapter + non-LoRA state. |
checkpoint-9000/, checkpoint-10000/ |
Intermediate Phase2 checkpoints (LoRA + DeepSpeed Zero-2 optimizer states). |
stage4_merged_strict/ |
Final adapter merged into Qwen2.5-VL-7B base โ ready for inference / eval. |
eval_stage4_strict/ |
VLMEvalKit results on CV-Bench 2D / 3D. |
config.json, config_patched.py, trainer_patched.py, trainer_state.json |
Final config and trainer state. |
runs/ |
Tensorboard event files. |
*.log |
Phase2 strict training / eval / continuation logs. |
Companion repos
- Steven668866/CoVT-3expert-Stage1-LoRA โ Phase1 6K-step LoRA adapter (clean).
- Steven668866/CoVT-Phase2-3expert-Full โ Historical repo from the non-strict run; buggy adapter weights have been removed (2026-06-29). Only the clean Phase1 merge + scripts + logs remain there.
Caveat
This is a 3-expert variant of CoVT. The paper's main configuration uses 4 experts, so absolute numbers are not directly comparable to the paper's headline table.
- Downloads last month
- 663
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support