CoVT-Phase2-3expert-Strict

Canonical strict 3-expert Phase2 LoRA + merged weights for our CoVT reproduction. Trained after the model.global_steps-on-resume bug was fixed, so the anchor-loss schedule matches the paper.

Path	Description
`adapter_config.json`, `adapter_model.safetensors`, `non_lora_state_dict.bin`	Final Phase2 LoRA adapter + non-LoRA state.
`checkpoint-9000/`, `checkpoint-10000/`	Intermediate Phase2 checkpoints (LoRA + DeepSpeed Zero-2 optimizer states).
`stage4_merged_strict/`	Final adapter merged into Qwen2.5-VL-7B base — ready for inference / eval.
`eval_stage4_strict/`	VLMEvalKit results on CV-Bench 2D / 3D.
`config.json`, `config_patched.py`, `trainer_patched.py`, `trainer_state.json`	Final config and trainer state.
`runs/`	Tensorboard event files.
`*.log`	Phase2 strict training / eval / continuation logs.

Companion repos

Steven668866/CoVT-3expert-Stage1-LoRA — Phase1 6K-step LoRA adapter (clean).
Steven668866/CoVT-Phase2-3expert-Full — Historical repo from the non-strict run; buggy adapter weights have been removed (2026-06-29). Only the clean Phase1 merge + scripts + logs remain there.

Caveat

This is a 3-expert variant of CoVT. The paper's main configuration uses 4 experts, so absolute numbers are not directly comparable to the paper's headline table.

Downloads last month: 663

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Steven668866
/

CoVT-Phase2-3expert-Strict

CoVT-Phase2-3expert-Strict

Contents

Companion repos

Caveat