Spaces:
Running
Running
Upload GVE-7B.json
#107
by
Zhuoning
- opened
Here, we present the results of the GVE-7B.
- Model: GVE-7B
- Paper: Towards Universal Video Retrieval: Generalizing Video Embedding via Synthesized Multimodal Pyramid Curriculum
- Evaluation: mostly from Qwen3-VL-Embedding
NOTE that:
- All Public Data: GVE-7B has been trained on 13M publicly available retrieval data (including 1.55M synthesized data based on public videos), providing detailed reproduction configurations
- Fully Zero-shot: No in-domain data for MMEB-V2-Video datasets is included in the training stages of the GVE series
- Retrieval Data Only: We do not utilize any video QA, classification, or grounding data
- Test-time Scaling: We report the best performance among the tested results from different test-time configurations of context length
ziyjiang
changed pull request status to
merged