-
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Paper • 2511.19856 • Published -
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision
Paper • 2512.01342 • Published • 16
Tong Da
dtong
·
AI & ML interests
None yet
Recent Activity
updated
a collection
1 day ago
Video Foundation Models
liked
a Space
28 days ago
microsoft/VITRA
liked
a dataset
28 days ago
google/frames-benchmark
Organizations
None yet