video_rag_v1 / README.md
aircrypto's picture
Update README.md
698e5a2 verified

My CLIP Video-Text Model

This model was trained on the MSR-VTT dataset using a custom CLIP-based architecture.

self.video_proj = nn.Sequential( nn.Linear(512, 2048), nn.ReLU(), nn.Linear(2048, 2048), nn.ReLU(), nn.Linear(2048, 512) )