video_rag_v1 / README.md
aircrypto's picture
Update README.md
698e5a2 verified
# My CLIP Video-Text Model
This model was trained on the MSR-VTT dataset using a custom CLIP-based architecture.
self.video_proj = nn.Sequential(
nn.Linear(512, 2048),
nn.ReLU(),
nn.Linear(2048, 2048),
nn.ReLU(),
nn.Linear(2048, 512)
)