Feature Extraction
Transformers
Safetensors
English
qwen2_vl
image-text-to-text
multimodal
video embedding
ncsoft
ncai
varco
Instructions to use NCSOFT/GME-VARCO-VISION-Embedding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NCSOFT/GME-VARCO-VISION-Embedding with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="NCSOFT/GME-VARCO-VISION-Embedding")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("NCSOFT/GME-VARCO-VISION-Embedding") model = AutoModelForImageTextToText.from_pretrained("NCSOFT/GME-VARCO-VISION-Embedding") - Notebooks
- Google Colab
- Kaggle
Update README.md
#3
by sun0park - opened
README.md
CHANGED
|
@@ -205,6 +205,18 @@ video_emb = F.normalize(video_emb, dim=-1)
|
|
| 205 |
|
| 206 |
<br>
|
| 207 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 208 |
---
|
| 209 |
license: cc-by-nc-4.0
|
| 210 |
---
|
|
|
|
| 205 |
|
| 206 |
<br>
|
| 207 |
|
| 208 |
+
## Citation
|
| 209 |
+
```bibtex
|
| 210 |
+
@misc{park2026vagentinteractivevideosearch,
|
| 211 |
+
title={V-Agent: An Interactive Video Search System Using Vision-Language Models},
|
| 212 |
+
author={SunYoung Park and Jong-Hyeon Lee and Youngjune Kim and Daegyu Sung and Younghyun Yu and Young-rok Cha and Jeongho Ju},
|
| 213 |
+
year={2026},
|
| 214 |
+
eprint={2512.16925},
|
| 215 |
+
archivePrefix={arXiv},
|
| 216 |
+
primaryClass={cs.CV},
|
| 217 |
+
url={https://arxiv.org/abs/2512.16925},
|
| 218 |
+
}
|
| 219 |
+
```
|
| 220 |
---
|
| 221 |
license: cc-by-nc-4.0
|
| 222 |
---
|