--- license: apache-2.0 library_name: transformers pipeline_tag: video-text-to-text base_model: Qwen/Qwen3-8B language: - en tags: - video-understanding - long-video-understanding - agentic-llm - video-question-answering - vision-language-model - grpo - reinforcement-learning - icml-2026 ---

๐ŸŽฌ VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority

Code HF Model ICML 2026

๐Ÿค— HuggingFace model: CewEhao/VideoSEAL_8B  ยท  ๐Ÿ’ป Code: Echochef/VideoSEAL

## ๐Ÿ‘‰ Introduction This is the official model card for **VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority** (ICML 2026). VideoSEAL provides offline build utilities for long video indexing: - OCR subtitles (SRT) โ†’ OCR captions + (optional) embeddings - Clip captions (VLM) โ†’ clip captions + (optional) embeddings - Merge into a unified semantic index under `indexes/semantic//` - (Optional) generate a global `full_story.txt` summary ## ๐Ÿ“ฆ Layout - ๐Ÿงฐ Shell entrypoints: `scripts/` - ๐Ÿ Python package: `videoseal/` - โœ… Tests: `test/` - ๐Ÿงฉ OCR toolchain (vendored): `third_party/video-subtitle-extractor/` ## โš™๏ธ Configuration - Defaults live in the scripts under `scripts/`. - Put real API keys/endpoints in your shell environment / job launcher. ## ๐Ÿ—๏ธ Run offline build ```bash cd /path/to/VideoSEAL export MLLM_API_KEY="sk_your_api_key" export EMBEDDING_API_KEY="sk_your_api_key" export AGENT_LLM_API_KEY="sk_your_api_key" export VISUAL_INSPECT_API_KEY="sk_your_api_key" VIDEO=/path/to/video.mp4 BENCHMARK=LVBench ./scripts/run_offline_build.sh ``` ## โœ… Run tests ```bash /root/miniconda3/envs/rllm/bin/python -m unittest discover -s test -v ``` ## ๐Ÿ‹๏ธ GRPO training (video tool workflow) This repo vendors a minimal copy of the `rllm/` + `verl/` Python packages (under the repo root) to make the video tool-agent GRPO workflow runnable without an extra repo checkout. ### ๐Ÿงช Training environment (conda) ```bash conda create -n videoseal python=3.12 -y conda activate videoseal pip install vllm==0.11.0 cd rllm pip install -e . cd ../verl pip install -e . ``` ### ๐Ÿš€ Launcher - `scripts/train/run_video_workflow_grpo.sh` ### ๐Ÿงฉ Example ```bash cd /path/to/VideoSEAL # Export real API keys/endpoints in your environment before launching. TRAIN_PARQUET='["/path/to/train.parquet"]' \ VAL_PARQUET='/path/to/val.parquet' \ MODEL_PATH='Qwen/Qwen3-8B' \ ./scripts/train/run_video_workflow_grpo.sh train ``` ### ๐Ÿ”Ž Quick checks ```bash ./scripts/train/run_video_workflow_grpo.sh test-reward pytest -q tests/rewards/test_video_reward_tool_env_integration.py ```