Spaces:

VEFX-Reward
/

VEFX-Code

Running

App Files Files Community

VEFX-Reward commited on May 7

Commit

d0216b0

verified ·

1 Parent(s): fee27b1

Initialize Space README with frontmatter

Browse files

Files changed (1) hide show

README.md +242 -5

README.md CHANGED Viewed

@@ -1,10 +1,247 @@
 ---
-title: VEFX Code
-emoji: 🚀
-colorFrom: gray
-colorTo: red
 sdk: static
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: VEFX-Code
+emoji: 🎬
+colorFrom: indigo
+colorTo: pink
 sdk: static
 pinned: false
+license: apache-2.0
+short_description: VEFX-Bench reference code & inference utils
 ---
+<div align="center">
+# VEFX-Bench
+### Benchmarking Generic Video Editing and Visual Effects
+**VEFX-Bench** is a comprehensive benchmark for evaluating text-driven video editing and visual effects. It includes **5,049 annotated examples** spanning **9 categories** and **32 subcategories**, evaluated by **VEFX-Reward** — a VLM-based reward model that scores edits across three dimensions on a 1–4 scale:
+| Dimension | What it measures |
+|---|---|
+| **Instructional Following (IF)** | Does the edit accurately reflect the editing instruction? |
+| **Render Quality (RQ)** | Visual clarity, temporal consistency, and physical plausibility |
+| **Edit Exclusivity (EE)** | Were only the intended regions modified, without side-effects? |
+---
+## 🏆 Model Leaderboard
+VEFX-Reward scores on 1–4 scale. Ranked by **GeoAgg** (α=2 for IF, β=1 for RQ, γ=1 for EE). Higher is better.
+> **📅 Updated: May 2, 2026** — For the latest results & submissions, visit the **[live leaderboard →](https://vefx-leaderboard.com/)**
+| Rank | Model | Type | IF ↑ | RQ ↑ | EE ↑ | GeoAgg ↑ |
+|:---:|---|---|:---:|:---:|:---:|:---:|
+| 🥇 | **Kling o3 Omni** | Commercial | 3.033 | **3.588** | 3.043 | **3.057** |
+| 🥈 | **Kling o1** | Commercial | **3.040** | 3.534 | 2.976 | 2.985 |
+| 🥉 | **Runway Gen-4.5** | Commercial | 2.817 | 3.319 | 2.923 | 2.912 |
+| 4 | Seedance 2.0 | Commercial | 2.811 | 3.421 | 3.088 | 2.766 |
+| 5 | Grok Imagine | Commercial | 2.606 | 3.346 | **3.376** | 2.723 |
+| 6 | Luma Ray 3 | Commercial | 2.702 | 3.403 | 2.705 | 2.717 |
+| 7 | UniVideo | Open-source | 2.294 | 3.266 | 3.091 | 2.516 |
+| 8 | Wan 2.6 | Commercial | 2.012 | 3.317 | 2.446 | 2.146 |
+| 9 | Luma Ray 2 | Commercial | 2.038 | 2.532 | 1.363 | 1.804 |
+| 10 | VACE | Open-source | 2.027 | 3.172 | 1.180 | 1.775 |
+---
+## 🎬 Demo Videos
+Each demo shows the **original video** (left) alongside the **edited video** (right).
+<table>
+<tr>
+<td align="center"><b>Attribute Change</b><br><sub>"Change the color of the red industrial trailer to a bright yellow while maintaining the texture and appearance of the metal surface."</sub></td>
+<td align="center"><b>Object Removal</b><br><sub>"Remove the woman with the grey backpack walking on the right side of the frame."</sub></td>
+</tr>
+<tr>
+<td align="center"><img src="assets/demo_attribute_change.gif" width="400"></td>
+<td align="center"><img src="assets/demo_object_removal.gif" width="400"></td>
+</tr>
+<tr>
+<td align="center"><b>Style Transfer</b><br><sub>"Restore the natural, realistic colors to the entire scene, replacing the current black and white style with a full-color rendition."</sub></td>
+<td align="center"><b>Camera Motion</b><br><sub>"Perform a smooth zoom in on the distant snowy mountain peaks to create a more immersive view."</sub></td>
+</tr>
+<tr>
+<td align="center"><img src="assets/demo_style_transfer.gif" width="400"></td>
+<td align="center"><img src="assets/demo_camera_zoom.gif" width="400"></td>
+</tr>
+</table>
+---
+## 📊 Benchmark at a Glance
+| | |
+|---|---|
+| 📝 **5,049** Annotated Examples | 🎬 **1,419** Source Videos |
+| 📂 **9 / 32** Categories / Subcategories | 🤖 **10** Editing Systems |
+| 📐 **3** Quality Dimensions (IF, RQ, EE) | 🧪 **300** Benchmark Test Pairs |
+---
+## 🤗 VEFX-Reward Models
+| Model | Backbone | Params | HuggingFace | Status |
+|---|---|---|---|---|
+| **VEFX-Reward-4B** | Qwen3-VL-4B-Instruct | 4B | [VEFX-Reward/VEFX-Reward-4B](https://huggingface.co/VEFX-Reward/VEFX-Reward-4B) | ✅ Available |
+---
+## 📦 VEFX-Bench Dataset
+The benchmark dataset is hosted on HuggingFace at **[VEFX-Reward/VEFX-Bench](https://huggingface.co/datasets/VEFX-Reward/VEFX-Bench)**.
+| | |
+|---|---|
+| 🎬 **300** Source Videos (720p) | 📝 `prompts.json` with editing instructions |
+| 📂 **9** Task Categories | 🗂️ `benchmark_meta.json` with category labels |
+**Task Categories:** Style Transfer · Object Manipulation · Background Change · Color/Lighting · Motion/Animation · Text/Overlay · Composition · Removal/Inpainting · Complex/Multi-step
+### Download and Evaluate
+```python
+from huggingface_hub import snapshot_download
+# Download the benchmark dataset
+snapshot_download(repo_id="VEFX-Reward/VEFX-Bench", repo_type="dataset", local_dir="./vefx_bench")
+```
+**Evaluation workflow:**
+1. Download the 300 source videos and `prompts.json`
+2. Apply your video editing model to each source video following its prompt
+3. Save edited videos as `0000.mp4` through `0299.mp4` (matching source index)
+4. Score with VEFX-Reward:
+```python
+import json
+from vefx_reward import VEFXReward
+model = VEFXReward("VEFX-Reward/VEFX-Reward-4B", device="cuda")
+with open("vefx_bench/prompts.json") as f:
+    prompts = json.load(f)
+for idx, item in enumerate(prompts):
+    scores = model.score(
+        original_video=f"vefx_bench/{idx:04d}.mp4",
+        edited_video=f"your_edits/{idx:04d}.mp4",
+        instruction=item["instruction"],
+    )
+    print(f"[{idx:04d}] IF={scores['IF']:.2f}  RQ={scores['RQ']:.2f}  EE={scores['EE']:.2f}")
+```
+---
+## 🚀 Quick Start
+### Installation
+```bash
+conda create -n vefx-bench python=3.10 -y
+conda activate vefx-bench
+# Install PyTorch first (match your CUDA version)
+# See https://pytorch.org/get-started/locally/ for the right command
+pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
+# Install remaining dependencies
+pip install -r requirements.txt
+# Install the package
+pip install -e .
+```
+> **Requirements:** Python ≥ 3.10, CUDA GPU, ~10 GB VRAM (bfloat16). Make sure your PyTorch CUDA version matches your driver.
+### Score a Video Edit (Python API)
+```python
+from vefx_reward import VEFXReward
+model = VEFXReward("VEFX-Reward/VEFX-Reward-4B", device="cuda")
+scores = model.score(
+    original_video="examples/sample_videos/object_removal_original.mp4",
+    edited_video="examples/sample_videos/object_removal_edited.mp4",
+    instruction="Remove the woman with the grey backpack walking on the right side of the frame.",
+)
+print(scores)
+# {'IF': 2.34, 'RQ': 1.93, 'EE': 1.82, 'Overall': 6.09}
+```
+### CLI Usage
+```bash
+python examples/quick_start.py \
+    --original examples/sample_videos/object_removal_original.mp4 \
+    --edited examples/sample_videos/object_removal_edited.mp4 \
+    --instruction "Remove the woman with the grey backpack walking on the right side of the frame."
+```
+### Score All Included Samples
+The repo includes 4 sample video pairs with prompts. Score them all:
+```python
+import json
+from vefx_reward import VEFXReward
+model = VEFXReward("VEFX-Reward/VEFX-Reward-4B", device="cuda")
+with open("examples/sample_videos/prompts.json") as f:
+    samples = json.load(f)
+for sample in samples:
+    scores = model.score(
+        original_video=f"examples/sample_videos/{sample['original']}",
+        edited_video=f"examples/sample_videos/{sample['edited']}",
+        instruction=sample["instruction"],
+    )
+    print(f"[{sample['category']}] IF={scores['IF']:.2f}  RQ={scores['RQ']:.2f}  EE={scores['EE']:.2f}")
+```
+### Batch Scoring
+Prepare a CSV with columns `original_video`, `edited_video`, `instruction`:
+```bash
+python examples/batch_scoring.py --csv edits.csv --output results.csv
+```
+### Multi-GPU Scoring
+For large-scale evaluation across multiple GPUs:
+```bash
+python examples/multi_gpu_scoring.py --csv edits.csv --num_gpus 4 --output results.csv
+```
+---
+## 📖 API Reference
+### `VEFXReward`
+```python
+VEFXReward(
+    model_path="VEFX-Reward/VEFX-Reward-4B",  # HuggingFace ID or local path
+    device="cuda",                           # "cuda", "cuda:0", "cpu"
+    dtype=torch.bfloat16,                    # torch.bfloat16 or torch.float16
+    fps=4.0,                                 # Video sampling rate
+    max_frame_pixels=399360,                 # Max pixels per frame
+)
+```
+#### `model.score(original_video, edited_video, instruction) → dict`
+Score a single video edit. Returns `{'IF': float, 'RQ': float, 'EE': float, 'Overall': float}`.
+#### `model.score_batch(original_videos, edited_videos, instructions) → list[dict]`
+Score multiple edits sequentially. Each sample is processed independently to avoid OOM.
+---