File size: 4,075 Bytes
056852e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
license: mit
pipeline_tag: video-text-to-text
library_name: transformers
---

# PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models

This repository contains the PhyDetEx model, designed for detecting and explaining physically implausible content in videos generated by Text-to-Video (T2V) models. PhyDetEx introduces a lightweight fine-tuning approach, enabling Vision-Language Models (VLMs) to not only detect physically implausible events but also generate textual explanations on the violated physical principles.

This work was presented in the paper:
[PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models](https://huggingface.co/papers/2512.01843)

- πŸ“– **Paper**: [PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models](https://huggingface.co/papers/2512.01843)
- πŸ’» **Code**: [https://github.com/Zeqing-Wang/PhyDetEx](https://github.com/Zeqing-Wang/PhyDetEx)
- πŸ€— **PID Dataset**: [https://huggingface.co/datasets/NNaptmn/PhyDetExDatasets](https://huggingface.co/datasets/NNaptmn/PhyDetExDatasets)

<img src="https://github.com/Zeqing-Wang/PhyDetEx/raw/main/assets/overall_figs.png" width="100%" alt="Overall Figure" />

## πŸ”₯ News
- **[2025.12.01]** πŸ”₯ We release the PID Dataset and the PhyDetEx Model!

## Introduction

PhyDetEx is a model designed for detecting physical implausible content. Additionally, to better address and test physical implausible content detection, we provide the PID Physical Implausibility Detection dataset.

## πŸ”§ How to Start

### Download the PID Test split

Download `PID_Test_split.zip` from [πŸ€— PID Dataset](https://huggingface.co/datasets/NNaptmn/PhyDetExDatasets), place it in the `Data/PID_test` directory, and organize it as follows:
PID_test/
    pos/
        video_xxx.mp4
        ......
    neg/
        video_xxx.mp4
        ......
    anno_file.json
```

### Download the PhyDetEx

Download PhyDetEx from [πŸ€— PhyDetEx Model](https://huggingface.co/NNaptmn/PhyDetEx).

### Prepare the Environment

```bash
pip install -r requirements.txt
```

Please note that the version of transformers may affect specific metrics, so it is recommended to use the version specified in requirements.txt.

### Set variables
In benchmark_on_pid_test_split.py, set the corresponding path for PhyDetEx, then run:
```
python benchmark_on_pid_test_split.py
```
The resulting ./res/res_on_pid_test.json will contain the F1 Score, Acc Plausible, and Acc Implausible.

### Get the reasoning score
Deploy any LLM using [lmdeploy](https://github.com/InternLM/lmdeploy). In the paper, we report results using LLaMa3 8B.

In infer_llm_score_for_pid_test_lmdeploy.py, set the corresponding port and evaluation file path, then run:

```
python infer_llm_score_for_pid_test_lmdeploy.py
```

### πŸ§ͺ Test on ImpossibleVideos

You can download and process the Physical Law-related data from [Impossible-Videos](https://github.com/showlab/Impossible-Videos). Alternatively, we recommend directly downloading our preprocessed data: [πŸ€— PID Dataset](https://huggingface.co/datasets/NNaptmn/PhyDetExDatasets) "ImpossibleVideos_Physical_Law_Only.zip", and placing it in `Data/PID_test`. The remaining steps are the same as for the PID Test.

Please note that the scripts for running ImpossibleVideos are `benchmark_on_impossible_videos.py` and `infer_llm_score_for_impossible_video_lmdeploy.py`.

## πŸ”§ Train the PhyDetEx

In the [πŸ€— PID Dataset](https://huggingface.co/datasets/NNaptmn/PhyDetExDatasets), we also provide the PID Train Split. For training PhyDetEx, we use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).

## Acknowledgement
We heavily borrow the data and code from ImpossibleVideos, and LLaMA-Factory. Thanks for sharing their code. 

## πŸ“Œ Citation

If you find the code useful for your work, please star this repo and consider citing:

```bibtex
@article{wang2025phydetex,
  title={PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models},
  author={},
  journal={arXiv preprint arXiv:2512.01843},
  year={2025}
}
```