faisalishfaq2005 commited on
Commit
386408b
Β·
verified Β·
1 Parent(s): c044153

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +141 -117
README.md CHANGED
@@ -1,117 +1,141 @@
1
- # Deepfake Detection with Improved EfficientViT
2
-
3
- ## Model Architecture
4
-
5
- ![Model Architecture](assets/architecture.png)
6
-
7
- ## Inference Pipeline
8
-
9
- ![Inference Pipeline](assets/inference_pipeline.png)
10
-
11
-
12
- This repository contains a **PyTorch model for deepfake detection** based on an improved **EfficientViT** architecture, trained on video data.
13
-
14
- The model predicts whether a video is **real (0)** or **fake (1)** using both visual information and temporal cues.
15
-
16
- ---
17
-
18
- ## 🧩 Model Description
19
-
20
- **Architecture:** Improved EfficientViT
21
- **Backbone:** EfficientNet-B0 for feature extraction
22
- **Head:** Transformer-based temporal modeling with classification head
23
- **Input:** Video frames (224Γ—224 RGB images)
24
- **Output:** Binary label (0=Real, 1=Fake) and frame-level probabilities
25
-
26
- **Key Features:**
27
-
28
- - Extracts faces from frames using MTCNN
29
- - Supports inference on raw video files
30
- - Provides frame-level probabilities for fine-grained analysis
31
-
32
- ---
33
-
34
- ## πŸ“ Repository Structure
35
-
36
- ```
37
- deepfake-efficientvit/
38
- β”‚
39
- β”œβ”€β”€ model.py # ImprovedEfficientViT class
40
- β”œβ”€β”€ inference.py # Functions to run inference on videos
41
- β”œβ”€β”€ model.pth # Trained weights
42
- β”œβ”€β”€ config.json # Optional model metadata
43
- β”œβ”€β”€ requirements.txt # Required packages
44
- β”œβ”€β”€ README.md
45
-
46
- ```
47
-
48
- ## ⚑ Installation
49
- git clone https://huggingface.co/faisalishfaq2005/deepfake-detection-efficientnet-vit
50
-
51
- cd deepfake-detection-efficientnet-vit
52
-
53
- pip install -r requirements.txt
54
-
55
- ## πŸš€ Usage
56
- # 1.Programmatic Inference
57
-
58
- ```python
59
-
60
- from huggingface_hub import hf_hub_download
61
- import torch
62
- from model import ImprovedEfficientViT
63
- from inference import predict_vedio # your inference function
64
-
65
- # 1️⃣ Download the checkpoint from Hugging Face
66
- checkpoint_path = hf_hub_download(
67
- repo_id="faisalishfaq2005/deepfake-detection-efficientnet-vit",
68
- filename="model.pth"
69
- )
70
-
71
- # 2️⃣ Load the model
72
- model = ImprovedEfficientViT()
73
- model.load_state_dict(torch.load(checkpoint_path, map_location="cpu"))
74
- model.eval()
75
-
76
- # 3️⃣ Run inference on a video
77
- video_path = "sample_video.mp4"
78
- result = predict_vedio(video_path, model)
79
- print(result)
80
- # Example Output: {'class': 1}
81
-
82
-
83
- ```
84
- # 2. Manual Download
85
-
86
- Go to the Hugging Face model page
87
-
88
- Download:
89
-
90
- model.pth
91
-
92
- model.py
93
-
94
- inference.py
95
-
96
- Place them in the same folder locally.
97
-
98
- Install requirements and run predict_video().
99
-
100
- ## πŸ“„ License
101
-
102
- This model is released under the MIT License.
103
- You are free to use, modify, and distribute it, with attribution.
104
-
105
- ## πŸ“š Citation
106
-
107
- If you use this model in your research, please cite:
108
-
109
- ```bibtex
110
- @inproceedings{faisalishfaq2025efficientvit,
111
- title={Deepfake Detection with Efficientnet and ViT},
112
- author={Faisal Ishfaq},
113
- year={2025}
114
- }
115
- ```
116
-
117
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - deepfake-detection
7
+ - vision-transformer
8
+ - efficientnet
9
+ - multimodal
10
+ - pytorch
11
+ - computer-vision
12
+ - model
13
+ - image-classification
14
+ datasets:
15
+ - custom
16
+ metrics:
17
+ - accuracy
18
+ - f1
19
+ pipeline_tag: image-classification
20
+ library_name: pytorch
21
+ widget:
22
+ - text: "sample_video.mp4"
23
+ ---
24
+
25
+ # Deepfake Detection with Improved EfficientViT
26
+
27
+ ## Model Architecture
28
+
29
+ ![Model Architecture](assets/architecture.png)
30
+
31
+ ## Inference Pipeline
32
+
33
+ ![Inference Pipeline](assets/inference_pipeline.png)
34
+
35
+
36
+ This repository contains a **PyTorch model for deepfake detection** based on an improved **EfficientViT** architecture, trained on video data.
37
+
38
+ The model predicts whether a video is **real (0)** or **fake (1)** using both visual information and temporal cues.
39
+
40
+ ---
41
+
42
+ ## 🧩 Model Description
43
+
44
+ **Architecture:** Improved EfficientViT
45
+ **Backbone:** EfficientNet-B0 for feature extraction
46
+ **Head:** Transformer-based temporal modeling with classification head
47
+ **Input:** Video frames (224Γ—224 RGB images)
48
+ **Output:** Binary label (0=Real, 1=Fake) and frame-level probabilities
49
+
50
+ **Key Features:**
51
+
52
+ - Extracts faces from frames using MTCNN
53
+ - Supports inference on raw video files
54
+ - Provides frame-level probabilities for fine-grained analysis
55
+
56
+ ---
57
+
58
+ ## πŸ“ Repository Structure
59
+
60
+ ```
61
+ deepfake-efficientvit/
62
+ β”‚
63
+ β”œβ”€β”€ model.py # ImprovedEfficientViT class
64
+ β”œβ”€β”€ inference.py # Functions to run inference on videos
65
+ β”œβ”€β”€ model.pth # Trained weights
66
+ β”œβ”€β”€ config.json # Optional model metadata
67
+ β”œβ”€β”€ requirements.txt # Required packages
68
+ β”œβ”€β”€ README.md
69
+
70
+ ```
71
+
72
+ ## ⚑ Installation
73
+ git clone https://huggingface.co/faisalishfaq2005/deepfake-detection-efficientnet-vit
74
+
75
+ cd deepfake-detection-efficientnet-vit
76
+
77
+ pip install -r requirements.txt
78
+
79
+ ## πŸš€ Usage
80
+ # 1.Programmatic Inference
81
+
82
+ ```python
83
+
84
+ from huggingface_hub import hf_hub_download
85
+ import torch
86
+ from model import ImprovedEfficientViT
87
+ from inference import predict_vedio # your inference function
88
+
89
+ # 1️⃣ Download the checkpoint from Hugging Face
90
+ checkpoint_path = hf_hub_download(
91
+ repo_id="faisalishfaq2005/deepfake-detection-efficientnet-vit",
92
+ filename="model.pth"
93
+ )
94
+
95
+ # 2️⃣ Load the model
96
+ model = ImprovedEfficientViT()
97
+ model.load_state_dict(torch.load(checkpoint_path, map_location="cpu"))
98
+ model.eval()
99
+
100
+ # 3️⃣ Run inference on a video
101
+ video_path = "sample_video.mp4"
102
+ result = predict_vedio(video_path, model)
103
+ print(result)
104
+ # Example Output: {'class': 1}
105
+
106
+
107
+ ```
108
+ # 2. Manual Download
109
+
110
+ Go to the Hugging Face model page
111
+
112
+ Download:
113
+
114
+ model.pth
115
+
116
+ model.py
117
+
118
+ inference.py
119
+
120
+ Place them in the same folder locally.
121
+
122
+ Install requirements and run predict_video().
123
+
124
+ ## πŸ“„ License
125
+
126
+ This model is released under the MIT License.
127
+ You are free to use, modify, and distribute it, with attribution.
128
+
129
+ ## πŸ“š Citation
130
+
131
+ If you use this model in your research, please cite:
132
+
133
+ ```bibtex
134
+ @inproceedings{faisalishfaq2025efficientvit,
135
+ title={Deepfake Detection with Efficientnet and ViT},
136
+ author={Faisal Ishfaq},
137
+ year={2025}
138
+ }
139
+ ```
140
+
141
+