hamzenium commited on
Commit
42053c6
·
verified ·
1 Parent(s): c8c46cf

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - google/vit-base-patch16-224
7
+ pipeline_tag: image-classification
8
+ tags:
9
+ - deepfake
10
+ - fakeimages
11
+ - detector
12
+ - fake
13
+ - vision-transformer
14
+ - vit
15
+ - image-classification
16
+ - computer-vision
17
+ - deep-learning
18
+ model_name: ViT Deepfake Detector
19
+ model_creator: Hamza Sohail, Ayaan Mohammed, Shadab Karim, Kirti Dhir
20
+ model_type: vision-transformer
21
+ datasets:
22
+ - faceforensics++
23
+ - celeb-df
24
+ - dfdc
25
+ - custom-generated
26
+ library_name: transformers
27
+ library_version: "4.40.0"
28
+ inference: true
29
+ model_description: |
30
+ This model is a fine-tuned version of Google's `vit-base-patch16-224` Vision Transformer, trained specifically for the binary classification task of detecting deepfake images. It outputs probabilities indicating whether a given image is real or fake.
31
+
32
+ The model was trained using a combination of real and manipulated images sourced from the FaceForensics++, Celeb-DF, and DFDC datasets, along with additional synthetic samples. It leverages the ViT architecture's ability to capture spatial and contextual features across image patches for effective fake content detection.
33
+
34
+ The primary application of this model is in fake image detection, digital media integrity validation, and social platform moderation tools.
35
+
36
+ training_details: |
37
+ - Base Model: google/vit-base-patch16-224
38
+ - Epochs: 10
39
+ - Optimizer: AdamW
40
+ - Loss: CrossEntropyLoss
41
+ - Learning rate: 5e-5
42
+ - Scheduler: CosineAnnealingLR
43
+ - Batch size: 32
44
+ - Framework: PyTorch with Hugging Face Transformers
45
+ - Hardware: Trained using Tesla T4 GPU
46
+
47
+ evaluation: |
48
+ The model was evaluated on a stratified test set of 10,000 images from multiple sources, achieving:
49
+ - Accuracy: 95.7
50
+ - Precision: 95.7%
51
+ - Recall: 95.7%
52
+ - F1-score: 95.7%
53
+
54
+ Confusion matrix and ROC curves were generated to analyze misclassifications and detection confidence.
55
+
56
+ intended_uses: |
57
+ This model is intended for:
58
+ - Automated detection of manipulated or deepfake images in social media content.
59
+ - Research in digital forensics and AI ethics.
60
+ - Educational purposes for understanding the application of Vision Transformers.
61
+
62
+ **Limitations:** This model may not generalize to unseen manipulation techniques not present in the training datasets. It is not intended for use in real-time legal or security-critical applications without additional verification mechanisms.
63
+
64
+ example_usage: