Update README.md
Browse files
README.md
CHANGED
|
@@ -23,8 +23,8 @@ Vision Transformer (ViT) model fine-tuned for detecting AI-generated images in f
|
|
| 23 |
- **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
|
| 24 |
|
| 25 |
### Model Sources
|
| 26 |
-
- **Repository:** [
|
| 27 |
-
- **Paper:** [
|
| 28 |
|
| 29 |
## Uses
|
| 30 |
### Direct Use
|
|
@@ -33,18 +33,12 @@ Detect AI-generated images in:
|
|
| 33 |
- Digital forensic investigations
|
| 34 |
- Media authenticity verification
|
| 35 |
|
| 36 |
-
### Out-of-Scope Use
|
| 37 |
-
- Detecting videos or text content
|
| 38 |
-
- Identifying generative model architectures (use Transformers-based detectors instead)
|
| 39 |
-
|
| 40 |
## Bias, Risks, and Limitations
|
| 41 |
- **Performance variance:** Accuracy drops 15-20% on diffusion-generated images vs GAN-generated
|
| 42 |
- **Geometric artifacts:** Struggles with rotated/flipped synthetic images
|
| 43 |
- **Data bias:** Trained primarily on LAION and COCO derivatives ([source][2411.04125v1.pdf])
|
|
|
|
| 44 |
|
| 45 |
-
### Recommendations
|
| 46 |
-
- Combine with error-level analysis for improved robustness
|
| 47 |
-
- Update model quarterly to address new generator architectures
|
| 48 |
|
| 49 |
## How to Use
|
| 50 |
```python
|
|
@@ -60,8 +54,8 @@ predicted_class = outputs.logits.argmax(-1)
|
|
| 60 |
|
| 61 |
## Training Details
|
| 62 |
### Training Data
|
| 63 |
-
-
|
| 64 |
-
-
|
| 65 |
|
| 66 |
### Training Hyperparameters
|
| 67 |
- **Framework:** PyTorch 2.0
|
|
@@ -81,11 +75,7 @@ predicted_class = outputs.logits.argmax(-1)
|
|
| 81 |
| AUC-ROC | 0.992 |
|
| 82 |
| FP Rate | 2.1% |
|
| 83 |
|
| 84 |
-
|
| 85 |
-
### Model Architecture
|
| 86 |
-
- ViT-Small with 16x16 patch embeddings
|
| 87 |
-
- 384x384 input resolution
|
| 88 |
-
- 12 transformer layers
|
| 89 |
|
| 90 |
## Citation
|
| 91 |
**BibTeX:**
|
|
|
|
| 23 |
- **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
|
| 24 |
|
| 25 |
### Model Sources
|
| 26 |
+
- **Repository:** [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
|
| 27 |
+
- **Paper:** [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)
|
| 28 |
|
| 29 |
## Uses
|
| 30 |
### Direct Use
|
|
|
|
| 33 |
- Digital forensic investigations
|
| 34 |
- Media authenticity verification
|
| 35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
## Bias, Risks, and Limitations
|
| 37 |
- **Performance variance:** Accuracy drops 15-20% on diffusion-generated images vs GAN-generated
|
| 38 |
- **Geometric artifacts:** Struggles with rotated/flipped synthetic images
|
| 39 |
- **Data bias:** Trained primarily on LAION and COCO derivatives ([source][2411.04125v1.pdf])
|
| 40 |
+
- **ADDED BY UPLOADER**: Model is already out of date, fails to detect images on newer generation models.
|
| 41 |
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
## How to Use
|
| 44 |
```python
|
|
|
|
| 54 |
|
| 55 |
## Training Details
|
| 56 |
### Training Data
|
| 57 |
+
- 2.7mil images from 15+ generators, 4600+ models
|
| 58 |
+
- Over 1.15TB worth of images
|
| 59 |
|
| 60 |
### Training Hyperparameters
|
| 61 |
- **Framework:** PyTorch 2.0
|
|
|
|
| 75 |
| AUC-ROC | 0.992 |
|
| 76 |
| FP Rate | 2.1% |
|
| 77 |
|
| 78 |
+

|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
## Citation
|
| 81 |
**BibTeX:**
|