ash12321
/

flux-detector-vit

@@ -28,12 +28,6 @@ model-index:
     - type: f1
       value: 0.9985
       name: F1 Score
-    - type: precision
-      value: 1.0000
-      name: Precision
-    - type: recall
-      value: 0.9970
-      name: Recall
 ---
 # FLUX Detector - Vision Transformer
@@ -50,85 +44,20 @@ This model is a **specialized binary classifier** trained to detect images gener
 - ⚡ **Fast Inference**: ~10ms per image on GPU
 - 📊 **Well-Validated**: Separate train/val/test splits with no overlap
-### Model Details
-- **Base Model**: google/vit-base-patch16-224 (Vision Transformer)
-- **Task**: Binary Image Classification (Real vs FLUX-Fake)
-- **Input**: 224×224 RGB images
-- **Output**: 2 classes (0: Real, 1: FLUX-Fake)
-- **Parameters**: 85.8M total
-## Performance
-### Test Set Results
 ```
-Accuracy:  0.9985
-Precision: 1.0000 (PERFECT!)
-Recall:    0.9970
-F1 Score:  0.9985
-AUC-ROC:   1.0000 (PERFECT!)
 False Positive Rate: 0.0000 (0.0%!)
 False Negative Rate: 0.0030
 ```
-### Confusion Matrix
-```
-                Predicted
-              Real    Fake
-Actual Real   1000       0  ← Perfect!
-Actual Fake      3     997
-```
-**Interpretation:**
-- Out of 1,000 real images: **ALL 1,000 correctly identified (100%)**
-- Out of 1,000 FLUX images: 997 correctly identified (99.7%)
-- **ZERO false positives** - never calls real images fake!
-## Training Details
-### Dataset
-**Training Data:**
-- Real Images: 8,000 (WikiArt paintings)
-- FLUX Images: 8,000 (generated with FLUX.1-dev at 20 steps)
-- Total: 16,000 images
-**Validation & Test:**
-- 2,000 images each (1,000 real + 1,000 FLUX)
-- Completely separate from training data
-- Different random seed than SDXL detector (no overlap)
-### Training Configuration
-```python
-Model: Vision Transformer (ViT-base-patch16-224)
-Optimizer: AdamW
-Learning Rate: 2e-5 (reduced to 1e-5 via scheduling)
-Batch Size: 32
-Epochs: 6 (early stopping from max 20)
-Training Time: 16.2 minutes
-Overfitting Prevention:
-- Early Stopping (patience=5)
-- Data Augmentation (random crops, flips, rotations, color jitter)
-- Dropout (0.1)
-- Label Smoothing (0.1)
-- Weight Decay (0.01)
-- Learning Rate Scheduling
-```
-## Usage
-### Installation
-```bash
-pip install transformers torch pillow
-```
-### Quick Start
 ```python
 import torch
@@ -143,105 +72,41 @@ processor = ViTImageProcessor.from_pretrained(
     "google/vit-base-patch16-224"
 )
-# Load and preprocess image
-image = Image.open("your_image.jpg")
 inputs = processor(images=image, return_tensors="pt")
 # Get prediction
 model.eval()
 with torch.no_grad():
     outputs = model(**inputs)
-    logits = outputs.logits
-    probs = torch.softmax(logits, dim=1)
-    prediction = logits.argmax(dim=1).item()
-# Interpret results
-if prediction == 1:
-    confidence = probs[0][1].item()
-    print(f"FLUX-Generated (confidence: {confidence:.2%})")
-else:
-    confidence = probs[0][0].item()
-    print(f"Real Image (confidence: {confidence:.2%})")
 ```
-### Advanced Usage with Threshold
 ```python
-def detect_flux(image_path, threshold=0.5):
-    """
-    Detect if image is FLUX-generated
-    Args:
-        image_path: Path to image
-        threshold: Classification threshold (default 0.5)
-    Returns:
-        dict: {is_flux: bool, confidence: float, label: str}
-    """
-    image = Image.open(image_path).convert('RGB')
-    inputs = processor(images=image, return_tensors="pt")
-    with torch.no_data():
-        outputs = model(**inputs)
-        probs = torch.softmax(outputs.logits, dim=1)
-        flux_prob = probs[0][1].item()
-    is_flux = flux_prob > threshold
-    return {
-        'is_flux': is_flux,
-        'confidence': flux_prob if is_flux else (1 - flux_prob),
-        'label': 'FLUX-Generated' if is_flux else 'Real Image',
-        'flux_probability': flux_prob,
-        'real_probability': 1 - flux_prob
-    }
-# Example
-result = detect_flux("test_image.jpg")
-print(f"{result['label']} ({result['confidence']:.2%} confident)")
 ```
-## Limitations
-### What This Model Detects
-✅ **FLUX.1-dev generated images** (Black Forest Labs)
-### What This Model Does NOT Detect
-❌ Other AI generators (SDXL, Midjourney, DALL-E, etc.)
-❌ FLUX.1-schnell (4-step variant) - not tested
-❌ FLUX 2 (newer version) - not tested
-❌ Edited/manipulated real images
-❌ Heavily compressed or low-quality images may reduce accuracy
-**Note**: This model was trained on FLUX.1-dev images generated at 20 steps, but should generalize to other step counts (15-50 steps) based on research.
-**Recommendation**: Use as part of an ensemble with other specialized detectors for comprehensive AI detection.
-## Intended Use
-### Primary Use Cases
-- Content moderation platforms
-- Academic research on AI-generated content
-- Watermarking and provenance systems
-- Educational tools for AI literacy
-- FLUX-specific image verification
-### Out-of-Scope Uses
-- Sole basis for legal decisions
-- Detection of non-FLUX generators without validation
-- Processing of illegal or harmful content
-## Ethical Considerations
-- This model should be used responsibly as part of broader content verification systems
-- Performance may degrade on images outside the training distribution
-- Always combine automated detection with human review for critical decisions
-- Be transparent about using AI detection systems
-- The model's zero false positive rate makes it particularly suitable for applications where falsely flagging real content is problematic
 ## Citation
@@ -255,16 +120,7 @@ print(f"{result['label']} ({result['confidence']:.2%} confident)")
 }
 ```
-## Model Card Authors
-ash12321
-## Model Card Contact
-For questions or feedback, please open an issue on the model repository.
 ---
 **Created**: 2025-12-31
-**Framework**: PyTorch + Transformers
-**License**: Apache 2.0

     - type: f1
       value: 0.9985
       name: F1 Score
 ---
 # FLUX Detector - Vision Transformer
 - ⚡ **Fast Inference**: ~10ms per image on GPU
 - 📊 **Well-Validated**: Separate train/val/test splits with no overlap
+### Performance
 ```
+Test Accuracy:  0.9985
+Precision:      1.0000 (PERFECT!)
+Recall:         0.9970
+F1 Score:       0.9985
+AUC-ROC:        1.0000 (PERFECT!)
 False Positive Rate: 0.0000 (0.0%!)
 False Negative Rate: 0.0030
 ```
+## Quick Start
 ```python
 import torch
     "google/vit-base-patch16-224"
 )
+# Load image
+image = Image.open("test.jpg")
 inputs = processor(images=image, return_tensors="pt")
 # Get prediction
 model.eval()
 with torch.no_grad():
     outputs = model(**inputs)
+    probs = torch.softmax(outputs.logits, dim=1)
+    if probs[0][1] > 0.5:
+        print(f"FLUX-Generated ({probs[0][1]:.2%} confident)")
+    else:
+        print(f"Real Image ({probs[0][0]:.2%} confident)")
 ```
+## Using the model.py Helper
 ```python
+from model import detect_image
+result = detect_image("test.jpg", model_path="ash12321/flux-detector-vit")
+print(f"Is Fake: {result['is_fake']}")
+print(f"Confidence: {result['confidence']:.2%}")
 ```
+## Files in this Repository
+- `pytorch_model.bin` - Model weights
+- `config.json` - Model configuration
+- `model.py` - Model architecture and helper functions
+- `README.md` - This documentation
+- `training_results.json` - Detailed training metrics
+- `training_curves.png` - Training visualization
+- `confusion_matrix.png` - Test set confusion matrix
 ## Citation
 }
 ```
 ---
+**License**: Apache 2.0
 **Created**: 2025-12-31