verity-1A / README.md
zelus82's picture
Upload folder using huggingface_hub
9b737ee verified
---
license: mit
library_name: transformers
tags:
- florence-2
- deepfake-detection
- computer-vision
- multimodal
- lora
pipeline_tag: image-to-text
---
# Verity-1A: Florence-2 + FLODA Deepfake Detection Model
## 🎯 Model Description
**Verity-1A** is an advanced multimodal model combining Microsoft's Florence-2-base with the FLODA-deepfake LoRA adapter for enhanced AI-generated content detection. This fusion creates a specialized model optimized for identifying deepfakes and AI-generated images while maintaining Florence-2's powerful vision-language capabilities.
## πŸ—οΈ Model Architecture
- **Base Model**: Microsoft Florence-2-base (768d architecture)
- **Enhancement**: FLODA-deepfake LoRA adapter
- **Model Size**: ~447 MB
- **Optimization**: PEFT-based fusion for efficient inference
## πŸš€ Key Features
- βœ… **Deepfake Detection**: Specialized for AI-generated content identification
- βœ… **Multimodal**: Combines vision and language understanding
- βœ… **Compact**: 6.7x smaller than Florence-2-large
- βœ… **Production-Ready**: Fully validated and optimized
## πŸ“Š Performance
- **Architecture**: 768-dimensional embeddings
- **Parameters**: ~232M parameters
- **Inference**: Optimized for real-time detection
- **Compatibility**: Full Transformers ecosystem support
## πŸ› οΈ Usage
```python
from transformers import AutoModelForCausalLM, AutoProcessor
import torch
# Load model
model = AutoModelForCausalLM.from_pretrained(
"zelus82/verity-1A",
torch_dtype=torch.float16,
trust_remote_code=True
)
# Load processor
processor = AutoProcessor.from_pretrained(
"zelus82/verity-1A",
trust_remote_code=True
)
# Example usage for deepfake detection
def detect_deepfake(image, text_prompt="Is this image AI-generated?"):
inputs = processor(text=text_prompt, images=image, return_tensors="pt")
with torch.no_grad():
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
return generated_text
```
## πŸŽ“ Training Details
- **Base Training**: Microsoft Florence-2-base foundation
- **Specialization**: FLODA-deepfake LoRA fine-tuning
- **Fusion Method**: PEFT merge_and_unload for optimal performance
- **Validation**: Comprehensive 666-tensor validation passed
## πŸ“‹ Model Card
| Attribute | Value |
|-----------|-------|
| Model Type | Multimodal Vision-Language |
| Base Architecture | Florence-2 |
| Specialization | Deepfake Detection |
| Model Size | 447 MB |
| Parameters | ~232M |
| Precision | Float16 |
| License | MIT |
## πŸ”§ Technical Specifications
- **Hidden Size**: 768
- **Vocabulary Size**: 51,289
- **Vision Encoder**: Advanced transformer-based
- **Language Model**: Optimized for detection tasks
- **LoRA Rank**: 8 (optimal efficiency/performance)
## ⚠️ Limitations
- Optimized specifically for deepfake detection tasks
- Based on Florence-2-base architecture (768d)
- Not compatible with Florence-2-large components
- Requires trust_remote_code=True for full functionality
## πŸ“„ Citation
```bibtex
@model{verity1a2024,
title={Verity-1A: Florence-2 Enhanced Deepfake Detection},
author={zelus82},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/zelus82/verity-1A}
}
```
## 🀝 Acknowledgments
- **Microsoft** for the Florence-2 foundation model
- **FLODA** team for the deepfake detection adapter
- **Hugging Face** for the ecosystem and hosting
## πŸ“ž Contact
For questions or collaborations, please reach out through the Hugging Face community discussions.
---
*Built with ❀️ for safer AI content detection*