| | --- |
| | license: mit |
| | library_name: transformers |
| | tags: |
| | - florence-2 |
| | - deepfake-detection |
| | - computer-vision |
| | - multimodal |
| | - lora |
| | pipeline_tag: image-to-text |
| | --- |
| | |
| | # Verity-1A: Florence-2 + FLODA Deepfake Detection Model |
| |
|
| | ## π― Model Description |
| |
|
| | **Verity-1A** is an advanced multimodal model combining Microsoft's Florence-2-base with the FLODA-deepfake LoRA adapter for enhanced AI-generated content detection. This fusion creates a specialized model optimized for identifying deepfakes and AI-generated images while maintaining Florence-2's powerful vision-language capabilities. |
| |
|
| | ## ποΈ Model Architecture |
| |
|
| | - **Base Model**: Microsoft Florence-2-base (768d architecture) |
| | - **Enhancement**: FLODA-deepfake LoRA adapter |
| | - **Model Size**: ~447 MB |
| | - **Optimization**: PEFT-based fusion for efficient inference |
| |
|
| | ## π Key Features |
| |
|
| | - β
**Deepfake Detection**: Specialized for AI-generated content identification |
| | - β
**Multimodal**: Combines vision and language understanding |
| | - β
**Compact**: 6.7x smaller than Florence-2-large |
| | - β
**Production-Ready**: Fully validated and optimized |
| |
|
| | ## π Performance |
| |
|
| | - **Architecture**: 768-dimensional embeddings |
| | - **Parameters**: ~232M parameters |
| | - **Inference**: Optimized for real-time detection |
| | - **Compatibility**: Full Transformers ecosystem support |
| |
|
| | ## π οΈ Usage |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoProcessor |
| | import torch |
| | |
| | # Load model |
| | model = AutoModelForCausalLM.from_pretrained( |
| | "zelus82/verity-1A", |
| | torch_dtype=torch.float16, |
| | trust_remote_code=True |
| | ) |
| | |
| | # Load processor |
| | processor = AutoProcessor.from_pretrained( |
| | "zelus82/verity-1A", |
| | trust_remote_code=True |
| | ) |
| | |
| | # Example usage for deepfake detection |
| | def detect_deepfake(image, text_prompt="Is this image AI-generated?"): |
| | inputs = processor(text=text_prompt, images=image, return_tensors="pt") |
| | |
| | with torch.no_grad(): |
| | generated_ids = model.generate( |
| | input_ids=inputs["input_ids"], |
| | pixel_values=inputs["pixel_values"], |
| | max_new_tokens=1024, |
| | num_beams=3 |
| | ) |
| | |
| | generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0] |
| | return generated_text |
| | ``` |
| |
|
| | ## π Training Details |
| |
|
| | - **Base Training**: Microsoft Florence-2-base foundation |
| | - **Specialization**: FLODA-deepfake LoRA fine-tuning |
| | - **Fusion Method**: PEFT merge_and_unload for optimal performance |
| | - **Validation**: Comprehensive 666-tensor validation passed |
| |
|
| | ## π Model Card |
| |
|
| | | Attribute | Value | |
| | |-----------|-------| |
| | | Model Type | Multimodal Vision-Language | |
| | | Base Architecture | Florence-2 | |
| | | Specialization | Deepfake Detection | |
| | | Model Size | 447 MB | |
| | | Parameters | ~232M | |
| | | Precision | Float16 | |
| | | License | MIT | |
| |
|
| | ## π§ Technical Specifications |
| |
|
| | - **Hidden Size**: 768 |
| | - **Vocabulary Size**: 51,289 |
| | - **Vision Encoder**: Advanced transformer-based |
| | - **Language Model**: Optimized for detection tasks |
| | - **LoRA Rank**: 8 (optimal efficiency/performance) |
| |
|
| | ## β οΈ Limitations |
| |
|
| | - Optimized specifically for deepfake detection tasks |
| | - Based on Florence-2-base architecture (768d) |
| | - Not compatible with Florence-2-large components |
| | - Requires trust_remote_code=True for full functionality |
| |
|
| | ## π Citation |
| |
|
| | ```bibtex |
| | @model{verity1a2024, |
| | title={Verity-1A: Florence-2 Enhanced Deepfake Detection}, |
| | author={zelus82}, |
| | year={2024}, |
| | publisher={Hugging Face}, |
| | url={https://huggingface.co/zelus82/verity-1A} |
| | } |
| | ``` |
| |
|
| | ## π€ Acknowledgments |
| |
|
| | - **Microsoft** for the Florence-2 foundation model |
| | - **FLODA** team for the deepfake detection adapter |
| | - **Hugging Face** for the ecosystem and hosting |
| |
|
| | ## π Contact |
| |
|
| | For questions or collaborations, please reach out through the Hugging Face community discussions. |
| |
|
| | --- |
| |
|
| | *Built with β€οΈ for safer AI content detection* |
| |
|