--- license: mit language: - en base_model: - google/gemma-4-E2B-it tags: - gemma - gguf - multimodal - vision - wildlife-monitoring - quantized - audio - text --- ## 🐘 EleGuard: Multimodal Elephant Detection **EleGuard** is a specialized, multimodal Vision-Language Model (VLM) developed for the **24/7 monitoring of elephant activity** in natural habitats. By leveraging infrared (IR) imagery and bioacoustic signals, EleGuard provides a robust solution for human-elephant conflict mitigation and wildlife conservation. ## Model Summary * **Project Name:** EleGuard * **Base Architecture:** This model is a variant based on **Gemma 4 E2B**. * **Modality:** Multimodal (Vision + Acoustic via Spectrograms). * **Format:** GGUF (Optimized for edge deployment). * **Training data:** [EleGuard Dataset](https://www.kaggle.com/datasets/malithabandara/eleguard-dataset) * **Training Method:** Knowledge Distillation from Gemini 3.1 Flash. ## Technical Innovation: Reasoning Distillation The core breakthrough of EleGuard is the shift from simple classification to **expert reasoning**. Instead of training only on labels, the model was fine-tuned on "thought blocks" generated by a Teacher model (Gemini 3.1 Flash). For every image or audio sample, the model is trained to explain its reasoning—such as identifying thermal signatures in thick brush or frequency patterns in a rumble—before outputting a final status: * **ALERT:** Elephant presence confirmed. * **SAFE:** No threat detected. ## Dataset Details The model was trained on a curated dataset of **2,600 samples** organized into: * **Visual Imagery:** High-resolution daytime and **Infrared (IR)** forest captures. * **Acoustic Data:** Mel Spectrograms identifying vocalizations like rumbles, roars, and trumpets. * **Paired Expert Labels:** Detailed JSON reasoning files for every media asset. ## Usage & Deployment This repository contains the model weights in **GGUF** format, specifically optimized for edge devices (Raspberry Pi, Jetson Nano, or standard laptops) using tools like `llama.cpp` or `Ollama`. ### Required Files: 1. `EleGuard-gemma-4-e2b-it.GGUF` (Main model weights) 2. `EleGuard-gemma-4-e2b-it.mmproj.GGUF` (Multimodal vision projector) ## Acknowledgments & Trademarks * Gemma is a trademark of Google LLC. * EleGuard is a model trained on a dataset based on Gemma 4 E2B. * This project was developed for [The Gemma 4 Good Hackathon](https://www.kaggle.com/competitions/gemma-4-good-hackathon/overview) using the Unsloth fine-tuning framework. ---