|
|
--- |
|
|
library_name: peft |
|
|
base_model: google/gemma-3-4b-it |
|
|
tags: |
|
|
- vision |
|
|
- image-classification |
|
|
- beans |
|
|
- plant-disease |
|
|
- gemma-3 |
|
|
- lora |
|
|
- fine-tuned |
|
|
license: gemma |
|
|
--- |
|
|
|
|
|
# Gemma-3-4B Fine-tuned for Bean Disease Classification |
|
|
|
|
|
This model is a fine-tuned version of [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) for classifying bean plant diseases. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Base Model:** Gemma-3-4B-IT (Vision) |
|
|
- **Fine-tuning Method:** LoRA (r=8, alpha=16) |
|
|
- **Dataset:** [beans](https://huggingface.co/datasets/beans) (100 samples) |
|
|
- **Task:** Image captioning / disease classification |
|
|
- **Final Validation Loss:** 0.001 (excellent!) |
|
|
|
|
|
## Classes |
|
|
|
|
|
1. Healthy bean plant |
|
|
2. Angular leaf spot disease |
|
|
3. Bean rust disease |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoProcessor, Gemma3ForConditionalGeneration |
|
|
from peft import PeftModel |
|
|
from PIL import Image |
|
|
import torch |
|
|
|
|
|
# Load base model |
|
|
base_model = Gemma3ForConditionalGeneration.from_pretrained( |
|
|
"google/gemma-3-4b-it", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, "Nefflymicn/gemma3-4b-bean-captioning") |
|
|
processor = AutoProcessor.from_pretrained("Nefflymicn/gemma3-4b-bean-captioning") |
|
|
|
|
|
# Prepare input |
|
|
image = Image.open("bean_plant.jpg") |
|
|
messages = [ |
|
|
{ |
|
|
"role": "user", |
|
|
"content": [ |
|
|
{"type": "image"}, |
|
|
{"type": "text", "text": "Describe this plant image."} |
|
|
] |
|
|
} |
|
|
] |
|
|
|
|
|
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device) |
|
|
|
|
|
# Generate |
|
|
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=False) |
|
|
response = processor.decode(outputs[0], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Epochs:** 10 |
|
|
- **Batch Size:** 1 (effective: 4 with gradient accumulation) |
|
|
- **Learning Rate:** 5e-5 |
|
|
- **Precision:** FP16 |
|
|
- **Hardware:** NVIDIA T4 GPU |
|
|
- **Training Time:** ~25 minutes |
|
|
- **Max Sequence Length:** 512 tokens |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Final Training Loss:** 0.69 |
|
|
- **Final Validation Loss:** 0.001 |
|
|
- **Accuracy:** Very high (based on validation loss) |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained on 100 images for demonstration purposes |
|
|
- Best suited for the 3 specific bean disease types in the training data |
|
|
- May not generalize to other bean varieties or diseases |
|
|
- Should be validated on real-world data before production use |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{gemma3-bean-captioning, |
|
|
author = {younaice}, |
|
|
title = {Gemma-3-4B Fine-tuned for Bean Disease Classification}, |
|
|
year = {2024}, |
|
|
publisher = {Hugging Face}, |
|
|
howpublished = {\url{https://huggingface.co/Nefflymicn/gemma3-4b-bean-captioning}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model inherits the Gemma license from the base model. Please refer to the [Gemma license](https://ai.google.dev/gemma/terms) for usage terms. |
|
|
|