File size: 2,944 Bytes

2ee4a0e
7c389b9
 
 
 
 
 
 
 
 
 
 
2ee4a0e
 
7c389b9
2ee4a0e
7c389b9
2ee4a0e
7c389b9
2ee4a0e
7c389b9
 
 
 
 
2ee4a0e
7c389b9
2ee4a0e
7c389b9
 
 
2ee4a0e
7c389b9
2ee4a0e
7c389b9
 
 
 
 
2ee4a0e
7c389b9
 
 
 
 
 
2ee4a0e
7c389b9
 
 
2ee4a0e
7c389b9
 
 
 
 
 
 
 
 
 
 
2ee4a0e
7c389b9
 
2ee4a0e
7c389b9
 
 
 
 
2ee4a0e
 
 
7c389b9
 
 
 
 
 
 
2ee4a0e
7c389b9
2ee4a0e
7c389b9
 
 
2ee4a0e
7c389b9
2ee4a0e
7c389b9
 
 
 
2ee4a0e
7c389b9
2ee4a0e
7c389b9
2ee4a0e
7c389b9
 
 
 
 
 
 
 
 
2ee4a0e
7c389b9
2ee4a0e
7c389b9

---
library_name: peft
base_model: google/gemma-3-4b-it
tags:
- vision
- image-classification
- beans
- plant-disease
- gemma-3
- lora
- fine-tuned
license: gemma
---

# Gemma-3-4B Fine-tuned for Bean Disease Classification

This model is a fine-tuned version of [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) for classifying bean plant diseases.

## Model Description

- **Base Model:** Gemma-3-4B-IT (Vision)
- **Fine-tuning Method:** LoRA (r=8, alpha=16)
- **Dataset:** [beans](https://huggingface.co/datasets/beans) (100 samples)
- **Task:** Image captioning / disease classification
- **Final Validation Loss:** 0.001 (excellent!)

## Classes

1. Healthy bean plant
2. Angular leaf spot disease
3. Bean rust disease

## Usage

```python
from transformers import AutoProcessor, Gemma3ForConditionalGeneration
from peft import PeftModel
from PIL import Image
import torch

# Load base model
base_model = Gemma3ForConditionalGeneration.from_pretrained(
    "google/gemma-3-4b-it",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Khytron/gemma3-4b-bean-captioning")
processor = AutoProcessor.from_pretrained("Khytron/gemma3-4b-bean-captioning")

# Prepare input
image = Image.open("bean_plant.jpg")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "Describe this plant image."}
        ]
    }
]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)

# Generate
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=False)
response = processor.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Training Details

- **Epochs:** 10
- **Batch Size:** 1 (effective: 4 with gradient accumulation)
- **Learning Rate:** 5e-5
- **Precision:** FP16
- **Hardware:** NVIDIA T4 GPU
- **Training Time:** ~25 minutes
- **Max Sequence Length:** 512 tokens

## Performance

- **Final Training Loss:** 0.69
- **Final Validation Loss:** 0.001
- **Accuracy:** Very high (based on validation loss)

## Limitations

- Trained on 100 images for demonstration purposes
- Best suited for the 3 specific bean disease types in the training data
- May not generalize to other bean varieties or diseases
- Should be validated on real-world data before production use

## Citation

If you use this model, please cite:

```bibtex
@misc{gemma3-bean-captioning,
  author = {younaice},
  title = {Gemma-3-4B Fine-tuned for Bean Disease Classification},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Khytron/gemma3-4b-bean-captioning}}
}
```

## License

This model inherits the Gemma license from the base model. Please refer to the [Gemma license](https://ai.google.dev/gemma/terms) for usage terms.