File size: 2,953 Bytes

e38e363
0d44bdc
 
 
 
 
 
 
 
 
 
 
e38e363
 
0d44bdc
e38e363
0d44bdc
e38e363
0d44bdc
e38e363
0d44bdc
 
 
 
 
e38e363
0d44bdc
e38e363
0d44bdc
 
 
e38e363
0d44bdc
e38e363
0d44bdc
 
 
 
 
e38e363
0d44bdc
 
 
 
 
 
e38e363
0d44bdc
 
 
e38e363
0d44bdc
 
 
 
 
 
 
 
 
 
 
e38e363
0d44bdc
 
e38e363
0d44bdc
 
 
 
 
e38e363
 
 
0d44bdc
 
 
 
 
 
 
e38e363
0d44bdc
e38e363
0d44bdc
 
 
e38e363
0d44bdc
e38e363
0d44bdc
 
 
 
e38e363
0d44bdc
e38e363
0d44bdc
e38e363
0d44bdc
 
 
 
 
 
 
 
 
e38e363
0d44bdc
e38e363
0d44bdc

---
library_name: peft
base_model: google/gemma-3-4b-it
tags:
- vision
- image-classification
- beans
- plant-disease
- gemma-3
- lora
- fine-tuned
license: gemma
---

# Gemma-3-4B Fine-tuned for Bean Disease Classification

This model is a fine-tuned version of [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) for classifying bean plant diseases.

## Model Description

- **Base Model:** Gemma-3-4B-IT (Vision)
- **Fine-tuning Method:** LoRA (r=8, alpha=16)
- **Dataset:** [beans](https://huggingface.co/datasets/beans) (100 samples)
- **Task:** Image captioning / disease classification
- **Final Validation Loss:** 0.001 (excellent!)

## Classes

1. Healthy bean plant
2. Angular leaf spot disease
3. Bean rust disease

## Usage

```python
from transformers import AutoProcessor, Gemma3ForConditionalGeneration
from peft import PeftModel
from PIL import Image
import torch

# Load base model
base_model = Gemma3ForConditionalGeneration.from_pretrained(
    "google/gemma-3-4b-it",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Nefflymicn/gemma3-4b-bean-captioning")
processor = AutoProcessor.from_pretrained("Nefflymicn/gemma3-4b-bean-captioning")

# Prepare input
image = Image.open("bean_plant.jpg")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "Describe this plant image."}
        ]
    }
]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)

# Generate
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=False)
response = processor.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Training Details

- **Epochs:** 10
- **Batch Size:** 1 (effective: 4 with gradient accumulation)
- **Learning Rate:** 5e-5
- **Precision:** FP16
- **Hardware:** NVIDIA T4 GPU
- **Training Time:** ~25 minutes
- **Max Sequence Length:** 512 tokens

## Performance

- **Final Training Loss:** 0.69
- **Final Validation Loss:** 0.001
- **Accuracy:** Very high (based on validation loss)

## Limitations

- Trained on 100 images for demonstration purposes
- Best suited for the 3 specific bean disease types in the training data
- May not generalize to other bean varieties or diseases
- Should be validated on real-world data before production use

## Citation

If you use this model, please cite:

```bibtex
@misc{gemma3-bean-captioning,
  author = {younaice},
  title = {Gemma-3-4B Fine-tuned for Bean Disease Classification},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Nefflymicn/gemma3-4b-bean-captioning}}
}
```

## License

This model inherits the Gemma license from the base model. Please refer to the [Gemma license](https://ai.google.dev/gemma/terms) for usage terms.