Crop-Guard-AI / SETUP_GUIDE.md
GitHub Action
Auto-deploy from GitHub
30a274a
|
Raw
History Blame Contribute Delete
5.15 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Crop Guard: End-to-End Setup Guide

This guide covers everything you need from local UI testing to preparing your dataset, fine-tuning your vision model, and deploying to Hugging Face Spaces.

1. Local UI Testing & Virtual Environment

You should absolutely use a Virtual Environment (venv) when testing locally to avoid conflicting with other Python projects on your machine.

To test the UI locally, follow these steps in your terminal (PowerShell):

  1. Create the virtual environment:
    python -m venv venv
    
  2. Activate the virtual environment:
    .\venv\Scripts\activate
    
  3. Install the dependencies:
    pip install -r requirements.txt
    
  4. Run the application:
    python app.py
    

    The first time you run app.py, it will download the base ViT model (340MB) and the Qwen LLM weights (4.3GB). If you only want to test the UI layout without waiting for the massive LLM download, open app.py, find the hf_hub_download line for Qwen, and temporarily comment out the try...except block setting llm = None.

2. Dataset Preparation (PlantVillage)

To make Crop Guard accurate for diseases, you need to fine-tune it on the PlantVillage dataset.

  1. Get the Dataset (The Kaggle Way): You do not need to download this to your local computer! Since you are using Kaggle Notebooks, you have two extremely easy options to get the data directly into your environment:

    • Option A (Kaggle Native): In your Kaggle Notebook, click "Add Data" on the right-side panel. Search for PlantVillage and click + to add it. It will instantly appear in your notebook's /kaggle/input/ directory!
    • Option B (Hugging Face via Code): You can download it directly inside your Python code using the datasets library from the Hugging Face hub (e.g., nateraw/plant-village).
  2. Structure: If you use Option A (Kaggle Native), the folders will already be arranged. If you use Option B, the load_dataset("nateraw/plant-village") command handles the structure for you automatically.

  3. Mapping: Keep track of the exact folder names or class labels. These 38 names will become the keys for the LABEL_TRANSLATOR dictionary inside your app.py!

3. Fine-Tuning the Vision Model

Since you only have a few days for the hackathon, the fastest way to fine-tune the google/vit-base-patch16-224 model is by using Hugging Face AutoTrain or a simple Google Colab notebook.

Option A: The No-Code Way (AutoTrain)

  1. Go to Hugging Face AutoTrain.
  2. Create a new Image Classification project.
  3. Upload your PlantVillage dataset.
  4. Select google/vit-base-patch16-224 as the base model.
  5. Train and push directly to your Hugging Face Hub profile!

Option B: Python / Kaggle Notebook If you want to script it, use the transformers library inside a Kaggle Notebook (which gives you free access to P100/T4 GPUs!). Here is a high-level snippet of what the training script looks like:

from transformers import AutoImageProcessor, AutoModelForImageClassification, Trainer, TrainingArguments
from datasets import load_dataset

# 1. Load dataset
dataset = load_dataset("imagefolder", data_dir="path/to/dataset")

# 2. Load processor & base model
processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224")
model = AutoModelForImageClassification.from_pretrained(
    "google/vit-base-patch16-224", 
    num_labels=38, 
    ignore_mismatched_sizes=True
)

# 3. Define training arguments & Trainer
training_args = TrainingArguments(
    output_dir="./vit-plantvillage",
    per_device_train_batch_size=16,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    num_train_epochs=3,
    push_to_hub=True,
    hub_model_id="your-username/cropguard-vit"
)

trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["val"])
trainer.train()

4. Deployment Steps (Hugging Face Spaces)

Once your UI is tested and your model is fine-tuned, you are ready to deploy.

  1. Create the Space: Log into Hugging Face, click your profile picture, and select New Space.
  2. Configure the Space:
    • Give it a name (e.g., CropGuard-AI).
    • Choose Gradio as the Space SDK.
    • For hardware, select the Free CPU tier (or T4 GPU if you have it available).
  3. Upload Files: Upload your local app.py and requirements.txt into the "Files" tab of your new space.
  4. Update the App File: If you fine-tuned your model and pushed it to the hub (e.g., your-username/cropguard-vit), make sure you change VISION_MODEL_ID = "your-username/cropguard-vit" in the app.py file before uploading!
  5. Build: The space will automatically begin building. It will install the dependencies from requirements.txt (including the CPU wheel for llama-cpp-python) and start the Gradio server.

The first boot on Hugging Face Spaces will take a few minutes as it downloads the 4.3GB LLM weights. Subsequent boots will be much faster.