Spaces:

build-small-hackathon
/

Crop-Guard-AI

Running on Zero

App Files Files Community

Crop-Guard-AI / SETUP_GUIDE.md

GitHub Action

Auto-deploy from GitHub

30a274a 25 days ago

preview code

Raw

History Blame Contribute Delete

5.15 kB

	# Crop Guard: End-to-End Setup Guide

	This guide covers everything you need from local UI testing to preparing your dataset, fine-tuning your vision model, and deploying to Hugging Face Spaces.

	## 1. Local UI Testing & Virtual Environment

	You should absolutely use a Virtual Environment (venv) when testing locally to avoid conflicting with other Python projects on your machine.

	To test the UI locally, follow these steps in your terminal (PowerShell):

	1. Create the virtual environment:
	```powershell
	python -m venv venv
	```
	2. Activate the virtual environment:
	```powershell
	.\venv\Scripts\activate
	```
	3. Install the dependencies:
	```powershell
	pip install -r requirements.txt
	```
	4. Run the application:
	```powershell
	python app.py
	```
	> [!TIP]
	> The first time you run `app.py`, it will download the base ViT model (~340MB) and the Qwen LLM weights (~4.3GB). If you only want to test the UI layout without waiting for the massive LLM download, open `app.py`, find the `hf_hub_download` line for Qwen, and temporarily comment out the `try...except` block setting `llm = None`.

	## 2. Dataset Preparation (PlantVillage)

	To make Crop Guard accurate for diseases, you need to fine-tune it on the PlantVillage dataset.

	1. Get the Dataset (The Kaggle Way): You do not need to download this to your local computer! Since you are using Kaggle Notebooks, you have two extremely easy options to get the data directly into your environment:

	- Option A (Kaggle Native): In your Kaggle Notebook, click "Add Data" on the right-side panel. Search for `PlantVillage` and click `+` to add it. It will instantly appear in your notebook's `/kaggle/input/` directory!
	- Option B (Hugging Face via Code): You can download it directly inside your Python code using the `datasets` library from the Hugging Face hub (e.g., `nateraw/plant-village`).

	2. Structure: If you use Option A (Kaggle Native), the folders will already be arranged. If you use Option B, the `load_dataset("nateraw/plant-village")` command handles the structure for you automatically.

	3. Mapping: Keep track of the exact folder names or class labels. These 38 names will become the keys for the `LABEL_TRANSLATOR` dictionary inside your `app.py`!

	## 3. Fine-Tuning the Vision Model

	Since you only have a few days for the hackathon, the fastest way to fine-tune the `google/vit-base-patch16-224` model is by using Hugging Face AutoTrain or a simple Google Colab notebook.

	Option A: The No-Code Way (AutoTrain)
	1. Go to [Hugging Face AutoTrain](https://huggingface.co/autotrain).
	2. Create a new Image Classification project.
	3. Upload your PlantVillage dataset.
	4. Select `google/vit-base-patch16-224` as the base model.
	5. Train and push directly to your Hugging Face Hub profile!

	Option B: Python / Kaggle Notebook
	If you want to script it, use the `transformers` library inside a Kaggle Notebook (which gives you free access to P100/T4 GPUs!). Here is a high-level snippet of what the training script looks like:
	```python
	from transformers import AutoImageProcessor, AutoModelForImageClassification, Trainer, TrainingArguments
	from datasets import load_dataset

	# 1. Load dataset
	dataset = load_dataset("imagefolder", data_dir="path/to/dataset")

	# 2. Load processor & base model
	processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224")
	model = AutoModelForImageClassification.from_pretrained(
	"google/vit-base-patch16-224",
	num_labels=38,
	ignore_mismatched_sizes=True
	)

	# 3. Define training arguments & Trainer
	training_args = TrainingArguments(
	output_dir="./vit-plantvillage",
	per_device_train_batch_size=16,
	evaluation_strategy="epoch",
	save_strategy="epoch",
	num_train_epochs=3,
	push_to_hub=True,
	hub_model_id="your-username/cropguard-vit"
	)

	trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["val"])
	trainer.train()
	```

	## 4. Deployment Steps (Hugging Face Spaces)

	Once your UI is tested and your model is fine-tuned, you are ready to deploy.

	1. Create the Space: Log into Hugging Face, click your profile picture, and select New Space.
	2. Configure the Space:
	- Give it a name (e.g., `CropGuard-AI`).
	- Choose Gradio as the Space SDK.
	- For hardware, select the Free CPU tier (or T4 GPU if you have it available).
	3. Upload Files: Upload your local `app.py` and `requirements.txt` into the "Files" tab of your new space.
	4. Update the App File: If you fine-tuned your model and pushed it to the hub (e.g., `your-username/cropguard-vit`), make sure you change `VISION_MODEL_ID = "your-username/cropguard-vit"` in the `app.py` file before uploading!
	5. Build: The space will automatically begin building. It will install the dependencies from `requirements.txt` (including the CPU wheel for `llama-cpp-python`) and start the Gradio server.

	> [!NOTE]
	> The first boot on Hugging Face Spaces will take a few minutes as it downloads the 4.3GB LLM weights. Subsequent boots will be much faster.