# Crop Guard: End-to-End Setup Guide This guide covers everything you need from local UI testing to preparing your dataset, fine-tuning your vision model, and deploying to Hugging Face Spaces. ## 1. Local UI Testing & Virtual Environment You should absolutely use a Virtual Environment (venv) when testing locally to avoid conflicting with other Python projects on your machine. **To test the UI locally, follow these steps in your terminal (PowerShell):** 1. **Create the virtual environment:** ```powershell python -m venv venv ``` 2. **Activate the virtual environment:** ```powershell .\venv\Scripts\activate ``` 3. **Install the dependencies:** ```powershell pip install -r requirements.txt ``` 4. **Run the application:** ```powershell python app.py ``` > [!TIP] > The first time you run `app.py`, it will download the base ViT model (~340MB) and the Qwen LLM weights (~4.3GB). If you *only* want to test the UI layout without waiting for the massive LLM download, open `app.py`, find the `hf_hub_download` line for Qwen, and temporarily comment out the `try...except` block setting `llm = None`. ## 2. Dataset Preparation (PlantVillage) To make Crop Guard accurate for diseases, you need to fine-tune it on the PlantVillage dataset. 1. **Get the Dataset (The Kaggle Way)**: You do **not** need to download this to your local computer! Since you are using Kaggle Notebooks, you have two extremely easy options to get the data directly into your environment: - **Option A (Kaggle Native)**: In your Kaggle Notebook, click **"Add Data"** on the right-side panel. Search for `PlantVillage` and click `+` to add it. It will instantly appear in your notebook's `/kaggle/input/` directory! - **Option B (Hugging Face via Code)**: You can download it directly inside your Python code using the `datasets` library from the Hugging Face hub (e.g., `nateraw/plant-village`). 2. **Structure**: If you use Option A (Kaggle Native), the folders will already be arranged. If you use Option B, the `load_dataset("nateraw/plant-village")` command handles the structure for you automatically. 3. **Mapping**: Keep track of the exact folder names or class labels. These 38 names will become the keys for the `LABEL_TRANSLATOR` dictionary inside your `app.py`! ## 3. Fine-Tuning the Vision Model Since you only have a few days for the hackathon, the fastest way to fine-tune the `google/vit-base-patch16-224` model is by using **Hugging Face AutoTrain** or a simple Google Colab notebook. **Option A: The No-Code Way (AutoTrain)** 1. Go to [Hugging Face AutoTrain](https://huggingface.co/autotrain). 2. Create a new Image Classification project. 3. Upload your PlantVillage dataset. 4. Select `google/vit-base-patch16-224` as the base model. 5. Train and push directly to your Hugging Face Hub profile! **Option B: Python / Kaggle Notebook** If you want to script it, use the `transformers` library inside a Kaggle Notebook (which gives you free access to P100/T4 GPUs!). Here is a high-level snippet of what the training script looks like: ```python from transformers import AutoImageProcessor, AutoModelForImageClassification, Trainer, TrainingArguments from datasets import load_dataset # 1. Load dataset dataset = load_dataset("imagefolder", data_dir="path/to/dataset") # 2. Load processor & base model processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224") model = AutoModelForImageClassification.from_pretrained( "google/vit-base-patch16-224", num_labels=38, ignore_mismatched_sizes=True ) # 3. Define training arguments & Trainer training_args = TrainingArguments( output_dir="./vit-plantvillage", per_device_train_batch_size=16, evaluation_strategy="epoch", save_strategy="epoch", num_train_epochs=3, push_to_hub=True, hub_model_id="your-username/cropguard-vit" ) trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["val"]) trainer.train() ``` ## 4. Deployment Steps (Hugging Face Spaces) Once your UI is tested and your model is fine-tuned, you are ready to deploy. 1. **Create the Space**: Log into Hugging Face, click your profile picture, and select **New Space**. 2. **Configure the Space**: - Give it a name (e.g., `CropGuard-AI`). - Choose **Gradio** as the Space SDK. - For hardware, select the **Free CPU** tier (or **T4 GPU** if you have it available). 3. **Upload Files**: Upload your local `app.py` and `requirements.txt` into the "Files" tab of your new space. 4. **Update the App File**: If you fine-tuned your model and pushed it to the hub (e.g., `your-username/cropguard-vit`), make sure you change `VISION_MODEL_ID = "your-username/cropguard-vit"` in the `app.py` file before uploading! 5. **Build**: The space will automatically begin building. It will install the dependencies from `requirements.txt` (including the CPU wheel for `llama-cpp-python`) and start the Gradio server. > [!NOTE] > The first boot on Hugging Face Spaces will take a few minutes as it downloads the 4.3GB LLM weights. Subsequent boots will be much faster.