# Crop Guard: End-to-End Setup Guide

This guide covers everything you need from local UI testing to preparing your dataset, fine-tuning your vision model, and deploying to Hugging Face Spaces.

## 1. Local UI Testing & Virtual Environment

You should absolutely use a Virtual Environment (venv) when testing locally to avoid conflicting with other Python projects on your machine.

**To test the UI locally, follow these steps in your terminal (PowerShell):**

1. **Create the virtual environment:**
   ```powershell
   python -m venv venv
   ```
2. **Activate the virtual environment:**
   ```powershell
   .\venv\Scripts\activate
   ```
3. **Install the dependencies:**
   ```powershell
   pip install -r requirements.txt
   ```
4. **Run the application:**
   ```powershell
   python app.py
   ```
   > [!TIP]
   > The first time you run `app.py`, it will download the base ViT model (~340MB) and the Qwen LLM weights (~4.3GB). If you *only* want to test the UI layout without waiting for the massive LLM download, open `app.py`, find the `hf_hub_download` line for Qwen, and temporarily comment out the `try...except` block setting `llm = None`. 

## 2. Dataset Preparation (PlantVillage)

To make Crop Guard accurate for diseases, you need to fine-tune it on the PlantVillage dataset.

1. **Get the Dataset (The Kaggle Way)**: You do **not** need to download this to your local computer! Since you are using Kaggle Notebooks, you have two extremely easy options to get the data directly into your environment:
   
   - **Option A (Kaggle Native)**: In your Kaggle Notebook, click **"Add Data"** on the right-side panel. Search for `PlantVillage` and click `+` to add it. It will instantly appear in your notebook's `/kaggle/input/` directory!
   - **Option B (Hugging Face via Code)**: You can download it directly inside your Python code using the `datasets` library from the Hugging Face hub (e.g., `nateraw/plant-village`).

2. **Structure**: If you use Option A (Kaggle Native), the folders will already be arranged. If you use Option B, the `load_dataset("nateraw/plant-village")` command handles the structure for you automatically.

3. **Mapping**: Keep track of the exact folder names or class labels. These 38 names will become the keys for the `LABEL_TRANSLATOR` dictionary inside your `app.py`!

## 3. Fine-Tuning the Vision Model

Since you only have a few days for the hackathon, the fastest way to fine-tune the `google/vit-base-patch16-224` model is by using **Hugging Face AutoTrain** or a simple Google Colab notebook.

**Option A: The No-Code Way (AutoTrain)**
1. Go to [Hugging Face AutoTrain](https://huggingface.co/autotrain).
2. Create a new Image Classification project.
3. Upload your PlantVillage dataset.
4. Select `google/vit-base-patch16-224` as the base model.
5. Train and push directly to your Hugging Face Hub profile!

**Option B: Python / Kaggle Notebook**
If you want to script it, use the `transformers` library inside a Kaggle Notebook (which gives you free access to P100/T4 GPUs!). Here is a high-level snippet of what the training script looks like:
```python
from transformers import AutoImageProcessor, AutoModelForImageClassification, Trainer, TrainingArguments
from datasets import load_dataset

# 1. Load dataset
dataset = load_dataset("imagefolder", data_dir="path/to/dataset")

# 2. Load processor & base model
processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224")
model = AutoModelForImageClassification.from_pretrained(
    "google/vit-base-patch16-224", 
    num_labels=38, 
    ignore_mismatched_sizes=True
)

# 3. Define training arguments & Trainer
training_args = TrainingArguments(
    output_dir="./vit-plantvillage",
    per_device_train_batch_size=16,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    num_train_epochs=3,
    push_to_hub=True,
    hub_model_id="your-username/cropguard-vit"
)

trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["val"])
trainer.train()
```

## 4. Deployment Steps (Hugging Face Spaces)

Once your UI is tested and your model is fine-tuned, you are ready to deploy.

1. **Create the Space**: Log into Hugging Face, click your profile picture, and select **New Space**.
2. **Configure the Space**: 
   - Give it a name (e.g., `CropGuard-AI`).
   - Choose **Gradio** as the Space SDK.
   - For hardware, select the **Free CPU** tier (or **T4 GPU** if you have it available).
3. **Upload Files**: Upload your local `app.py` and `requirements.txt` into the "Files" tab of your new space.
4. **Update the App File**: If you fine-tuned your model and pushed it to the hub (e.g., `your-username/cropguard-vit`), make sure you change `VISION_MODEL_ID = "your-username/cropguard-vit"` in the `app.py` file before uploading!
5. **Build**: The space will automatically begin building. It will install the dependencies from `requirements.txt` (including the CPU wheel for `llama-cpp-python`) and start the Gradio server. 

> [!NOTE]  
> The first boot on Hugging Face Spaces will take a few minutes as it downloads the 4.3GB LLM weights. Subsequent boots will be much faster.