add data prep script clarification
Browse files
README.md
CHANGED
|
@@ -129,7 +129,7 @@ results[0].plot()
|
|
| 129 |
|
| 130 |
### Training Data
|
| 131 |
|
| 132 |
-
The three datasets are available in the [MMLA Data Collection](https://huggingface.co/collections/imageomics/mmla). See `prepare_yolo_dataset.py` for details on train/test splits.
|
| 133 |
|
| 134 |
#### Dataset splitting strategy
|
| 135 |
We applied a stratified 60/40 train-test split across species and locations to evaluate model generalizability. Data was collected from three distinct environments: Mpala Research Centre (location_1), Ol Pejeta Conservancy (location_2), and The Wilds Conservation Center (location_3). The dataset includes four target classes: Zebra, Giraffe, Onager, and African Wild Dog.
|
|
@@ -189,7 +189,7 @@ results = model.train(
|
|
| 189 |
|
| 190 |
#### Testing Data
|
| 191 |
|
| 192 |
-
The model was evaluated on a held-out test set located at `images/test` containing:
|
| 193 |
- 7658 test images with instances of Zebra, Giraffe, Onager, and Dog
|
| 194 |
|
| 195 |
|
|
|
|
| 129 |
|
| 130 |
### Training Data
|
| 131 |
|
| 132 |
+
The three datasets are available in the [MMLA Data Collection](https://huggingface.co/collections/imageomics/mmla). See `prepare_yolo_dataset.py` for details on train/test splits; the script runs on standard Python 3.10+ packages, and generates the splits.
|
| 133 |
|
| 134 |
#### Dataset splitting strategy
|
| 135 |
We applied a stratified 60/40 train-test split across species and locations to evaluate model generalizability. Data was collected from three distinct environments: Mpala Research Centre (location_1), Ol Pejeta Conservancy (location_2), and The Wilds Conservation Center (location_3). The dataset includes four target classes: Zebra, Giraffe, Onager, and African Wild Dog.
|
|
|
|
| 189 |
|
| 190 |
#### Testing Data
|
| 191 |
|
| 192 |
+
The model was evaluated on a held-out test set located at `images/test` (created by running the [data prep script](https://huggingface.co/imageomics/mmla/blob/main/prepare_yolo_dataset.py)) containing:
|
| 193 |
- 7658 test images with instances of Zebra, Giraffe, Onager, and Dog
|
| 194 |
|
| 195 |
|