clarify a few points and formatting
Browse files
README.md
CHANGED
|
@@ -89,7 +89,7 @@ for more information on how this model can be used generate time-budgets from ae
|
|
| 89 |
|
| 90 |
### Training Data
|
| 91 |
|
| 92 |
-
[KABR
|
| 93 |
|
| 94 |
### Training Procedure
|
| 95 |
|
|
@@ -103,7 +103,7 @@ For each tracklet, we create a separate video, called a mini-scene, by extractin
|
|
| 103 |
detection in a video frame.
|
| 104 |
This allows us to compensate for the drone's movement and provides a stable, zoomed-in representation of the animal.
|
| 105 |
|
| 106 |
-
See [project page](https://kabrdata.xyz/) and the [paper](https://openaccess.thecvf.com/content/WACV2024W/CV4Smalls/papers/Kholiavchenko_KABR_In-Situ_Dataset_for_Kenyan_Animal_Behavior_Recognition_From_Drone_WACVW_2024_paper.pdf) for data preprocessing details.
|
| 107 |
|
| 108 |
We applied data augmentation techniques during training, including horizontal flipping to randomly
|
| 109 |
mirror the input frames horizontally and color augmentations to randomly modify the
|
|
@@ -118,20 +118,17 @@ We used a sample rate of 16x5, and random weight initialization.
|
|
| 118 |
|
| 119 |
## Evaluation
|
| 120 |
|
| 121 |
-
The dataset was evaluated on X3D-L model utilizing [SlowFast](https://github.com/facebookresearch/SlowFast) framework.
|
| 122 |
|
| 123 |
-
###
|
| 124 |
|
| 125 |
-
[KABR Dataset](https://huggingface.co/datasets/imageomics/KABR)
|
| 126 |
-
|
| 127 |
-
We provide a train-test split of the mini-scenes for evaluation purposes, with 75% for train and 25% for testing. No mini-scene was divided by the split. The splits ensured a stratified representation of giraffes, Plains zebras, and Grevy’s zebras.
|
| 128 |
|
| 129 |
#### Metrics
|
| 130 |
|
| 131 |
-
We report precision, recall, and F1 score
|
| 132 |
-
|
| 133 |
|
| 134 |
-
|
| 135 |
|
| 136 |
| WI | BS | mAP Overall | mAP Head | mAP Tail | P | R | F1 |
|
| 137 |
|----------|----|-------------|----------|----------|--------|--------|--------|
|
|
@@ -140,7 +137,7 @@ We report precision, recall, and F1 score. We also report mean Average Precision
|
|
| 140 |
|
| 141 |
### Model Architecture and Objective
|
| 142 |
|
| 143 |
-
[Model Description](https://arxiv.org/pdf/2004.04730)
|
| 144 |
|
| 145 |
#### Hardware
|
| 146 |
|
|
@@ -150,7 +147,6 @@ Running the X3D model requires a modern NVIDIA GPU with CUDA support. X3D-L is d
|
|
| 150 |
|
| 151 |
**BibTeX:**
|
| 152 |
|
| 153 |
-
|
| 154 |
If you use our model in your work, please cite the model and associated paper.
|
| 155 |
|
| 156 |
**Model**
|
|
@@ -207,5 +203,4 @@ Jenna Kline and Maksim Kholiavchenko
|
|
| 207 |
|
| 208 |
## Model Card Contact
|
| 209 |
|
| 210 |
-
|
| 211 |
-
<!-- Could include who to contact with questions, but this is also what the "Discussions" tab is for. -->
|
|
|
|
| 89 |
|
| 90 |
### Training Data
|
| 91 |
|
| 92 |
+
This model was trained on the [KABR mini-scene dataset](https://huggingface.co/datasets/imageomics/KABR).
|
| 93 |
|
| 94 |
### Training Procedure
|
| 95 |
|
|
|
|
| 103 |
detection in a video frame.
|
| 104 |
This allows us to compensate for the drone's movement and provides a stable, zoomed-in representation of the animal.
|
| 105 |
|
| 106 |
+
See the [KBAR mini-scene project page](https://kabrdata.xyz/) and the [paper](https://openaccess.thecvf.com/content/WACV2024W/CV4Smalls/papers/Kholiavchenko_KABR_In-Situ_Dataset_for_Kenyan_Animal_Behavior_Recognition_From_Drone_WACVW_2024_paper.pdf) for data preprocessing details.
|
| 107 |
|
| 108 |
We applied data augmentation techniques during training, including horizontal flipping to randomly
|
| 109 |
mirror the input frames horizontally and color augmentations to randomly modify the
|
|
|
|
| 118 |
|
| 119 |
## Evaluation
|
| 120 |
|
| 121 |
+
The dataset was evaluated on the X3D-L model utilizing the [SlowFast](https://github.com/facebookresearch/SlowFast) framework, specifically utilizing teh [test_net script](https://github.com/facebookresearch/SlowFast/blob/main/tools/test_net.py).
|
| 122 |
|
| 123 |
+
### Testing Data
|
| 124 |
|
| 125 |
+
We provide a train-test split of the mini-scenes from the [KABR Dataset](https://huggingface.co/datasets/imageomics/KABR) for evaluation purposes (test set indicated in [annotations/val.csv](https://huggingface.co/datasets/imageomics/KABR/blob/main/KABR/annotation/val.csv), with 75% for train and 25% for testing. No mini-scene was divided by the split. The splits ensured a stratified representation of giraffes, Plains zebras, and Grevy’s zebras.
|
|
|
|
|
|
|
| 126 |
|
| 127 |
#### Metrics
|
| 128 |
|
| 129 |
+
We report precision, recall, and F1 score on the KABR mini-scene test set, along with the mean Average Precision (mAP) for overall, head-class, and tail-class performance.
|
|
|
|
| 130 |
|
| 131 |
+
**Results**
|
| 132 |
|
| 133 |
| WI | BS | mAP Overall | mAP Head | mAP Tail | P | R | F1 |
|
| 134 |
|----------|----|-------------|----------|----------|--------|--------|--------|
|
|
|
|
| 137 |
|
| 138 |
### Model Architecture and Objective
|
| 139 |
|
| 140 |
+
Please see the [Base Model Description](https://arxiv.org/pdf/2004.04730).
|
| 141 |
|
| 142 |
#### Hardware
|
| 143 |
|
|
|
|
| 147 |
|
| 148 |
**BibTeX:**
|
| 149 |
|
|
|
|
| 150 |
If you use our model in your work, please cite the model and associated paper.
|
| 151 |
|
| 152 |
**Model**
|
|
|
|
| 203 |
|
| 204 |
## Model Card Contact
|
| 205 |
|
| 206 |
+
For questions on this model, please open a [discussion](https://huggingface.co/imageomics/x3d-kabr-kinetics/discussions) on this repo.
|
|
|