| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | pipeline_tag: image-classification |
| | tags: |
| | - climate |
| | --- |
| | |
| | ## Model description |
| |
|
| | This is a transformers based image classification model, implemented using the technique of transfer learning. |
| | The pretrained model is [Vision transformer](https://huggingface.co/google/vit-base-patch16-224) trained on Imagenet-21k. |
| |
|
| | ## Datasets |
| |
|
| | The dataset used is downloaded from git repo [Agri-Hub/Space2Ground](https://github.com/Agri-Hub/Space2Ground/tree/main). |
| | I used Street-level image patches folder for this model. It is a dataset containing cropped vegetation parts of |
| | mapillary street-level images. Further details are on the linked git repo. |
| |
|
| | ### How to use |
| |
|
| | You can use this model directly with help of pipeline class from transformers library of hugging face |
| |
|
| | ```python |
| | |
| | >>>from transformers import pipeline |
| | >>>classifier = pipeline("image-classification", model="iammartian0/vegetation_classification_model") |
| | >>>classifier(image) |
| | |
| | ``` |
| | or |
| |
|
| | uploading a target image to Hosted inference api. |
| |
|
| | ## Training procedure |
| |
|
| |
|
| |
|
| | ### Preprocessing |
| |
|
| | Assigining labels based on parent folder names |
| |
|
| | ### Image Transformations |
| |
|
| | Applied RandomResizedCrop from torchvision.transforms to all the training images. |
| |
|
| | ### Finetuning |
| |
|
| | Model is finetuned on the dataset for four epochs |
| |
|
| | ## Evaluation results |
| |
|
| | Model acheived an Top-1 accuracy of 0.929. |
| |
|
| | ## Further exploration to do |
| | - Trainig a multilabel model where model can find if the image is from left side or right side |
| | on top of classifying the vegetation |
| | - Fine grained classification of crop labels using Raw/Initial set of street-level images |
| |
|
| |
|
| | ### BibTeX entry and citation info |
| |
|
| | ```bibtex |
| | @misc{wu2020visual, |
| | title={Visual Transformers: Token-based Image Representation and Processing for Computer Vision}, |
| | author={Bichen Wu and Chenfeng Xu and Xiaoliang Dai and Alvin Wan and Peizhao Zhang and Zhicheng Yan and Masayoshi Tomizuka and Joseph Gonzalez and Kurt Keutzer and Peter Vajda}, |
| | year={2020}, |
| | eprint={2006.03677}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CV} |
| | } |
| | ``` |
| | ```bibtex |
| | |
| | @INPROCEEDINGS{9816335, |
| | author={Choumos, George and Koukos, Alkiviadis and Sitokonstantinou, Vasileios and Kontoes, Charalampos}, |
| | booktitle={2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP)}, |
| | title={Towards Space-to-Ground Data Availability for Agriculture Monitoring}, |
| | year={2022}, |
| | volume={}, |
| | number={}, |
| | pages={1-5}, |
| | doi={10.1109/IVMSP54334.2022.9816335} |
| | } |
| | ``` |
| |
|
| |
|