erukude
/

cornvit

Image Classification

computer-vision

image-processing

corn-kernel-classification

Model card Files Files and versions

cornvit / README.md

erukude's picture

CornViT - A Multi-stage CVT Framework

26d4801 verified 6 days ago

|

history blame contribute delete

2.25 kB

	---
	language: en
	license: mit
	tags:
	- keras
	- tensorflow
	- computer-vision
	- image-processing
	- corn-kernel-classification
	pipeline_tag: image-classification
	library_name: keras
	---

	# CornViT

	A Multi-Stage Convolutional Vision Transformer Framework for Corn Kernel Analysis

	## Overview

	Three-stage hierarchical classification pipeline for automated corn kernel quality assessment:

	- Stage 1: Purity detection (Pure vs Impure)
	- Stage 2: Shape classification (Flat vs Round)
	- Stage 3: Embryo orientation (Up vs Down)

	## Architecture

	- Model: CvT-13 (384×384) with ImageNet-22k pretraining
	- Framework: PyTorch + Microsoft CvT
	- Test Accuracy: 93.8% (Stage 1), 94.1% (Stage 2), 91.1% (Stage 3)

	## Setup

	```bash
	# Clone repository
	git clone https://github.com/microsoft/CvT.git

	# Install dependencies
	pip install -r requirements.txt
	```

	## Training

	Each stage has independent training scripts:

	```bash
	python stage1/train_cvt13.py # Purity classification
	python stage2/train_cvt13.py # Shape classification
	python stage3/train_cvt13.py # Embryo orientation
	```

	## Inference

	```bash
	python stage1/inference_cvt13.py
	python stage2/inference_cvt13.py
	python stage3/inference_cvt13.py
	```

	## Baselines

	ResNet50 and DenseNet121 baselines available in `baselines/`.

	## Structure

	```
	├── stage1/ # Purity classification
	├── stage2/ # Shape classification
	├── stage3/ # Embryo orientation
	└── preprocess/ # Data preprocessing scripts
	```

	## Requirements

	- Python 3.13+
	- PyTorch 2.9+
	- CUDA (optional, for GPU training)

	---

	## Citation
	If you use this code, models, or catalog in your research, please cite:

	```bibtex
	@Article{computers15010002,
	AUTHOR = {Erukude, Sai Teja and Mascarenhas, Jane and Shamir, Lior},
	TITLE = {CornViT: A Multi-Stage Convolutional Vision Transformer Framework for Hierarchical Corn Kernel Analysis},
	JOURNAL = {Computers},
	VOLUME = {15},
	YEAR = {2026},
	NUMBER = {1},
	ARTICLE-NUMBER = {2},
	URL = {https://www.mdpi.com/2073-431X/15/1/2},
	ISSN = {2073-431X},
	DOI = {10.3390/computers15010002}
	}
	```